[This is a cross-post from my other blog]
I decided to try to brush off some very rusty SQL skills and write a query against the OGF database to find out what our monthly active user counts looked like.
This data can only be compiled accurately for dates since the “migration”, which was in August, 2021. Before that changesets are not properly dated as they were all loaded at once from the old instance (Thilo’s) to the new instance (Luciano’s).
I decided to categorize unique user counts for each month by their “start year” (e.g. a user like myself would be under start year 2014). That way I could also see how old “generations” of users drop off over time.
I started with a nested SQL query, which is basically what I used to do professionally about 20 years ago, but it’s been a while since I wrote raw queries against complex data like this. It comes back fairly quickly though, and this is definitely the result of a little bit of trial and error. I’m working with two tables in the openstreetmap PostgreSQL database: users and changesets. I’m counting changesets by the month they occur and by a breakdown of users by when they started using the OGF site (with a special category for users who are “new” in the month they are active).
Here’s the query I made and ran at the PSQL prompt on the server (these data structures are all public knowledge, part of the OSM specification, so I don’t feel worried sharing it).
SELECT t.changeset_month, t.user_first_year, t.new_user_flag, COUNT(DISTINCT t.user_id) AS count_users, COUNT(t.changeset_id) AS count_changesets FROM ( SELECT i.changeset_id, i.changeset_month, i.user_id, LEFT(i.creation_month, 4) AS user_first_year, CASE WHEN i.creation_month = i.changeset_month THEN 'new_user' ELSE 'old_user' END AS new_user_flag FROM ( SELECT to_char(changesets.closed_at, 'YYYY-MM') AS changeset_month, changesets.user_id AS user_id, changesets.id AS changeset_id, to_char(users.creation_time, 'YYYY-MM') as creation_month FROM changesets INNER JOIN users ON users.id = changesets.user_id WHERE changesets.closed_at >= '2021-10-01'::date AND changesets.closed_at <= '2022-12-31'::date ) AS i ) AS t GROUP BY t.changeset_month, t.user_first_year, t.new_user_flag ORDER BY t.changeset_month, t.user_first_year, t.new_user_flag DESC;
Here’s the raw output I got.
changeset_month | user_first_year | new_user_flag | count_users | count_changesets -----------------+-----------------+---------------+-------------+------------------ 2021-10 | 2013 | old_user | 10 | 112 2021-10 | 2014 | old_user | 17 | 1098 2021-10 | 2015 | old_user | 23 | 720 2021-10 | 2016 | old_user | 24 | 448 2021-10 | 2017 | old_user | 34 | 787 2021-10 | 2018 | old_user | 45 | 1088 2021-10 | 2019 | old_user | 35 | 750 2021-10 | 2020 | old_user | 28 | 557 2021-10 | 2021 | old_user | 34 | 492 2021-10 | 2021 | new_user | 24 | 280 2021-11 | 2012 | old_user | 1 | 1 2021-11 | 2013 | old_user | 8 | 207 2021-11 | 2014 | old_user | 19 | 687 2021-11 | 2015 | old_user | 26 | 697 2021-11 | 2016 | old_user | 21 | 408 2021-11 | 2017 | old_user | 36 | 738 2021-11 | 2018 | old_user | 41 | 1271 2021-11 | 2019 | old_user | 29 | 663 2021-11 | 2020 | old_user | 26 | 749 2021-11 | 2021 | old_user | 28 | 439 2021-11 | 2021 | new_user | 23 | 498 2021-12 | 2013 | old_user | 10 | 335 2021-12 | 2014 | old_user | 17 | 773 2021-12 | 2015 | old_user | 23 | 892 2021-12 | 2016 | old_user | 25 | 657 2021-12 | 2017 | old_user | 31 | 703 2021-12 | 2018 | old_user | 43 | 1506 2021-12 | 2019 | old_user | 29 | 769 2021-12 | 2020 | old_user | 25 | 664 2021-12 | 2021 | old_user | 45 | 854 2021-12 | 2021 | new_user | 22 | 250 2022-01 | 2012 | old_user | 1 | 3 2022-01 | 2013 | old_user | 7 | 174 2022-01 | 2014 | old_user | 13 | 1023 2022-01 | 2015 | old_user | 22 | 805 2022-01 | 2016 | old_user | 23 | 568 2022-01 | 2017 | old_user | 32 | 753 2022-01 | 2018 | old_user | 39 | 1642 2022-01 | 2019 | old_user | 30 | 1062 2022-01 | 2020 | old_user | 30 | 878 2022-01 | 2021 | old_user | 42 | 703 2022-01 | 2022 | new_user | 24 | 192 2022-02 | 2013 | old_user | 10 | 397 2022-02 | 2014 | old_user | 19 | 1251 2022-02 | 2015 | old_user | 21 | 768 2022-02 | 2016 | old_user | 22 | 540 2022-02 | 2017 | old_user | 30 | 1136 2022-02 | 2018 | old_user | 41 | 1612 2022-02 | 2019 | old_user | 25 | 617 2022-02 | 2020 | old_user | 27 | 888 2022-02 | 2021 | old_user | 41 | 722 2022-02 | 2022 | old_user | 8 | 117 2022-02 | 2022 | new_user | 21 | 234 2022-03 | 2013 | old_user | 8 | 61 2022-03 | 2014 | old_user | 17 | 1384 2022-03 | 2015 | old_user | 25 | 927 2022-03 | 2016 | old_user | 22 | 705 2022-03 | 2017 | old_user | 35 | 1498 2022-03 | 2018 | old_user | 41 | 1211 2022-03 | 2019 | old_user | 33 | 861 2022-03 | 2020 | old_user | 30 | 1018 2022-03 | 2021 | old_user | 38 | 1458 2022-03 | 2022 | old_user | 13 | 491 2022-03 | 2022 | new_user | 27 | 758 2022-04 | 2013 | old_user | 7 | 186 2022-04 | 2014 | old_user | 17 | 871 2022-04 | 2015 | old_user | 24 | 500 2022-04 | 2016 | old_user | 19 | 683 2022-04 | 2017 | old_user | 30 | 1093 2022-04 | 2018 | old_user | 39 | 1291 2022-04 | 2019 | old_user | 29 | 833 2022-04 | 2020 | old_user | 28 | 673 2022-04 | 2021 | old_user | 32 | 753 2022-04 | 2022 | old_user | 25 | 900 2022-04 | 2022 | new_user | 21 | 231 2022-05 | 2012 | old_user | 1 | 1 2022-05 | 2013 | old_user | 8 | 214 2022-05 | 2014 | old_user | 22 | 1093 2022-05 | 2015 | old_user | 25 | 526 2022-05 | 2016 | old_user | 25 | 899 2022-05 | 2017 | old_user | 28 | 1100 2022-05 | 2018 | old_user | 40 | 1097 2022-05 | 2019 | old_user | 30 | 873 2022-05 | 2020 | old_user | 30 | 673 2022-05 | 2021 | old_user | 34 | 1328 2022-05 | 2022 | old_user | 27 | 558 2022-05 | 2022 | new_user | 30 | 348 2022-06 | 2013 | old_user | 7 | 249 2022-06 | 2014 | old_user | 17 | 979 2022-06 | 2015 | old_user | 26 | 605 2022-06 | 2016 | old_user | 26 | 593 2022-06 | 2017 | old_user | 30 | 988 2022-06 | 2018 | old_user | 39 | 941 2022-06 | 2019 | old_user | 27 | 985 2022-06 | 2020 | old_user | 26 | 783 2022-06 | 2021 | old_user | 36 | 919 2022-06 | 2022 | old_user | 44 | 606 2022-06 | 2022 | new_user | 31 | 523 2022-07 | 2013 | old_user | 8 | 84 2022-07 | 2014 | old_user | 19 | 1414 2022-07 | 2015 | old_user | 25 | 630 2022-07 | 2016 | old_user | 24 | 558 2022-07 | 2017 | old_user | 29 | 1209 2022-07 | 2018 | old_user | 39 | 1199 2022-07 | 2019 | old_user | 27 | 632 2022-07 | 2020 | old_user | 29 | 796 2022-07 | 2021 | old_user | 35 | 1082 2022-07 | 2022 | old_user | 41 | 750 2022-07 | 2022 | new_user | 38 | 632 2022-08 | 2013 | old_user | 10 | 165 2022-08 | 2014 | old_user | 17 | 1236 2022-08 | 2015 | old_user | 24 | 687 2022-08 | 2016 | old_user | 25 | 636 2022-08 | 2017 | old_user | 29 | 961 2022-08 | 2018 | old_user | 41 | 1145 2022-08 | 2019 | old_user | 35 | 808 2022-08 | 2020 | old_user | 25 | 585 2022-08 | 2021 | old_user | 38 | 847 2022-08 | 2022 | old_user | 50 | 1049 2022-08 | 2022 | new_user | 40 | 589 2022-09 | 2013 | old_user | 10 | 175 2022-09 | 2014 | old_user | 19 | 1534 2022-09 | 2015 | old_user | 20 | 604 2022-09 | 2016 | old_user | 24 | 750 2022-09 | 2017 | old_user | 23 | 735 2022-09 | 2018 | old_user | 41 | 1302 2022-09 | 2019 | old_user | 28 | 841 2022-09 | 2020 | old_user | 24 | 503 2022-09 | 2021 | old_user | 30 | 609 2022-09 | 2022 | old_user | 54 | 1035 2022-09 | 2022 | new_user | 25 | 261 2022-10 | 2012 | old_user | 1 | 3 2022-10 | 2013 | old_user | 8 | 231 2022-10 | 2014 | old_user | 18 | 1112 2022-10 | 2015 | old_user | 22 | 433 2022-10 | 2016 | old_user | 25 | 789 2022-10 | 2017 | old_user | 25 | 809 2022-10 | 2018 | old_user | 39 | 1267 2022-10 | 2019 | old_user | 28 | 895 2022-10 | 2020 | old_user | 30 | 736 2022-10 | 2021 | old_user | 27 | 382 2022-10 | 2022 | old_user | 54 | 1040 2022-10 | 2022 | new_user | 17 | 173 2022-11 | 2013 | old_user | 8 | 248 2022-11 | 2014 | old_user | 16 | 1164 2022-11 | 2015 | old_user | 22 | 357 2022-11 | 2016 | old_user | 23 | 675 2022-11 | 2017 | old_user | 25 | 927 2022-11 | 2018 | old_user | 37 | 897 2022-11 | 2019 | old_user | 25 | 702 2022-11 | 2020 | old_user | 25 | 679 2022-11 | 2021 | old_user | 25 | 577 2022-11 | 2022 | old_user | 49 | 1100 2022-11 | 2022 | new_user | 31 | 1641 2022-12 | 2013 | old_user | 6 | 317 2022-12 | 2014 | old_user | 18 | 981 2022-12 | 2015 | old_user | 23 | 543 2022-12 | 2016 | old_user | 20 | 541 2022-12 | 2017 | old_user | 25 | 941 2022-12 | 2018 | old_user | 34 | 973 2022-12 | 2019 | old_user | 28 | 935 2022-12 | 2020 | old_user | 22 | 500 2022-12 | 2021 | old_user | 29 | 432 2022-12 | 2022 | old_user | 59 | 1623 2022-12 | 2022 | new_user | 25 | 317
I plugged these data into a spreadsheet, did a few changes and a pivot table. Here’s a graph of the result.
The top band, in light blue, is the “new users” band – these are users each month who are active in their first month of joining OGF. The lower bands represent each year back to OGF’s founding, in 2012 (there were only 2 users in the first year, Thilo and Joschi).
Based on that graph, I would say really OGF is quite stable. We acquire a certain number of new users each month, we lose about an equal proportion of old users, but some subset of long-term users stick around.