Since we're doing many queries to get stats for each age group and each
geozone, testing shows these indices make stats calculation about 25%
faster on processes with 100,000 participants.
Debugging shows that the bottleneck in the stats calculation is the
number of times we're querying the users table using the same array of
IDs in the `where` condition but each time combined with other
conditions.
So we're inserting the results of querying the users table with the
array of IDs in a temporary table and using this temporary table for the
other calculations. When querying this temporary table, there's no need
to filter for IDs anymore.
For budget stats, the `generate` method is now about 10-20 times faster
for a budget with 20,000 participants. For budgets with only a few dozen
participants, there's no significant difference in performance.
I thought about modifying the `participants` method and use the
temporary table there. The problem, however, is that in this case it
isn't clear when to drop the temporary table, and we could end up with
thousands of temporary tables in the database if we don't do it right.
Creating and dropping the temporary table in the same transaction, on
the other hand, guarantees that won't be the case.
Note there's no risk of duplicate tables since they're created and
dropped inside a transaction, so we're always using the same table name
for the same resource. We're adding a test that fails with a
`PG::DuplicateTable: ERROR: relation "participants__1"` error if we
don't use a transaction.
Back in commit 383909e16, we said:
> Even if this class looks very simple now, we're trying a few things
> related to these stats. Having a class for it makes future changes
> easier and, if there weren't any future changes, at least it makes
> current experiments easier.
Since there haven't been any changes in the last 5 years and we've found
cases where using the GeozoneStats class results in a slightly worse
performance, we're removing this class. The code is now a bit easier to
read, and is consistent with the way we calculate participants by age.
This reverts commit 383909e16.
Using `>` and `<` for dates is a bit confusing because it usually
requires mentally translating `<` into "before" and `>` into "after",
meaning it takes a few seconds to check which is the starting date and
which is the ending date. When using a range, it's easier to see which
is which.
Note that we were using `>` and `<`, but now we're using the equivalent
of `>=` and `<=`. This makes sense IMHO, since the beginning of the year
and the end of the year should also be included in this case.
We were calculating the age stats based on the age of the users who
participated... at the moment where we were calculating the stats. That
means that, if 20 years ago, 1000 people who were 16 years old
participated, they would be shown as having 36 years in the stats.
Instead, we want to show the stats at the time when the process took
place, so we're implementing a `participation_date` method.
Note that, for polls, we could actually use the `age` column in the
`poll_voters` table. However, doing so would be harder, would only work
for polls but not for budgets, and it wouldn't be statistically very
relevant, since the stats are shown by age groups, and only a small
percentage of people would change their age group (and only to the
nearest one) between the time they participate and the time the process
ends.
We might use the `poll_voters` table in the future, though, since we
have a similar issue with geozones and genders, and using the
information in `poll_voters` would solve it as well (only for polls,
though).
Also note that we're using the `ends_at` dates because some people but
be too young to vote when a process starts but old enough to vote when
the process ends.
Finally, note that we might need to change the way we calculate the
participation date for a budget, since some budgets might not enabled
every phase. Not sure how stats work in that scenario (even before these
changes).
We were already testing it as part of the statisticable concern, but
it's easier to understand when we have specific tests for it as well.
Note that we're using `eq` insteab of `be` because it's what we use most
often in the test. For consistency, we're also changing the tesst in
budget stats so it uses `eq`.
According to the README [1]:
> To mask previously collected IPs, use:
> Ahoy::Visit.find_each do |visit|
> visit.update_column :ip, Ahoy.mask_ip(visit.ip)
> end
We're adapting the code with our version, since we use the `Visit` model
instead of the `Ahoy::Visit` model.
[1] https://github.com/ankane/ahoy/blob/v5.0.2/README.md#ip-masking
As mentioned in Ahoy's README [1]:
> Ahoy provides a number of options to help with GDPR compliance.
> Update config/initializers/ahoy.rb with:
>
> class Ahoy::Store < Ahoy::DatabaseStore
> def authenticate(data)
> # disables automatic linking of visits and users
> end
> end
>
> Ahoy.mask_ips = true
> Ahoy.cookies = :none
As also mentioned in the README:
> If Ahoy was installed before v5, add an index before making this
> change.
> (...)
> For Active Record, create a migration with:
> add_index :ahoy_visits, [:visitor_token, :started_at]
However, the `visitor_token` doesn't exist in our table, since we
generated the `visits` table when Ahoy used the `visitor_id` column. So
we're using this column for the index.
Note we also need to change the `visit` method, since otherwise we get
an exception [2]. As mentioned on the issue reporting the exception:
> you'll need to copy the latest version of that method and adapt it to
> your model. I believe you'll want to replace:
>
> where(visit_token: ahoy.visit_token) with
> where(id: ensure_uuid(ahoy.visit_token))
>
> where(visitor_token: ahoy.visitor_token) with
> where(visitor_id: ensure_uuid(ahoy.visitor_token))
So we're copying the latest version of that method and changing it
accordingly.
[1] https://github.com/ankane/ahoy/blob/v5.0.2/README.md
[2] Issue 549 in https://github.com/ankane/ahoy
This was added in commit 02f19aa4b, before we started tracking events.
I don't think we ever used it; in any case, we now use the `Ahoy::Chart`
class to deal with the stats Ahoy used to generate.
This page isn't linked from anywhere and most Consul Democracy
installations don't even know it exists, so it's useless for most
people.
If we ever bring it back, we should at least add a link pointing to this
page.
The test was supposed to be testing notifications without a link, but it
was adding a link. It was only passing because the expectations was
checking whether there was a link whose text was a URL, but what we
want to check here is whether the URL is present in the `href`
attribute.
We're about to delete the page showing public stats and, since this link
was using the `/stats` path, we're fixing it.
Now that we've moved the logic to generate the events data to a model,
and we've got access to the model in the component rendering the chart,
we can render the data inside the chart instead of doing an extra AJAX
request to get the same data.
Originally, this was problaby done this way so the page wouldn't take
several seconds to load while preparing the data for the chart when
there are thousands of dates being displayed. With an AJAX call, the
page would load as fast as usual, and then the chart would render after
a few seconds. However, we can have an even better performance
improvement in this scenario if we use a Set instead of an Array. The
method `Array#include?`, which we were calling for every date in the
data, is much slower that `Set#merge`. So now both the page and the
chart load as fast as expected.
We could also use something like:
```
def add
(...)
shared_keys.push(*collection.keys)
end
def build
(...)
shared_keys.uniq.each do |k|
(...)
end
def shared_keys
@shared_keys ||= []
end
```
Or other approaches to avoid using `Array#include?`. The performance
would be similar to the one we get when using `Set`. We're using a `Set`
because it makes more obvious that `shared_keys` is supposed to contain
unique elements.
We've had some tests failing in the past due to these AJAX requests
being triggered automatically during the tests and no expectations
checking the requests have finished, so now we're reducing the amount of
flaky tests.
We were always displaying the event names in English.
Note we're changing the `user_supported_budgets` key because it didn't
make much sense; the investments are supported, and not the budgets.
We're also adding "created" to most of the event names in order to make
the texts more explicit, since not all the events refer to created data.
Note we're delegating the `t` method because i18n-tasks doesn't detect
code like `ApplicationController.helpers.t` and so reports we aren't
using the `admin.stats.graph` translations.
We were tracking some events with Ahoy, but in an inconsistent way. For
example, we were tracking when a debate was created, but (probably
accidentally) we were only tracking proposals when they were created
from the management section. For budget investments and their supports,
we weren't using Ahoy events but checking their database tables instead.
And we were only using ahoy events for the charts; for the other stats,
we were using the real data.
While we could actually fix these issues and start tracking events
correctly, existing production data would remain broken because we
didn't track a certain event when it happened. And, besides, why should
we bother, for instance, to track when a debate is created, when we can
instead access that information in the debates table?
There are probably some features related to tracking an event and their
visits, but we weren't using them, and we were storing more user data
than we needed to.
So we're removing the track events, allowing us to simplify the code and
make it more consistent. We aren't removing the `ahoy_events` table in
case existing Consul Democracy installations use it, but we'll remove it
after releasing version 2.2.0 and adding a warning in the release notes.
This change fixes the proposal created chart, since we were only
tracking proposals created in the management section, and opens the
possibility to add more charts in the future using data we didn't track
with Ahoy.
Also note the "Level 2 user Graph" test wasn't testing the graph, so
we're changing it in order to test it. We're also moving it next to the
other graphs test and, since we were tracking the event when we were
confirming the phone, we're renaming to "Level 3 users".
Finally, note that, since we were tracking events when something was
created, we're including the `with_hidden` scope. This is also
consistent with the other stats shown in the admin section as well as
the public stats.
The only way to use campaigns is to manually insert them in the
database, which IMHO isn't very practical.
We're going to change every piece of code that generates an Ahoy event
and, in this case, the easiest way to change the Campaing model so it
doesn't use Ahoy events is to simply remove it.
Note we're keeping the database tables until we release a new version,
just in case some Consul Democracy installations are using them. We'll
inform in the release notes that we'll remove the campaigns table after
the release, so existing installations using the `campaigns` table can
move the data somewhere else before we remove the table.
Using <h3> headings for the links had two disadvantages.
First, it was the wrong heading level to use, since there was no <h2>
tag before it.
Second, headings are supposed to be followed by content associated to
that heading; here, we had no content following the headings.
So we're using a list of links and giving it a heading. We're adding
styles so the page still looks like it used to, although these styles
are certainly asking for improvements.
The JavaScript required to display the chart wasn't loaded on the admin
stats page.
We're not adding a test because we're going to move the budgets graph to
a different page on the pull request containing this commit.
Note we're changing the "Go back" link, since using a turbolinks refresh
broke this link when using the Chromium browser. Besides, there was an
inconsistency where some of the "Go back" links in admin stats pointed
to the admin stats page but other links pointed to `:back`.
We were only using it in one place: the debates controller. All the
other controllers including CommentableActions were overwriting this
action, except the ones in the admin area, where creating proposals,
debates or investments isn't possible.
Note this means that, most of the times, we weren't tracking events
creating a resource.
Also note that since, as mentioned in commit 3752fef6b, there are no
geozones in the debates form, we don't have to load them when creating a
debate fails due to validation rules.
We were using Foundation's accordion menu to open/close nested lists of
links. Unfortunately, Foundation's accordion makes it impossible to
access links in nested links using the keyboard [1] (note the issue is
closed, but in the latest version of Foundation, 6.8.1, it's still
present, and Foundation's development is mostly discontinued).
Furtheremore, it adds the `menuitem` role to links, but ARIA menus are
not ment for navigation but for application behavior and, since it
doesn't add the `menubar` or `menu` roles to the parent elements, it
results in accessibility issues for people using screen readers (also
reported by the Axe accessibility testing engine).
So we need to implement our own solution. We're using the most commonly
used pattern: a buttton with the `aria-expanded` attribute. And, for
people using browsers where JavaScript hasn't loaded, we're keeping the
submenus open at all times (just like we were doing until now), and
we're disabling the buttons (since they do nothing without JavaScript).
This might not be an ideal solution, but it's probably good enough, and
way better than what we had until now.
We've also considered using the <details> and <summary> elements instead
of using buttons to open/close items on the list. However, these
elements still present some accessibility issues [2], and the transition
between open and closed can't be animated unless we overwrite the
`click` event with JavaScript. The pattern of using these elements to
open/close a nested list of links isn't common either, and some people
using screen readers might get confused when entering/leaving the nested
list.
We tried other approaches to get the animation effect, all of them based
on adding `[aria-expanded="false"]:not([disabled]) + * { display: none;
}` to the CSS file.
Unfortunately, animation using CSS isn't feasible right now because
browsers can't animate a change form `height: 0` to `height: auto`.
There are some hacks like animating the `max-height` or the `flex-grow`
property, but the resulting animation is inconsistent. A perfect
animation can be done using the `grid-template-rows` property [3], but
it requires adding a grid container and only works in Firefox and recent
versions of Chrome and similar browsers.
Getting to a solution with JavaScript was also tricky. With the
following approach, `slideToggle()` opened the menu the first time, even
if it was already open (not sure why):
```
toggle_buttons.on("click", function() {
$(this).attr("aria-expanded", !JSON.parse($(this).attr("aria-expanded")));
$(this).next().slideToggle();
});
```
This made the arrow turn after the menu had slided instead of doing it
at the same time:
```
toggle_buttons.on("click", function() {
var button = $(this);
button.next().slideToggle(function() {
button.attr("aria-expanded",
!JSON.parse(button.attr("aria-expanded")));
});
}
```
With this, everything disappeared quickly:
```
toggle_buttons.on("click", function() {
var expanded = JSON.parse($(this).attr("aria-expanded"));
if (expanded) {
$(this).next().slideUp();
} else {
$(this).next().slideDown();
}
$(this).attr("aria-expanded", !expanded);
}
```
So, in the end, we're hiding the nested link lists with JavaScript
instead of CSS.
[1] Issue 12046 in https://github.com/foundation/foundation-sites
[2] https://www.scottohara.me/blog/2022/09/12/details-summary.html
[3] https://css-tricks.com/css-grid-can-do-auto-height-transitions