- added 2 new types
- modified the models to get data through graphQL
- modified the corresponding spec
- also testing that hidden comments do not show up
- modified comments specs bc now it returns comments on budget
investments
It was a bit strange to leave the end date blank and have a message
associated with the start date, so we're using presence validations
instead.
For the range validation, we're using the comparison validator included
in Rails 7.0.
The `validates_comparison_of` method was added in Rails 7.0.
We aren't changing the `date_range` validation in polls yet because it's
a bit complex; we'll do it in the next commit.
This method was added in Rails 7.0 and makes the code slihgtly more
readable.
The downside is that it generates two queries instead of one, so it
might generate some confusion when debugging SQL queries. Its impact on
performance is probably negligible.
This method was introduced in Rails 7.0, and thanks to it we can
simplify the code that gets the translations in order.
We tried to use this method to simplify the `Randomizable` concern as
well. However, we found out that, when ordering tens of thousands of
records, the query could take several minutes, so we aren't using it in
this case. Using it for translation fallbacks is OK, since there's a
good chance we're never going to have tens of thousands of available
locales.
Note that automated security tools reported a false positive related to
SQL Injection due to the way we used `LEFT JOIN`, so now we get one less
false positive in these reports.
These cases aren't covered by the `Rails/WhereRange` rubocop rule, but
IMHO using ranges makes them more consistent. Besides, they generate SQL
which is more consistent with what Rails usually generates. For example,
`Poll.where("starts_at <= :time and ends_at >= :time", time:
Time.current)` generates:
```
SELECT \"polls\".\"id\", (...) WHERE \"polls\".\"hidden_at\" IS NULL AND
(starts_at <= '2024-07-(...)' and ends_at >= '2024-07-(...)')
```
And `Poll.where(starts_at: ..Time.current, ends_at: Time.current..)`
generates:
```
SELECT \"polls\".\"id\", (...) WHERE \"polls\".\"hidden_at\" IS NULL AND
\"polls\".\"starts_at\" <= '2024-07-(...)' AND \"polls\".\"ends_at\" >=
'2024-07-(...)'"
```
Note that the `not_archived` scope in proposals slightly changes, since
we were using `>` and now we use the equivalent of `>=`. However, since
the `created_at` field is a time, this will only mean that a proposal
will be archived about one microsecond later.
For consistency, we're also changing the `archived` scope, so a proposal
is never archived and not archived at the same time (not even for a
microsecond).
This rule was added in rubocop-rails 2.25.0. Applying it allows us to
simplify the code a little bit. For example, now there's no need to
specify the `proposals` table in proposal scopes, which was actually
causing a bug in the `Legislation::Proposal` model, which was using the
`proposals` table instead of the `legislation_proposals` table (but,
since we don't use this scope, it didn't affect the application).
Until now, we've stored the text of the answer somebody replied to. The
idea was to handle the scenarios where the user voters for an option but
then that option is deleted and restored, or the texts of the options
are accidentally edited and so the option "Yes" is now "Now" and vice
versa.
However, since commit 3a6e99cb8, options can no longer be edited once
the poll starts, so there's no risk of the option changing once somebody
has voted.
This means we can now store the ID of the option that has been voted.
That'll also help us deal with a bug introduced int 673ec075e, since
answers in different locales are not counted as the same answer. Note we
aren't dealing with this bug right now.
We're still keeping (and storing) the answer as well. There are two
reasons for that.
First, we might add an "open answer" type of questions in the future and
use this column for it.
Second, we've still got logic depending on the answer, and we need to be
careful when changing it because there are existing installations where
the answer is present but the option_id is not.
Note that we're using `dependent: nullify`. The reasoning is that, since
we're storing both the option_id and the answer text, we can still use
the answer text when removing the option. In practice, this won't matter
much, though, since we've got a validation rule that makes it impossible
to destroy options once the poll has started.
Also note we're still allowing duplicate records when the option is nil.
We need to do that until we've removed every duplicate record in the
database.
Note that, when taking votes from an erased user, since poll answers
don't belong to poll voters, we were not migrating them in the
`take_votes_from` method (and we aren't migrating them now either).
This call was added in commit 81f65f1ac, and the test for its need was
added in commit cb1542874. However, both the test and the helper method
relying on the `touch` call were removed in commit f90d0d9c4.
We were checking we didn't have more votes than allowed in the case of
questions with multiple answers, but we weren't checking it in the case
of questions with a single answer. This made it possible to create more
than one answer to the same question. This could happen because the
method `find_or_initialize_user_answer` might initialize two answers in
different threads, due to a race condition.
Having a class named `Poll::Question::Answer` and another class named
`Poll::Answer` was so confusing that no developer working on the project
has ever been capable of remembering which is which for more than a few
seconds.
Furthermore, we're planning to add open answers to polls, and we might
add a reference from the `poll_answers` table to the
`poll_question_answers` table to property differentiate between open
answers and closed answers. Having yet another thing named answer would
be more than what our brains can handle (we know it because we did this
once in a prototype).
So we're renaming `Poll::Question::Answer` to `Poll::Question::Option`.
Hopefully that'll make it easier to remember. The name is also (more or
less) consistent with the `Legislation::QuestionOption` class, which is
similar.
We aren't changing the table or columns names for now in order to avoid
possible issues when upgrading (old code running with the new database
tables/columns after running the migrations but before deployment has
finished, for instance). We might do it in the future.
I've tried not to change the internationalization keys either so
existing translations would still be valid. However, since we have to
change the keys in `activerecord.yml` so methods like
`human_attribute_name` keep working, I'm also changing them in places
where similar keys were used (like `poll_question_answer` or
`poll/question/answer`).
Note that it isn't clear whether we should use `option` or
`question_option` in some cases. In order to keep things simple, we're
using `option` where we were using `answer` and `question_option` where
we were using `question_answer`.
Also note we're adding tests for the admin menu component, since at
first I forgot to change the `answers` reference there and all tests
passed.
We added the code indicating the table in commit 673ec075e because back
then we had a `title` column in both the `poll_question_answers` table
and the `poll_question_answer_translations` table.
Since that's no longer the case since commit 7a7877656, we can simplify
the code.
Not sure about when we stopped needing them, but all usages of
require_dependency definitely become obsolete after we started using
Zeitwerk in commit 5f24ee912.
We're also going to rename `Poll::Question::Answer` to
`Poll::Question::Option`, so the conflict of having two `Answer`
classes, that made us add this code in the first place, will not even
exist.
We were using the same code, and the same regular expressions, in two
places. To do so, we were including a helper inside a model, which is
something we don't usually do.
This way we can simplify the view, particularly the form. However, we're
still adding some complexity to the form so inputs are inside labels and
so the collection is easier to style with CSS.
Note that, for everything to work consistently, we need to make sure
that the default locale is one of the available locales.
Also note that we aren't overwriting the `#save ` method set by
globalize. I didn't feel too comfortable changing a monkey-patch which
ideally shouldn't be there in the first place, I haven't found a case
where `Globalize.locale` is `nil` (since it defaults to `I18n.locale`,
which should never be `nil`), so using `I18n.default_locale` probably
doesn't affect us.
We're only calling the `block_for` method from one place: the
`content_block` helper, and we're never passing the locale parameter to
that helper, which means we're always calling `block_for` using
`I18n.locale`.
So I'm not sure why we were doing the `locale ||=` assignment, since
`I18n.locale` doesn't return nil.
In any case, since this was added in commit c1de2dced, Ruby has added
support for arguments forwarding, so we can use it here to simplify the
code a little bit.
Note that, currently, we take these settings from the database but we
don't provide a way to edit them through the admin interface, so the
locales must be manually introduced through a Rails console.
While we did consider using a comma-separated list, we're using spaces
in order to be consistent with the way we store the allowed content
types settings.
The `enabled_locales` nomenclature, which contrasts with
`available_locales`, is probably subconsciously based on similar
patterns like the one Nginx uses to enable sites.
Note that we aren't using `Setting.enabled_locales` in the globalize
initializer when setting the fallbacks. This means the following test
(which we could add to the shared globalizable examples) would fail:
```
it "Falls back to an enabled locale if the fallback is not enabled" do
Setting["locales.default"] = "en"
Setting["locales.enabled"] = "fr en"
allow(I18n.fallbacks).to receive(:[]).and_return([:fr, :es])
Globalize.set_fallbacks_to_all_available_locales
I18n.with_locale(:fr) do
expect(record.send(attribute)).to eq "In English"
end
end
```
The reason is that the code making this test pass could be:
```
def Globalize.set_fallbacks_to_all_available_locales
Globalize.fallbacks = I18n.available_locales.index_with do |locale|
((I18n.fallbacks[locale] & Setting.enabled_locales) + Setting.enabled_locales).uniq
end
end
```
However, this would make it impossible to run `rake db:migrate` on new
applications because the initializer would try to load the `Setting`
model but the `settings` table wouldn't exist at that point.
Besides, this is a really rare case that IMHO we don't need to support.
For this scenario, an installation would have to enable a locale, create
records with contents in that locale, then disable that locale and have
that locale as a fallback for a language where content for that record
wasn't created. If that happened, it would be solved by creating content
for that record in every enabled language.
When calculating the stats on, say, January 1st 2024, and using a group
age of, say, between 20 and 24 years, we were considering that everyone
born between 2000 and 2004 had between 20 and 24 years. This wasn't
accurate, since most people born in 2004 would have 19 years at that
point, and most people born in 1999 would have 24 years.
In commit e51e03446, we started using the same code to show stats in the
public area and in the admin area. However, in doing so we introduced a
bug, since stats in the public area are only shown after a certain part
of the process has finished, meaning the stats appearing on the page
never change (in theory), so it's perfectly fine to cache them. However,
in the admin area stats can be accessed while the process is still
ongoing, so caching the stats will lead to the wrong results being
displayed.
We've thought about expiring the cache when new supports or ballot lines
are added; however, that means the methods calculating the stats for the
supporting phase would expire when supports are added/removed but the
methods calculating the stats for the voting phase would expire when
ballot lines are added/removed. It gets even more complex because the
`headings` method calculates stats for both the supporting and the
voting phases.
So, since loading stats in the admin section is fast even without the
cache because they only load very basic statistics, we're taking the
simple approach of disabling the cache in this case, so everything works
the same way it did before commit e51e03446.
Co-authored-by: Javi Martín <javim@elretirao.net>
When we first started caching the stats, generating them was a process
that took several minutes, so we never expired the cache.
However, there have been cases where we run into issues where the stats
shown on the screen were outdated. That's why we introduced a task to
manually expire the cache.
But now, generating the stats only takes a few seconds, so we can
automatically expire them every day, remove all the logic needed to
manually expire them, and get rid of most of the issues related to the
cache being outdated.
We're expiring them every day because it's the same day we were doing in
public stats (which we removed in commit 631b48f58), only we're using
`expires_at:` to set the expiration time, in order to simplify the code.
Note that, in the test, we're using `travel_to(time)` so the test passes
even when it starts an instant before midnight. We aren't using
`:with_frozen_time` because, in similar cases (although not in this
case, but I'm not sure whether that's intentional), `travel_to` shows
this error:
> Calling `travel_to` with a block, when we have previously already made
> a call to `travel_to`, can lead to confusing time stubbing.
Since now generating stats (assuming the results aren't in the cache)
only takes a few seconds even when there are a hundred thousand
participants, as opposed to the several minutes it took to generate them
when we introduced the Cron job, we can simply generate the stats during
the first request to the stats page.
Note that, in order to avoid creating a temporary table when the stats
are cached, we're making sure we only create this table when we need to.
Otherwise, we could spend up to 1 second on every request to the stats
page creating a table that isn't going to be used.
Also note we're using an instance variable to check whether we're
creating a table; I tried to use `table_exists?`, but it didn't work. I
wonder whether `table_exists?` doesn't detect temporary tables.
Since we're doing many queries to get stats for each age group and each
geozone, testing shows these indices make stats calculation about 25%
faster on processes with 100,000 participants.
Debugging shows that the bottleneck in the stats calculation is the
number of times we're querying the users table using the same array of
IDs in the `where` condition but each time combined with other
conditions.
So we're inserting the results of querying the users table with the
array of IDs in a temporary table and using this temporary table for the
other calculations. When querying this temporary table, there's no need
to filter for IDs anymore.
For budget stats, the `generate` method is now about 10-20 times faster
for a budget with 20,000 participants. For budgets with only a few dozen
participants, there's no significant difference in performance.
I thought about modifying the `participants` method and use the
temporary table there. The problem, however, is that in this case it
isn't clear when to drop the temporary table, and we could end up with
thousands of temporary tables in the database if we don't do it right.
Creating and dropping the temporary table in the same transaction, on
the other hand, guarantees that won't be the case.
Note there's no risk of duplicate tables since they're created and
dropped inside a transaction, so we're always using the same table name
for the same resource. We're adding a test that fails with a
`PG::DuplicateTable: ERROR: relation "participants__1"` error if we
don't use a transaction.
Back in commit 383909e16, we said:
> Even if this class looks very simple now, we're trying a few things
> related to these stats. Having a class for it makes future changes
> easier and, if there weren't any future changes, at least it makes
> current experiments easier.
Since there haven't been any changes in the last 5 years and we've found
cases where using the GeozoneStats class results in a slightly worse
performance, we're removing this class. The code is now a bit easier to
read, and is consistent with the way we calculate participants by age.
This reverts commit 383909e16.
Using `>` and `<` for dates is a bit confusing because it usually
requires mentally translating `<` into "before" and `>` into "after",
meaning it takes a few seconds to check which is the starting date and
which is the ending date. When using a range, it's easier to see which
is which.
Note that we were using `>` and `<`, but now we're using the equivalent
of `>=` and `<=`. This makes sense IMHO, since the beginning of the year
and the end of the year should also be included in this case.
We were calculating the age stats based on the age of the users who
participated... at the moment where we were calculating the stats. That
means that, if 20 years ago, 1000 people who were 16 years old
participated, they would be shown as having 36 years in the stats.
Instead, we want to show the stats at the time when the process took
place, so we're implementing a `participation_date` method.
Note that, for polls, we could actually use the `age` column in the
`poll_voters` table. However, doing so would be harder, would only work
for polls but not for budgets, and it wouldn't be statistically very
relevant, since the stats are shown by age groups, and only a small
percentage of people would change their age group (and only to the
nearest one) between the time they participate and the time the process
ends.
We might use the `poll_voters` table in the future, though, since we
have a similar issue with geozones and genders, and using the
information in `poll_voters` would solve it as well (only for polls,
though).
Also note that we're using the `ends_at` dates because some people but
be too young to vote when a process starts but old enough to vote when
the process ends.
Finally, note that we might need to change the way we calculate the
participation date for a budget, since some budgets might not enabled
every phase. Not sure how stats work in that scenario (even before these
changes).
This was added in commit 02f19aa4b, before we started tracking events.
I don't think we ever used it; in any case, we now use the `Ahoy::Chart`
class to deal with the stats Ahoy used to generate.
This page isn't linked from anywhere and most Consul Democracy
installations don't even know it exists, so it's useless for most
people.
If we ever bring it back, we should at least add a link pointing to this
page.
Now that we've moved the logic to generate the events data to a model,
and we've got access to the model in the component rendering the chart,
we can render the data inside the chart instead of doing an extra AJAX
request to get the same data.
Originally, this was problaby done this way so the page wouldn't take
several seconds to load while preparing the data for the chart when
there are thousands of dates being displayed. With an AJAX call, the
page would load as fast as usual, and then the chart would render after
a few seconds. However, we can have an even better performance
improvement in this scenario if we use a Set instead of an Array. The
method `Array#include?`, which we were calling for every date in the
data, is much slower that `Set#merge`. So now both the page and the
chart load as fast as expected.
We could also use something like:
```
def add
(...)
shared_keys.push(*collection.keys)
end
def build
(...)
shared_keys.uniq.each do |k|
(...)
end
def shared_keys
@shared_keys ||= []
end
```
Or other approaches to avoid using `Array#include?`. The performance
would be similar to the one we get when using `Set`. We're using a `Set`
because it makes more obvious that `shared_keys` is supposed to contain
unique elements.
We've had some tests failing in the past due to these AJAX requests
being triggered automatically during the tests and no expectations
checking the requests have finished, so now we're reducing the amount of
flaky tests.
We were always displaying the event names in English.
Note we're changing the `user_supported_budgets` key because it didn't
make much sense; the investments are supported, and not the budgets.
We're also adding "created" to most of the event names in order to make
the texts more explicit, since not all the events refer to created data.
Note we're delegating the `t` method because i18n-tasks doesn't detect
code like `ApplicationController.helpers.t` and so reports we aren't
using the `admin.stats.graph` translations.
We were tracking some events with Ahoy, but in an inconsistent way. For
example, we were tracking when a debate was created, but (probably
accidentally) we were only tracking proposals when they were created
from the management section. For budget investments and their supports,
we weren't using Ahoy events but checking their database tables instead.
And we were only using ahoy events for the charts; for the other stats,
we were using the real data.
While we could actually fix these issues and start tracking events
correctly, existing production data would remain broken because we
didn't track a certain event when it happened. And, besides, why should
we bother, for instance, to track when a debate is created, when we can
instead access that information in the debates table?
There are probably some features related to tracking an event and their
visits, but we weren't using them, and we were storing more user data
than we needed to.
So we're removing the track events, allowing us to simplify the code and
make it more consistent. We aren't removing the `ahoy_events` table in
case existing Consul Democracy installations use it, but we'll remove it
after releasing version 2.2.0 and adding a warning in the release notes.
This change fixes the proposal created chart, since we were only
tracking proposals created in the management section, and opens the
possibility to add more charts in the future using data we didn't track
with Ahoy.
Also note the "Level 2 user Graph" test wasn't testing the graph, so
we're changing it in order to test it. We're also moving it next to the
other graphs test and, since we were tracking the event when we were
confirming the phone, we're renaming to "Level 3 users".
Finally, note that, since we were tracking events when something was
created, we're including the `with_hidden` scope. This is also
consistent with the other stats shown in the admin section as well as
the public stats.
The only way to use campaigns is to manually insert them in the
database, which IMHO isn't very practical.
We're going to change every piece of code that generates an Ahoy event
and, in this case, the easiest way to change the Campaing model so it
doesn't use Ahoy events is to simply remove it.
Note we're keeping the database tables until we release a new version,
just in case some Consul Democracy installations are using them. We'll
inform in the release notes that we'll remove the campaigns table after
the release, so existing installations using the `campaigns` table can
move the data somewhere else before we remove the table.