This table will store which reports (stats, results, ...) will be shown
for a certain process (polls, budgets, ...).
Note Rails fails to save a poll and its report when both are new records
if we add a `validate :process, presence: true` rule. Since it caused a
lot of trouble when creating records for tests during factories rule
completely. Instead, I've created the `results_enabled=` and
`stats_enabled=` methods, so tests are easier to set up, while also
automatically creating a report if it doesn't already exist. This also
decouples form structure and database implemenation.
Originally I named this table `enabled_reports` and instead of having
`stats` and `results` columns, it had an `enabled` column and a `kind`
column, which would be set to "stats" or "results". However, although
that table would allow us to add arbitrary reports easily, I found the
way we had to handle the `has_many` relationship was a bit too complex.
This code might be slightly slower because it performs one query per
field in the form, but I didn't notice any differences on my development
machine, and the code is now much easier to understand.
Due to technical issues, sometimes users voted in booths and their vote
couldn't be added to the database. So we're including them in the users
with no demographic data.
Using SQL's `select` instead of converting the records to a ruby array
increases performance dramatically when there are thousands of records.
For a poll with 200000 voters, calculating stats took more than 7
minutes, and now it takes less than 2 minutes.
We're generating stats every 2 hours because it's less than the time it
will take to generate stats for every process. Once stats are generated,
this task should take less than a second.
The regenerate task has been added so we can manually execute it.
These methods are only used while stats are being generated; once stats
are generated, they aren't used anymore. So there's no need to store
them using the Dalli cache.
Furthermore, there are polls (and even budgets) with hundreds of
thousands of participants. Calculating stats for them takes a very long
time because we can't store all those records in the Dalli cache.
However, since these records aren't used once the stats are generated,
we can store them in an instance variable while we generate the stats,
speeding up the process.
We need a way to manually expire the cache for a budget or poll without
expiring the cache of every budget or poll.
Using the `updated_at` column would be dangerous because most of the
times we update a budget or a poll, we don't need to regenerate their
stats.
We've considered adding a `stats_updated_at` column to each of these
tables. However, in that case we would also need to add a similar column
in the future to every process type whose stats we want to generate.
If users participated and were hidden after participating, we should
still count them in the participants stats.
In the tests, we set users' `hidden_at` attribute before they vote.
Although in real life they would vote first and then they would be
hidden, I've written the tests like this for the sake of simplicity.
So these styles are available in CONSUL.
Note we're not including these styles inside `.participation-stats`
because this class is used in Plaza de España's statistics.
This implementation is a bit more robust because we don't have to change
any of the "or_later?" methods if we add or remove a new phase.
We could also use metaprogramming to reduce code duplication in these
methods. So far, I've decided to keep the code simple since the
duplication seems reasonable.
As the Rails guides say:
> All scope methods will return an ActiveRecord::Relation object
That means `find_by_kind` will return a relation when nothing is found;
the expected behaviour is to return `nil`, like all `find_by` methods
do.
Using scopes also means strange things happen when we try to chain
scopes like `phases.published.drafting`. With scopes, the `drafting`
part would be ignored and all published phases would be returned.
If there's demographic data for all participants, it doesn't make sense
to show the message.
We're using translations instead of an `if` in the view because the text
is also different when there's only one participant. In some languages
the text might also be different depending on how many people with no
demographic data participated.
Another possibility would be to use an `if` in the view so we don't
display an empty paragraph when the cont is zero, and then using
translation for `one` and `other`. I haven't gone that way because I
thought the logic would be more complex and the benefits wouldn't be
that great.
Even if this class looks very simple now, we're trying a few things
related to these stats. Having a class for it makes future changes
easier and, if there weren't any future changes, at least it makes
current experiments easier.
Note we keep the method `participants_by_geozone` to return a hash
because we're caching the stats and storing GeozoneStats objects would
need a lot more memory and we would get an error.
The code is easier to read now, it returns the same results it used to
return, and performance-wise it's probably the same thing, but if it's
not, we'll trust Rails will do optimizations that we don't when we
manually pluck the IDs.
It is way more efficient because we're caching the result of that
method, and this way we only store each voter once in the cache. We were
storing many voters several times and then we were filtering them with
`uniq`.