Sanitizing descriptions before saving a record has a few drawbacks:
1. It makes the application rely on data being safe in the database. If
somehow dangerous data enters the database, the application will be
vulnerable to XSS attacks
2. It makes the code complicated
3. It isn't backwards compatible; if we decide to disallow a certain
HTML tag in the future, we'd need to sanitize existing data.
On the other hand, sanitizing the data in the view means we don't need
to triple-check dangerous HTML has already been stripped when we see the
method `auto_link_already_sanitized_html`, since now every time we use
it we sanitize the text in the same line we call this method.
We could also sanitize the data twice, both when saving to the database
and when displaying values in the view. However, doing so wouldn't make
the application safer, since we sanitize text introduced through
textarea fields but we don't sanitize text introduced through input
fields.
Finally, we could also overwrite the `description` method so it
sanitizes the text. But we're already introducing Globalize which
overwrites that method, and overwriting it again is a bit too confusing
in my humble opinion. It can also lead to hard-to-debug behaviour.
Using the `_html` suffix in an i18n key is the same as using `html_safe`
on it, which means that translation could potentially be used for XSS
attacks.
Sometimes we're interpolating a link inside a translation, and marking
the whole translations as HTML safe.
However, some translations added by admins to the database or through
crowdin are not entirely under our control.
Although AFAIK crowdin checks for potential cross-site scripting
attacks, it's a good practice to sanitize parts of a string potentially
out of our control before marking the string as HTML safe.
The name `safe_html_with_links` was confusing and could make you think
it takes care of making the HTML safe. So I've renamed it in a way that
makes it a bit more intuitive that it expects its input to be already
sanitized.
I've changed `text_with_links` as well so now the two method names
complement each other.
The label text was always in English, and it wasn't associated with any
input field.
The `SecureRandom` part is a quick hack so we don't get duplicate IDs.
Using "your_answer_#{question.id}" might work as well, but right now I'm
not sure if the form is sometimes rendered twice for the same question.
If there's demographic data for all participants, it doesn't make sense
to show the message.
We're using translations instead of an `if` in the view because the text
is also different when there's only one participant. In some languages
the text might also be different depending on how many people with no
demographic data participated.
Another possibility would be to use an `if` in the view so we don't
display an empty paragraph when the cont is zero, and then using
translation for `one` and `other`. I haven't gone that way because I
thought the logic would be more complex and the benefits wouldn't be
that great.
So if we don't have information regarding gender, age or geozone, stats
regarding those topics will not be shown.
Note we're using `spec/models/statisticable_spec.rb` because having the
same file in `spec/models/concerns` caused the tests to be executed
twice.
Also note the implementation behind the `gender?`, `age?` and `geozone?`
methods is a bit primitive. We might need to make it more robust in the
future.
It will make it far easier to call other methods on the stats object,
and we're already caching the methods.
We had to remove the view fragment caching because the stats object
isn't as easy to cache. The good thing about it is the view will
automatically be updated when we change logic regarding which stats to
show, and the methods taking long to execute are cached in the model.
For now we think showing them would be showing too much data and it
would be a bit confusing.
I've been tempted to just remove the view and keep the methods in the
model in case they're used by other institutions using CONSUL. However,
it's probably better to wait until we're asked to re-implement them, and
in the meantime we don't maintain code nobody uses. The code wasn't that
great to start with (I know it because I wrote it).