PERF: Add scheduled job to delete old stylesheet cache rows (PR #13747)

Old stylesheet cache entries stick around for too long. We currently keep a maximum of 50 entries per target. Plus, any changes in target names (due to refactors) leave redundant entries in the table.

This adds a job that removes stylesheet cache entries older than 150 days. This is a long timeframe to account for Discourse sites that update only a few times a year. (We could easily cleanup entries that are 50 or 100 days old.)

The stylesheet_cache table can be quite large. On, for example, it is the largest table:

# rake db:stats

table_name                           | row_estimate | table_size | index_size | total_size
stylesheet_cache                     | 6176         | 264 MB     | 936 kB     | 265 MB

1573 rows are over 150 days old, 5238 are over 100 days old.


What happens to sites that don’t update in 150 days? Will the stylesheets disappear?

The stylesheets won’t disappear. Should the job remove all cache entries, the first request to hit the site will regenerate the stylesheet caches. In other words, if a site doesn’t update and all caches get emptied, the first request to hit the site after the cleanup will be slow.

Excellent, thanks for the clarification.

    StylesheetCache.where('created_at < ?', 150.days.ago).delete_all

I’m not sure what does TIMESTAMP do but we don’t need it because ActiveRecord will handle the conversion.

Out of curiosity, why do we keep so many old stylesheet in the cache?

  describe ".clean_up" do
      StylesheetCache.first.update!(created_at: 151.days.ago)

I also think we should extract 150 days to a constant so that it doesn’t have to be duplicated in tests.

Out of curiosity, why do we keep so many old stylesheet in the cache?

Because of MAX_TO_KEEP = 50 in the StylesheetCache model, which will keep up 50 versions of the same target. As to why 50, I don’t know… not sure why we keep previous versions in the cache.