DEV - Stamp javascript files (PR #10575)

rake javascript:update will copy files from the public/javascripts/ directory, and add the git hash of the public/javascripts/ to the filename. I’m calling these ‘stamped files’

(rake javascript:remove_stamped_files will remove the stamped files created by rake javascript:update)

It also creates a file public/javascripts/REVISION which contains the same git hash.

When the Rails app is booted, the value of this file is read and subsequently returned to the client via the discourse-setup hash. If the appropriate value is found, it is added to the URL when loadScript is called, via the method addHashToURL.

If you wish to manually test, something like the following will get you started:

  1. rake javascript:update and (re)start your Rails app.
  2. Navigate to / - you should see a request for jquery.magnific-popup.min-[GIT_HASH].js
  3. Navigating to something like /admin/customize/themes/1/common/scss/edit should generate a request for /javascripts/ace-[GIT_HASH]/ace.js and other related ace files.

GitHub

The title of this pull request changed from “Stamp javascript files” to "DEV - Stamp javascript files

This is nice, but I find we risk accumulating multiple versions of the libraries in the repo. If I understand correctly, devs would need to run javascript:remove_stamped_files and then javascript:update, right?

We also lose some git history in having the outputted library change filenames on every update. (Granted, the history here isn’t all that useful, since these are minified files.)

Did you try passing the git hash as a query parameter, so that the request to the JS would look like /javascripts/ace/ace.js?v=[GIT-HASH]? In theory, that should work for cache busting and it feels like a simpler change.

we should probably remove it rather than commit it pending.

This is nice, but I find we risk accumulating multiple versions of the libraries in the repo. If I understand correctly, devs would need to run javascript:remove_stamped_files and then javascript:update, right?

https://github.com/discourse/discourse/pull/10575/files#diff-d0b1b9e93608b3963155b0eddff5c957R57

remove_stamped_files is set up as a dependency on update - it’ll get called every time you run update.

We also lose some git history in having the outputted libraries change filenames on every update. (Granted, the history here isn’t all that useful, since these are minified files.)

Shouldn’t we be looking at the history of the source files, rather than the outputted files?

To be fair, i’m not super crazy about how the versioned/stamped files are placed in to the repo. Perhaps there’s a cleaner way to do it as part of a deploy/app startup process?

Did you try passing the git hash as a query parameter, so that the request to the JS would look like /javascripts/ace/ace.js?v=[GIT-HASH]? In theory, that should work for cache busting and it feels like a simpler change.

Apparently it’s not as bad as it used to be, but some badly behaved proxy caches would not cache things with query strings, and/or ignore query string when caching items.

Essentially it’s safer to change the filename, rather than throw things in the query params: https://guides.rubyonrails.org/asset_pipeline.html#what-is-fingerprinting-and-why-should-i-care-questionmark

(Note that Rails fingerprints each file individually. Given how infrequently we update anything in public/javascript, I felt a reasonable compromise was to have them use the git hash of the directory. Perhaps it would be better to place the files under public/javascript/[GIT_HASH]/filename.js, rather than rewriting the filename directly?)

The non-pending version of this works locally for me, but fails on the build/CI servers. Is it worth exploring why it works in one environment but not another?

remove_stamped_files is set up as a dependency on update - it’ll get called every time you run update.

My bad, this does indeed cleanup other stamped files, but it also leaves the unstamped files there, so we would have two versions of the same file in the repo. (I do see the note in the rake task to remove the the copied entry once we’re confident there are no hardcoded calls to non-stamped filenames.)

Shouldn’t we be looking at the history of the source files, rather than the outputted files?

We don’t have any history for the source files, they are in node_modules, which is git ignored.

To be fair, i’m not super crazy about how the versioned/stamped files are placed in to the repo. Perhaps there’s a cleaner way to do it as part of a deploy/app startup process?

Yes, I think we will move to a cleaner way, eventually, likely after Ember CLI is well integrated into core.

Essentially it’s safer to change the filename, rather than throw things in the query params: https://guides.rubyonrails.org/asset_pipeline.html#what-is-fingerprinting-and-why-should-i-care-questionmark

Strictly speaking you are right, however these vendored scripts don’t quite go through the usual Rails asset pipeline, so some of the Rails-specific warnings don’t apply (points 2 and 3 in the Rails link). They’re usually served via the CDN (using a 1-day nginx cache rule).

I’m still on the fence about this, I see a few disadvantages. If one vendor script is updated, say ACE, all other vendor scripts will be renamed (because they all share the same GIT HASH). At least temporarily, we would have two copies of each script in the repo. And some of the generated filenames are not pretty, especially minified files (example: Chart.min-d96086014da541ebd4c4b8fadb21df8e447b1661.js).

Can this be closed @jbrw now that the alternative solution has been merged?