DEV: Prefabrication (test optimization) (PR #7414)

Sure, last minor round of feedback and I will merge this in first thing on Monday (Australia time)

@SamSaffron, This comment was extremely helpful.

I had the same thought yesterday, but I could see a clear performance regression when I did that. It was looking like master vs distributed caches with dirty tracking were roughly equivalent. I reasoned that there must be an important difference around cache initialization vs clearing.

When I revisited it this morning, with a fresh brain, that explanation didn’t hold water for me. I’ve rerun the tests with and without dirty tracking a few times and the difference isn’t so pronounced as I thought it was. I looked into this further and now I suspect the extra ~40 seconds isn’t down to redis traffic or cache initialization at all. It’s just that the tests use cached values from prior tests.

Here’s the data. This is cache hits by cache where the value was written in a different test:

["icon_manager", 10854]
["theme", 6829]
["am_serializer_fragment_cache", 4840]
["banner_json", 1299]
["scheme_hex_for_name", 1263]
["discourse_stylesheet", 311]
["category_topic_ids", 148]
["svg_sprite", 66]
["csp_extensions", 12]
["developer_ids", 4]
["last_read_only", 0]
["category_url", 0]

When you give each test a clear cache, it rebuilds it by doing queries against the DB. It seems obvious in retrospect, but these things often do.

My plan is to prime these caches explicitly before running the test suite and preventing changes unless opted into.

@SamSaffron, I should said so earlier, but this is ready to be looked at again.

No probs, first thing when I start my day tomorrow!

On Mon, May 6, 2019 at 3:25 PM Daniel Waterworth wrote:

@SamSaffron, I should said so early, but this is ready to be looked at again.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread .

:confetti_ball: merging this in (once I resolve the conflict … my current test time is 7:10 … lets see what this does

Thanks heaps for this work Daniel!


6:06s down from 7:10s for 11146 specs this is not too shabby at all, awesome work!

Thank you! It’s great to see this merged :smile:

I wish I’ve done a better job spreading the word about since 2010. :sweat_smile: It’s the exact same idea. Just called set instead of fab!.

Edit: Back in the days, contexts were not wrapped in SQL transactions themselves, so objects created using set were leaking across tests.

@pcreux Yeah, someone told me about rspec-set when I was talking about test-prof. If I knew about it before, I’d prefer to contribute to it or integrate into test-prof)

The idea is very similar though with one significant difference:

Back in the days, contexts were not wrapped in SQL transactions themselves

That remains the same: fab! (or more precisely, before_all) takes care of it. Though this feature will probably find its way to the rspec-rails upstream someday.

It appears Instructure also created a gem to tackle this problem I’m just linking for reference. Seems a lot of people have tried solving this before :smile:

Also how is state leak handled?

You are able to mutate the objects in the database

That’s only part of the story. From rspec-rails doc:

Even though database updates in each example will be rolled back, the
object won't know about those rollbacks so the object and its backing
data can easily get out of sync.

An addon to detect object state modifications is about to land let_it_be of test-prof.

The problem with using refind: true is that if object’s associated object is modified, the associated object won’t be re-found:

Are you sure this is a correct example? refind: true is equal to user = User.find(, and it creates a fresh AR object with no associations loaded. The reason why refind: true exists is exactly the above example. So, let_it_be(:user, refind: true) is (almost) equal to let!(:user).

Correction: this is only true if use_transactional_fixtures is set to false, but it’s set to true, so no worries here.