DEV: Prefabrication (test optimization) (PR #7414)

Whereas:

let!(:user) { Fabricate(:user) }

Creates a user object in the database for each test.

prefabricate(:user) { Fabricate(:user) }

Creates a user object in the database for each group of tests. These objects get cleaned up by virtue of being created in a transaction which is rolled back when the group completes.

GitHub

You’ve signed the CLA, danielwaterworth. Thank you! This pull request is ready for review.

1 Warning
:warning: This pull request is big! We prefer smaller PRs whenever possible, as they are easier to review. Can this be split into a few smaller PRs?

Generated by :no_entry_sign: Danger

what kind of performance difference does this make to the test suite?

This PR makes the tests ~10% faster, but there are still many unexploited opportunities to apply this.

The performance win is something that I love, but I worry that the new syntax makes it harder for people to author tests, they always need to think, should I be using prefab vs let?

What I wonder is if somehow we can:

  1. Clear a bunch of rows in the test db.
  2. Prefabricated a pile of objects and rows in the test DB and hold refs
  3. Start the test suite (leaning on (1)) so the db is no longer “blank” when all the tests start.

Then simply amend the behaviour of Fabricate to make use of (2), so the end result is the individual tests don’t need to be aware of this.

The trouble with (1) and (2) is that it would add cost to every time we run a single test in the suite. That said we could have logic that makes (1) and (2) only run on full test suite runs.

I am not sure this is all pretty speculative but I worry about adding all this new syntax in the PR and decision points for devs.

My first attempt to solve this problem was similar to what you suggest. It would build a set of objects before any tests were run. I abandoned this effort because, unfortunately, a significant number of the tests assume they are starting with a blank slate.

As far as deciding between prefabricate vs let (vs let!), this is how I think about it:

If the choice was just between let vs let! and performance wasn’t a concern, I would always choose let!. It’s easier to predict what a test using let! will do since you don’t have to figure out what order the let bodies will execute in or if they will execute at all and this matters when there are side effects.

In this light, let is a cheaper let!. It’s almost observationally the same, except in the presence of side effects or divergence.

However, prefabricate is also just a cheaper let! and is also almost observationally equivalent. Moreover, prefabricate is cheaper or equal to let as long as the thing in question is actually used.

So, my answer to “which should I use?” is “default to prefabricate for active record objects”.

I follow … maybe the name fab! here will alleviate some of the concerns. Cause it is both clear and short.

prefabricate is both a mouthful to type and is kind of surprising cause we are used to let only sometimes is happening.

@eviltrout / @tgxworld what are your feelings here on introducing a new fab! method that allows reuse of objects between tests in a spec file?

Also to clarify is fidelity of “group” here? Is this per context?

eg: in this example then Fabricate(:user) will be called twice?:

context "test1" do
  fab!(:user) { Fabricate(:user ) } 
   it "something 1" do
     ....
   end
    it "something 2" do
     ....
   end
end
context "test" do
   fab!(:user) { Fabricate(:user ) } 
    it "something 2" do
     ....
   end
end

Changing prefabricate to fab! is a great suggestion.

Your intuition is correct, the user is created twice and it won’t exist in the database when its not in scope.

Also how is state leak handled? I am kind of OK to have rules about immutability in some cases, but want to know what the base rule is here?

context "test" do
   fab!(:user) { Fabricate(:user) }

    it "test 1" do
       user.name = "bob17"
       DB.exec("update users set name = ? where id = ?", "testing", user.id)
       user.save!
    end

    it "test 2" do
         puts user.name
         puts user.username
    end
end

You are able to mutate the objects in the database, you can even delete them. There’s a transaction around each test and there’s a transaction around each group with prefabricated objects (rails fakes nested transactions with savepoints).

So, you are completely able to mutate a prefabricated object with arbitrary SQL provided you do so on the default DB connection. These objects aren’t visible on any other connection because the transaction never commits. Even without this PR, this rule already applies to objects created during tests.

As far as state that doesn’t get persisted is concerned, the object itself is created afresh each test from its id. So, it’s also fine to mutate each object’s unpersisted state, as long as you don’t expect unpersisted state that you create in the fab! block to be present in the test itself.

Wouldn’t this leak the methods into other examples?

I don’t believe so. RSpec creates a class for each context and define_method puts the method on the class, not the module, as illustrated here:

module Test
  def foo
    define_method(:bar) do
      :bar
    end
  end
end

class Foo
  extend Test

  foo
end

class Bar
  extend Test
end

p Foo.new.respond_to?(:bar) # => true
p Bar.new.respond_to?(:bar) # => false

@danielwaterworth Do you have the actual numbers across multiple runs? I’m curious if it is a consistent 10% gain here. I think the usage is quite interesting here even though it feels like a per context fixture. How about overriding let! instead? That maintains the same API but comes with the per context cache introduced here.

Also the code looks fine to me but can you try running the whole test suite 10 times to ensure there aren’t any failures as a result of the change made in this PR? I see that Travis has been failing a couple of times here so that makes me worry.

I’ve added specs for fab! to demonstrate that it does what you’d expect w.r.t scoping.

@tgxworld, You’re right, with the tests taking so long to run, it is difficult to get statistical accuracy. I’ve just done another run after rebasing, it came out at 13m29s vs 14m27s for master - which is only a 7% improvement. It’s much easier to become convinced of the efficacy of these changes by running individual spec files before and after instead.

I’ve also noticed the failures. I’m looking into that now. There are also intermittent failures on master that I’m looking into as well.

One problem with overriding let! is that fab! only works for active record objects and let! can be used for any ruby object.

Nice I ran this locally and saw a speed up of about 50 seconds from 9mins 32 seconds.

My biggest concern here is the potential for this to introduce leaky state / bugs that might introduce heisentests in the future. The speed gain is nice, but there is a risk of more hard to catch bugs as a result.

Maybe we could introduce an option to disable the behavior for easy diagnosis?

Will it be better to move the disabling of rate limiting into an after(:each) instead? https://github.com/discourse/discourse/blob/646cdfa4493d7809c0afdc19adc23ba619c3c789/spec/rails_helper.rb#L165

I am with @eviltrout in that we should have a switch to disable this optimisation just-in-case maybe just an env var?

One problem with overriding let! is that fab! only works for active record objects and let! can be used for any ruby object.

I wonder though if that is an unsolvable problem? you could look at the object you get back from let! and if ActiveRecord::Base === o then go down one path else go down another.

I am kind of leaning towards just “turbo boosting” let!, for a few reasons.

  1. This general pattern will be easier to upstream to rspec as an option
  2. Less stuff to teach devs

Overall, this looks like a very safe change to me, only real risk here heisentest wise are going to be static methods that leaks objects, but I feel the risk is tiny and it already exists in all tests anyway. The transaction savepoint trick is saving us from enormous amounts of danger here.