PERF: Improve query performance all inbox private messages. (PR #14304)

First reported in Slow loading on private-message-topic-tracking-state.json - #19 by forkythetoy - bug - Discourse Meta

There are two optimization being applied here:

  1. Fetch a user’s group id in a seperate query instead of including it as a sub-query. When I tried a subquery, the query plan becomes very inefficient.

  2. Join against the topic_allowed_users and topic_allowed_groups table instead of doing an IN against a subquery where we UNION the topic_ids from the two tables. From my profiling, this enables PG to do a backwards index scan on the index_topics_on_timestamps_private index.

GitHub

This pull request has been mentioned on Discourse Meta. There might be relevant details there:

Minor, but should you have the same “Executing an extra query…” comment here to stop people changing this? Or is it only the filter_archived query that suffers from the performance penalty?

Is this group_with_messages_ids method called in a loop or otherwise with several different users? If not then why do we have to keep the user.id as the key of a hash instead of just storing @group_with_messages_ids_for_user = user.groups.where(has_messages: true).pluck(:id)? Just curious, I cannot tell from just this diff + file.

Only the filtered_archived query that will call this again which is why I extracted it into a method to memoize the values.

Good point let me include that.

Actually, I’m constraint by the API design here where we don’t have a acting_user and target_user pattern for this calss. I opted for a hash because I want to err on the safe side since the target_user is always passed into the query method calls above. Who knows when someone might call this in a loop.

Fair enough!