PERF: Improve database query perf when loading topics for a category. (#14416)

PERF: Improve database query perf when loading topics for a category. (#14416)

  • PERF: Improve database query perf when loading topics for a category.

Instead of left joining the topics table against categories by filtering with categories.id, we can improve the query plan by filtering against topics.category_id first before joining which helps to reduce the number of rows in the topics table that has to be joined against the other tables and also make better use of our existing index.

The following is a before and after of the query plan for a category with many subcategories.

Before:

                                                                                                       QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------
 Limit  (cost=1.28..747.09 rows=30 width=12) (actual time=85.502..2453.727 rows=30 loops=1)
   ->  Nested Loop Left Join  (cost=1.28..566518.36 rows=22788 width=12) (actual time=85.501..2453.722 rows=30 loops=1)
         Join Filter: (category_users.category_id = topics.category_id)
         Filter: ((topics.category_id = 11) OR (COALESCE(category_users.notification_level, 1) <> 0) OR (tu.notification_level > 1))
         ->  Nested Loop Left Join  (cost=1.00..566001.58 rows=22866 width=20) (actual time=85.494..2453.702 rows=30 loops=1)
               Filter: ((COALESCE(tu.notification_level, 1) > 0) AND ((topics.category_id <> 11) OR (topics.pinned_at IS NULL) OR ((t
opics.pinned_at <= tu.cleared_pinned_at) AND (tu.cleared_pinned_at IS NOT NULL))))
               Rows Removed by Filter: 1
               ->  Nested Loop  (cost=0.57..528561.75 rows=68606 width=24) (actual time=85.472..2453.562 rows=31 loops=1)
                     Join Filter: ((topics.category_id = categories.id) AND ((categories.topic_id <> topics.id) OR (categories.id = 1
1)))
                     Rows Removed by Join Filter: 13938306
                     ->  Index Scan using index_topics_on_bumped_at on topics  (cost=0.42..100480.05 rows=715549 width=24) (actual ti
me=0.010..633.015 rows=464623 loops=1)
                           Filter: ((deleted_at IS NULL) AND ((archetype)::text <> 'private_message'::text))
                           Rows Removed by Filter: 105321
                     ->  Materialize  (cost=0.14..36.04 rows=30 width=8) (actual time=0.000..0.002 rows=30 loops=464623)
                           ->  Index Scan using categories_pkey on categories  (cost=0.14..35.89 rows=30 width=8) (actual time=0.006.
.0.040 rows=30 loops=1)
                                 Index Cond: (id = ANY ('{11,53,57,55,54,56,112,94,107,115,116,117,97,95,102,103,101,105,99,114,106,1
13,104,98,100,96,108,109,110,111}'::integer[]))
               ->  Index Scan using index_topic_users_on_topic_id_and_user_id on topic_users tu  (cost=0.43..0.53 rows=1 width=16) (a
ctual time=0.004..0.004 rows=0 loops=31)
                     Index Cond: ((topic_id = topics.id) AND (user_id = 1103877))
         ->  Materialize  (cost=0.28..2.30 rows=1 width=8) (actual time=0.000..0.000 rows=0 loops=30)
               ->  Index Scan using index_category_users_on_user_id_and_last_seen_at on category_users  (cost=0.28..2.29 rows=1 width
=8) (actual time=0.004..0.004 rows=0 loops=1)
                     Index Cond: (user_id = 1103877)
 Planning Time: 1.359 ms
 Execution Time: 2453.765 ms
(23 rows)

After:

                                                                                                                            QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1.28..438.55 rows=30 width=12) (actual time=38.297..657.215 rows=30 loops=1)
   ->  Nested Loop Left Join  (cost=1.28..195944.68 rows=13443 width=12) (actual time=38.296..657.211 rows=30 loops=1)
         Filter: ((categories.topic_id <> topics.id) OR (topics.category_id = 11))
         Rows Removed by Filter: 29
         ->  Nested Loop Left Join  (cost=1.13..193462.59 rows=13443 width=16) (actual time=38.289..657.092 rows=59 loops=1)
               Join Filter: (category_users.category_id = topics.category_id)
               Filter: ((topics.category_id = 11) OR (COALESCE(category_users.notification_level, 1) <> 0) OR (tu.notification_level > 1))
               ->  Nested Loop Left Join  (cost=0.85..193156.79 rows=13489 width=20) (actual time=38.282..657.059 rows=59 loops=1)
                     Filter: ((COALESCE(tu.notification_level, 1) > 0) AND ((topics.category_id <> 11) OR (topics.pinned_at IS NULL) OR ((topics.pinned_at <= tu.cleared_pinned_at) AND (tu.cleared_pinned_at IS NOT NULL))))
                     Rows Removed by Filter: 1
                     ->  Index Scan using index_topics_on_bumped_at on topics  (cost=0.42..134521.06 rows=40470 width=24) (actual time=38.267..656.850 rows=60 loops=1)
                           Filter: ((deleted_at IS NULL) AND ((archetype)::text <> 'private_message'::text) AND (category_id = ANY ('{11,53,57,55,54,56,112,94,107,115,116,117,97,95,102,103,101,105,99,114,106,113,104,98,100,96,108,109,110,111}'::integer[])))
                           Rows Removed by Filter: 569895
                     ->  Index Scan using index_topic_users_on_topic_id_and_user_id on topic_users tu  (cost=0.43..1.43 rows=1 width=16) (actual time=0.003..0.003 rows=0 loops=60)
                           Index Cond: ((topic_id = topics.id) AND (user_id = 1103877))
               ->  Materialize  (cost=0.28..2.30 rows=1 width=8) (actual time=0.000..0.000 rows=0 loops=59)
                     ->  Index Scan using index_category_users_on_user_id_and_last_seen_at on category_users  (cost=0.28..2.29 rows=1 width=8) (actual time=0.004..0.004 rows=0 loops=1)
                           Index Cond: (user_id = 1103877)
         ->  Index Scan using categories_pkey on categories  (cost=0.14..0.17 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=59)
               Index Cond: (id = topics.category_id)
 Planning Time: 1.633 ms
 Execution Time: 657.255 ms
(22 rows)
  • PERF: Optimize index on topics bumped_at.

Replace index_topics_on_bumped_at index with a partial index on Topic#bumped_at filtered by archetype since there is already another index that covers private topics.

diff --git a/app/models/topic.rb b/app/models/topic.rb
index 95cb1b9..6f155fe 100644
--- a/app/models/topic.rb
+++ b/app/models/topic.rb
@@ -1846,7 +1846,7 @@ end
 #  idx_topics_user_id_deleted_at           (user_id) WHERE (deleted_at IS NULL)
 #  idxtopicslug                            (slug) WHERE ((deleted_at IS NULL) AND (slug IS NOT NULL))
 #  index_topics_on_bannered_until          (bannered_until) WHERE (bannered_until IS NOT NULL)
-#  index_topics_on_bumped_at               (bumped_at)
+#  index_topics_on_bumped_at_public        (bumped_at) WHERE ((deleted_at IS NULL) AND ((archetype)::text <> 'private_message'::text))
 #  index_topics_on_created_at_and_visible  (created_at,visible) WHERE ((deleted_at IS NULL) AND ((archetype)::text <> 'private_message'::text))
 #  index_topics_on_id_and_deleted_at       (id,deleted_at)
 #  index_topics_on_id_filtered_banner      (id) UNIQUE WHERE (((archetype)::text = 'banner'::text) AND (deleted_at IS NULL))
diff --git a/db/post_migrate/20210922064213_alter_bumped_at_indexes_on_topics.rb b/db/post_migrate/20210922064213_alter_bumped_at_indexes_on_topics.rb
new file mode 100644
index 0000000..037b6c2
--- /dev/null
+++ b/db/post_migrate/20210922064213_alter_bumped_at_indexes_on_topics.rb
@@ -0,0 +1,27 @@
+# frozen_string_literal: true
+
+class AlterBumpedAtIndexesOnTopics < ActiveRecord::Migration[6.1]
+  disable_ddl_transaction!
+
+  def up
+    execute(<<~SQL)
+    CREATE INDEX CONCURRENTLY IF NOT EXISTS index_topics_on_bumped_at_public
+    ON topics (bumped_at)
+    WHERE ((deleted_at IS NULL) AND ((archetype)::text <> 'private_message'::text));
+    SQL
+
+    execute(<<~SQL)
+    DROP INDEX IF EXISTS index_topics_on_bumped_at;
+    SQL
+
+    # The following index is known to have not been properly renamed. Drop it if
+    # exists just in case.
+    execute(<<~SQL)
+    DROP INDEX IF EXISTS index_forum_threads_on_bumped_at;
+    SQL
+  end
+
+  def down
+    raise ActiveRecord::IrreversibleMigration
+  end
+end
diff --git a/lib/topic_query.rb b/lib/topic_query.rb
index 25362ab..57ffb9a 100644
--- a/lib/topic_query.rb
+++ b/lib/topic_query.rb
@@ -625,11 +625,11 @@ class TopicQuery
     @options[:category_id] = category_id
     if category_id
       if options[:no_subcategories]
-        result = result.where('categories.id = ?', category_id)
+        result = result.where('topics.category_id = ?', category_id)
       else
-        result = result.where("categories.id IN (?)", Category.subcategory_ids(category_id))
+        result = result.where("topics.category_id IN (?)", Category.subcategory_ids(category_id))
         if !SiteSetting.show_category_definitions_in_topic_lists
-          result = result.where("categories.topic_id <> topics.id OR categories.id = ?", category_id)
+          result = result.where("categories.topic_id <> topics.id OR topics.category_id = ?", category_id)
         end
       end
       result = result.references(:categories)
diff --git a/spec/components/topic_query_spec.rb b/spec/components/topic_query_spec.rb
index cd65957..a7e7f2e 100644
--- a/spec/components/topic_query_spec.rb
+++ b/spec/components/topic_query_spec.rb
@@ -268,33 +268,100 @@ describe TopicQuery do
       let!(:subcategory) { Fabricate(:category_with_definition, parent_category_id: category.id) }
       let(:subsubcategory) { Fabricate(:category_with_definition, parent_category_id: subcategory.id) }
 
+      # Not used in assertions but fabricated to ensure we're not leaking topics
+      # across categories
+      let!(:_category) { Fabricate(:category_with_definition) }
+      let!(:_subcategory) { Fabricate(:category_with_definition, parent_category_id: _category.id) }
+
       it "works with subcategories" do
-        expect(TopicQuery.new(moderator, category: category.id).list_latest.topics.size).to eq(1)
-        expect(TopicQuery.new(moderator, category: subcategory.id).list_latest.topics.size).to eq(1)
-        expect(TopicQuery.new(moderator, category: category.id, no_subcategories: true).list_latest.topics.size).to eq(1)
+        expect(
+          TopicQuery
+            .new(moderator, category: category.id)
+            .list_latest.topics
+        ).to contain_exactly(category.topic)
+
+        expect(
+          TopicQuery
+            .new(moderator, category: subcategory.id)
+            .list_latest.topics
+        ).to contain_exactly(subcategory.topic)
+
+        expect(
+          TopicQuery
+            .new(moderator, category: category.id, no_subcategories: true)
+            .list_latest.topics
+        ).to contain_exactly(category.topic)
       end
 
       it "shows a subcategory definition topic in its parent list with the right site setting" do
         SiteSetting.show_category_definitions_in_topic_lists = true
-        expect(TopicQuery.new(moderator, category: category.id).list_latest.topics.size).to eq(2)
+
+        expect(
+          TopicQuery
+            .new(moderator, category: category.id)
+            .list_latest.topics
+        ).to contain_exactly(category.topic, subcategory.topic)
       end
 
       it "works with subsubcategories" do
         SiteSetting.max_category_nesting = 3
 
-        Fabricate(:topic, category: category)
-        Fabricate(:topic, category: subcategory)
-        Fabricate(:topic, category: subsubcategory)
+        category_topic = Fabricate(:topic, category: category)
+        subcategory_topic = Fabricate(:topic, category: subcategory)
+        subsubcategory_topic = Fabricate(:topic, category: subsubcategory)
 
         SiteSetting.max_category_nesting = 2
-        expect(TopicQuery.new(moderator, category: category.id).list_latest.topics.size).to eq(3)
-        expect(TopicQuery.new(moderator, category: subcategory.id).list_latest.topics.size).to eq(3)
-        expect(TopicQuery.new(moderator, category: subsubcategory.id).list_latest.topics.size).to eq(2)
+
+        expect(
+          TopicQuery
+            .new(moderator, category: category.id)
+            .list_latest.topics
+        ).to contain_exactly(category.topic, category_topic, subcategory_topic)
+
+        expect(
+          TopicQuery
+            .new(moderator, category: subcategory.id)
+            .list_latest.topics
+        ).to contain_exactly(
+          subcategory.topic,
+          subcategory_topic,
+          subsubcategory_topic
+        )
+
+        expect(
+          TopicQuery
+            .new(moderator, category: subsubcategory.id)
+            .list_latest.topics
+        ).to contain_exactly(subsubcategory.topic, subsubcategory_topic)
 
         SiteSetting.max_category_nesting = 3
-        expect(TopicQuery.new(moderator, category: category.id).list_latest.topics.size).to eq(4)
-        expect(TopicQuery.new(moderator, category: subcategory.id).list_latest.topics.size).to eq(3)
-        expect(TopicQuery.new(moderator, category: subsubcategory.id).list_latest.topics.size).to eq(2)
+
+        expect(
+          TopicQuery
+            .new(moderator, category: category.id)
+            .list_latest.topics
+        ).to contain_exactly(
+          category.topic,
+          category_topic,
+          subcategory_topic,
+          subsubcategory_topic
+        )
+
+        expect(
+          TopicQuery
+            .new(moderator, category: subcategory.id)
+            .list_latest.topics
+        ).to contain_exactly(
+          subcategory.topic,
+          subcategory_topic,
+          subsubcategory_topic
+        )
+
+        expect(
+          TopicQuery
+            .new(moderator, category: subsubcategory.id)
+            .list_latest.topics
+        ).to contain_exactly(subsubcategory.topic, subsubcategory_topic)
       end
     end
   end

GitHub sha: cd64e887118819476507950f01ac24fba91bef8a

This commit appears in #14416 which was approved by udan11 and danielwaterworth. It was merged by tgxworld.