FEATURE: SidekiqStats metrics (#194)

FEATURE: SidekiqStats metrics (#194)

  • FEATURE: SidekiqStats metrics

  • SidekiqStats: new description for workers_size

  • Rewrite Sidekiq section of the README

diff --git a/README.md b/README.md
index a0bf9d6..05872c7 100644
--- a/README.md
+++ b/README.md
@@ -367,43 +367,49 @@ Metrics collected by Process instrumentation include labels `type` (as given wit
 
 #### Sidekiq metrics
 
-Including Sidekiq metrics (how many jobs ran? how many failed? how long did they take? how many are dead? how many were restarted?)
-
-`‍``ruby
-Sidekiq.configure_server do |config|
-   config.server_middleware do |chain|
-      require 'prometheus_exporter/instrumentation'
-      chain.add PrometheusExporter::Instrumentation::Sidekiq
-   end
-   config.death_handlers << PrometheusExporter::Instrumentation::Sidekiq.death_handler
-end
-`‍``
-
-To monitor Queue size and latency:
+There are different kinds of Sidekiq metrics that can be collected. A recommended setup looks like this:
 
 `‍``ruby
 Sidekiq.configure_server do |config|
+  require 'prometheus_exporter/instrumentation'
+  config.server_middleware do |chain|
+    chain.add PrometheusExporter::Instrumentation::Sidekiq
+  end
+  config.death_handlers << PrometheusExporter::Instrumentation::Sidekiq.death_handler
   config.on :startup do
-    require 'prometheus_exporter/instrumentation'
+    PrometheusExporter::Instrumentation::Process.start type: 'sidekiq'
+    PrometheusExporter::Instrumentation::SidekiqProcess.start
     PrometheusExporter::Instrumentation::SidekiqQueue.start
+    PrometheusExporter::Instrumentation::SidekiqStats.start
   end
 end
 `‍``
 
-This will only monitor the queues that are consumed by the sidekiq process you are on.  You can pass an `all_queues` parameter to monitor metrics on all queues.
+* The middleware and death handler will generate job specific metrics (how many jobs ran? how many failed? how long did they take? how many are dead? how many were restarted?).
+* The [`Process`](#per-process-stats) metrics provide basic ruby metrics.
+* The `SidekiqProcess` metrics provide the concurrency and busy metrics for this process.
+* The `SidekiqQueue` metrics provides size and latency for the queues run by this process.
+* The `SidekiqStats` metrics provide general, global Sidekiq stats (size of Scheduled, Retries, Dead queues, total number of jobs, etc).
+
+For `SidekiqQueue`, if you run more than one process for the same queues, note that the same metrics will be exposed by all the processes, just like the `SidekiqStats` will if you run more than one process of any kind. You might want use `avg` or `max` when consuming their metrics.
 
-To monitor Sidekiq process info:
+An alternative would be to expose these metrics in lone, long-lived process. Using a rake task, for example:
 
 `‍``ruby
-Sidekiq.configure_server do |config|
-  config.on :startup do
-    require 'prometheus_exporter/instrumentation'
-    PrometheusExporter::Instrumentation::Process.start type: 'sidekiq'
-    PrometheusExporter::Instrumentation::SidekiqProcess.start
-  end
+task :sidekiq_metrics do
+  server = PrometheusExporter::Server::WebServer.new
+  server.start
+
+  PrometheusExporter::Client.default = PrometheusExporter::LocalClient.new(collector: server.collector)
+
+  PrometheusExporter::Instrumentation::SidekiqQueue.start(all_queues: true)
+  PrometheusExporter::Instrumentation::SidekiqStats.start
+  sleep
 end
 `‍``
 
+The `all_queues` parameter for `SidekiqQueue` will expose metrics for all queues.
+
 Sometimes the Sidekiq server shuts down before it can send metrics, that were generated right before the shutdown, to the collector. Especially if you care about the `sidekiq_restarted_jobs_total` metric, it is a good idea to explicitly stop the client:
 
 `‍``ruby
@@ -447,7 +453,21 @@ Both metrics will have a `queue` label with the name of the queue.
 | Gauge | `sidekiq_process_busy`        | Number of busy workers for this process |
 | Gauge | `sidekiq_process_concurrency` | Concurrency for this process            |
 
-Both metrics will include the labels `labels`, `queues`, `quiet`, `tag`, `hostname` and `identity`, as returned by the [Sidekiq API](https://github.com/mperham/sidekiq/wiki/API#processes).
+Both metrics will include the labels `labels`, `queues`, `quiet`, `tag`, `hostname` and `identity`, as returned by the [Sidekiq Processes API](https://github.com/mperham/sidekiq/wiki/API#processes).
+
+**PrometheusExporter::Instrumentation::SidekiqStats**
+| Type  | Name                            | Description                             |
+| ---   | ---                             | ---                                     |
+| Gauge | `sidekiq_stats_dead_size`       | Size of the dead queue                  |
+| Gauge | `sidekiq_stats_enqueued`        | Number of enqueued jobs                 |
+| Gauge | `sidekiq_stats_failed`          | Number of failed jobs                   |
+| Gauge | `sidekiq_stats_processed`       | Total number of processed jobs          |
+| Gauge | `sidekiq_stats_processes_size`  | Number of processes                     |
+| Gauge | `sidekiq_stats_retry_size`      | Size of the retries queue               |
+| Gauge | `sidekiq_stats_scheduled_size`  | Size of the scheduled queue             |
+| Gauge | `sidekiq_stats_workers_size`    | Number of jobs actively being processed |
+
+Based on the [Sidekiq Stats API](https://github.com/mperham/sidekiq/wiki/API#stats).
 
 _See [Metrics collected by Process Instrumentation](#metrics-collected-by-process-instrumentation) for a list of metrics the Process instrumentation will produce._
 
diff --git a/lib/prometheus_exporter/instrumentation.rb b/lib/prometheus_exporter/instrumentation.rb
index d3eb62c..1effe2d 100644
--- a/lib/prometheus_exporter/instrumentation.rb
+++ b/lib/prometheus_exporter/instrumentation.rb
@@ -6,6 +6,7 @@ require_relative "instrumentation/method_profiler"
 require_relative "instrumentation/sidekiq"
 require_relative "instrumentation/sidekiq_queue"
 require_relative "instrumentation/sidekiq_process"
+require_relative "instrumentation/sidekiq_stats"
 require_relative "instrumentation/delayed_job"
 require_relative "instrumentation/puma"
 require_relative "instrumentation/hutch"
diff --git a/lib/prometheus_exporter/instrumentation/sidekiq_stats.rb b/lib/prometheus_exporter/instrumentation/sidekiq_stats.rb
new file mode 100644
index 0000000..d4900ce
--- /dev/null
+++ b/lib/prometheus_exporter/instrumentation/sidekiq_stats.rb
@@ -0,0 +1,43 @@
+# frozen_string_literal: true
+
+module PrometheusExporter::Instrumentation
+  class SidekiqStats
+    def self.start(client: nil, frequency: 30)
+      client ||= PrometheusExporter::Client.default
+      sidekiq_stats_collector = new
+
+      Thread.new do
+        loop do
+          begin
+            client.send_json(sidekiq_stats_collector.collect)
+          rescue StandardError => e
+            STDERR.puts("Prometheus Exporter Failed To Collect Sidekiq Stats metrics #{e}")
+          ensure
+            sleep frequency
+          end
+        end
+      end
+    end
+
+    def collect
+      {
+        type: 'sidekiq_stats',
+        stats: collect_stats
+      }
+    end
+
+    def collect_stats
+      stats = ::Sidekiq::Stats.new
+      {
+        'dead_size' => stats.dead_size,
+        'enqueued' => stats.enqueued,
+        'failed' => stats.failed,
+        'processed' => stats.processed,
+        'processes_size' => stats.processes_size,
+        'retry_size' => stats.retry_size,
+        'scheduled_size' => stats.scheduled_size,
+        'workers_size' => stats.workers_size,
+      }
+    end
+  end
+end
diff --git a/lib/prometheus_exporter/server.rb b/lib/prometheus_exporter/server.rb
index 423a7d7..cd3e40b 100644
--- a/lib/prometheus_exporter/server.rb
+++ b/lib/prometheus_exporter/server.rb
@@ -7,6 +7,7 @@ require_relative "server/process_collector"
 require_relative "server/sidekiq_collector"
 require_relative "server/sidekiq_queue_collector"
 require_relative "server/sidekiq_process_collector"
+require_relative "server/sidekiq_stats_collector"
 require_relative "server/delayed_job_collector"

[... diff too long, it was truncated ...]

GitHub sha: 854de20e2521048e79e07ff58b7b2efbfbbcf22e

This commit appears in #194 which was merged by SamSaffron.