FIX: Ruby HTTP Client not backing off polling on error.

FIX: Ruby HTTP Client not backing off polling on error.

diff --git a/lib/message_bus/http_client.rb b/lib/message_bus/http_client.rb
index 89a6e82..b665b3a 100644
--- a/lib/message_bus/http_client.rb
+++ b/lib/message_bus/http_client.rb
@@ -113,6 +113,7 @@ module MessageBus
           rescue StandardError => e
             @stats.failed += 1
             warn("#{e.class} #{e.message}: #{e.backtrace.join("\n")}")
+            sleep interval
             retry
           ensure
             stop
@@ -226,7 +227,7 @@ module MessageBus
     def interval
       if @enable_long_polling
         if (failed_count = @stats.failed) > 2
-          (@min_poll_interval * failed_count).clamp(
+          (@min_poll_interval * 2**failed_count).clamp(
             @min_poll_interval, @max_poll_interval
           )
         else
diff --git a/spec/integration/http_client_spec.rb b/spec/integration/http_client_spec.rb
index f70540c..77321de 100644
--- a/spec/integration/http_client_spec.rb
+++ b/spec/integration/http_client_spec.rb
@@ -40,6 +40,41 @@ describe MessageBus::HTTPClient do
 
       assert_equal(new_threads, Thread.list - threads)
     end
+
+    describe 'when an error is encountered while trying to poll' do
+      let(:base_url) { "http://0.0.0.0:12312123" }
+
+      let(:client) do
+        MessageBus::HTTPClient.new(base_url, min_poll_interval: 1)
+      end
+
+      it 'should handle errors correctly' do
+        begin
+          original_stderr = $stderr
+          $stderr = fake = StringIO.new
+
+          client.channels[channel] = MessageBus::HTTPClient::Channel.new(
+            callbacks: [-> {}]
+          )
+
+          client.start
+
+          assert_equal(MessageBus::HTTPClient::STARTED, client.status)
+
+          while stats.failed < 1 do
+            sleep 0.05
+          end
+
+          # Sleep for more than the default min_poll_interval to ensure
+          # that we sleep for the right interval after failure
+          sleep 0.5
+
+          assert_equal(1, fake.string.scan("Errno::ECONNREFUSED").size)
+        ensure
+          $stderr = original_stderr
+        end
+      end
+    end
   end
 
   describe '#subscribe' do

GitHub sha: 621d73a1

One thing we can do here is add a bit of random to the backoff, that way if a bunch of clients are hammering the server at the exact same time they can jitter a bit.

Also… you need an upper bound here… 2** can get pretty big. maybe maximum delay is 1 minute or something.

1 Like

One thing we can do here is add a bit of random to the backoff, that way if a bunch of clients are hammering the server at the exact same time they can jitter a bit.

I feel like this is a really edge case where you’ll need all whole bunch of clients to be started at the same time. For the normal use case, we already have clients keeping multiple connections opened to the server at the same time so I’m not sure why we need the random backoff here.

Also… you need an upper bound here… 2** can get pretty big. maybe maximum delay is 1 minute or something.

There is already protection in place which restricts it to the @max_poll_interval

          (@min_poll_interval * 2**failed_count).clamp(
            @min_poll_interval, @max_poll_interval
          )
2 Likes