PERF: backoff background requests when overloaded (PR #10888)

When the server gets overloaded and lots of requests start queuing server will attempt to shed load by returning 429 errors on background requests.

The client can flag a request as background by setting the header: Discourse-Background to true

Out-of-the-box we shed load when the queue time goes above 0.5 seconds.

The only request we shed at the moment is the request to load up a new post when someone posts to a topic.

We can extend this as we go with a more general pattern on the client.

Previous to this change, rate limiting would “break” the post stream which would make suggested topics vanish and users would have to scroll the page to see more posts in the topic.

Server needs this protection for cases where tons of clients are navigated to a topic and a new post is made. This can lead to a self inflicted denial of service if enough clients are viewing the topic.

Due to the internal security design of Discourse it is hard for a large number of clients to share a channel where we would pass the full post body via the message bus.

GitHub

Bear in mind that later could happen after topic controller has been destroyed.

A cleaner pattern would be to hold reference to later and cancel it in willDestroy I think: https://api.emberjs.com/ember/3.21/classes/Controller/methods/willDestroy?anchor=willDestroy

It can seem like a nitpick, but I generally prefer to use null in this case, undefined means it has never been assigned, which is not the case here.

            this._loadingPostIds = null;

You can use this now I think, I’m still fixing few issues with babels and plugins but fore core that should be good:

    if (opts?.background) {

I think you can use optional chaining here:

xhr?.responseJson?.extras?.wait_seconds

Let me know if it doesn’t work

Is the topic controller ever destroyed? I added some protection in case the topic id flips

Oh nice we have optional chaining

Yes AFAIK, it shouldn’t be destroyed in normal workflow but there might be cases where it would, like tests maybe? It’s just like DOM listeners you don’t absolutely always need to remove them, I just think it makes things clearer/cleaner to do it, and might avoid some complex to track bugs in the future.

Oh nice we have optional chaining

yes be sure to read DEV: updates js transpiler to use babel 7 (#10627) · discourse/discourse@bbddce4 · GitHub to know what’s available now

I will merge a commit this week to make it available everywhere, but core should be good.

I’m curious, why this change?

prettier was checking my ruby files in vim

For clarity in the comment, move || times <= 0 into this if-statement

Checking the rest of the code… there is server randomization, uniform between 5-10sec @ 10ms.

The client is in a better position to implement randomized exponential backoff because it has better visibility into how many consecutive attempts it’s made. Uniform randomization is decent and decays into a normal distribution after multiple attempts.

Checking other implementations… they trust the backoff value from the server, and use randomized exponential when the server can’t give a backoff value.

https://github.com/menghanl/grpc-go/blob/4e06cefbe5f1619362397ea2ad765f1c6311f748/stream.go#L546-L558

Okay, so concrete suggestion: Please document that the value from the server is randomized @ 10ms granularity. Don’t floor at 5 seconds, floor at 0.5 seconds - self documenting + allows more server customization flexiblity.

Hmm, loadingLastPost gets set to false in-between retries? is that going to flicker the display at all?

comment this as “uniform random between 5-10sec @ 10ms”

> 0.5 as we might need to tweak the exact allocation

From my testing, the flicker and weird all happens when you have stuff in the post stream which you have not loaded. loading last post does not cause too much issue and flicker.

I do not really agree with 0.5 seconds here as a minimum. Certainly without seeing longer delays as a problem in production. End user will be oblivious here, it will look like a reply happened just now even if it happened 5 seconds ago.

I really need to observe logs from an event with these params prior to tweaking anything.

I am merging since @ZogStriP approved :slight_smile: I want to get this into production so we can start tuning params