How to implement a random exponential backoff algorithm in Javascript

Using an exponential backoff algorithm with randomization when reconnecting to services can prevent sudden spikes on your server after a crash.

Imagine for example you have a WebSocket server with 10.000 clients connected. Your server crashes but comes back up rather quickly. That’s great because you implemented a reconnection script something along the following:

function connect() {
  ws = new WebSocket("ws://localhost:8080");
  ws.addEventListener('close', connect);
}

All your clients will reconnect automatically and no one even knows the server went down right? Right, but there are some problems with this straightforward approach.

Making it backoff exponential

The client will start performing reconnect attempts as soon as the socket closes and will do so for as long as the server is down. This will create a lot of unnecessary requests and can cause network congestion. This can simply be solved with a timer in between the reconnect attempts.

This timeout will limit the reconnect attempts per client per period of time. However, just after disconnecting you want to reconnect the client as soon as possible, the reasons for disconnecting might be some local network error, short outage, or a quick reload of the server. But when initial attempts fail it becomes more likely that the connection errors are related to something that will take longer than a few seconds to restore. Therefore you don’t want to keep hitting the server at the same rate with reconnect attempts.

This is where the exponential backoff algorithm comes in. By increasing the delay exponentially after each attempt we can gradually set the interval to some higher maximum while trying to reconnect as fast as possible at the start.

var initialReconnectDelay = 1000;
var currentReconnectDelay = initialReconnectDelay;
var maxReconnectDelay = 16000;

function connect() {
    ws = new WebSocket("ws://localhost:8080");
    ws.addEventListener('open', onWebsocketOpen);
    ws.addEventListener('close',  onWebsocketClose);
}

function onWebsocketOpen() {
    currentReconnectDelay = initialReconnectDelay;
}

function onWebsocketClose() {
    ws = null;
    setTimeout(() => {
        reconnectToWebsocket();
    }, currentReconnectDelay);
}

function reconnectToWebsocket() {
    if(currentReconnectDelay < maxReconnectDelay) {
        currentReconnectDelay*=2;
    }
    connect();
}

Making it random

So now our clients are throttled on their reconnect attempts that’s good but what happens when the server goes down an instantly goes back up while 10.000 clients are connected? That’s right, all of those clients will try to reconnect at the exact same second. This can cause a spike on the server and in the worse case bring it back down.

To prevent this spike we can make the backoff algorithm random. So not all of our clients reconnect at the same moment. This will gracefully let all of our clients reconnect without spiking the server.

function onWebsocketClose() {
    ws = null;
    // Add anything between 0 and 3000 ms to the delay.  
    setTimeout(() => {
        reconnectToWebsocket();
    }, currentReconnectDelay + Math.floor(Math.random() * 3000)));
}

Of course, you can tweak the numbers to your specific use case based on your server capacity and concurrent clients.

That’s it, if you took or know another approach worth sharing please do so!

Leave a Reply

Your email address will not be published. Required fields are marked *