Home | History | Annotate | Download | only in doc
      1 GRPC Connection Backoff Protocol
      2 ================================
      3 
      4 When we do a connection to a backend which fails, it is typically desirable to
      5 not retry immediately (to avoid flooding the network or the server with
      6 requests) and instead do some form of exponential backoff.
      7 
      8 We have several parameters:
      9  1. INITIAL_BACKOFF (how long to wait after the first failure before retrying)
     10  1. MULTIPLIER (factor with which to multiply backoff after a failed retry)
     11  1. JITTER (by how much to randomize backoffs).
     12  1. MAX_BACKOFF (upper bound on backoff)
     13  1. MIN_CONNECT_TIMEOUT (minimum time we're willing to give a connection to
     14     complete)
     15 
     16 ## Proposed Backoff Algorithm
     17 
     18 Exponentially back off the start time of connection attempts up to a limit of
     19 MAX_BACKOFF, with jitter.
     20 
     21 ```
     22 ConnectWithBackoff()
     23   current_backoff = INITIAL_BACKOFF
     24   current_deadline = now() + INITIAL_BACKOFF
     25   while (TryConnect(Max(current_deadline, now() + MIN_CONNECT_TIMEOUT))
     26          != SUCCESS)
     27     SleepUntil(current_deadline)
     28     current_backoff = Min(current_backoff * MULTIPLIER, MAX_BACKOFF)
     29     current_deadline = now() + current_backoff +
     30       UniformRandom(-JITTER * current_backoff, JITTER * current_backoff)
     31 
     32 ```
     33 
     34 With specific parameters of
     35 MIN_CONNECT_TIMEOUT = 20 seconds
     36 INITIAL_BACKOFF = 1 second
     37 MULTIPLIER = 1.6
     38 MAX_BACKOFF = 120 seconds
     39 JITTER = 0.2
     40 
     41 Implementations with pressing concerns (such as minimizing the number of wakeups
     42 on a mobile phone) may wish to use a different algorithm, and in particular
     43 different jitter logic.
     44 
     45 Alternate implementations must ensure that connection backoffs started at the
     46 same time disperse, and must not attempt connections substantially more often
     47 than the above algorithm.
     48 
     49 ## Reset Backoff
     50 
     51 The back off should be reset to INITIAL_BACKOFF at some time point, so that the
     52 reconnecting behavior is consistent no matter the connection is a newly started
     53 one or a previously disconnected one.
     54 
     55 We choose to reset the Backoff when the SETTINGS frame is received, at that time
     56 point, we know for sure that this connection was accepted by the server.
     57