Question about failure and fail-over

Branden Visser mrvisser at gmail.com
Thu Jul 18 11:10:27 UTC 2013


Hi all, I have a general question about server failure and failover
within an upstream group to ensure I understand it correctly.

Lets say I have the configuration:

proxy_next_upstream timeout;
proxy_connect_timeout 5;
...
upstream {
  127.0.0.1 max_fails=3 fail_timeout=10s
  127.0.0.2 max_fails=3 fail_timeout=10s
  127.0.0.3 max_fails=3 fail_timeout=10s
}

And then the server 127.0.0.1 starts "hanging" indefinitely on
connection attempts.

a) Once 3 connection attempts timeout after 5 seconds on 127.0.0.1, it
will be marked down. However, during that 5 second timeout, it is
possible that 30, or N connections / requests may be in process of
timing out as well, so you may end up with 30 internal connection
failures as a result of 127.0.0.1's issue. Although they all are
retried on the next available upstream, 30 end-users noticed a 5
second hang in their request as a result of waiting for the timeout to
occur.

b) After 10 seconds, if the server is still hanging, a) basically
repeats in the same manner.

Is this correct? If I add "keepalive 64;" into the upstream block,
does the above scenario change? If a server is marked down as a result
of no new connections being able to connect, are all persistent
connections destroyed as well?

Any insight on this understanding would be appreciated.

Cheers,
Branden



More information about the nginx mailing list