incorrect upstream max_fails behaviour
Jan Prachař
jan.prachar at gmail.com
Thu Mar 26 19:08:01 UTC 2020
Hello,
the upstream module documentation says:
max_fails=number
sets the number of unsuccessful attempts to communicate with the server
that should happen in the duration set by the fail_timeout parameter to
consider the server unavailable for a duration also set by the
fail_timeout parameter.
And also:
fail_timeout=time
sets
the time during which the specified number of unsuccessful attempts to
communicate with the server should happen to consider the server
unavailable;
Load balancing documentation at
http://nginx.org/en/docs/http/load_balancing.html says:
The max_fails directive sets the number of consecutive unsuccessful
attempts to communicate with the server that should happen during
fail_timeout.
But I have found that the actual nginx behaviour is different. Every
time an upstream fails, peer->accessed and peer->checked is set to now
and peers->fails is incremented. peer->checked is set to now also
before connecting to upstream, if
now - peer->checked > peer->fail_timeout. (1)
peer->fails is set to 0 only for sucessful request if peer->accessed <
peer->checked, which can happen only if condition (1) was fulfilled.
Therefore, peers->fails is set to zero only if no upstream error
happens during fail_timeout interval. So for example, if upstream fails
once every fail_timeout, after max_fails*fail_timeout will be marked as
unavailable.
Or if there are no succesful requests to an upstream, peers->fails is
incremented with every request independetly on fail_timeout settings.
My test confirms that nginx indeed behaves like this.
Is the documented behavior only part of the commercial subscription, or
am I missing somthing?
Best regards,
Jan Prachař
More information about the nginx-devel
mailing list