fail_timeout in upstream not rescpeted?
Maxim Dounin
mdounin at mdounin.ru
Mon Jan 30 12:45:59 UTC 2017
Hello!
On Mon, Jan 30, 2017 at 02:41:06AM -0500, plrunner wrote:
> Hi everybody,
>
> I am running nginx v1.11 and I noticed something pretty weird in my
> error.log.
>
> I have fail_timeout=1800s along with max_fails=1 in my upstream and
> proxy_next_upstream is set to "error timeout", so I expect an upstream host
> to be taken off the list for 30 minutes just after the first failed
> connection.
>
> Here is what I unexpectedly get in the error.log
>
> 2017/01/23 09:49:48 [error] 30676#30676: *2202666 connect() failed (111:
> Connection refused) while connecting to upstream, client: 93.XX.YYY.228,
> server: *.foobar.com, request: "GET /generic/api/v1/tag/1006 HTTP/2.0",
> upstream: "http://[beaf:beaf:1001:a001::003D:4]:8080/generic/api/v1/tag/1006
> host: "cy1.foobar.com", referrer: "https://web.foobar.com/"
> 2017/01/23 09:49:48 [warn] 30676#30676: *2202666 upstream server temporarily
> disabled while connecting to upstream, client: 93.XX.YYY.228, server:
> *.foobar.com, request: "GET /generic/api/v1/tag/1006 HTTP/2.0", upstream:
> "http://[beaf:beaf:1001:a001::003D:4]:8080/generic/api/v1/tag/1006 host:
> "cy1.foobar.com", referrer: "https://web.foobar.com/"
> 2017/01/23 09:57:53 [error] 30695#30695: *2205681 connect() failed (111:
> Connection refused) while connecting to upstream, client: 93.XX.YYY.228,
> server: *.foobar.com, request: "GET /generic/api/v1/tag/1006 HTTP/2.0",
> upstream: "http://[beaf:beaf:1001:a001::003D:4]:8080/generic/api/v1/tag/1006
> host: "cy1.foobar.com", referrer: "https://web.foobar.com/"
> 2017/01/23 09:57:53 [warn] 30695#30695: *2205681 upstream server temporarily
> disabled while connecting to upstream, client: 93.XX.YYY.228, server:
> *.foobar.com, request: "GET /generic/api/v1/tag/1006 HTTP/2.0", upstream:
> "http://[beaf:beaf:1001:a001::003D:4]:8080/generic/api/v1/tag/1006 host:
> "cy1.foobar.com", referrer: "https://web.foobar.com/"
>
> The host is reused after just 8 minutes, instead of 30 minutes.
>
> Is there anything wrong in my conf or something I forgot to take into
> account?
As can be seen from "30676#" and "30695#", these messages are from
different worker processes. By default each worker process uses
its own run-time state for the upstream servers. If you want
worker processes to use shared state, you can configure this using
the "zone" directive in the "upstream" block, see details here:
http://nginx.org/en/docs/http/ngx_http_upstream_module.html#zone
--
Maxim Dounin
http://nginx.org/
More information about the nginx
mailing list