No live upstreams with a single upstream
domleb
nginx-forum at forum.nginx.org
Wed Sep 19 12:22:48 UTC 2018
Maxim Dounin Wrote:
-------------------------------------------------------
> Hello!
>
> On Tue, Sep 18, 2018 at 06:02:46AM -0400, domleb wrote:
>
> > While running a load test that injects 10k TPS across 3 Nginx
> instances, we
> > are seeing spikes of errors where Nginx returns HTTP 502 and logs
> the
> > message 'no live upstreams while connecting to upstream'. There are
> no
> > other errors logged e.g. connection errors.
> >
> > Also, we have a single upstream virtual IP (we use iptables to
> balance load
> > across the backend) and according to the docs the upstream should
> never be
> > marked as down in this case:
> >
> > 'If there is only a single server in a group, max_fails,
> fail_timeout and
> > slow_start parameters are ignored, and such a server will never be
> > considered unavailable'
> >
> > Testing locally with our config confirms this and I cannot reproduce
> the 'no
> > live upstreams while connecting to upstream' message when simulating
> > connection and read errors with a single upstream.
> >
> > To debug I tried enabling debug logs but under load that degraded
> > performance too much. I also traced the worker process with strace
> and
> > didn't find any socket or other other errors during the 502 spike.
> >
> > I was able to create this issue on Nginx 1.12.2 and 1.15.3.
> >
> > So given that we don't see any source error and we have a single
> upstream,
> > I'm interested to know what other scenarios could result in a 502
> with the
> > log message 'no live upstreams while connecting to upstream'?
>
> Could you please show the upstream configuration you are using?
>
> With a single server in the upstream block, "no live upstreams"
> error may happen if:
>
> - the server is marked "down" in the configuration, or
> - the server reached the max_conns limit.
>
> Also note that "a single server" does not apply to cases when
> there is a single hostname which resolves to multiple IP address
> (this defines multiple servers at once).
>
> --
> Maxim Dounin
> http://mdounin.ru/
> _______________________________________________
> nginx mailing list
> nginx at nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx
I removed our max_conns limit and that resolved the issue - thanks for the
help.
I might be worth changing the log message in this case as I believe the
upstream is still live and there are no other log messages to indicate what
the problem is.
Posted at Nginx Forum: https://forum.nginx.org/read.php?2,281255,281298#msg-281298
More information about the nginx
mailing list