<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body style="background-color: rgb(255, 255, 255); color: rgb(0, 0,
0);" text="#000000" bgcolor="#FFFFFF">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
We're seeing an <tt>502 bad gateway</tt> responses to client on an
nginx load balanced upstream due to "<tt>no live upstreams</tt>".<br>
<br>
The <tt>upstream</tt> in question has 2 servers defined with
default settings running over https (<tt>proxy_pass <a
class="moz-txt-link-freetext" href="https://myupstream"
moz-do-not-send="true">https://myupstream</a></tt>).<br>
<br>
When this happens we see "<tt>no live upstreams while connecting to
upstream</tt>" in the nginx error log and just prior to this:<br>
"<tt>peer closed connection in SSL handshake (54: Connection reset
by peer) while SSL handshaking to upstream</tt>".<br>
<br>
We currently believe that the client closing the connection is
causing the upstream to have a failure counted against it.<br>
<br>
With the defaults of <tt>max_fails</tt><tt>=1</tt> and <tt>fail_timeout=10</tt>
it only takes two such closes within a 10 second window to take
down all upstream nodes resulting in the "<tt>no live upstream</tt>s"
and hence all subsequent connections for the next 10 seconds fail
instantly with <tt>502 bad gateway</tt>.<br>
<br>
Does this explanation seem plausible, is this a bug in nginx?<br>
<br>
We're currently testing with max_fails=10 as a potential workaround.<br>
<br>
Regards<br>
Steve<br>
<br>
<br>
<br>
<br>
<br>
</body>
</html>