In-flight HTTP requests fail during hot configuration reload (SIGHUP)

Tue Jun 2 13:15:40 UTC 2015

Hello!

On Mon, Jun 01, 2015 at 09:21:12PM +0100, Matthew O'Riordan wrote:

[...]

> > Your problem is in step (5).  While you've started new nginx 
> > workers to handle new requests in step (4), this doesn't guarantee 
> > that old upstream servers are no longer needed.
> 
> I realise that is the problem, but I am not quite sure what the 
> best strategy to correct this is.  We are experiencing this 
> problem in production environments because Nginx sits behind an 
> Amazon ELB.  ELB by default will maintain a connection to the 
> client (browser for example) and a backend server (Nginx in this 
> case).  What we seem to be experiencing is that because ELB has 
> opened a connection to Nginx, Nginx has automatically assigned 
> this socket to an existing healthy upstream server.  So even if 
> a SIGHUP is sent to Nginx, ELB’s next request will always be 
> processed by the old upstream server at the time the connection 
> to Nginx was opened.  So therefore for us to do rolling 
> deployments, we have to keep the old server running for periods 
> of up to say 2 minutes to ensure existing connection requests 
> are completed.  We have designed our upstream server so that it 
> will complete existing in-flight requests, however our upstream 
> server thinks that an in-flight request is one that is being 
> responded to, not one that is perhaps just opened and no data 
> has been sent from the client to the server on the socket yet.

Ideally, you should keep old upstream servers running till all old 
worker processes are terminated.  This way you won't depend on 
configuration and/or implementation details.  This can be a bit 
too long though, as old workers usually busy sending big responses 
to slow clients.

> > Only new connections will be processed by new worker processes with new 
> > nginx configuration.  Old workers continue to service requests 
> > started before you've reconfigured nginx, and will only terminate 
> > once all previously started requests are finished.  This includes 
> > requests already send to an upstream server and reading a 
> > response, and requests not yet read from a client.  For these 
> > requests previous configuration apply, and you shouldn't stop old 
> > upstream servers till old worker processes are shut down.
> 
> Ok, however we do need a sensible timeout to ensure we do 
> actually shut down our old upstream servers too. This is the 
> problem I am finding with the strategy we currently have.
> ELB, for example, pipelines requests using a single TCP 
> connection in accordance with the HTTP/1.1 spec.  When a SIGHUP 
> is sent to Nginx, how does it then deal with pipelined requests?  
> Will it process all received requests and then issue a 
> "Connection: Close” header, or will it process the current 
> request and then close the connection?  If the former, then it’s 
> quite possible that in the time those in-flight requests are 
> responded to, another X number of requests will have been 
> received also in the pipeline.

Upon a SIGHUP, nginx will finish processing of requests it already 
started to process.  No additional requests will be processed, 
including pipelined requests.  This is considered to be an 
implementation details though, not something guaranteed.

-- 
Maxim Dounin
http://nginx.org/