Strange requests stalling inside nginx
mdounin at mdounin.ru
Mon Feb 13 12:37:57 UTC 2017
On Mon, Feb 13, 2017 at 07:16:00AM -0500, MichaelLogutov wrote:
> Maxim Dounin Wrote:
> > Hello!
> > On Sat, Feb 11, 2017 at 04:06:29PM -0500, MichaelLogutov wrote:
> > > Hello.
> > > We have some strange issues when requests seems to stall inside
> > nginx - in
> > > nginx log we see that request took 1 second (and was terminated by
> > client
> > > timeout), while excactly the same request (we have special unique
> > headers to
> > > mark them) from the uwsgi logs took only 100ms to complete.
> > It is not clear how do you measure that request took 1 second.
> > Note well that request time as seen by nginx is unlikely to be
> > less that the time seen by your backend, but can easily be much
> > longer: for example, nginx has to read the request from client,
> > and it sometimes takes significant time.
> > > We also saw some very strange pauses in nginx debug log:
> > [...]
> > I don't see any "strange pauses" in the logs provided. If you
> > think there are any - please elaborate.
> > Note well that it's almost impossible to analize debug logs for
> > pauses if these logs doesn't contain information about processing
> > of events. To make sure all needed information is present it is
> > important to configure debug logs at global level, as recommended
> > at http://nginx.org/en/docs/debugging_log.html.
> Thank you.
> I was talking about 3 seconds gap here:
> 2017/01/26 20:17:51 [debug] 2749#0: *75780960 http upstream recv(): -1 (11:
> Resource temporarily unavailable)
> 2017/01/26 20:17:54 [debug] 2749#0: *75780960 http upstream request:
Ah, sorry, I didn't notice this.
The time gap here looks like waiting of connect() to complete. It
2017/01/26 20:17:51 [debug] 2749#0: *75780960 connect to 10.0.0.176:80, fd:500 #75780961
and in turn it is followed by sending the request to the upstream
2017/01/26 20:17:54 [debug] 2749#0: *75780960 http upstream send request handler
(The "http upstream recv()" message is a part of checking if the
client closed the connection, and is unrelated.)
The 3 seconds here suggests that there is a packet loss somewhere
between nginx and the backend. Most likely the backend is simply
overloaded, and its listen socket queue overflows and drops
packets. On Linux it may be a good idea to use
net.ipv4.tcp_abort_on_overflows=1 to better diagnose such
problems. Examining listen queue via "ss -nltp" on the backend
might also help.
More information about the nginx