nginx stuck in tight loop sometimes ( sorry for new thread )
mdounin at mdounin.ru
Tue Jan 19 15:10:11 UTC 2021
On Tue, Jan 19, 2021 at 01:33:54PM +0000, James Beal wrote:
> > Are you able to reproduce the problem without any 3rd party
> > modules? Since nginx itself does not use pipes, this looks
> > like a pagespeed problem.
> Not really we use about 500mb/s of bandwidth with pagespeed
> turned on and we are using about 2 gigabits a second with it
> turned off ( although I think that is more hitting the limits of
> the interconnect ).
Well, so you'll have to debug what happens then. Debugger and
strace are your friends. Most likely you'll end up fixing
pagespeed (or working on removing it from your system). Some
useful links from a quick look into pagespeed code below.
> A two minute look at the nginx source shows pipes around
> upstream I think.
These aren't OS pipes, but rather nginx code named
> Is there a method of working out where an nginx process is stuck
> ( It does not respond to the normal signals ).
As long as the process is stuck in a 3rd party module code,
spinning in a loop, the only thing you can do is to send the KILL
signal to the process, which cannot be catched or ignored and
kills the process. Or you can attach to the process with a
debugger and take a look where it spins. I suspect this will be
this code, which simply re-tries writing on EAGAIN:
The "TODO(oschaaf): should we worry about spinning here?" line in
the code looks like exactly what you are seeing. The relevant
issue is linked in the same file:
That is, the code is known to do not work under load for at least
three years. This basically matches my impression from previous
attempts to look into the pagespeed code: it is not expected to
work. I would really recommend to reconsider its usage. If you
need optimizations of the responses returned, consider doing this
during your deployment.
Hope this helps.
More information about the nginx