nginx crash

Maxim Dounin mdounin at mdounin.ru
Wed Nov 23 12:29:36 UTC 2011


Hello!

On Wed, Nov 23, 2011 at 04:01:29AM +0000, António P. P. Almeida wrote:

> On 14 Nov 2011 16h57 WET, magicbearmo at gmail.com wrote:
> 
> > Hello,
> >
> > I think is also not relative to crash, when I increase the zone key
> > size, crash hasn't happen again (but my upstream are also up), so I
> > still unknown where has the crash happen.
> > Just I had update all of my server to 1.1.8, I want to test for your
> > config because some of my server has running dynamic page, and they
> > aren't return the Content-Length.
> 
> I'm having a similar issue. The php-fpm workers stop responding and
> since the FCGI cache lifetime is 15s I get a bunch of SEGVs on the
> cache manager.

By "lifetime" you mean "fastcgi_cache_path ... inactive=15s"?

If yes, than you may want to change your configuration: this isn't 
going to work well as nginx uses "inactive" as a guard against 
stale cache entry locks (left by crashed workers, if any).  It 
was never expected to be lower than time required to fetch 
a resource from an upstream, and setting it to lower expected to 
cause problems.

If no, you may want to provide more details.  (Ideally I would 
like to see full debug log showing the crash, the "ignore long 
locked inactive cache entry" alert and some time before it, but I 
understand it would be hard to obtain unless you are able to 
reproduce the problem at will.)

> I'm increasing the cache keys zone size from 5M to 10M. I know you
> guys at nginx.com are busy people, but please look into this issue. Is
> really making Nginx sensitive to failings in apps. Which it shouldn't
> be since there's a clear separation between app and server. That's
> IMHO one big selling point, among many others of course.

Recently posted patch series[1] addresses major part of the 
problem: it resolves deadlock after such crashes.  It would be 
fine to resolve the crash itself too, but I wasn't yet able to 
trace the problem cause (unless it's caused by too low inactive 
time, which is mostly a configuration problem, see above).

[1] http://mailman.nginx.org/pipermail/nginx-devel/2011-November/001471.html

Maxim Dounin



More information about the nginx-devel mailing list