nginx crash

Maxim Dounin mdounin at mdounin.ru
Sat Nov 12 16:16:26 UTC 2011


Hello!

On Sat, Nov 12, 2011 at 10:07:22PM +0800, MagicBear wrote:

> happen on a of main upstream server dead.
> 
> Here is the config
> 
> proxy_cache_path /dev/shm/cdn_cache_comment  levels=1:2
> keys_zone=cache_comment_mem:32m max_size=128m;

[...]

> 	proxy_read_timeout   60s;

How long it takes for response to be removed from cache given 
"max_size=128m" and "keys_zones=" size constraints under your 
load?

> 2011/11/12 MagicBear <magicbearmo at gmail.com>:
> > 2011/11/12 19:00:16 [alert] 7552#0: ignore long locked inactive cache
> > entry 26b0312d67bd41ef132ce5b8a4445ffa, count:1
> > 2011/11/12 19:02:17 [alert] 7552#0: ignore long locked inactive cache
> > entry ac307ce9b33a01a04f4f17c187d9b11a, count:1
> > 2011/11/12 19:02:45 [alert] 7552#0: ignore long locked inactive cache
> > entry e5fa15e3f856238feb5e0b7128120e20, count:1
> > 2011/11/12 19:03:59 [alert] 7552#0: ignore long locked inactive cache
> > entry 1eb06fe015c489159f15b514bb333931, count:1
> > 2011/11/12 19:04:46 [alert] 7552#0: ignore long locked inactive cache
> > entry 5023f1eb7e74908ae75d6a7a57ac4dfd, count:2
> > 2011/11/12 19:05:41 [alert] 7552#0: ignore long locked inactive cache
> > entry 9fda125ea01601b6a32536afd2c59aa2, count:1
> > 2011/11/12 19:06:02 [alert] 7547#0: worker process 7548 exited on
> > signal 11 (core dumped)

(Just in case: are there any other alerts before "ignore long 
locked inactive cache entry..." ones?)

It looks like nginx decided to remove cache entry (assuming it was 
left locked by a previously died worker) while the entry was in 
fact locked due to nginx waiting for a backend timeout.

Resulting segfault is somewhat expected with current code.  Looks 
like code needs some (more) safeguards against such situations.

As a workaround, you may want to make sure talking to backend 
always takes less than expected cache entry lifetime, this will 
prevent such situations.

Maxim Dounin



More information about the nginx-devel mailing list