Maybe a bug in Nginx 6.0.30 x86_64
is at rambler-co.ru
Sun May 11 18:43:28 MSD 2008
On Sun, May 11, 2008 at 11:04:02AM +0200, Fran?ois Battail wrote:
> I've an issue with Nginx 0.6.30. So far it seems that the following
> things are needed to reproduce this bug:
> - keepalives enabled,
> - Nginx running on Linux x86_64,
> - many concurrent connections (> 300).
> Nginx crashes on a SIGSEGV, but it looks like a stack overflow caused by
> recursive calls according to gdb.
> Here is the setup:
> Ubuntu 8.04 64 bits, kernel 2.6.24
> Amd 64 3000+ (single core), 512 MB
> Nginx 0.6.30 configuration:
> 1 worker, no master process, no daemon, epoll, sendfile, worker
> connections: 8000, 1 server serving a 10 kB index.html static file.
> ./configure --prefix=.
> Then using Apache Bench like this:
> ab -k -c 500 -n 1000000 http://localhost:8000/index.html
> will cause a stack overflow (after ~24000 requests).
> Using ab without the keepalive option (-k) works on the same computer
> without problem so far even with a large number of simultaneous
> connections ie:
> ab -c 4000 -n 1000000 http://localhost:8000/index.html
> Never could reproduce it on my main computer (Ubuntu 8.04 i386 32 bits).
> I will try to do another tests on a q6600 (quad core) running Ubuntu
> 8.04 x86_64 and keep you informed.
> A gdb session showing the first 500 stack backtraces is attached, hope
> it will help.
It seems that you are able to run ab/nginx in resonance:
1) ab issues request, nginx gets epoll notification and sends response,
2) kernel schedules ab, it gets response, sends a new request,
3) kernel schedules nginx, it reads the request, sends a response,
4) goto #2.
So nginx processes all requests after single epoll notification without
any blocking. This may happen only on greedy event notification methods:
epoll and rtsig. Also I'm not sure that it can be reproduced from remote host.
I see two ways to fix it: the simplest is to limit number of keepalive requests.
BTW, please note, that running nginx without master process is not good
for production: it has problems with reconfiguration, etc.
More information about the nginx