Stop handling SIGTERM and zombie processes after reconfigure
Florian S.
f_los_ch at yahoo.com
Wed Jul 3 14:48:29 UTC 2013
Hi together!
I'm having occasionally trouble with worker processes left <defunct> and
nginx stopping handling signals (HUP and even TERM) in general.
Upon reconfigure signal, the log shows four new processes being spawned,
while the old four processes are shutting down:
> [notice] 5159#0: using the "epoll" event method
> [notice] 5159#0: nginx/1.4.1
> [notice] 5159#0: built by gcc 4.4.3 (Ubuntu 4.4.3-4ubuntu5.1)
> [notice] 5159#0: OS: Linux 3.9.7-147-x86
> [notice] 5159#0: getrlimit(RLIMIT_NOFILE): 100000:100000
> [notice] 5159#0: start worker processes
> [notice] 5159#0: start worker process 5330
> [notice] 5159#0: start worker process 5331
> [notice] 5159#0: start worker process 5332
> [notice] 5159#0: start worker process 5333
> [notice] 5159#0: signal 1 (SIGHUP) received, reconfiguring
> [notice] 5159#0: reconfiguring
> [notice] 5159#0: using the "epoll" event method
> [notice] 5159#0: start worker processes
> [notice] 5159#0: start worker process 12457
> [notice] 5159#0: start worker process 12458
> [notice] 5159#0: start worker process 12459
> [notice] 5159#0: start worker process 12460
> [notice] 5159#0: start cache manager process 12461
> [notice] 5159#0: start cache loader process 12462
> [notice] 5331#0: gracefully shutting down
> [notice] 5330#0: gracefully shutting down
> [notice] 5331#0: exiting
> [notice] 5330#0: exiting
> [notice] 5331#0: exit
> [notice] 5330#0: exit
> [notice] 5332#0: gracefully shutting down
> [notice] 5159#0: signal 17 (SIGCHLD) received
> [notice] 5159#0: worker process 5331 exited with code 0
> [notice] 5332#0: exiting
> [notice] 5332#0: exit
> [notice] 5333#0: gracefully shutting down
> [notice] 5333#0: exiting
> [notice] 5333#0: exit
After that, nginx is fully operational and serving requests -- however,
ps yields:
> root 5159 0.0 0.0 6248 1696 ? Ss 10:43 0:00 nginx: master
process /chroots/nginx/nginx -c /chroots/nginx/conf/nginx.conf
> nobody 5330 0.0 0.0 0 0 ? Z 10:43 0:00 [nginx] <defunct>
> nobody 5332 0.0 0.0 0 0 ? Z 10:43 0:00 [nginx] <defunct>
> nobody 5333 0.0 0.0 0 0 ? Z 10:43 0:00 [nginx] <defunct>
> nobody 12457 0.0 0.0 8332 2940 ? S 10:44 0:00 nginx: worker process
> nobody 12458 0.0 0.0 8332 2940 ? S 10:44 0:00 nginx: worker process
> nobody 12459 0.0 0.0 8332 3544 ? S 10:44 0:00 nginx: worker process
> nobody 12460 0.0 0.0 8332 2940 ? S 10:44 0:00 nginx: worker process
> nobody 12461 0.0 0.0 6296 1068 ? S 10:44 0:00 nginx: cache
manager process
> nobody 12462 0.0 0.0 0 0 ? Z 10:44 0:00 [nginx] <defunct>
In the log one can see that SIGCHLD is only received once for 5331,
which does not show up as zombie -- in contrast to the workers 5330,
5332, 5333, and the cache loader 12462.
Much more serious is that neither
> /chroots/nginx/nginx -c /chroots/nginx/conf/nginx.conf -s(stop|reload)
nor
> kill 5159
seem to get handled by nginx anymore (nothing in the log and no effect).
Maybe the master process is stuck waiting for some mutex?:
> strace -p 5159
> Process 5159 attached - interrupt to quit
> futex(0xb7658e6c, FUTEX_WAIT_PRIVATE, 2, NULL
Unfortunately, I missed to get a core dump of the master process while
it was running. Additionally, there is no debug log available, sorry. As
I was not able to reliably reproduce this issue, I'll most probably have
to wait...
Many thanks in advance and kind regards,
Florian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx-devel/attachments/20130703/514baa92/attachment.html>
More information about the nginx-devel
mailing list