nginx worker stuck, potential mod_zip bug
W. Andrew Loe III
andrew at andrewloe.com
Sat Feb 18 00:03:46 UTC 2012
I'm running an older version of nginx (0.7.67) with mod_zip 1.1.6. I
believe we have a found a rare bug, I'm trying to figure out of it is
with mod_zip or with nginx, and if upgrading nginx will potentially
resolve it.
The symptom is a worker process getting "stuck" at 100% CPU, leaving
all connections in CLOSE_WAIT, and servicing no requests. It appears
that the trigger for this is downloading an archive from mod_zip, but
we have never been able to reproduce it, only observe it in
production.
I was finally able to catch one of the workers and get a backtrace from gdb:
$ cat backtrace.log | addr2line -e /usr/lib/debug/usr/sbin/nginx -f
ngx_http_postpone_filter
/build/buildd/nginx-0.7.67/src/http/ngx_http_postpone_filter_module.c:125
ngx_http_ssi_body_filter
/build/buildd/nginx-0.7.67/src/http/modules/ngx_http_ssi_filter_module.c:430
ngx_http_charset_body_filter
/build/buildd/nginx-0.7.67/src/http/modules/ngx_http_charset_filter_module.c:643
ngx_http_zip_body_filter
/build/buildd/nginx-0.7.67/modules/mod_zip-1.1.6/ngx_http_zip_module.c:336
ngx_output_chain
/build/buildd/nginx-0.7.67/src/core/ngx_output_chain.c:58
ngx_http_copy_filter
/build/buildd/nginx-0.7.67/src/http/ngx_http_copy_filter_module.c:114
ngx_http_range_body_filter
/build/buildd/nginx-0.7.67/src/http/modules/ngx_http_range_filter_module.c:549
ngx_http_output_filter
/build/buildd/nginx-0.7.67/src/http/ngx_http_core_module.c:1716
ngx_event_pipe_write_to_downstream
/build/buildd/nginx-0.7.67/src/event/ngx_event_pipe.c:627
ngx_http_upstream_process_upstream
/build/buildd/nginx-0.7.67/src/http/ngx_http_upstream.c:2509
ngx_http_upstream_handler
/build/buildd/nginx-0.7.67/src/http/ngx_http_upstream.c:844
ngx_event_process_posted
/build/buildd/nginx-0.7.67/src/event/ngx_event_posted.c:30
ngx_worker_process_cycle
/build/buildd/nginx-0.7.67/src/os/unix/ngx_process_cycle.c:793
ngx_spawn_process
/build/buildd/nginx-0.7.67/src/os/unix/ngx_process.c:201
ngx_reap_children
/build/buildd/nginx-0.7.67/src/os/unix/ngx_process_cycle.c:612
main
/build/buildd/nginx-0.7.67/src/core/nginx.c:396
??
??:0
_start
??:0
??
??:0
??
??:0
??
??:0
??
??:0
??
??:0
??
??:0
I also had the log in debug mode (it is incredibly large ~ 150GB
uncompressed) and it is completely filled with the following little
loop:
2012/02/17 11:58:45 [debug] 10150#0: *1888 mod_zip: entering
subrequest body filter
2012/02/17 11:58:45 [debug] 10150#0: *1888 http postpone filter
"/s3/bucket/key" 0000000000000000
2012/02/17 11:58:45 [debug] 10150#0: *1888 copy filter: 0 "/s3/bucket/key"
2012/02/17 11:58:45 [debug] 10150#0: *1888 pipe write busy: 8192
2012/02/17 11:58:45 [debug] 10150#0: *1888 pipe write: out:0000000000000000, f:1
2012/02/17 11:58:45 [debug] 10150#0: *1888 http output filter "/s3/bucket/key"
2012/02/17 11:58:45 [debug] 10150#0: *1888 copy filter: "/s3/bucket/key"
2012/02/17 11:58:45 [debug] 10150#0: *1888 mod_zip: entering
subrequest body filter
2012/02/17 11:58:45 [debug] 10150#0: *1888 http postpone filter
"/s3/bucket/key" 0000000000000000
2012/02/17 11:58:45 [debug] 10150#0: *1888 copy filter: 0 "/s3/bucket/key"
2012/02/17 11:58:45 [debug] 10150#0: *1888 pipe write busy: 8192
2012/02/17 11:58:45 [debug] 10150#0: *1888 pipe write: out:0000000000000000, f:1
2012/02/17 11:58:45 [debug] 10150#0: *1888 http output filter "/s3/bucket/key"
2012/02/17 11:58:45 [debug] 10150#0: *1888 copy filter: "/s3/bucket/key"
2012/02/17 11:58:45 [debug] 10150#0: *1888 mod_zip: entering
subrequest body filter
2012/02/17 11:58:45 [debug] 10150#0: *1888 http postpone filter
"/s3/bucket/key" 0000000000000000
2012/02/17 11:58:45 [debug] 10150#0: *1888 copy filter: 0 "/s3/bucket/key"
2012/02/17 11:58:45 [debug] 10150#0: *1888 pipe write busy: 8192
2012/02/17 11:58:45 [debug] 10150#0: *1888 pipe write: out:0000000000000000, f:1
2012/02/17 11:58:45 [debug] 10150#0: *1888 http output filter "/s3/bucket/key"
2012/02/17 11:58:45 [debug] 10150#0: *1888 copy filter: "/s3/bucket/key"
Am I right in assuming this is a bug in mod_zip, it looks like a
buffer is never being drained to the client?
More information about the nginx
mailing list