Socket leak
Fasih
faskiri.devel at gmail.com
Fri May 24 13:39:58 UTC 2013
Hi all
I have been seeing slow but steady socket leak in nginx ever since I
upgraded from 1.0.5 to 1.2.6. I have my custom module in nginx which I was
sure what was the leak. This is how I went about investigating:
1. Configure nginx with one worker
2. strace on the worker process, tracing
read/readv/write/writev/close/shutdown calls
3. Every now and then, for all the open fds (from ls -l /proc/<pid>/fd),
check the socket that is not available in netstat -pane
4. What I saw was, the leaking socket always had the last operation as
writev which returned an error.
5. Increased the nginx log level to info and verified that nginx was
getting ECONNRESET or EPIPE on writev failure. Which was OK.
6. Traced back in code to see how it is handled, the error translates to
CHAIN_ERROR and eventually causes ngx_http_finalize_request to be called.
This in turn calls ngx_http_terminate_request.
However, in this function, the request is not terminated if
r->write_event_handler is set. This seems to be set if the request handler
is a user module. I think the rationale for the check is, if there is a
module who is handling the request, dont terminate yet, wait for a write
event on the socket and then terminate it (which is why I thought it is
setting r->write_event_handler to ngx_http_terminate_handler).
I tried to repro this w/ empty_gif_handler however, it sends header and
body in one call to writev which I cant get to fail in my test environment.
To reproduce the bug, if I replace the call to ngx_http_send_response with
ngx_http_send_header and ngx_http_output_filter (as used by ngx_upstream or
other modules which dont have the headers and body together), I could
reproduce the leak. I have a client that sends a request and closes the
socket immediately, nginx sees the error, prints the info log, and then it
doesnt close the socket.
I have a small patch attached, the fix I did is basically saying that if
there is a connection error, there is no point setting write_event_handler
as there wont be any activity on the socket, so just terminate it
immediately.
I could be very wrong in the understanding of the code flow. My patch just
fixes this and I am not very sure if this is the right fix. Please let me
know.
I will try to add a testcase to reproduce this in the nginx test framework.
Thank you for your patience.
Regards
+Fasih
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx-devel/attachments/20130524/f8220833/attachment.html>
-------------- next part --------------
diff --git a/nginx-1.2.6/src/http/ngx_http_request.c b/nginx-1.2.6/src/http/ngx_http_request.c
--- a/nginx-1.2.6/src/http/ngx_http_request.c
+++ b/nginx-1.2.6/src/http/ngx_http_request.c
@@ -2172,7 +2172,7 @@ ngx_http_terminate_request(ngx_http_request_t *r, ngx_int_t rc)
"http terminate cleanup count:%d blk:%d",
mr->count, mr->blocked);
- if (mr->write_event_handler) {
+ if (mr->write_event_handler && !c->error) {
if (mr->blocked) {
return;
More information about the nginx-devel
mailing list