faskiri.devel at gmail.com
Fri May 24 16:49:57 UTC 2013
Thanks for the really quick reply. The ngx_http_run_posted_requests totally
made sense and explained the bit that I was missing.
I get the bug when writev called in the context of a request handler gets
an error. The repro I had was basically with nginx running on a server and
client on my laptop over wireless @ work. I am not @ work now and from my
home connection I am unable to repro this. Will send you the backtrace as
soon as I get it again.
On Fri, May 24, 2013 at 8:24 PM, Maxim Dounin <mdounin at mdounin.ru> wrote:
> On Fri, May 24, 2013 at 07:09:58PM +0530, Fasih wrote:
> > Hi all
> > I have been seeing slow but steady socket leak in nginx ever since I
> > upgraded from 1.0.5 to 1.2.6. I have my custom module in nginx which I
> > sure what was the leak. This is how I went about investigating:
> > 1. Configure nginx with one worker
> > 2. strace on the worker process, tracing
> > read/readv/write/writev/close/shutdown calls
> > 3. Every now and then, for all the open fds (from ls -l /proc/<pid>/fd),
> > check the socket that is not available in netstat -pane
> > 4. What I saw was, the leaking socket always had the last operation as
> > writev which returned an error.
> > 5. Increased the nginx log level to info and verified that nginx was
> > getting ECONNRESET or EPIPE on writev failure. Which was OK.
> > 6. Traced back in code to see how it is handled, the error translates to
> > CHAIN_ERROR and eventually causes ngx_http_finalize_request to be called.
> > This in turn calls ngx_http_terminate_request.
> > However, in this function, the request is not terminated if
> > r->write_event_handler is set. This seems to be set if the request
> > is a user module. I think the rationale for the check is, if there is a
> > module who is handling the request, dont terminate yet, wait for a write
> > event on the socket and then terminate it (which is why I thought it is
> > setting r->write_event_handler to ngx_http_terminate_handler).
> Rationale is to make sure there are no functions on stack which
> assume request object is here and will try to access it after
> we'll free request data.
> The r->write_event_handler (that is, ngx_http_terminate_handler())
> is expected to be called by a ngx_http_run_posted_requests() which
> in turn is called by low-level event handling functions (notably,
> > I tried to repro this w/ empty_gif_handler however, it sends header and
> > body in one call to writev which I cant get to fail in my test
> > To reproduce the bug, if I replace the call to ngx_http_send_response
> > ngx_http_send_header and ngx_http_output_filter (as used by ngx_upstream
> > other modules which dont have the headers and body together), I could
> > reproduce the leak. I have a client that sends a request and closes the
> > socket immediately, nginx sees the error, prints the info log, and then
> > doesnt close the socket.
> > I have a small patch attached, the fix I did is basically saying that if
> > there is a connection error, there is no point setting
> > as there wont be any activity on the socket, so just terminate it
> > immediately.
> > I could be very wrong in the understanding of the code flow. My patch
> > fixes this and I am not very sure if this is the right fix. Please let me
> > know.
> > I will try to add a testcase to reproduce this in the nginx test
> The patch looks wrong, see above.
> Could you please show a backtrace up to
> ngx_http_terminate_request() with mr->write_event_handler and
> c->error set (i.e. where you think leak happens)?
> You may also want to upgrade to a more recent version, e.g. 1.5.0,
> to make sure the problem you are facing isn't already fixed.
> Maxim Dounin
> nginx-devel mailing list
> nginx-devel at nginx.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the nginx-devel