I have a situation where when using nginx as a proxy, large POST
requests to my upstream server sometimes fail -- for whatever reason
(network congestion, upstream server paused for java GC. etc.), the
initial writes fill up the socket buffers and so ngx_writev gets a
EAGAIN. So that's pretty normal; I'd expect nginx to handle that, poll
the fd, and retry the I/O when the socket is ready to write again. [In
my case, that's always within a few seconds; if the socket isn't
available within proxy_send_timeout, then I'd expect the request to be
Within the low-level parts of the nginx code, I see the framework for
this all in place: ngx_writev() returns NGX_AGAIN to
ngx_linux_sendfile_chain. That calls ngx_chain_update_sent to adjust the
buffer for the amount written and marks the ngx event holder as not
ready. Later, I do see nginx poll the fd; and the event holder gets
marked as ready, but by that time the partially-written data has been
lost and so is never written. Before that time, http_finalize_request
has been called with a status of NGX_DONE on the call path back up from
ngx_writev(), and the pending data seems also have been discarded.
Interestingly, the call stack and path here seem to be the same whether
proxy_request_buffering is on or off.
Have I missed something ( must have, right?), or is the partial write
situation just not handled properly at all?
If it is the case that the state required to keep track of the buffer is
not propagated though the code and it would be a big thing to fix, then
the simpler way to fix it is an option that ngx_writev() wait on a
temporary selector until the data can be written (or until the
proxy_send_timeout). That blocks the worker for that time, but the
worker can really only handle one request at a time anyway, correct? So
it is pretty much indistinguishable from a slightly-faster but still
pretty slow upstream server.