Long time ago I briefly mentioned that I'm working on chunked HTTP request body processing.
While implementing I have faced with several problems.
The main problem is that chunked HTTP request body does not always contain Content-Length header, therefore it is impossible to determine how many bytes it is necessary to read in order to get complete body. In keepalive connection this is problematic, because the header of next pipelined request may immediately follow the body or trailer of current request. Assume, that we are trying to read body in preallocated buffers in most efficient way. The header of next request may end up in the request body buffer. This creates an undesired situation, that on one hand the chunked body filter must signal the end of the request body, on the other hand the remaining part of the buffer, which chunked body filter does not want to consume, must be returned to the HTTP header parser. I've found that the last thing -- returning the remaining part of the buffer to the HTTP header parser -- is not something what could be easily done, due to the complexity of HTTP header parser. Does someone have a clue how it could be implemented?
The second problem is that efficiency of request body reception is limited with connection's recv call, because recv_chain does not get read limit in bytes as argument. Therefore, even if Content-Length is given, it is impossible to read request body into multiple buffers, thus improving the memory consumption. If I grep for recv_chain in nginx's code, I see that it is used only in src/event/ngx_event_pipe.c. It doesn't seem that a lot of stuff will be broken if an argument will be added to recv_chain.
Does anyone see any other problems if an argument will be added to recv_chain?
Now, you might ask me why I am writing this while someone has already implemented reception of chunked body. I've seen the implementation of chunked body parser from agentzh:
It is nice, but I think it neither addressed the pipelined requests problem, nor it can be used with in standard modules, like proxy, because it re-implements function ngx_http_read_client_request_body.
Could someone comment on that?
I'm really happy with this mailling list. Thank you Igor! Although I haven't
developed an nginx module, I'm eagerly following the development of nginx :)
It seems that there is one module development guide  which was written by
Evan Miller. If there are more, could you please share?
I think starting with an informative post as a first message is useful for the