Processing responses of unbounded sizes from upstream servers

Maxime Henrion mhenrion at appnexus.com
Tue Oct 27 17:04:07 UTC 2015


Hello Maxim,


Thanks for your input, it *is* appreciated.

As someone who has contributed to a large number of open source projects throughout the years, it is however quite annoying to be pointed at the source over and over again.

You seem to be thinking that I've never taken even a quick look at the source code, and that I am sending mails to this list out of pure laziness. As it turns out, I have been reading this codebase on and on since I started this project. I couldn't possibly have gotten as far as I am now without doing that. I have opened and read stuff in the SSI module more times than I can count now, and perused a large number of core source files.

Yes, ultimately, the source code is the one and only source of truth, and yes, the source code undeniably contains the answers to all the questions I have asked so far, in some form or another. However, if you want the luxury to "UTSL" people, you would need a codebase that is properly commented first... I have rarely worked with code that contains as little comments as the nginx codebase. That wouldn't be so bad if there was an API guide or something to begin with!

I'm more than happy to be pointed at code, I really am. And guess what, I hate sending e-mails to development mailing lists to get information probably a lot more than you hate answering them (as it seems you do), and would MUCH rather be able to get at what I need by myself. But sorry to say, the current nginx's codebase does not constitute a proper documentation on its internals for someone who hasn't had prior experience with the codebase.

So yeah, I'm fully aware that trying to develop nginx modules without reading the source code won't work. I'm not doing that though, not even remotely...

Regards,
Maxime

_______________________________________
From: nginx-devel [nginx-devel-bounces at nginx.org] on behalf of Maxim Dounin [mdounin at mdounin.ru]
Sent: Tuesday, October 27, 2015 5:25 PM
To: nginx-devel at nginx.org
Subject: Re: Processing responses of unbounded sizes from upstream servers

Hello!

On Tue, Oct 27, 2015 at 04:00:09PM +0000, Maxime Henrion wrote:

> Hey Sorin!
>
> First and foremost, thanks a lot for your answer.
>
> I started working on implementing a body filter for my needs,
> and am already facing some strange issues. When my body filter
> is called on the subrequest I sent, despite it being a body
> filter, I get a buffer chain containing the whole upstream
> response, including headers. Since this is not a header filter,
> this has gotten me quite confused... For what it's worth, my
> module doesn't even register a header filter. Here is some gdb
> output:
>
>
> Breakpoint 1, ngx_http_dispatcher_body_filter (r=0x5402050, in=0x5403b38) at /home/mhenrion/dispatcher/ngx_http_dispatcher_module.c:207
> 207 {
> (gdb) p *in->buf
> $1 = {pos = 0x5404441 "N", last = 0x5404498 "", file_pos = 0, file_last = 0,
>   start = 0x54043c0 "HTTP/1.1 200 OK\r\n[more data here that I stripped]",
>   end = 0x54083c0 "", tag = 0x7671e0 <ngx_http_proxy_module>, file = 0x0, shadow = 0x5403698, temporary = 1, memory = 0, mmap = 0, recycled = 0, in_file = 0, flush = 0,
>   sync = 0, last_buf = 0, last_in_chain = 0, last_shadow = 1, temp_file = 0, num = 0}
> (gdb)
>
> Unless I'm missing something, this makes it a lot harder for me
> to actually work on the body of those responses, as I would end
> up having to parse the HTTP header to find the part I'm
> interested in.

You are expected to work with data between in->buf->pos and
in->buf->last (assuming a buffer in memory).

> I'm also at a loss as to the semantics of the last_buf and
> last_in_chain flags. The module development guide doesn't
> contain any information about the "last_in_chain" flag; it only
> talks about last_buf as a marker indicating that the buffer
> chain is complete. Intuitively, I would think of "last_in_chain"
> as a marker that we're at the end of the buffer list, but that
> doesn't make a lot of sense since this would be redundant with
> the next pointer being NULL. The ngx_buf.h header is equally
> unhelpful there and I couldn't locate a piece of code that would
> make the semantics clear yet (some modules seem to consider both
> flags as being equal, while others aren't).

The last_buf flag marks the last buffer of a request.  The
last_in_chain flag marks the last buffer of a subrequest.  Looking
into ngx_http_send_special() may help to understand things.

> Last but not least, is there any documentation as to when
> exactly the output filters are being called? Mine seems to be
> called two times for the same subrequest. Once with the buffer
> that I showed above, and a second time with a buffer that is
> mostly all zero'ed, except for the "sync" and "last_in_chain"
> flags both set to 1. That seems to be some kind of a flush
> request, but again, I don't see any documentation on that.

Use the Source, Luke.

Output filters are called when nginx needs to output something.
It may happen in various places and is expected to happen at least
once if a response has body.   Grep for ngx_http_output_filter()
for more details.

Seriously, trying to code nginx modules without reading it's
source code won't work.

--
Maxim Dounin
http://nginx.org/

_______________________________________________
nginx-devel mailing list
nginx-devel at nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx-devel



More information about the nginx-devel mailing list