various body filter questions

Maxim Dounin mdounin at mdounin.ru
Thu Jul 19 13:52:24 UTC 2012


Hello!

On Thu, Jul 19, 2012 at 10:48:59AM +0100, Robert Lemmen wrote:

> hi everyone,
> 
> I need to run an nginx with a custom body filter, for the sake of
> argument let's say I want to replace "robert" with "bob" in every file
> that ends with ".txt". I played around a bit and looked at the various
> tutorials and documentation on the net. I have managed to get it
> basically working, but am looking for your input to get it "correct":
> 
> a) I create a module context and store it via ngx_http_set_ctx(). how do
> I clean it up? or do I not need to?

Usually you don't need to.  If you really want for some reason, 
you may use ngx_http_set_ctx() with NULL.

> does everything allocated with
> ngx_pcalloc(r->pool, ...) get freed when the request is sorted?

Yes.  Everything allocated from a pool is freed on the pool 
destruction.

> b) I would like to handle HEAD requests as well, but I can't just do it
> with a filter. In the header filter I don't have the actual data in the
> file, so I can't do the substitutions, so I don't know what the
> content-length will be. what to do? I was thinking doing a GET

Usual aproach is to just remove Content-Length header from an 
answer.  As even with GET you can't forecast response length 
before you've done processing full response, which in turn means 
your module will just don't work with anything but small 
responses.

[...]

> c) in the body filter, I want to read the buffer chain and modify it or
> create a new one. but I don't get all the details of the buffer chain.
> assuming "ngx_chain_t *in" as a parameter of the filter:
> - if in->next != 0 (and so on), does that mean these buffers are parts
>   of the actual file? so e.g. in->buf could hold the first half of teh
>   file and in->next->buf the second half?

Not file, response.  Otherwise yes, there may be more than one 
buffer in a chain passed to a body filter.  Moreover, it may be 
more than one call to a body filter.  You have to handle all 
buffers in the input chain(s) passed to your body filter.

> - what are the semantics of the different fields in in->buf? what's the
>   difference between pos/last and start/end? what do the different flags
>   (temporary, memory, mmap, recycled, in_file, flush, sync) mean?

The pos/last are bounds of memory used in a buffer, start/end - 
memory region allocated.  Various flags specify various buffer 
properties, try looking code for details.

> - what's the difference between last_buf, last_in_chain? is it not the
>   same as in->next == 0?

The last_in_chain flag marks last buffer of a subrequest response, 
last_buf - last buffer in a main request response.  It's not the 
same as in->next == NULL as the latter means nothing, see above.

> - it might be easier for me to read the buffers and create new ones,
>   rather than changing the existing ones. do I need to do any sort of
>   cleanup? 

Yes.  If you don't pass original buffer to a next filter and 
instead copy it's contents you have to mark buffer as sent by 
moving b->pos to b->last.  Failure to do so will result in hang 
on large responses as eventually underlying module will run out of 
buffers and will wait for previously passed buffers to be sent.

> - are the buffers always complete? how would I know? I assume that if I
>   download a 10GB file nginx doesn't just copy it into memory...

Single buffer chain passed to a body filter doesn't necessary 
represent full response.

Number and size of buffers used to copy files to memory (once one 
of a filters needs it) may be controlled with output_buffers 
directive, see http://nginx.org/r/output_buffers.

> - how does sendfile work in nginx? if I do a body filter I of course
>   need to make sure sendfile is never used.

As long as buffers aren't with file-based content sendfile is not 
used.  You shouldn't care about this as long as you correctly 
modify buffer chain(s).

Maxim Dounin



More information about the nginx-devel mailing list