various body filter questions
Maxim Dounin
mdounin at mdounin.ru
Thu Jul 19 13:52:24 UTC 2012
Hello!
On Thu, Jul 19, 2012 at 10:48:59AM +0100, Robert Lemmen wrote:
> hi everyone,
>
> I need to run an nginx with a custom body filter, for the sake of
> argument let's say I want to replace "robert" with "bob" in every file
> that ends with ".txt". I played around a bit and looked at the various
> tutorials and documentation on the net. I have managed to get it
> basically working, but am looking for your input to get it "correct":
>
> a) I create a module context and store it via ngx_http_set_ctx(). how do
> I clean it up? or do I not need to?
Usually you don't need to. If you really want for some reason,
you may use ngx_http_set_ctx() with NULL.
> does everything allocated with
> ngx_pcalloc(r->pool, ...) get freed when the request is sorted?
Yes. Everything allocated from a pool is freed on the pool
destruction.
> b) I would like to handle HEAD requests as well, but I can't just do it
> with a filter. In the header filter I don't have the actual data in the
> file, so I can't do the substitutions, so I don't know what the
> content-length will be. what to do? I was thinking doing a GET
Usual aproach is to just remove Content-Length header from an
answer. As even with GET you can't forecast response length
before you've done processing full response, which in turn means
your module will just don't work with anything but small
responses.
[...]
> c) in the body filter, I want to read the buffer chain and modify it or
> create a new one. but I don't get all the details of the buffer chain.
> assuming "ngx_chain_t *in" as a parameter of the filter:
> - if in->next != 0 (and so on), does that mean these buffers are parts
> of the actual file? so e.g. in->buf could hold the first half of teh
> file and in->next->buf the second half?
Not file, response. Otherwise yes, there may be more than one
buffer in a chain passed to a body filter. Moreover, it may be
more than one call to a body filter. You have to handle all
buffers in the input chain(s) passed to your body filter.
> - what are the semantics of the different fields in in->buf? what's the
> difference between pos/last and start/end? what do the different flags
> (temporary, memory, mmap, recycled, in_file, flush, sync) mean?
The pos/last are bounds of memory used in a buffer, start/end -
memory region allocated. Various flags specify various buffer
properties, try looking code for details.
> - what's the difference between last_buf, last_in_chain? is it not the
> same as in->next == 0?
The last_in_chain flag marks last buffer of a subrequest response,
last_buf - last buffer in a main request response. It's not the
same as in->next == NULL as the latter means nothing, see above.
> - it might be easier for me to read the buffers and create new ones,
> rather than changing the existing ones. do I need to do any sort of
> cleanup?
Yes. If you don't pass original buffer to a next filter and
instead copy it's contents you have to mark buffer as sent by
moving b->pos to b->last. Failure to do so will result in hang
on large responses as eventually underlying module will run out of
buffers and will wait for previously passed buffers to be sent.
> - are the buffers always complete? how would I know? I assume that if I
> download a 10GB file nginx doesn't just copy it into memory...
Single buffer chain passed to a body filter doesn't necessary
represent full response.
Number and size of buffers used to copy files to memory (once one
of a filters needs it) may be controlled with output_buffers
directive, see http://nginx.org/r/output_buffers.
> - how does sendfile work in nginx? if I do a body filter I of course
> need to make sure sendfile is never used.
As long as buffers aren't with file-based content sendfile is not
used. You shouldn't care about this as long as you correctly
modify buffer chain(s).
Maxim Dounin
More information about the nginx-devel
mailing list