Buffer Reuse in Nginx

Wed Mar 7 13:41:55 UTC 2012

On Tue, Mar 6, 2012 at 5:28 PM, Alexandr Gomoliako <zzz at zzz.org.ua> wrote:
> There is another way: you can skip ngx_http_output_filter() altogether
> and use r->connection directly with c->send/c->recv and your own
> handlers for c->read/c->write.

I believe that would be a last resort for me as that would disable the
whole benefits of Nginx modules and create incompatibility. The
solution should work though.

> This will be much easier with perl though. I didn't document it very
> well, but I'm willing to help you, if you are going to use this:
> http://zzzcpan.github.com/nginx-perl/Nginx.html#CONNECTION_TAKEOVER

Thanks for the offer, but my project is written in C/C++. Though I'll
look into it and see if I can apply this design pattern in my module.

On Tue, Mar 6, 2012 at 6:26 PM, Maxim Dounin <mdounin at mdounin.ru> wrote:
>> #define ngx_free_chain(pool, cl)
>> Unfortunately this function's name does not reflect that it does
>> exactly. From my understanding it only recycles the ngx_chain_t object
>> by placing the pointer into pool->chain. Even the ngx_buf_t object
>> that cl->buf points to is completely ignored.
>>
>>
>> ngx_chain_t* ngx_alloc_chain_link(ngx_pool_t *pool)
>> This function only allocate the ngx_chain_t object and will try to
>> reuse the chains that were freed by ngx_free_chain(). Because the buf
>> pointer in the chain is ignored it is not safe to assume that the
>> ngx_buf_t object and the buffer data can be reused.
>
> Correct, these two functions deal with chain links, and they
> completely ignore any possible content of the structures.  They
> are basically equivalent to
>
>    ngx_palloc(pool, sizeof(ngx_chain_t));
>    ngx_pfree(pool, cl);

Ok got it. So my interpretation is correct.

>> ngx_int_t ngx_output_chain(ngx_output_chain_ctx_t *ctx, ngx_chain_t *in)
>> It seems to me that most of the buffer copying magic happens in this
>> function. I tried as hard to follow but still could not fully
>> understand what the function does. As I understand this function would
>> copy the actual buffer data if the ngx_buf_t object has some specific
>> flags set as determined by ngx_output_chain_as_is(). I am not sure
>> whether I should set any flags to instruct ngx_output_chain() to copy
>> all buffer data so that I can safely reuse the buffers that I own.
>
> You shouldn't instruct it to copy anything.
>
> Instead, you should reuse your own buffers as long as they are
> freed, via ngx_chain_update_chains() and friends.  See below.

Alright.

>> typedef void* ngx_buf_tag_t
>> This mysterious tag seems to be the way for me to claim ownership to a
>> buffer by assigning it a unique pointer value. However I could find
>> almost no explanation on how to use this tag field properly. I'd like
>> to know if setting this tag would guarantee that the buffers I created
>> would never be shared ownership with other modules?
>
> It is used by ngx_chain_update_chains() to match buffers allocated
> by your module.

Ok then this is my interpretation for the buffer tag: There is always
exactly one ngx_buf_t object that points to a particular memory area,
and it is owned by the module that set the tag field. The modules down
the output chain would change the b->pos and b->last pointers and that
change is used by the buffer owner to determine if the buffer is free
to reuse again.

>> void ngx_chain_update_chains(ngx_pool_t *p, ngx_chain_t **free,
>> ngx_chain_t **busy, ngx_chain_t **out, ngx_buf_tag_t tag)
>> I find that this function seems to be performing what I want, and it
>> seems to be called in other modules that has similar buffer reuse
>> mechanism. However I am really confused about the purpose of this
>> function and what it does exactly. From the signature it seems to be
>> determining which buffers are safe to reuse, and then reclaim the free
>> buffers into the **free chain. However on close inspection I found
>> that all it does is to move all tagged buffers at **busy and **out to
>> free, while calling ngx_free_chain() on buffer chains that do not
>> share the same tag. I don't know if the buffers freed by this function
>> is guaranteed safe to be reused, and I don't know what happen with the
>> buffers that have different tags.
>
> Buffers only moved to **free if they are indeed free, i.e. when
> ngx_buf_size(cl->buf) == 0.  Buffers from **free will be then
> reused either with ngx_chain_get_free_buf() or with your own code.

Ah now I realized I missed that line of code. So what you mean is that
it is also possible to manually determine if a buffer is free for
reuse by checking if ngx_buf_size(b) is 0?

To make sure I get the idea correct I am showing here my
interpretation of this function:

void
ngx_chain_update_chains(ngx_pool_t *p, ngx_chain_t **free, ngx_chain_t **busy,
    ngx_chain_t **out, ngx_buf_tag_t tag)
{
    ngx_chain_t  *cl;

    // Append *out to the end of *busy chain, and empty the *out chain
    if (*busy == NULL) {
        *busy = *out;

    } else {
        for (cl = *busy; cl->next; cl = cl->next) { /* void */ }

        cl->next = *out;
    }

    *out = NULL;

    // For each buffer in the busy chain
    while (*busy) {
        cl = *busy;

        // If the size is not zero, then the buffer is still being used.
        // We assume that the rest of buffers in the chain are still
        // in use as well, so we break and return.
        if (ngx_buf_size(cl->buf) != 0) {
            break;
        }

        // If the buffer is not owned by current module, proceed to
        // the next buffer and free the chain link while ignoring the
        // buffer pointer.
        if (cl->buf->tag != tag) {
            *busy = cl->next;
            ngx_free_chain(p, cl);
            continue;
        }

        // else, the buffer is owned by the current module
        // and the buffer is free for reuse.
        // reset the pos and last pointer.
        cl->buf->pos = cl->buf->start;
        cl->buf->last = cl->buf->start;

        // Put the chain together with the buffer into the *free chain
        // The buffer pointers at this *free chain link is guaranteed to
        // be valid and it's memory region is safe for reuse.
        *busy = cl->next;
        cl->next = *free;
        *free = cl;
    }
}

>> ngx_chain_t * ngx_chain_get_free_buf(ngx_pool_t *p, ngx_chain_t **free)
>> I find that this function will return the buffers freed by
>> ngx_chain_update_chains(). Most modules seem to do overwrite the data
>> on the obtained buffer without any issue. That makes me wonder if
>> ngx_chain_update_chains() really works.
>
> See above.

>From what I understand ngx_chain_get_free_buf() will always return a
ngx_chain_t object with valid pointer to a ngx_buf_t object. On the
opposite, ngx_alloc_chain_link() only returns the ngx_chain_t object
and the buf pointer must be reseted before any use.

Also, if the *free chain contains any chain link, that chain link will
be returned and the buffer attached would have b->start and b->end
pointing to memory region that is safe to write. But on the other hand
of *free is empty, then ngx_chain_get_free_buf would only allocate new
ngx_chain_t and ngx_buf_t objects but leave the b->start and b->end
pointers as NULL. As a result the caller of ngx_chain_get_free_buf()
must ensure that cl->buf->start points to some memory region, or the
caller itself have to manually allocate that memory region and assign
them to cl->buf.

>> ngx_chain_t * ngx_connection_s::send_chain(ngx_connection_t *c,
>> ngx_chain_t *in, off_t limit)
>> The function pointer at r->connection->send_chain would return the
>> buffer chain that it has not yet sent. I also found that the returned
>> chain is then stored in r->out waiting to be sent next time. So it
>> seems like I can determine if my buffers are safe for reuse by
>> checking if the buffer chains at r->out point to the same buffer data.
>> However I am not sure if solely based on this method is really safe,
>> especially if there are filter modules that retain buffers in their
>> own context.
>
> No, this isn't correct aproach.  Don't touch r->out, use
> ngx_chain_update_chains() instead, it will do needed work for you.

Ok thanks for the warning.

>> ngx_int_t ngx_http_output_filter(ngx_http_request_t *r, ngx_chain_t *in)
>> After so many clues that I stated above, I just wish to know what is
>> really the right way to determine if buffers are safe for reuse after
>> this function, ngx_http_output_filter(), is called. Can I just set
>> buf->tag? Or should I check r->out? Or should I call
>> ngx_chain_get_free_buf()?
>
> You should really use ngx_chain_update_chains().  The basic
> aproach is to do something like this (partially stolen from
> chunked filter):
>
>    cl = ngx_chain_get_free_buf(r->pool, &ctx->free);
>    if (cl == NULL) {
>        return NGX_ERROR;
>    }
>
>    if (cl->buf->start == NULL) {
>        /*
>         * allocate memory for a buffer, if you really need one
>         * with associated memory region
>         */
>
>        ...
>    }
>
>    cl->buf->tag = (ngx_buf_tag_t) &ngx_http_chunked_filter_module;
>
>    ...
>
>    rc = ngx_http_output_filter(r, out);
>
>    ngx_chain_update_chains(r->pool, &ctx->free, &ctx->busy, &out,
>                            (ngx_buf_tag_t) &ngx_http_chunked_filter_module);
>
>    ...

Thanks for the boilerplate. Now I am more or less understand how to
write this part of code.

> You may also want to add some extra processing to ensure that no
> more than specified number of buffers will be allocated, handle
> case when you can't allocate more buffers and so on.  Note that
> chunked filter doesn't do this as it doesn't really care (it just
> want to reuse it's own buffers, but doesn't cap number of them),
> you may want to take a look at gzip filter and/or upstream module
> and event pipe code for more complex examples.

I'll take note of that. Thank you so much for your explanation!

Soares