Proposal: new caching backend for nginx

Aliaksandr Valialkin valyala at gmail.com
Wed Jan 23 17:27:42 UTC 2013


On Wed, Jan 23, 2013 at 5:47 AM, Maxim Dounin <mdounin at mdounin.ru> wrote:


> I don't think that it will fit as a cache store for nginx.  In
> particular, with quick look through sources I don't see any
> interface to store data with size not known in advance, which
> happens often in HTTP world.


Yes, ybc doesn't allow storing data with size not known in advance due to
performance and architectural reasons. There are several workarounds for
this problem:
- objects with unknown sizes may be streamed into a temporary location
before storing them into ybc.
- objects with unknown sizes may be cached in ybc using fixed-sized chunks,
except for the last chunk, which may have smaller size. Here is a
pseudo-code:
store_stream_to_ybc(stream, key, chunk_size) {
  for (n = 0; ; n++) {
    key_for_chunk = get_key_for_chunk(key, n);
    chunk_txn = start_ybc_set_txn(key_for_chunk, chunk_size);
    bytes_copied = copy_stream_to_set_txn(stream, chunk_txn);
    if (bytes_copied == -1) {
      // error occurred when copying data to chunk_txn.
      rollback_set_txn(chunk_txn);
      return ERROR;
    }
    if (bytes_copied < chunk_size) {
      // the last chunk reached. Copy it again, since we know its' size now.
      last_chunk_txn = start_ybc_set_txn(key_for_chunk, bytes_copied);
      copy_data(last_chunk_txn, cunk_txn, bytes_copied);
      rollback_set_txn(chunk_txn);
      commit_set_txn(last_chunk_txn);
      return SUCCESS;
    }

    // there is other data in the stream.
    commit_set_txn(chunk_txn);
}



> Additionally, it looks like it
> doesn't provide async disk IO support.
>

Ybc works with memory mapped files. It doesn't use disk I/O directly. Disk
I/O may be triggered if the given memory page is missing in RAM. It's
possible to determine whether the given virtual memory location is cached
in RAM or not - OSes provide special syscalls designed for this case - for
example, mincore(2) in linux. But I think it's better relying on caching
mechanisms provided by OS for memory mapped files than using such syscalls
directly. Ybc may block nginx worker when reading swapped out memory pages,
but this should be rare event if frequently accessed cached objects fit RAM.

Also as I understood from the http://www.aosabook.org/en/nginx.html , nginx
currently may block on disk I/O too:

> One major problem that the developers of nginx will be solving in
upcoming versions is
> how to avoid most of the blocking on disk I/O. At the moment, if there's
not enough
> storage performance to serve disk operations generated by a particular
worker, that
> worker may still block on reading/writing from disk.

-- 
Best Regards,

Aliaksandr
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx-devel/attachments/20130123/c6e30be6/attachment.html>


More information about the nginx-devel mailing list