HTTP GEO Module and Memcached Module question

Evan Miller emmiller at gmail.com
Wed May 2 09:00:25 MSD 2007


Liang Jin <mywebadmin at ...> writes:

> Hi, Evan. Are you indicating that the upstream module can be used
> along with the memcached module to define a cluster of memcached
> servers? And also use weight for different upstream servers? This
> sounds good.
>
> However, ip_hash operation seems a bit unclear. How are the requests
> distributed among upstream servers? The algorithm should be known to
> the backend script, so that the backend can make the corresponding
> cache available on the same memcached server. Otherwise, the backend
> would have to make the cache available on all the clustered memcached
> servers in order to make the frontend nginx fetch the cached contents
> in the first try.

This latter description is correct. Requests are distributed independent
of the requested URL, and so you'd need to cache files in all of your memcache
servers. Having multiple memcache servers behind nginx is only useful to
distribute load or provide high availability.

>
> Also, I am wondering about customizing the hash table. Is there anyway
> to manually set a MD5 algorithm based on the request URL?
>
> Another problem I am facing is the use of cookies. Backend caching is
> done according to a MD5 hash of both the request URL and the available
> cookies. However, nginx in the frontend could only fetch a cached key
> in memcached using the request URL.

You can write a module! Nginx's load-balancing code is modular, i.e., it will be
(relatively) straightforward to write a couple-hundred line file that adds your
hashing mechanism of choice, using both the URL and the cookies. If you go this
route, you'll want to copy/mimic the ip_hash module:

http://tinyurl.com/ywjs6m

Also take a look at the ngx_http_parse_multi_header_lines function... FYI, this
code will get you access to the "userid" cookie:

   ngx_str_t my_userid;

   ngx_http_parse_multi_header_lines(&r->headers_in.cookies, "userid",
&my_userid);

(Ran into a similar problem myself.)

I think ideally, there'd be a high-level configuration syntax for specifying a
hash algorithm and the thing to be hashed, e.g.

    upstream_hash md5 $url$cookie[userid];
    upstream_hash crc32 $ip;

That'd be sweet!

> I would think it is more flexible if nginx can get a key
> (corresponding to the request URL, etc) from Memcached and assign the
> content to a nginx variable. This way the variable can be set in the
> backend to use Unix timestamp, or anything related to the cached
> content. And additional handling of the variable can be used
> (filtering, pattern matching, etc).

Not a bad idea. I'd love it if memcache had a better way for storing some
metadata. The problem now is that if you put metadata in a separate key, then it
could end up on a separate server (making things difficult from nginx's
perspective), or the metadata key could expire when the main key is perfectly
valid, so then you're just wasting RAM. The other solution is to pack metadata
into the value itself; for example, for one application I stuffed a timestamp
into the first four bytes of the value, then the rest was the "real" data. But
then you're on your own when it comes to encoding your metadata and parsing it
later. I guess I just wish there were a standard way to store sub-keys and
values that were guaranteed to reside on the same server as the main key, and
expire if and only if the main key expired. Storing metadata is a common
problem, and it's silly that there's not a common solution. But that's a debate
for the memcached mailing list...

Evan

>
> >
> > The Memcached module already supports the failover mechanism you describe; you
> > can adjust the "memcached_next_upstream", "memcached_send_timeout", and
> > "memcached_read_timeout" to send a request to the next memcached on the list if
> > a request fails. See http://wiki.codemongers.com/NginxMemcachedModule.
> >
> > The improvement I would like to see is storing a last-modified time in the
> > stored item's flag. Although the docs say the flag only stores 16 bits, it
> > actually has room for 32 bits:
> >
> > http://lists.danga.com/pipermail/memcached/2007-April/004071.html
> >
> > That size is perfect for a Unix timestamp. This would let you return a 304 Not
> > Modified response from items in the memcache.
>
> Effectively, nginx would grab this information as the expiration
> information for the cached contents, right?
>

>
> >
> > Evan
>
> -Liang
>
>






More information about the nginx mailing list