consistent hashing using split_clients

Maxim Dounin mdounin at mdounin.ru
Thu Nov 1 09:58:21 UTC 2012


Hello!

On Wed, Oct 31, 2012 at 12:46:09PM -0400, rmalayter wrote:

> Maxim Dounin Wrote:
> > 
> > Percentage values are stored in fixed point with 2 digits after 
> > the point.  Configuration parsing will complain if you'll try to 
> > specify more digits after the point.
> > 
> > > How many "buckets" does the hash table for split_clients
> > > have (it doesn't seem to be configurable)?
> > 
> > The split_clients algorithm doesn't use buckets, as it's not a 
> > hash table.  Instead, it calculates hash function of the 
> > original value, and selects resulting value based on a hash 
> > function result.  See http://nginx.org/r/split_clients for 
> > details.
> > 
> 
> So clearly I am down the wrong path here, and split_clients just cannot do
> what I need. I will have to rethink things.
> 
> The 3rd-party ngx_http_consistent_hash module appears to be un-maintained,
> un-commented. It also uses binary search to find an upstream instead of a
> hash table, making it O(log(n)) for each request. My C skills haven't been
> used in anger since about 1997, so updating or maintaining it myself would
> probably not be a fruitless exercise.

You may also try memcached_hash module by Tomash Brechko, as 
available at http://openhack.ru/nginx-patched/wiki/MemcachedHash.  

It features Ketama consistent hashing compatible with 
Cache::Memcached::Fast (memcached client module from the same 
author).  Unfortunately, it's more or less unmaintained too, but I 
think I have patches to bring it up to nginx 0.8.50 at least, and 
it should be trivial to merge it with more recent versions.

> Perhaps I will have to fall back to using perl to get a hash bucket for the
> time being. I assume 4096 upstreams is not a problem for nginx given that it
> is used widely by CDNs.

As long as you use split_clients to actually select a hash bucket, 
I see no real difference with using embedded perl. 

> hashing module using MurmurHash3:
> http://forum.nginx.org/read.php?29,212712,212739#msg-212739
> 
> I suppose other work took priority. Maybe Igor has some code stashed
> somewhere that just needs testing and polishing.
> 
> If not, it seems that the current "ip_hash" scheme used in nginx could be
> easily adapted to fast consistent hashing by simply
>   -using MurmurHash3 or similar instead of the current simple
> multiply+modulo scheme
>   -allowing arbitrary nginx variables as hash input instead of just the IP
> address during upstream selection
>   -at initialization utilizing a hash table of 4096 or whatever configurable
> number of buckets
>   -fill the hash table by sorting the server array on
> murmurhash3(bucket_number + server_name + server_weight_counter) and taking
> the first server
> 
> Is there a mechanism for sponsoring development along these lines and
> getting it into the official nginx distribution? Consistent hashing is the
> one commonly-used proxy server function that nginx seems to be missing.

Hash module Igor mentioned is still in the TODO, but no ETA as 
it's not something frequently asked about.  If you want to sponsor 
the development, please write to the email address listed at 
http://nginx.com/support.html.

-- 
Maxim Dounin
http://nginx.com/support.html



More information about the nginx mailing list