consistent hashing using split_clients

Wed Oct 31 14:50:47 UTC 2012

Hello!

On Wed, Oct 31, 2012 at 10:31:12AM -0400, rmalayter wrote:

> I'm looking for a way to do consistent hashing without any 3rd-party modules
> or perl/lua. I came up with the idea of generating a split_clients and list
> of upstreams via script, so we can add/remove backends without blowing out
> the cache on each upstream when a backend server is added, removed or
> otherwise offline.
> 
> What I have looks like the config below. The example only includes 16
> upstreams for clarity, and is generated by sorting by the SHA1 hash of
> server names for each upstream bucket along with the bucket name.
> 
> Unfortunately, to get an even distribution of requests to upstream buckets
> with consistent hashing, I am actually going to need at least 4096
> upstreams, and the corresponding number of entries in split_clients. 
> 
> Will 4096 entries in single split_clients block pose a performance issue?
> Will split_clients have a distribution problem with a small percentage like
> "0.0244140625%"?

Percentage values are stored in fixed point with 2 digits after 
the point.  Configuration parsing will complain if you'll try to 
specify more digits after the point.

> How many "buckets" does the hash table for split_clients
> have (it doesn't seem to be configurable)?

The split_clients algorithm doesn't use buckets, as it's not a 
hash table.  Instead, it calculates hash function of the 
original value, and selects resulting value based on a hash 
function result.  See http://nginx.org/r/split_clients for 
details.

[...]

-- 
Maxim Dounin
http://nginx.com/support.html