session persistance with IP hash

Robert Paprocki rpaprocki at fearnothingproductions.net
Thu Jul 28 16:07:00 UTC 2016


Hello,

On Thu, Jul 28, 2016 at 7:32 AM, Brian Pugh <project722 at gmail.com> wrote:

> Yesterday once I got the traffic going to the backend servers from nginx I
> noticed that I was pinned to "backend3", which is last in the order. And
> since I am the one setting this up I am the only user. So I changed up my
> order just to see the effects of calculating a new hash. Instead of:
>
> upstream backend {
> backend1
> backend2
> backend3
> }
>
> I listed them in the order:
>
> upstream backend {
> backend2
> backend3
> backend1
> }
>
> then restarted nginx. At that point my traffic was pinned to backend1.
> This seems a bit odd to me in that it seems to be always choosing the last
> server in the order. Any thoughts on what might be happening and why it did
> not pin me to backend1 the first time and backend2 the second time?
>

This sounds exactly like what should be expected. To better understand,
let's look at a simple example of how hash-based selection _might_ occur (I
say "might" because this is not exactly how Nginx performs its hashing, I'm
simplifying for examples' sake, but it's good enough). We'll make a few
assumptions:

- The hash key in this example is your IP address (we'll use 127.0.0.1 for
simplicity)
- We will assume each backend has the same weight
- Arrays are zero indexed
- Our upstream block looks like such:
upstream backend {
backend1
backend2
backend3
}
- We will kindly remember this is a conceptual example and not how Nginx
does things under the hood (but this is clear enough)

Let's say that the whole upstream definition creates an array of servers,
so in your first example you'd have an array of:

upstreams = { "backend1", "backend2", "backend3" }

The key "127.0.0.1" is run through a mathematical hash function to create
an integer, and it will create the same integer every single time its run.
For our example, let's say that hash("127.0.0.1") equals the number 438653.
In order to select which backend to use, we compute our hash value 438652,
modulo the number of backends we have in our upstream (is essentially the
remainder after division). So,

438653 % 3 = 2

So we get index 2 from our array. Remember that our conceptual array is
zero-indexed, so we select upstreams[2], or the third element in our array,
which is "backend3". Every time a backend is selected for the key
"127.0.0.1", "backend3" will be used. because that's the result of looking
up upstreams[2]. Now, let's change the upstream block to such:

upstream backend {
backend2
backend3
backend1
}

Assuming nothing else has changed, we run through the same process again.
hash("127.0.0.1") is 438653, and 438653 % 3 is still 2. So, we look up
upstreams[2] (the third element in our array, same as last time), and we
get "backend1". Given this upstream configuration, we will get the same
result every time. In this example, the order we define backends matters,
and this can be used to explain your results perfectly. You shouldn't
expect that your key will be hashed to the first backend in your upstream
block, you should only expect that the same key will produce the same
_relative_ result every time it is hashed.

(And again to note, Nginx does do things slightly differently, rr peers are
defined as a linked list, not an array, and lookup is not strictly hashval
% list length; this is a conceptual example _only_)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20160728/258fe6f7/attachment.html>


More information about the nginx mailing list