[PATCH] Random peer selection for implicit upstream defined by proxy_pass

Tue Aug 21 18:36:49 UTC 2012

hello, Nginx developers,

submitting this patch for your review. It changes the behaviour of
peer selection
for the case, when the upstream is defined implicitly by proxy_pass
directive like this:

resolver 172.16.0.23;
set $backend backend.servers.example.com;
proxy_pass http://$backend;

It's important to note that this is the case when
"backend.servers.example.com" name
is getting resolved by nginx resolver at request time, instead of
being resolved by OS
resolver when server configuration is read at startup. My change only
affects this very
specific scenario. (When "backend.servers.example.com" is resolved at startup,
the original round-robin algorithm is still used.)

Why the change?

Let me explain our use case first. We are using nginx as a load
balancer in from of
an array of backend servers. The array is dynamic - it changes over
time: new servers
can be launched and join the array, old ones may get terminated. The
current state
of the array is defined by a DNS A-record which maps
"backend.servers.example.com"
to a list of IP addresses. This A-record also changes dynamically as
the array itself
changes.

The fact that the A-record is dynamic means that we cannot resolve the name at
startup. Therefore, we use nginx resolver to do DNS at request time.
This works great,
but the problem is with selection of which backend server (peer) to
proxy the request to.

It turns out that the round-robin algorithm actually does not work in
this case,
because of the way things are implemented in src/http/ngx_http_upstream.c.
The peer array is created and initialized on each request, even if a
cached result of
DNS query is returned by resolver. The upstream code has no knowledge of whether
the list of IP addresses changed or not. So what ends up happening is this:
Let's say "backend.servers.example.com" resolves to 6 IP addresses. Then the
first good one will be always used, and the rest will never be even looked at.

It is easy to verify that, if you use nginx.debug executable, enable
"debug" logging
in your error_log directive, and just tail the error log like this:

$tail /var/log/nginx/error.log | grep current
2012/08/21 12:07:09 [debug] 25252#0: *1 get rr peer, current: 0 -5
2012/08/21 12:07:10 [debug] 25252#0: *1 get rr peer, current: 0 -5
2012/08/21 12:07:11 [debug] 25252#0: *1 get rr peer, current: 0 -5
2012/08/21 12:07:12 [debug] 25252#0: *1 get rr peer, current: 0 -5
2012/08/21 12:07:12 [debug] 25252#0: *1 get rr peer, current: 0 -5
2012/08/21 12:07:13 [debug] 25252#0: *1 get rr peer, current: 0 -5

With my random selection change, the peer is basically chosen at
random each time.
Of course, the random selection does not guarantee ideal distribution
of the load, but
it turns out to work quite well statistically. Here are some numbers
from a load-testing run.
6 servers used were in the backend:

Total # of requests: 2,374,398
======================
server 2:  399,342  16.819%
server 3:  397,854  16.756%
server 1:  396,807  16.712%
server 4:  396,660  16.706%
server 5:  393,735  16.582%
server 0:  390,000  16.425%

In our case, we don't really care if all backend servers receive the
exact amount of requests, as long as the load is approximately the same.
The bad situation that we were aiming to avoid, is when one server is
overloaded and others are running idle.

The patch was originally made for 1.2.2, but i verified that it also
applies without problems to 1.2.3 and 1.3.5, and works the same
way with those newer versions of nginx.

Sorry about such a long message.
Thanks for reading!

Anton.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: nginx-upstream-random.patch
Type: application/octet-stream
Size: 18738 bytes
Desc: not available
URL: <http://mailman.nginx.org/pipermail/nginx-devel/attachments/20120821/56a04ce6/attachment-0001.obj>