[PATCH] Random peer selection for implicit upstream defined by proxy_pass

Maxim Dounin mdounin at mdounin.ru
Wed Sep 19 15:24:50 UTC 2012


Hello!

On Sat, Sep 08, 2012 at 01:19:49AM -0700, Anton Jouline wrote:

> hello, Maxim.
> 
> Thanks for your response!
> You are right that my patch looks unnecessarily complicated for
> what it actually does. Lots of code duplication too... After reading your
> message I started thinking again of a simpler way to accomplish the same
> task, and i think i found it.
> 
> But first, let me provide feedback for your 2 proposals.

(again, sorry for late reply, still too busy)

> 
> #1 - Enforcing the random order in ngx_http_upstream_create_round_robin_peer()
> is part of the solution, but it's not enough, because of how the peer is
> selected in ngx_upstream_get_peer() function in the same file. So we need
> to modify that function also, but must be careful there, because it is used by
> the regular weighted round robin algorithm, when upstream is defined
> either implicitly by resolved-once proxy_pass, or explicitly by upstream block
> in the configuration file. We don't want to modify the existing round robin
> behaviour for those cases. I believe i found a way to do that (see the new
> patch at the end of the message)

Just enforcing random order of peers as originally suggested 
should be enough here.  The code in ngx_http_upstream_get_peer() 
will then select first one, which will be random one.

See below for some comments on your patch.

> #2 - Introduce round-robin in resolver code. I think that would be good,
> but it does not really solve the issue of poor load distribution,
> because during
> the period of time when the DNS lookup is cached, still only 1 ip address
> would be used out of the list of many. It is true that when TTL expires,
> the next lookup would return a re-ordered list and perhaps a different ip
> would be in the first position, but for the duration of TTL there will still be
> no distribution of requests among the available servers.
> That's why i think #1 is a better alternative.

No, no, no.  I mean - always return names in random order from a 
resolver, even if DNS response is cached.  This should do the same 
as (1) but will cover other cases as well.

> Ok, and now goes the new version of the patch.
> It is much shorter and simpler! No new files needed, just a couple
> of small changes in ngx_http_upstream_round_robin.c.
> Tested successfully with versions 1.2.2, 1.2.3, and 1.3.5
> 
> Thanks.
> 
> Anton.
> 
> 
> ===
> diff -u -r nginx-1.2.2/src/http/ngx_http_upstream_round_robin.c
> nginx-1.2.2-mod/src/http/ngx_http_upstream_round_robin.c
> --- nginx-1.2.2/src/http/ngx_http_upstream_round_robin.c	2012-07-02
> 12:41:13.000000000 -0400
> +++ nginx-1.2.2-mod/src/http/ngx_http_upstream_round_robin.c	2012-09-08
> 03:10:17.000000000 -0400
> @@ -288,7 +288,7 @@
>  {
>      u_char                            *p;
>      size_t                             len;
> -    ngx_uint_t                         i, n;
> +    ngx_uint_t                         i, j, n;
>      struct sockaddr_in                *sin;
>      ngx_http_upstream_rr_peers_t      *peers;
>      ngx_http_upstream_rr_peer_data_t  *rrp;
> @@ -359,8 +359,11 @@
>          }
>      }
> 
> +    /* start at random index */
> +    j = (int) ( (float)(peers->number) * (rand() / (RAND_MAX + 1.0)));
> +
>      rrp->peers = peers;
> -    rrp->current = 0;
> +    rrp->current = j;

This looks overcomplicated.  The same thing can be done with

       rrp->current = ngx_random() % peers->number.

which is much more readable.

> 
>      if (rrp->peers->number <= 8 * sizeof(uintptr_t)) {
>          rrp->tried = &rrp->data;
> @@ -504,7 +507,7 @@
>      time_t                        now;
>      uintptr_t                     m;
>      ngx_int_t                     total;
> -    ngx_uint_t                    i, n;
> +    ngx_uint_t                    i, j, n;
>      ngx_http_upstream_rr_peer_t  *peer, *best;
> 
>      now = ngx_time();
> @@ -514,14 +517,15 @@
> 
>      for (i = 0; i < rrp->peers->number; i++) {
> 
> -        n = i / (8 * sizeof(uintptr_t));
> -        m = (uintptr_t) 1 << i % (8 * sizeof(uintptr_t));
> +        j = (i + rrp->current) % rrp->peers->number;
> +        n = j / (8 * sizeof(uintptr_t));
> +        m = (uintptr_t) 1 << j % (8 * sizeof(uintptr_t));

Note this might affect peer selection in some situations.  If 
there are two peers with identical effective weight, peer 
selection now depends on rrp->current (previous peer tried in a 
normal case), while with previous code selection was stable and 
always used first one.

BTW, you may find more details about peer selection algorithm 
currently used in the r4622 commit log, see here:

http://trac.nginx.org/nginx/changeset/4622/nginx

Maxim Dounin


> 
>          if (rrp->tried[n] & m) {
>              continue;
>          }
> 
> -        peer = &rrp->peers->peer[i];
> +        peer = &rrp->peers->peer[j];
> 
>          if (peer->down) {
>              continue;
> ===
> 
> 
> On Fri, Sep 7, 2012 at 7:45 AM, Maxim Dounin <mdounin at mdounin.ru> wrote:
> > Hello!
> >
> > (sorry for late reply, I was busy with OCSP stapling and slowly
> > catching up with resulting mail backlog now)
> >
> > On Tue, Aug 21, 2012 at 11:36:49AM -0700, Anton Jouline wrote:
> >
> >> hello, Nginx developers,
> >>
> >> submitting this patch for your review. It changes the behaviour of
> >> peer selection
> >> for the case, when the upstream is defined implicitly by proxy_pass
> >> directive like this:
> >>
> >> resolver 172.16.0.23;
> >> set $backend backend.servers.example.com;
> >> proxy_pass http://$backend;
> >
> > [...]
> >
> >> It turns out that the round-robin algorithm actually does not work in
> >> this case,
> >> because of the way things are implemented in src/http/ngx_http_upstream.c.
> >> The peer array is created and initialized on each request, even if a
> >> cached result of
> >> DNS query is returned by resolver. The upstream code has no knowledge of whether
> >> the list of IP addresses changed or not. So what ends up happening is this:
> >> Let's say "backend.servers.example.com" resolves to 6 IP addresses. Then the
> >> first good one will be always used, and the rest will never be even looked at.
> >
> > Surely this is valid problem, and it needs fixing.  Though I don't
> > think we have to introduce another balancer module just for this.
> >
> > I think of two possible ways to resolve this:
> >
> > 1. Enforce random order of server in ngx_http_upstream_create_round_robin_peer().
> > Note it's only used to create upsteams on the fly, so it should be
> > ok (and will lead to the same result as with your patch, though
> > will be simplier).
> >
> > 2. Introduce round-robin in resolver code before we return cached
> > response, much like it will be done by a normal DNS servers.  This
> > should also simplify other possible uses of a resolver.
> >
> > Not sure which one would be better though.
> >
> > Maxim Dounin
> >
> > _______________________________________________
> > nginx-devel mailing list
> > nginx-devel at nginx.org
> > http://mailman.nginx.org/mailman/listinfo/nginx-devel
> 
> _______________________________________________
> nginx-devel mailing list
> nginx-devel at nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx-devel



More information about the nginx-devel mailing list