least_conn not working for me

Wed Dec 23 21:26:10 UTC 2020

Hello!

On Wed, Dec 23, 2020 at 12:36:16PM -0500, kenneth.s.brooks wrote:

> Thanks for the response.
> 
> I understand what you are saying about the worker processes. We have only a
> single worker process.
> 
> I have 2 upstream servers.
> 
> To validate:
> I am sending a request for a LARGE file. I see it hit server1. Server1 is
> now serving that request for the next couple of minutes.

Note that the fact that server1 is actually serving the request 
needs some additional verification.  As a web accelerator nginx 
normally ensures that upstream servers are free to serve additional 
requests as soon as possible, so if the limiting factor is the 
client speed rather than connection between nginx and the upstream 
server, nginx will happily buffer the response and will serve it 
to the client itself.  And there will be no active connections to 
server1, so least_conn will behave much like round-robin.

> I send a request for a very tiny file. I see it hit server2. It finishes
> (server1 is still processing request #1)
> I send a request for a very tiny file. I see it hit server1 (even tho it is
> still serving request #1 and server2 is not serving anything)
> 
> I repeat that over and over, and I'll see all the subsequent requests being
> routed to server1, then 2, then 1 then 2.
> If I submit another LARGE file request, if the last request went to server2,
> then now I have 2 LARGE file requests being served by server1.
> 
> If I submit more requests, they all continue to be equally distributed to
> server1 and server2 (even though server1 has 2 active things it is
> serving).

This behaviour corresponds to no active connections to server1, as 
might happen if the file is not large enough and instead buffered 
by nginx. 

> Is there some sort of a 'fudge factor' or threshold? That there has to be n
> number of requests that one server is handling more than another server?  I
> wouldn't think so, but I'm at a loss.

No, nothing like this.

Just in case, here is a simple configuration which demonstrates 
how least_conn works (by using limit_rate to slow down responses 
of one of the upstream servers):

    upstream u {
        server 127.0.0.1:8081;
        server 127.0.0.1:8082;
        least_conn;
    }

    server {
        listen 8080;

        location / {
            proxy_pass http://u;
        }
    }

    server {
        listen 8081;
        limit_rate 10;
        return 200 "slow\n";
    }

    server {
        listen 8082;
        return 200 "fast\n";
    }

And a test:

$ for i in 1 2 3 4; do curl -q http://127.0.0.1:8080/ & sleep 0.1; done; sleep 15
fast
fast
fast
slow
[1]   Done                    curl -q http://127.0.0.1:8080/
[2]   Done                    curl -q http://127.0.0.1:8080/
[3]   Done                    curl -q http://127.0.0.1:8080/
[4]   Done                    curl -q http://127.0.0.1:8080/

Note that requests are started with some delay ("sleep 0.1") to 
make sure fast backend will be able to respond before the next 
request starts.  Note that only one of the requests is routed to 
the slow backed - all other requests are routed to the fast 
one.  That is, least_conn works as expected.

-- 
Maxim Dounin
http://mdounin.ru/