max clients != worker processes * worker connections

Tue Oct 11 15:38:10 UTC 2011

Hello!

On Tue, Oct 11, 2011 at 11:02:34AM -0400, davidcie wrote:

> Hello,
> 
> I relise that it may have been hammered on quite a few times already,
> but I cannot understand the results I'm getting. I have a rather basic
> setup on Ubuntu Server 11.04, where nginx (run as root) spawns 2
> worker_processes and serves a basic HTML page. Each of these processes
> should have its worker_connections set to 8192. There's no limit_zone
> defined. worker_rlimit_nofile is set to 65536, keepalive_timeout 5.
> 
> Verifying max load, I run ab from another server on the subnet. It works
> fine with "ab -kc 8000 -n 16000 http://10.1.1.10/". However, when I do
> "ab -kc 10000 -n 16000 http://10.1.1.10/", ab shows about 3486 failed
> requests in its results (Length: 1743, Exceptions: 1743), while nginx's
> error.log features numerous "2011/10/11 16:49:24 [alert] 12081#0: 8192
> worker_connections are not enough" errors. 
> 
> Testing a number of settings, it seems that there's a close to 1-1
> relationship between worker_connections and the maximum concurrency
> parameter to ab that doesn't produce errors. I tried setting
> worker_processes to some high number (like 16), but it seems to have no
> effect whatsoever.
> 
> Can you please let me know why this setup might not be serving the
> "promised" ;-) worker_processes * worker_connections connections? Is it
> possible that new connections are not being evenly distributed to the
> two processes? Apologies if this is some basic error on our side, we're
> still learing (and admiring) nginx and are more used to IIS!

With small html page it's very likely that one worker process will 
be able to hold accept mutex exclusively.  Upon aproaching  
worker_connections limit it will stop using accept mutex, but it 
may take some time (up to accept_mutex_delay, default 500ms) for 
other workers to come into play.  Additionally, the worker will 
still try to accept() connections until it sees other worker 
locked accept mutex (and this may take some additional time).

With real load you should get much better distribution of client 
connections between workers.  In extreme use cases like the above 
you may try using

    events {
        accept_mutex off;
        ...
    }

to achieve better connection distribution between workers.

This implies some CPU overhead though, especially when using many 
worker processes (OS will wake up each worker on any new 
connection instead of only one holding accept mutex).

Maxim Dounin