accpet_mutex cause nginx worker balance problem
Maxim Dounin
mdounin at mdounin.ru
Mon Aug 4 14:57:51 UTC 2014
Hello!
On Sun, Aug 03, 2014 at 10:47:26PM -0400, xinghua_hi wrote:
> hello,
>
> I still can't understand why accept_mutex cause disbalance. In code
> below, multi worker will try to get mutex and the question is , why one
> worker can always get the mutex ? I test many times, find that one worker
> can always accept new connection much more than others. Thanks very much.
Only worker which holds the accept mutex will try to accept new
connections. Other workers will only process events they already
have, or try to grab accept mutex again after 500ms timeout
(accept_mutex_delay[1]) if there are no other events to handle.
Consider a short test on otherwise idle server like one you are
doing, with many connections established during a small period of
time. Assume there are 2 workers:
- worker A holds accept mutex, worker B waits for 500ms timeout
doing nothing;
- in a short period of time 1000 connections comes in;
- worker A woken up by the kernel, accepts a connection;
- worker A goes back to the kernel to wait for more data; since
worker B is in kernel waiting for a 500ms timeout, accept mutex
is again locked by A;
- worker A wokern up again, and the above repeats multiple times.
More or less this continues till worker B wakes up after 500ms and
tries to lock the accept mutex. If it is lucky and this happens
when worker A is doing something, it will be able to lock the
accept mutex. That is, further connections will be accepted by
worker B. If worker B isn't lucky, then worker A will accept
connections for more time. For short tests this may mean that all
connections will be accepted by a single worker. (And things will
be even worse if multi_accept[2] is used.)
On a normally loaded server the above situation isn't likely to
happen as all workers are priodically woken up by the kernel, and
will try to lock accept mutex when going back to the kernel. Thus
connections are distributed among all workers more or less evenly.
In short tests though, accept_mutex can easily cause disbalance as
described above.
[1] http://nginx.org/r/accept_mutex_delay
[2] http://nginx.org/r/multi_accept
--
Maxim Dounin
http://nginx.org/
More information about the nginx
mailing list