[Patch] possible mutex starvation issue affects all nginx Linux versions.

Maxim Dounin mdounin at mdounin.ru
Mon Dec 2 13:39:07 UTC 2013


Hello!

On Mon, Dec 02, 2013 at 07:15:52AM -0500, itpp2012 wrote:

> Here is a patch for a possible mutex starvation issue which affects all
> nginx Linux versions.
> Already solved for Windows since nginx 1.5.7.1 Caterpillar.
> Can be reproduced when nginx reloads the config & worker holding mutex dies
> or hangs.
> 
> Fixed by Vittorio Francesco Digilio, commercially sponsered solution by
> ITPP.
> target source mainline 1.5.8 - 30-11-2013.
> src/event/ngx_event_accept.c, line 402 was correctly found (starvation fix)
> but missed in:
> src/event/ngx_event.c, line 259-260, 1 line added;
> 255 if (ngx_posted_accept_events) {
> 256 ngx_event_process_posted(cycle, &ngx_posted_accept_events);
> 257 }
> 258
> 259 if (ngx_accept_mutex_held) {
> --+ ngx_accept_mutex_held=0;
> 260 ngx_shmtx_unlock(&ngx_accept_mutex);
> 261 }
> 262
> 263 if (delta) {
> 264 ngx_event_expire_timers();
> 265 }

This patch is wrong.  The ngx_accept_mutex_held don't need to be 
unset here.  The ngx_accept_mutex_held is used as an idicator that on 
previous iteration the lock was held by the process, and unsetting 
it will result in incorrect behaviour.

> Also applies to; src/os/win32/ngx_process_cycle.c
> line 507-508, 3 lines added;
> 504 if (ngx_processes[n].handle != h) {
> 505 continue;
> 506 }
> 507
> --+ if(*ngx_accept_mutex.lock==ngx_processes[n].pid) {
> --+ *ngx_accept_mutex.lock=0;
> --+ }
> 508 if (GetExitCodeProcess(h, &code) == 0) {
> 509 ngx_log_error(NGX_LOG_ALERT, cycle->log, ngx_errno,
> 510 "GetExitCodeProcess(%P) failed",
> 511 ngx_processes[n].pid);
> 512 }

This patch is also wrong.  In the current state of the win32 
version it is not assumed that accept mutex is used at all.  But 
if it is, correct aproach to unlock shared memory mutexes on 
abnormal process termination is to port (or move to 
platform-independed place) the ngx_unlock_mutexes() function from 
src/os/unix/ngx_process.c.  It correctly uses 
ngx_shmtx_force_unlock() to properly unlock shared memory mutexes 
using atomic operations.

Note well that unlocking shared memory mutexes on abnormal 
process termination is an emergency mechanism.  If it actually 
happens, it indicate that something is really wrong in other 
places of the system.

Please also consider reading 
http://nginx.org/en/docs/contributing_changes.html.

-- 
Maxim Dounin
http://nginx.org/en/donation.html



More information about the nginx mailing list