Fix for ngx_unlock() and thus race conditions in AIO mechanism
rmind at noxt.eu
Mon Apr 18 21:40:59 UTC 2016
Some background: enabling AIO threads can result in "stuck" connections,
permanently waiting for handler callback (in the system these connections
appear in a CLOSE_WAIT state). This happens due to race condition(s) in
the AIO thread pool code. Since the window is tiny, it may take hundreds
of thousands of requests on a many-core machine to trigger a single race,
but it is present in the real world: on some systems we accumulate over
a hundred of them a day.
Fix: add a compiler barrier to ngx_unlock(); since it is a macro, it is
subject to the compiler reordering optimisations. Consider the following
fragment of the ngx_thread_pool_cycle() function:
ngx_spinlock(&ngx_thread_pool_done_lock, 1, 2048);
*ngx_thread_pool_done.last = task;
ngx_thread_pool_done.last = &task->next;
On nginx 1.9.14 RHEL 6 build (nginx-1.9.14-1.el6.ngx.x86_64.rpm), you can
see the following assembly code:
438a05: bf 78 f6 70 00 mov $0x70f678,%edi
438a1c: e8 df a6 fe ff callq 423100 <ngx_spinlock>
438a21: 48 8b 05 60 6c 2d 00 mov 0x2d6c60(%rip),%rax
438a28: 48 c7 05 45 6c 2d 00 movq $0x0,0x2d6c45(%rip) # <--
438a2f: 00 00 00 00
438a33: bf b0 8a 43 00 mov $0x438ab0,%edi
438a38: 48 89 2d 49 6c 2d 00 mov %rbp,0x2d6c49(%rip)
438a3f: 48 89 28 mov %rbp,(%rax)
438a42: ff 15 08 72 2d 00 callq *0x2d7208(%rip)
You can see that at 0x438a28 it performs "*lock = 0" before inserting the
task into the queue. This can race with ngx_thread_pool_handler(), where
ngx_thread_pool_done list is reset before the "task" is inserted (that is,
tasks just get lost).
On x86/amd64, the unlock operation only needs a *compiler* barrier, it does
not need a *memory* barrier. I used ngx_memory_barrier() in the patch, but
you can optimise x86/amd64 to use __asm __volatile("":::"memory").
The patch is attached. Even better fix would be to use pthread_spin_lock
or just pthread_mutex_lock. Why roll out your own one?
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
More information about the nginx-devel