RFC on C Library Safety in Nginx Module Callback

Sinan Kaya Okaya at kernel.org
Fri Jun 14 01:08:56 UTC 2019


I wanted to hear opinions on this surprising observation while
developing an nginx module.

We hit this issue while developing an nginx module. During nginx
module's  startup process, it starts up nginx but also does a lot of
other work, some of which involves using glibc's implementation of
timer_create, a posix function.

https://code.woboq.org/userspace/glibc/sysdeps/unix/sysv/linux/timer_create.c.html
https://code.woboq.org/userspace/glibc/sysdeps/unix/sysv/linux/timer_routines.c.html#__active_timer_sigev_thread_lock


Looking at the glibc source code for timer_create, we can see it has a
global mutex named __active_timer_sigev_thread_lock.  If nginx happens
to call fork() while the rest of the nginx module code is calling
timer_create, the fork() will make a copy of global memory which
includes the global mutex as being owned by some other thread. In the
newly-forked process, that thread will not exist, and the mutex will
always be owned by a non-existent thread. I'm pasting in a GDB trace
showing this below the rest of my explanatory text.


This thread is blocked on timer_create. It is trying to enter a
pthread_mutex_t which is a global named __active_timer_sigev_thread_lock.

(gdb) bt

#0  0x0000ff02aab1634c in __lll_lock_wait (
    futex=futex at entry=0xff02aaaee2c8 <__active_timer_sigev_thread_lock>,
    private=0) at /usr/src/debug/glibc/2.27-r0/git/nptl/lowlevellock.c:46

#1  0x0000ff02aab0f648 in __GI___pthread_mutex_lock (
    mutex=mutex at entry=0xff02aaaee2c8 <__active_timer_sigev_thread_lock>)
    at /usr/src/debug/glibc/2.27-r0/git/nptl/pthread_mutex_lock.c:78

#2  0x0000ff02aaadbb20 in timer_create (clock_id=<optimized out>,
    evp=0xff02a9e181a8, timerid=0xff02a41e43a0)
    at
/usr/src/debug/glibc/2.27-r0/git/sysdeps/unix/sysv/linux/timer_create.c:159

...

#16 0x0000000000695db0 in ngx_thread_pool_cycle (data=0xff02a0027700)
    at
/usr/src/debug/nginx-staticdev/1.14.2-r0/nginx-1.14.2/src/core/ngx_thread_pool.c:342
#17 0x0000ff02aab0cf78 in start_thread (arg=0xff02a8e160d6)
    at /usr/src/debug/glibc/2.27-r0/git/nptl/pthread_create.c:463

#18 0x0000ff02aa2efe2c in thread_start ()
    at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78


By looking at this link below, almost majority of the C library
functions are unsafe when called inside the nginx callback.

One way to work around the issue is use using exec() functions so that
each child process gets a fresh copy of the parent rather than getting
a copy.

https://www.linuxprogrammingblog.com/threads-and-fork-think-twice-before-using-them

"Library functions

Problem with mutexes and critical code sections implies another
non-obvious issue. It's theoretically possible to write your code
executed in threads so that you are sure it's safe to call fork when
such threads run but in practice there is one big problem: library
functions. You're never sure if a library function you are using doesn't
use global data. Even if it is thread safe, it may be achieved using
mutexes internally. You are never sure. Even system library functions
that are thread-safe may use locks internally."


More information about the nginx-devel mailing list