RFC on C Library Safety in Nginx Module Callback

Ranier Vf ranier.vf at gmail.com
Fri Jun 14 12:51:02 UTC 2019


Hi,
Maybe this help.
http://www.doublersolutions.com/docs/dce/osfdocs/htmls/develop/appdev/Appde193.htm

"One solution to the problem of calling *fork( )* in a multithreaded
environment exists. (Note that this method will not work for server
application code or any other application code that is invoked by a
callback from a library.) Before an application performs a *fork( )*
followed by something other than *exec( )*, it must cancel all of the other
threads. After it joins the canceled threads, it can safely *fork( )*
because it is the only thread in existence. This means that libraries that
create threads must establish cancel handlers that propagate the cancel to
the created threads and join them. The application should save enough state
so that the threads can be recreated and restarted after the *fork( )*
processing completes. "

Best regards,
Ranier Vilela

Em qui, 13 de jun de 2019 às 22:09, Sinan Kaya <Okaya at kernel.org> escreveu:

> I wanted to hear opinions on this surprising observation while
> developing an nginx module.
>
> We hit this issue while developing an nginx module. During nginx
> module's  startup process, it starts up nginx but also does a lot of
> other work, some of which involves using glibc's implementation of
> timer_create, a posix function.
>
>
> https://code.woboq.org/userspace/glibc/sysdeps/unix/sysv/linux/timer_create.c.html
>
> https://code.woboq.org/userspace/glibc/sysdeps/unix/sysv/linux/timer_routines.c.html#__active_timer_sigev_thread_lock
>
>
> Looking at the glibc source code for timer_create, we can see it has a
> global mutex named __active_timer_sigev_thread_lock.  If nginx happens
> to call fork() while the rest of the nginx module code is calling
> timer_create, the fork() will make a copy of global memory which
> includes the global mutex as being owned by some other thread. In the
> newly-forked process, that thread will not exist, and the mutex will
> always be owned by a non-existent thread. I'm pasting in a GDB trace
> showing this below the rest of my explanatory text.
>
>
> This thread is blocked on timer_create. It is trying to enter a
> pthread_mutex_t which is a global named __active_timer_sigev_thread_lock.
>
> (gdb) bt
>
> #0  0x0000ff02aab1634c in __lll_lock_wait (
>     futex=futex at entry=0xff02aaaee2c8 <__active_timer_sigev_thread_lock>,
>     private=0) at /usr/src/debug/glibc/2.27-r0/git/nptl/lowlevellock.c:46
>
> #1  0x0000ff02aab0f648 in __GI___pthread_mutex_lock (
>     mutex=mutex at entry=0xff02aaaee2c8 <__active_timer_sigev_thread_lock>)
>     at /usr/src/debug/glibc/2.27-r0/git/nptl/pthread_mutex_lock.c:78
>
> #2  0x0000ff02aaadbb20 in timer_create (clock_id=<optimized out>,
>     evp=0xff02a9e181a8, timerid=0xff02a41e43a0)
>     at
> /usr/src/debug/glibc/2.27-r0/git/sysdeps/unix/sysv/linux/timer_create.c:159
>
> ...
>
> #16 0x0000000000695db0 in ngx_thread_pool_cycle (data=0xff02a0027700)
>     at
>
> /usr/src/debug/nginx-staticdev/1.14.2-r0/nginx-1.14.2/src/core/ngx_thread_pool.c:342
> #17 0x0000ff02aab0cf78 in start_thread (arg=0xff02a8e160d6)
>     at /usr/src/debug/glibc/2.27-r0/git/nptl/pthread_create.c:463
>
> #18 0x0000ff02aa2efe2c in thread_start ()
>     at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78
>
>
> By looking at this link below, almost majority of the C library
> functions are unsafe when called inside the nginx callback.
>
> One way to work around the issue is use using exec() functions so that
> each child process gets a fresh copy of the parent rather than getting
> a copy.
>
>
> https://www.linuxprogrammingblog.com/threads-and-fork-think-twice-before-using-them
>
> "Library functions
>
> Problem with mutexes and critical code sections implies another
> non-obvious issue. It's theoretically possible to write your code
> executed in threads so that you are sure it's safe to call fork when
> such threads run but in practice there is one big problem: library
> functions. You're never sure if a library function you are using doesn't
> use global data. Even if it is thread safe, it may be achieved using
> mutexes internally. You are never sure. Even system library functions
> that are thread-safe may use locks internally."
> _______________________________________________
> nginx-devel mailing list
> nginx-devel at nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx-devel/attachments/20190614/beeb5edc/attachment.html>


More information about the nginx-devel mailing list