SO_REUSEPORT

Mathew Heard mat999 at gmail.com
Fri May 3 23:02:20 UTC 2019


Not spinning since then, but that's when that worker (from the old binary)
was spawned. It's an old worker spinning.

Unfortunately there isnt any debug symbols.

GDB:
(gdb) bt
#0  0x00007ff842edd016 in ?? ()
#1  0x0000000040d9ab70 in ?? ()
#2  0x4096580000000000 in ?? ()
#3  0x4064000000000000 in ?? ()
#4  0x00000009413dcda0 in ?? ()
#5  0x41f975d000000006 in ?? ()
#6  0x000000004190c148 in ?? ()
#7  0x00000000413e4580 in ?? ()
#8  0x40bb6b7840f961c8 in ?? ()
#9  0x0000000000464000 in ?? ()
#10 0x00007ffea78d5370 in ?? ()
#11 0x0000000000000000 in ?? ()

strace:
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
rt_sigreturn()                          = 1094569376
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
rt_sigreturn()                          = 1094569376
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
rt_sigreturn()                          = 1094569376
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
rt_sigreturn()                          = 1094569376
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
rt_sigreturn()                          = 1094569376
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
rt_sigreturn()                          = 1094569376
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
rt_sigreturn()                          = 1094569376
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
rt_sigreturn()                          = 1094569376
--- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
rt_sigreturn()                          = 1094569376


It is a reduced version (less additional modules) of Openresty so third
party module interference is possible. Would there be any thing in
particular re; modules I should check for?

My first step has been trying to work out how to replicate it on demand.
Just triggering binary reloads is not enough, something has to happen
between them and I'm not yet sure what.

On Sat, May 4, 2019 at 8:52 AM Maxim Dounin <mdounin at mdounin.ru> wrote:

> Hello!
>
> On Thu, May 02, 2019 at 08:51:41PM +1000, Mathew Heard wrote:
>
> > Got a little bit further and confirmed this is definitely to do with the
> > binary upgrade.
> >
> > www-data   988 99.9  0.7 365124 122784 ?       R    Jan30 131740:46
> nginx:
> > worker process
> > root      2800  0.0  1.0 361828 165044 ?       Ss   Jan05  27:54 nginx:
> > master process /usr/sbin/nginx -g daemon on; master_process on;
> >
> > 2800   is nginx.old, also (nginx/1.15.8) as we did 2 builds with slightly
> > different compile options.
> >
> > The processes do not respond to nice kill signals, only a -9 was able to
> > kill it.
>
> So, as previously suggested, there is another problem elsewhere.
>
> And, if I'm reading "131740:46" correctly, the worker was eating
> CPU for about 90 days now.  Looking into debugger where it spins
> might be helpful.
>
> Also make sure to take a look at the exact compile options (as
> shown by "nginx.old -V"), as well as any patches and/or modules
> loaded, and the configuration used.
>
> --
> Maxim Dounin
> http://mdounin.ru/
> _______________________________________________
> nginx-devel mailing list
> nginx-devel at nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx-devel/attachments/20190504/555881c5/attachment.html>


More information about the nginx-devel mailing list