Hangup in epoll

Igor Sysoev igor at sysoev.ru
Tue Sep 7 12:38:14 MSD 2010


On Tue, Sep 07, 2010 at 10:25:30AM +0200, jhauser wrote:

> Hello, with recent versions of nginx (0.8.47 - 0.8.49) we had hangups
> of all nginx processes, which could even not killed. This happend
> during low-traffic hours. In the kernel-logs we found:
> 
> 
> Sep  7 05:02:00 www06 kernel: Unable to handle kernel NULL pointer
> dereference at virtual address 00000020
> Sep  7 05:02:00 www06 kernel:  printing eip:
> Sep  7 05:02:00 www06 kernel: c0256a3d
> Sep  7 05:02:00 www06 kernel: *pde = 32919001
> Sep  7 05:02:00 www06 kernel: Oops: 0000 [#1]
> Sep  7 05:02:00 www06 kernel: SMP
> Sep  7 05:02:00 www06 kernel: last sysfs file:
> /devices/pci0000:00/0000:00:0a.0/0000:02:02.1/irq
> Sep  7 05:02:00 www06 kernel: Modules linked in: nls_utf8 nfs lockd
> nfs_acl sunrpc cpufreq_ondemand cpufreq_userspace ipt_TCPMSS
> cpufreq_powersave xt_limit xt_tcpudp xt_state powernow_k8 ipt_LOG
> ipt_recent freq_table iptable_nat ip_na
> t ip_conntrack nfnetlink iptable_filter ip_tables x_tables ipv6 dock
> button battery ac xfs_quota xfs loop dm_mod i2c_amd756 ide_cd ohci_hcd
> ehci_hcd cdrom i2c_core mptctl hw_random usbcore tg3 ext3 jbd edd fan
> thermal processor sg mpt
> spi mptscsih mptbase scsi_transport_spi amd74xx sd_mod scsi_mod
> ide_disk ide_core
> Sep  7 05:02:00 www06 kernel: CPU:    1
> Sep  7 05:02:00 www06 kernel: EIP:    0060:[<c0256a3d>]    Not tainted VLI
> Sep  7 05:02:00 www06 kernel: EFLAGS: 00210246   (2.6.16.60-0.66.1-bigsmp #1)
> Sep  7 05:02:00 www06 kernel: EIP is at sock_poll+0x9/0xe
> Sep  7 05:02:00 www06 kernel: eax: f447f980   ebx: 00000000   ecx:
> 00000000   edx: e936f900
> Sep  7 05:02:00 www06 kernel: esi: cd8f9e2c   edi: f5cb4540   ebp:
> cd8f9e00   esp: e8b0ff60
> Sep  7 05:02:00 www06 kernel: ds: 007b   es: 007b   ss: 0068
> Sep  7 05:02:00 www06 kernel: Process nginx (pid: 17585,
> threadinfo=e8b0e000 task=c968ed10)
> Sep  7 05:02:00 www06 kernel: Stack: <0>00000200 c018ced6 083d1b88
> 00000000 defa3080 7fffffff f10cd21c c94fe5c0
> Sep  7 05:02:00 www06 kernel:        00000000 00000000 c0181a69
> ccbd5380 f45f3740 c0167047 e8b0ff98 e8b0ff98
> Sep  7 05:02:00 www06 kernel:        f5cb4550 f5cb4550 00000008
> ffffffff 00000003 e8b0e000 c0103dcb 00000008
> Sep  7 05:02:00 www06 kernel: Call Trace:
> Sep  7 05:02:00 www06 kernel:  [<c018ced6>] sys_epoll_wait+0x246/0x3f8
> Sep  7 05:02:00 www06 kernel:  [<c0181a69>] mntput_no_expire+0x13/0x76
> Sep  7 05:02:00 www06 kernel:  [<c0167047>] filp_close+0x4e/0x54
> Sep  7 05:02:00 www06 kernel:  [<c0103dcb>] sysenter_past_esp+0x54/0x79
> Sep  7 05:02:00 www06 kernel: Code: d8 89 43 10 0f 20 e0 89 43 14 5b
> c3 b8 20 53 42 c0 e9 bd ff ff ff b8 01 00 00 00 c3 b8 fa ff ff ff c3
> 53 89 d1 8b 50 78 8b 5a 08 <ff> 53 20 5b c3 53 89 d1 8b 50 78 8b 5a 08
> ff 53 40 5b c3 53 8b
> 
> The system is SUSE Linux Enterprise Server 10 (i586),
> Kernel Linux version 2.6.16.60-0.66.1-bigsmp (geeko at buildhost) (gcc
> version 4.1.2 20070115 (SUSE Linux)) #1 SMP Fri May 28 12:10:21 UTC
> 2010
> NGinx-version 0.8.49
> We now switched to event-modul poll for the time being as this is a
> low traffic site.
> 
> The hangup happened two times before, but very sporadic and could not
> be correlated to specific requests. Any more help we could provide? Oh
> and if someone has a hint how we can prevent the full reboot of the
> machine in such situations would also be appreciated.

As I understand this is not "hungup of all nginx processes, which could
even not killed", but a kernel crash. This is not nginx issue, this is
Linux kernel bug.


-- 
Igor Sysoev
http://sysoev.ru/en/



More information about the nginx mailing list