Hangup in epoll
Igor Sysoev
igor at sysoev.ru
Tue Sep 7 12:38:14 MSD 2010
On Tue, Sep 07, 2010 at 10:25:30AM +0200, jhauser wrote:
> Hello, with recent versions of nginx (0.8.47 - 0.8.49) we had hangups
> of all nginx processes, which could even not killed. This happend
> during low-traffic hours. In the kernel-logs we found:
>
>
> Sep 7 05:02:00 www06 kernel: Unable to handle kernel NULL pointer
> dereference at virtual address 00000020
> Sep 7 05:02:00 www06 kernel: printing eip:
> Sep 7 05:02:00 www06 kernel: c0256a3d
> Sep 7 05:02:00 www06 kernel: *pde = 32919001
> Sep 7 05:02:00 www06 kernel: Oops: 0000 [#1]
> Sep 7 05:02:00 www06 kernel: SMP
> Sep 7 05:02:00 www06 kernel: last sysfs file:
> /devices/pci0000:00/0000:00:0a.0/0000:02:02.1/irq
> Sep 7 05:02:00 www06 kernel: Modules linked in: nls_utf8 nfs lockd
> nfs_acl sunrpc cpufreq_ondemand cpufreq_userspace ipt_TCPMSS
> cpufreq_powersave xt_limit xt_tcpudp xt_state powernow_k8 ipt_LOG
> ipt_recent freq_table iptable_nat ip_na
> t ip_conntrack nfnetlink iptable_filter ip_tables x_tables ipv6 dock
> button battery ac xfs_quota xfs loop dm_mod i2c_amd756 ide_cd ohci_hcd
> ehci_hcd cdrom i2c_core mptctl hw_random usbcore tg3 ext3 jbd edd fan
> thermal processor sg mpt
> spi mptscsih mptbase scsi_transport_spi amd74xx sd_mod scsi_mod
> ide_disk ide_core
> Sep 7 05:02:00 www06 kernel: CPU: 1
> Sep 7 05:02:00 www06 kernel: EIP: 0060:[<c0256a3d>] Not tainted VLI
> Sep 7 05:02:00 www06 kernel: EFLAGS: 00210246 (2.6.16.60-0.66.1-bigsmp #1)
> Sep 7 05:02:00 www06 kernel: EIP is at sock_poll+0x9/0xe
> Sep 7 05:02:00 www06 kernel: eax: f447f980 ebx: 00000000 ecx:
> 00000000 edx: e936f900
> Sep 7 05:02:00 www06 kernel: esi: cd8f9e2c edi: f5cb4540 ebp:
> cd8f9e00 esp: e8b0ff60
> Sep 7 05:02:00 www06 kernel: ds: 007b es: 007b ss: 0068
> Sep 7 05:02:00 www06 kernel: Process nginx (pid: 17585,
> threadinfo=e8b0e000 task=c968ed10)
> Sep 7 05:02:00 www06 kernel: Stack: <0>00000200 c018ced6 083d1b88
> 00000000 defa3080 7fffffff f10cd21c c94fe5c0
> Sep 7 05:02:00 www06 kernel: 00000000 00000000 c0181a69
> ccbd5380 f45f3740 c0167047 e8b0ff98 e8b0ff98
> Sep 7 05:02:00 www06 kernel: f5cb4550 f5cb4550 00000008
> ffffffff 00000003 e8b0e000 c0103dcb 00000008
> Sep 7 05:02:00 www06 kernel: Call Trace:
> Sep 7 05:02:00 www06 kernel: [<c018ced6>] sys_epoll_wait+0x246/0x3f8
> Sep 7 05:02:00 www06 kernel: [<c0181a69>] mntput_no_expire+0x13/0x76
> Sep 7 05:02:00 www06 kernel: [<c0167047>] filp_close+0x4e/0x54
> Sep 7 05:02:00 www06 kernel: [<c0103dcb>] sysenter_past_esp+0x54/0x79
> Sep 7 05:02:00 www06 kernel: Code: d8 89 43 10 0f 20 e0 89 43 14 5b
> c3 b8 20 53 42 c0 e9 bd ff ff ff b8 01 00 00 00 c3 b8 fa ff ff ff c3
> 53 89 d1 8b 50 78 8b 5a 08 <ff> 53 20 5b c3 53 89 d1 8b 50 78 8b 5a 08
> ff 53 40 5b c3 53 8b
>
> The system is SUSE Linux Enterprise Server 10 (i586),
> Kernel Linux version 2.6.16.60-0.66.1-bigsmp (geeko at buildhost) (gcc
> version 4.1.2 20070115 (SUSE Linux)) #1 SMP Fri May 28 12:10:21 UTC
> 2010
> NGinx-version 0.8.49
> We now switched to event-modul poll for the time being as this is a
> low traffic site.
>
> The hangup happened two times before, but very sporadic and could not
> be correlated to specific requests. Any more help we could provide? Oh
> and if someone has a hint how we can prevent the full reboot of the
> machine in such situations would also be appreciated.
As I understand this is not "hungup of all nginx processes, which could
even not killed", but a kernel crash. This is not nginx issue, this is
Linux kernel bug.
--
Igor Sysoev
http://sysoev.ru/en/
More information about the nginx
mailing list