<div dir="ltr"><div dir="ltr"><div><div><div><div><div><div><div>Hi all,<br><br></div>I
was hoping someone might have an idea here.. I have a number of nginx
doing load balancing sitting behind AWS's network load balancers (TCP) -
which seem to only support TCP checks.<br><br></div>Recently a few have
stopped working / frozen - they still seem to accept a tcp connection
from the NLB - which leads the health check not to fail. But they
cannot internally process the request and you cannot even ssh into the
machine. A reboot is required and that takes longer than normal.<br><br></div>I
think the failure is related to a disk issue since the only error in
the entire logs where regarding the disk. (error logs below)<br><br></div></div>Ideally
if nginx or the O/S fails it would be better if the port just closed.
I've considered writing a small daemon that monitors via http locally
and keeps a port open if everything is ok.<br></div><div><br></div><div>These machines have been running for months now without any issues until now.</div><div><br></div>Anyone have an idea?<br><br></div>Thanks!<br><br>----<br><pre class="gmail-m_-2817378367315308525gmail-B-G">[4161960.544106] INFO: task jbd2/xvda1-8:271 blocked for more than 120 seconds.
[4161960.551035] Not tainted 4.4.0-1022-aws #31-Ubuntu
[4161960.556118] "echo 0 > /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables this message.
[4161960.562846] INFO: task monit:13224 blocked for more than 120 seconds.
[4161960.567394] Not tainted 4.4.0-1022-aws #31-Ubuntu
[4161960.571120] "echo 0 > /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables this message.
[4162080.576076] INFO: task dhclient:696 blocked for more than 120 seconds.
[4162080.579596] Not tainted 4.4.0-1022-aws #31-Ubuntu
[4162080.582355] "echo 0 > /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables this message.
[4162080.586470] INFO: task monit:13224 blocked for more than 120 seconds.
[4162080.589847] Not tainted 4.4.0-1022-aws #31-Ubuntu
[4162080.592654] "echo 0 > /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables this message.
[4162200.596100] INFO: task jbd2/xvda1-8:271 blocked for more than 120 seconds.
[4162200.599646] Not tainted 4.4.0-1022-aws #31-Ubuntu
[4162200.602422] "echo 0 > /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables this message.
[4162200.606423] INFO: task dhclient:696 blocked for more than 120 seconds.
[4162200.610118] Not tainted 4.4.0-1022-aws #31-Ubuntu
[4162200.613093] "echo 0 > /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables this message.
[4162200.617889] INFO: task monit:13224 blocked for more than 120 seconds.
[4162200.621641] Not tainted 4.4.0-1022-aws #31-Ubuntu
[4162200.624506] "echo 0 > /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables this message.
[4162244.551431] systemd[1]: Failed to start Journal Service.
[4162320.628099] INFO: task jbd2/xvda1-8:271 blocked for more than 120 seconds.
[4162320.631942] Not tainted 4.4.0-1022-aws #31-Ubuntu
[4162320.635012] "echo 0 > /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables this message.
[4162320.639647] INFO: task dhclient:696 blocked for more than 120 seconds.
[4162320.643241] Not tainted 4.4.0-1022-aws #31-Ubuntu
[4162320.646233] "echo 0 > /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables this message.
[4162320.650712] INFO: task monit:13224 blocked for more than 120 seconds.
[4162320.654190] Not tainted 4.4.0-1022-aws #31-Ubuntu
[4162320.657183] "echo 0 > /proc/sys/kernel/hung_task_<wbr>timeout_secs" disables this message.
[4162334.801390] systemd[1]: Failed to start Journal Service.
[4162425.051503] systemd[1]: Failed to start Journal Service.
[4162515.301393] systemd[1]: Failed to start Journal Service.
</pre></div></div>