Nginx as Load Balancer Connection Issues

gtuhl nginx-forum at nginx.us
Fri Jan 6 21:49:16 UTC 2012


We have a box running nginx and two boxes running apache.  The apache
boxes are configured as an upstream for nginx.  

The nginx box has a public IP, and then it talks to the upstream apaches
using the private network (same switch).  We are sustaining a couple
hundred requests/sec.

We've had several issues with the upstreams being counted out by nginx,
causing the "no live upstreams" message in the error log and end users
seeing 502 errors.  When this happens the machines are barely being
used, single digit load averages in 16 core boxes.

Initially we were seeing a ton of "connect() failed (110: Connection
timed out)", 1 every couple seconds.  I added these to sysctl.conf and
that seemed to solve the problem:

net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_fin_timeout = 20    
net.ipv4.tcp_max_syn_backlog = 20480
net.core.netdev_max_backlog = 4096
net.ipv4.tcp_max_tw_buckets = 400000
net.core.somaxconn = 4096

Now things generally run fine but every once in awhile we get a huge
burst of "upstream prematurely closed connection while reading response
header from upstream" followed by a "no live upstreams".  Again, no
apparent load on the machines involved.  These bursts only last a minute
or so.  We also still get an occasional "connect() failed (110:
Connection timed out)" but they are far less frequent, perhaps 1 or 2
per hour.

Anyone have recommendations for tuning the networking side to improve
the situation here?  These are some of the nginx.conf settings we have
in place, removed the ones that don't seem related to the issue:

worker_processes  4;
worker_rlimit_nofile 30000;
events {
    worker_connections  4096;
    # multi_accept on;                                                  
                   
    use epoll;
}
http {
    client_max_body_size 200m;

    proxy_read_timeout 600s;
    proxy_send_timeout 600s;
    proxy_connect_timeout 60s;

    proxy_buffer_size 128k;
    proxy_buffers 4 128k;

    keepalive_timeout  0;
    tcp_nodelay        on;
}

Happy to provide any other details.  This is the "ulimit -a" on all
boxes:

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 20
file size               (blocks, -f) unlimited
pending signals                 (-i) 16382
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 300000
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) unlimited
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Posted at Nginx Forum: http://forum.nginx.org/read.php?2,220894,220894#msg-220894



More information about the nginx mailing list