client wrongly closed keepalive connection problem

Mon Nov 28 15:17:07 UTC 2011

Hi all, we are some college students maintaining a linux distribution
mirror. We are not experienced but now meet with
this problem.
There is a pxe system in our college to help every computer boot a linux
system or install one on Ethernet. Now most users 
cannot install, cause most of the packages will time out when being
downloaded.
We used lftp to mirror the directories, trying to find out the problem.
On this directory (on our mirror with same content)
http://ftp.debian.org/debian/pool/main/a/abiword/
lftp hints:

<--- Accept-Ranges: bytes
<--- 
---> HEAD /debian/pool/main/a/abiword/abiword_2.8.2.orig.tar.gz
HTTP/1.1
---> Host: mirrors.ustc.edu.cn
---> User-Agent: lftp/4.0.6
---> Accept: */*
---> Connection: keep-alive
---> 
<--- HTTP/1.1 200 OK
**** recv: <input-socket>: Connection reset by peer
---- Closing HTTP connection

sometimes might be 
"send:<output-socket>: Connection reset by peer"

and the nginx error.log shows"[info] 8864#0: *1661715 client xxxx closed
keepalive connection"

After trying mirror with lftp many times, we found that after request
around 100 files in this directory,
the server starts to send data back. And around receiving each 15-17
files, the reset error will show once,
until the server start to send out bigger files in this directory.

At first we suspect that it might because there are too many small files
at the start in this directory,
but when we move this directory to another host and mirror it, it is
just ok.

Following is our nginx conf without some not that import configuration:

user www-data;
worker_processes  16;
worker_cpu_affinity  xxxxx(just skip it);
worker_rlimit_nofile 65535;
events {
    use epoll;
    worker_connections  10000;
    #multi_accept off;
}
http {
    include       /etc/nginx/mime.types;
    access_log    /var/log/nginx/access.log;
    error_log  /var/log/nginx/error.log info;

    sendfile        on;
    tcp_nopush     on;

    keepalive_timeout  30;
    tcp_nodelay        off;(we also tried nopush off with nodelay on)
    limit_zone conn_per_ip $binary_remote_addr 10m;
}

server {
    root /srv/www/;
    autoindex on;
    limit_conn conn_per_ip 20;

    ## limit speed for MSIE UAs
    if ($http_user_agent ~ "MSIE" ) { 
        set $limit_rate 10k;
    } 
}

netstat -ant |wc -l :   4000-5000

sysctl.conf:
net.ipv4.tcp_max_syn_backlog = 65536
net.core.netdev_max_backlog = 32768
net.core.somaxconn = 32768
net.core.wmem_default = 8388608
net.core.rmem_default = 8388608
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_syn_retries = 2
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_mem = 94500000 915000000 927000000
net.ipv4.tcp_max_orphans = 3276800

The nginx version we have tried: 1.0.5, 1.0.10, 0.7.67
Server system: debian squeeze.

Currently we found that limit_req might solve the problem, but is this
the right way? Because it
will restrict the performance.

Thank you for your help.

Posted at Nginx Forum: http://forum.nginx.org/read.php?2,219220,219220#msg-219220