nginx not responding to some SYN packets

Igor Sysoev is at rambler-co.ru
Tue Feb 10 10:02:09 MSK 2009


On Mon, Feb 09, 2009 at 01:51:12PM -0800, Radek Burkat wrote:

> I have been looking at this issue to try to understand the interactions and
> this is what I have found out so far.
> I can reproduce the issue by decreasing the net.ipv4.tcp_max_syn_backlog
> and  setting net.ipv4.tcp_syncookies = 0.  It makes sense that when the load
> gets a little high the system does not process the accept fast enough and
> the backlog builds up.  When the backlog fills up the ACK to the initial SYN
> is not issued.
> It would seem also that enabling syncookies uses some other mechanism as I
> do not see the dropped connections when I have  net.ipv4.tcp_syncookies = 1
> and back log to some small value like net.ipv4.tcp_max_syn_backlog =10.
>  From what I read so far is that syncookies goes into action when you hit a
> certain high water mark on your backlog buffer.  Raising the backlog value
> and/or enabling syncookies seems like a good idea.
> You can look at how many connection are in the backlog with net stat.
> netstat -aln | grep -c SYN_RECV
> 146
> The problem here is that this gives you and instantaneous count of the
> backlog.  As you can imagine this backlog is getting filled and cleared at a
> high rate. It would be nice to know when this actually gets full or even
> what the max has been over a period of time.  Is there a stat like this in
> net statistics?
> This information seems like it is quite important in determining the
> performance of your server latency.
> 
> In nginx there is the optional backlog variable in the listen configuration.
>  I imagine that this maps directly to the system call listen.  Though I see
> that the nginx backlog default is -1. How does that translate to the system
> parameter which requires an in above 0.

Yes, backlog may affect this. However, it seems that you run very old
nginx as it had been fixed in 2007:

Changes with nginx 0.6.7                                         15 Aug 2007
Changes with nginx 0.5.32                                        24 Sep 2007

    *) Bugfix: now nginx uses default listen backlog value 511 on all 
       platforms except FreeBSD.
       Thanks to Jiang Hong.

Now -1 is used on FreeBSD only and it means the maximum available backlog.

Besides, net.ipv4.tcp_max_syn_backlog = 10 is too small.

Also, when kernel has no enough backlog, it usually response with
RST packet, if this is not disabled.


> Radek
> 
> 
> 
> On Wed, Feb 4, 2009 at 12:29 AM, Olivier B. <nginx.list at daevel.fr> wrote:
> 
> > Hello,
> >
> > have you tried to disable syncookies and/or adjust them ?
> > On the varnish documentation (
> > http://varnish.projects.linpro.no/wiki/Performance ) there is at least
> > this setup (in /etc/sysctl.conf) :
> >
> >   net.ipv4.tcp_syncookies = 0
> >   net.ipv4.tcp_max_orphans = 262144
> >   net.ipv4.tcp_max_syn_backlog = 262144
> >   net.ipv4.tcp_synack_retries = 2
> >   net.ipv4.tcp_syn_retries = 2
> >
> >
> > and what about netfilter connection tracking ?
> >
> > Olivier
> >
> > Radek Burkat a ?crit :
> >
> >> Thanks Igor, for eliminating one variable for me. I noticed that when I
> >> use a single worker and set the affinity to the same core as the ethernet
> >> interface interrupt I decrease the frequency of the issue but do not
> >> eliminate it.  Not sure
> >> Are there any kernel/network debugging tools which reach a little deeper
> >> than tcpdump?
> >>
> >> Radek
> >>
> >> On Tue, Feb 3, 2009 at 10:11 PM, Igor Sysoev <is at rambler-co.ru <mailto:
> >> is at rambler-co.ru>> wrote:
> >>
> >>    On Tue, Feb 03, 2009 at 10:07:27PM -0800, Radek Burkat wrote:
> >>
> >>    > Have a machine running the latest devel version nginx-0.7.33
> >>    (tried 0.6.35
> >>    > with same results) for serving small (less than 10K images) and
> >>    am seeing on
> >>    > tcpdump that some SYN packets are not responded to right
> >>    away.The browser
> >>    > does retransmit these image requests every second and on the 2nd
> >>    or 3rd
> >>    > resent SYN, I finally start seeing and ACK, and the images load.
> >>    > It is very indeterministic as to when it happens and can only
> >>    reproduce it
> >>    > some of the time.  When it does occur the outcome is a page with
> >>    some images
> >>    > loaded and others (whose SYN packets are not ACKs) are not
> >>    loaded.....a few
> >>    > seconds later they load.
> >>    > Typically the system has ~2000 active connections, most in keep
> >>    alive.  The
> >>    > load is around 100-200 req/sec.
> >>    >
> >>    > I have tries all sorts of settings and configurations suggested
> >>    in the
> >>    > maillist but I still dont have the solution for this issue.
> >>     from 1 to 4
> >>    > workers, changing the connection counts, different even
> >>    handlers, kernel
> >>    > buffers, etc.
> >>    > It just seems so anecdotal to just change a bunch of settings
> >>    without being
> >>    > able to what is happening internally.
> >>    > I'd like to be able to debug a little deeper to find out what is
> >>    happening
> >>    > to these packets.
> >>    >
> >>    > How would I go about debugging what is the cause of it.  Is it
> >>    the interface
> >>    > driver, kernel, or nginx?  What kind of tools and debugging
> >>    options can I
> >>    > try next?
> >>
> >>    nginx has not any relation to TCP handshake (SYN, SYN/ACK, ACK),
> >>    this is kernel or driver issue.
> >>
> >>    > Thanks, Radek
> >>    >
> >>    >
> >>    > System Details
> >>    > model name : Intel(R) Xeon(R) CPU X3210 @ 2.13GHz
> >>    > Linux n06 2.6.18-92.1.22.el5 #1 SMP Tue Dec 16 12:03:43 EST 2008
> >>    i686 i686
> >>    > i386 GNU/Linux
> >>    > eth0
> >>    > Advertised auto-negotiation: Yes
> >>    > Speed: 1000Mb/s
> >>    > Duplex: Full
> >>    > Port: Twisted Pair
> >>    > PHYAD: 1
> >>    > Transceiver: internal
> >>    > Auto-negotiation: on
> >>    > Supports Wake-on: g
> >>    > Wake-on: d
> >>    > Current message level: 0x000000ff (255)
> >>    > Link detected: yes
> >>    > driver: tg3
> >>    > version: 3.86
> >>    > firmware-version: 5721-v3.61, ASFIPMI v6.21
> >>    > bus-info: 0000:05:00.0
> >>    >
> >>    > avg-cpu: %user %nice %system %iowait %steal %idle
> >>    > 0.10 0.00 0.20 2.43 0.00 97.27
> >>    > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz
> >>    await svctm
> >>    > %util
> >>    > sda 0.00 0.00 27.40 0.00 443.20 0.00 16.18 0.10 3.50 3.32 9.10
> >>    > no overruns or errors on interface.
> >>
> >>    --
> >>    Igor Sysoev
> >>    http://sysoev.ru/en/
> >>
> >>
> >>
> >
> >

-- 
Igor Sysoev
http://sysoev.ru/en/





More information about the nginx mailing list