nginx not responding to some SYN packets
Igor Sysoev
is at rambler-co.ru
Tue Feb 10 10:02:09 MSK 2009
On Mon, Feb 09, 2009 at 01:51:12PM -0800, Radek Burkat wrote:
> I have been looking at this issue to try to understand the interactions and
> this is what I have found out so far.
> I can reproduce the issue by decreasing the net.ipv4.tcp_max_syn_backlog
> and setting net.ipv4.tcp_syncookies = 0. It makes sense that when the load
> gets a little high the system does not process the accept fast enough and
> the backlog builds up. When the backlog fills up the ACK to the initial SYN
> is not issued.
> It would seem also that enabling syncookies uses some other mechanism as I
> do not see the dropped connections when I have net.ipv4.tcp_syncookies = 1
> and back log to some small value like net.ipv4.tcp_max_syn_backlog =10.
> From what I read so far is that syncookies goes into action when you hit a
> certain high water mark on your backlog buffer. Raising the backlog value
> and/or enabling syncookies seems like a good idea.
> You can look at how many connection are in the backlog with net stat.
> netstat -aln | grep -c SYN_RECV
> 146
> The problem here is that this gives you and instantaneous count of the
> backlog. As you can imagine this backlog is getting filled and cleared at a
> high rate. It would be nice to know when this actually gets full or even
> what the max has been over a period of time. Is there a stat like this in
> net statistics?
> This information seems like it is quite important in determining the
> performance of your server latency.
>
> In nginx there is the optional backlog variable in the listen configuration.
> I imagine that this maps directly to the system call listen. Though I see
> that the nginx backlog default is -1. How does that translate to the system
> parameter which requires an in above 0.
Yes, backlog may affect this. However, it seems that you run very old
nginx as it had been fixed in 2007:
Changes with nginx 0.6.7 15 Aug 2007
Changes with nginx 0.5.32 24 Sep 2007
*) Bugfix: now nginx uses default listen backlog value 511 on all
platforms except FreeBSD.
Thanks to Jiang Hong.
Now -1 is used on FreeBSD only and it means the maximum available backlog.
Besides, net.ipv4.tcp_max_syn_backlog = 10 is too small.
Also, when kernel has no enough backlog, it usually response with
RST packet, if this is not disabled.
> Radek
>
>
>
> On Wed, Feb 4, 2009 at 12:29 AM, Olivier B. <nginx.list at daevel.fr> wrote:
>
> > Hello,
> >
> > have you tried to disable syncookies and/or adjust them ?
> > On the varnish documentation (
> > http://varnish.projects.linpro.no/wiki/Performance ) there is at least
> > this setup (in /etc/sysctl.conf) :
> >
> > net.ipv4.tcp_syncookies = 0
> > net.ipv4.tcp_max_orphans = 262144
> > net.ipv4.tcp_max_syn_backlog = 262144
> > net.ipv4.tcp_synack_retries = 2
> > net.ipv4.tcp_syn_retries = 2
> >
> >
> > and what about netfilter connection tracking ?
> >
> > Olivier
> >
> > Radek Burkat a ?crit :
> >
> >> Thanks Igor, for eliminating one variable for me. I noticed that when I
> >> use a single worker and set the affinity to the same core as the ethernet
> >> interface interrupt I decrease the frequency of the issue but do not
> >> eliminate it. Not sure
> >> Are there any kernel/network debugging tools which reach a little deeper
> >> than tcpdump?
> >>
> >> Radek
> >>
> >> On Tue, Feb 3, 2009 at 10:11 PM, Igor Sysoev <is at rambler-co.ru <mailto:
> >> is at rambler-co.ru>> wrote:
> >>
> >> On Tue, Feb 03, 2009 at 10:07:27PM -0800, Radek Burkat wrote:
> >>
> >> > Have a machine running the latest devel version nginx-0.7.33
> >> (tried 0.6.35
> >> > with same results) for serving small (less than 10K images) and
> >> am seeing on
> >> > tcpdump that some SYN packets are not responded to right
> >> away.The browser
> >> > does retransmit these image requests every second and on the 2nd
> >> or 3rd
> >> > resent SYN, I finally start seeing and ACK, and the images load.
> >> > It is very indeterministic as to when it happens and can only
> >> reproduce it
> >> > some of the time. When it does occur the outcome is a page with
> >> some images
> >> > loaded and others (whose SYN packets are not ACKs) are not
> >> loaded.....a few
> >> > seconds later they load.
> >> > Typically the system has ~2000 active connections, most in keep
> >> alive. The
> >> > load is around 100-200 req/sec.
> >> >
> >> > I have tries all sorts of settings and configurations suggested
> >> in the
> >> > maillist but I still dont have the solution for this issue.
> >> from 1 to 4
> >> > workers, changing the connection counts, different even
> >> handlers, kernel
> >> > buffers, etc.
> >> > It just seems so anecdotal to just change a bunch of settings
> >> without being
> >> > able to what is happening internally.
> >> > I'd like to be able to debug a little deeper to find out what is
> >> happening
> >> > to these packets.
> >> >
> >> > How would I go about debugging what is the cause of it. Is it
> >> the interface
> >> > driver, kernel, or nginx? What kind of tools and debugging
> >> options can I
> >> > try next?
> >>
> >> nginx has not any relation to TCP handshake (SYN, SYN/ACK, ACK),
> >> this is kernel or driver issue.
> >>
> >> > Thanks, Radek
> >> >
> >> >
> >> > System Details
> >> > model name : Intel(R) Xeon(R) CPU X3210 @ 2.13GHz
> >> > Linux n06 2.6.18-92.1.22.el5 #1 SMP Tue Dec 16 12:03:43 EST 2008
> >> i686 i686
> >> > i386 GNU/Linux
> >> > eth0
> >> > Advertised auto-negotiation: Yes
> >> > Speed: 1000Mb/s
> >> > Duplex: Full
> >> > Port: Twisted Pair
> >> > PHYAD: 1
> >> > Transceiver: internal
> >> > Auto-negotiation: on
> >> > Supports Wake-on: g
> >> > Wake-on: d
> >> > Current message level: 0x000000ff (255)
> >> > Link detected: yes
> >> > driver: tg3
> >> > version: 3.86
> >> > firmware-version: 5721-v3.61, ASFIPMI v6.21
> >> > bus-info: 0000:05:00.0
> >> >
> >> > avg-cpu: %user %nice %system %iowait %steal %idle
> >> > 0.10 0.00 0.20 2.43 0.00 97.27
> >> > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz
> >> await svctm
> >> > %util
> >> > sda 0.00 0.00 27.40 0.00 443.20 0.00 16.18 0.10 3.50 3.32 9.10
> >> > no overruns or errors on interface.
> >>
> >> --
> >> Igor Sysoev
> >> http://sysoev.ru/en/
> >>
> >>
> >>
> >
> >
--
Igor Sysoev
http://sysoev.ru/en/
More information about the nginx
mailing list