nginx not responding to some SYN packets

Radek Burkat radek at pinkbike.com
Tue Feb 10 00:51:12 MSK 2009


I have been looking at this issue to try to understand the interactions and
this is what I have found out so far.
I can reproduce the issue by decreasing the net.ipv4.tcp_max_syn_backlog
and  setting net.ipv4.tcp_syncookies = 0.  It makes sense that when the load
gets a little high the system does not process the accept fast enough and
the backlog builds up.  When the backlog fills up the ACK to the initial SYN
is not issued.
It would seem also that enabling syncookies uses some other mechanism as I
do not see the dropped connections when I have  net.ipv4.tcp_syncookies = 1
and back log to some small value like net.ipv4.tcp_max_syn_backlog =10.
 From what I read so far is that syncookies goes into action when you hit a
certain high water mark on your backlog buffer.  Raising the backlog value
and/or enabling syncookies seems like a good idea.
You can look at how many connection are in the backlog with net stat.
netstat -aln | grep -c SYN_RECV
146
The problem here is that this gives you and instantaneous count of the
backlog.  As you can imagine this backlog is getting filled and cleared at a
high rate. It would be nice to know when this actually gets full or even
what the max has been over a period of time.  Is there a stat like this in
net statistics?
This information seems like it is quite important in determining the
performance of your server latency.

In nginx there is the optional backlog variable in the listen configuration.
 I imagine that this maps directly to the system call listen.  Though I see
that the nginx backlog default is -1. How does that translate to the system
parameter which requires an in above 0.


Radek



On Wed, Feb 4, 2009 at 12:29 AM, Olivier B. <nginx.list at daevel.fr> wrote:

> Hello,
>
> have you tried to disable syncookies and/or adjust them ?
> On the varnish documentation (
> http://varnish.projects.linpro.no/wiki/Performance ) there is at least
> this setup (in /etc/sysctl.conf) :
>
>   net.ipv4.tcp_syncookies = 0
>   net.ipv4.tcp_max_orphans = 262144
>   net.ipv4.tcp_max_syn_backlog = 262144
>   net.ipv4.tcp_synack_retries = 2
>   net.ipv4.tcp_syn_retries = 2
>
>
> and what about netfilter connection tracking ?
>
> Olivier
>
> Radek Burkat a écrit :
>
>> Thanks Igor, for eliminating one variable for me. I noticed that when I
>> use a single worker and set the affinity to the same core as the ethernet
>> interface interrupt I decrease the frequency of the issue but do not
>> eliminate it.  Not sure
>> Are there any kernel/network debugging tools which reach a little deeper
>> than tcpdump?
>>
>> Radek
>>
>> On Tue, Feb 3, 2009 at 10:11 PM, Igor Sysoev <is at rambler-co.ru <mailto:
>> is at rambler-co.ru>> wrote:
>>
>>    On Tue, Feb 03, 2009 at 10:07:27PM -0800, Radek Burkat wrote:
>>
>>    > Have a machine running the latest devel version nginx-0.7.33
>>    (tried 0.6.35
>>    > with same results) for serving small (less than 10K images) and
>>    am seeing on
>>    > tcpdump that some SYN packets are not responded to right
>>    away.The browser
>>    > does retransmit these image requests every second and on the 2nd
>>    or 3rd
>>    > resent SYN, I finally start seeing and ACK, and the images load.
>>    > It is very indeterministic as to when it happens and can only
>>    reproduce it
>>    > some of the time.  When it does occur the outcome is a page with
>>    some images
>>    > loaded and others (whose SYN packets are not ACKs) are not
>>    loaded.....a few
>>    > seconds later they load.
>>    > Typically the system has ~2000 active connections, most in keep
>>    alive.  The
>>    > load is around 100-200 req/sec.
>>    >
>>    > I have tries all sorts of settings and configurations suggested
>>    in the
>>    > maillist but I still dont have the solution for this issue.
>>     from 1 to 4
>>    > workers, changing the connection counts, different even
>>    handlers, kernel
>>    > buffers, etc.
>>    > It just seems so anecdotal to just change a bunch of settings
>>    without being
>>    > able to what is happening internally.
>>    > I'd like to be able to debug a little deeper to find out what is
>>    happening
>>    > to these packets.
>>    >
>>    > How would I go about debugging what is the cause of it.  Is it
>>    the interface
>>    > driver, kernel, or nginx?  What kind of tools and debugging
>>    options can I
>>    > try next?
>>
>>    nginx has not any relation to TCP handshake (SYN, SYN/ACK, ACK),
>>    this is kernel or driver issue.
>>
>>    > Thanks, Radek
>>    >
>>    >
>>    > System Details
>>    > model name : Intel(R) Xeon(R) CPU X3210 @ 2.13GHz
>>    > Linux n06 2.6.18-92.1.22.el5 #1 SMP Tue Dec 16 12:03:43 EST 2008
>>    i686 i686
>>    > i386 GNU/Linux
>>    > eth0
>>    > Advertised auto-negotiation: Yes
>>    > Speed: 1000Mb/s
>>    > Duplex: Full
>>    > Port: Twisted Pair
>>    > PHYAD: 1
>>    > Transceiver: internal
>>    > Auto-negotiation: on
>>    > Supports Wake-on: g
>>    > Wake-on: d
>>    > Current message level: 0x000000ff (255)
>>    > Link detected: yes
>>    > driver: tg3
>>    > version: 3.86
>>    > firmware-version: 5721-v3.61, ASFIPMI v6.21
>>    > bus-info: 0000:05:00.0
>>    >
>>    > avg-cpu: %user %nice %system %iowait %steal %idle
>>    > 0.10 0.00 0.20 2.43 0.00 97.27
>>    > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz
>>    await svctm
>>    > %util
>>    > sda 0.00 0.00 27.40 0.00 443.20 0.00 16.18 0.10 3.50 3.32 9.10
>>    > no overruns or errors on interface.
>>
>>    --
>>    Igor Sysoev
>>    http://sysoev.ru/en/
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://nginx.org/pipermail/nginx/attachments/20090209/b9128419/attachment.html>


More information about the nginx mailing list