Weird issue between backend and upstream (0 bytes)

Guzmán Brasó guzman.braso at gmail.com
Thu Aug 9 13:24:51 UTC 2012


Hi...

Just made myself of a dump from the backend perspective and analized with
wireshark to see what were on those bytes, I was pretty sure those bytes
were some error from the app with the key to fix it.... but no, those 15776
bytes are always 15776 but the first 15776 of every request, so every
request is different.

According to tcpdump everything flows OK and after one common ack from
upstream while backend sends data, upstream inmediatly  sends the FIN flag
and inmediatly after that a group of RST arrive.

So, clearly there's no backend issue here, whatever happens it's on the
upstream, and now it looks more a problem with the OS than nginx fault.

Anyone have any ideas why the OS may be closing & resetting nginx
connections?

OS: Debian Lenny Virtualized inside XenServer (Citrix free version)
Virtual cores: 2 ( Load: 0.10)
RAM: 2GB (Currently: 44mb free / 1.3GB os cached, 310MB buffers)

Currently show no load and problem occurrs to 20%-30% of traffic to both
backends but only for a specific url. This main nginx it's doing load
balancing for more domains, some of them with a lot of traffic, and there's
not a single status 200 0 byte request in the logs of others domains.

Some network stats:
netstat -anp|awk '$1~/^tcp/ { print $6 }'|sort -n|uniq -c
      2 CLOSING
    164 ESTABLISHED
     21 FIN_WAIT1
    150 FIN_WAIT2
      1 LAST_ACK
      9 LISTEN
      4 SYN_RECV
      1 SYN_SENT
    293 TIME_WAIT

No custom configuration inside /etc/sysctl.conf...

I'll be gladly submit more thata from the faulty nginx server if anyone
needs, just ask

Thanks for any hint, I'm really out of ideas

On Fri, Aug 3, 2012 at 2:48 PM, Guzmán Brasó <guzman.braso at gmail.com> wrote:

> Hi list!
>
> I've a weid issue I've been banging my head without been able to
> understand what's going on.
>
> Setup is as follow:
> - 1 nginx 0.7.67 doing load balance to two backends.
> - backend 1 with Nginx 1.2.2 and php-fpm 5.3
> - backend 2 with Nginx 0.7.67 and php-fpm 5.3
>
> Some, and only some requests log in the upstream status 200 and 0 byte
> returned.
> Same request in the backend log shows a 200 status and ALWAYS the same
> amount of bytes, which change between backends.
> When this happen on a request that went to backend 1: upstream logs 200 0
> byte, backend logs 200 15776 bytes.
> When this happen on a request that went to backend 2: upstream logs 200 0
> byte, backend logs 200 15670 bytes.
>
> I've tried without luck to reproduce the problem, so I decided to start
> debugging all requests to this site to try to understand why nginx is
> returning empty responses.
>
> This is what I see in upstream when error happens:
> (...)
> 2012/08/03 13:25:32 [debug] 1546#0: *823 http proxy header: "Accept: */*"
> 2012/08/03 13:25:32 [debug] 1546#0: *823 http proxy header: "User-Agent:
> AdsBot-Google (+http://www.google.com/adsbot.html)"
> 2012/08/03 13:25:32 [debug] 1546#0: *823 http proxy header:
> "Accept-Encoding: gzip,deflate"
> 2012/08/03 13:25:32 [debug] 1546#0: *823 http proxy header:
> 2012/08/03 13:25:32 [debug] 1546#0: *823 http cleanup add: 0000000001A3DF10
> 2012/08/03 13:25:32 [debug] 1546#0: *823 get rr peer, try: 2
> 2012/08/03 13:25:32 [debug] 1546#0: *823 get rr peer, current: 0 8
> 2012/08/03 13:25:32 [debug] 1546#0: *823 socket 149
> 2012/08/03 13:25:32 [debug] 1546#0: *823 epoll add connection: fd:149
> ev:80000005
> 2012/08/03 13:25:32 [debug] 1546#0: *823 connect to 176.31.64.205:8059,
> fd:149 #824
> 2012/08/03 13:25:32 [debug] 1546#0: *823 http upstream connect: -2
> 2012/08/03 13:25:32 [debug] 1546#0: *823 event timer add: 149:
> 30000:1344011162971
> 2012/08/03 13:25:32 [debug] 1546#0: *823 http run request:
> "/s/miracle+noodle?"
> 2012/08/03 13:25:32 [debug] 1546#0: *823 http upstream check client, write
> event:1, "/s/miracle+noodle"
> 2012/08/03 13:25:32 [debug] 1546#0: *823 http upstream recv(): -1 (11:
> Resource temporarily unavailable)
> 2012/08/03 13:25:32 [debug] 1546#0: *823 http upstream request:
> "/s/miracle+noodle?"
>  2012/08/03 13:25:32 [debug] 1546#0: *823 http upstream send request
> handler
> 2012/08/03 13:25:32 [debug] 1546#0: *823 http upstream send request
> 2012/08/03 13:25:32 [debug] 1546#0: *823 chain writer buf fl:1 s:358
> 2012/08/03 13:25:32 [debug] 1546#0: *823 chain writer in: 0000000001A3DF48
> 2012/08/03 13:25:32 [debug] 1546#0: *823 writev: 358
> 2012/08/03 13:25:32 [debug] 1546#0: *823 chain writer out: 0000000000000000
> 2012/08/03 13:25:32 [debug] 1546#0: *823 event timer del: 149:
> 1344011162971
> 2012/08/03 13:25:32 [debug] 1546#0: *823 event timer add: 149:
> 120000:1344011252972
> 2012/08/03 13:25:33 [debug] 1546#0: *823 http run request:
> "/s/miracle+noodle?"
> 2012/08/03 13:25:33 [debug] 1546#0: *823 http upstream check client, write
> event:0, "/s/miracle+noodle"
> 2012/08/03 13:25:33 [debug] 1546#0: *823 http upstream recv(): 0 (11:
> Resource temporarily unavailable)
> 2012/08/03 13:25:33 [debug] 1546#0: *823 http run request:
> "/s/miracle+noodle?"
> 2012/08/03 13:25:33 [debug] 1546#0: *823 http upstream check client, write
> event:1, "/s/miracle+noodle"
> 2012/08/03 13:25:45 [debug] 1546#0: *823 http upstream request:
> "/s/miracle+noodle?"
> 2012/08/03 13:25:45 [debug] 1546#0: *823 http upstream process header
> 2012/08/03 13:25:45 [debug] 1546#0: *823 malloc: 0000000001B8DC30:16384
> 2012/08/03 13:25:45 [debug] 1546#0: *823 recv: fd:149 4344 of 16296
> 2012/08/03 13:25:45 [debug] 1546#0: *823 http proxy status 200 "200 OK"
> 2012/08/03 13:25:45 [debug] 1546#0: *823 http proxy header: "Server: nginx"
> 2012/08/03 13:25:45 [debug] 1546#0: *823 http proxy header: "Date: Fri, 03
> Aug 2012 16:24:26 GMT"
> 2012/08/03 13:25:45 [debug] 1546#0: *823 http proxy header: "Content-Type:
> text/html"
> 2012/08/03 13:25:45 [debug] 1546#0: *823 http proxy header: "Connection:
> close"
> 2012/08/03 13:25:45 [debug] 1546#0: *823 http proxy header: "X-Powered-By:
> PHP/5.3.15-1~dotdeb.0"
> 2012/08/03 13:25:45 [debug] 1546#0: *823 http proxy header done
> 2012/08/03 13:25:45 [debug] 1546#0: *823 HTTP/1.1 200 OKM
> 2012/08/03 13:25:45 [debug] 1546#0: *823 finalize http upstream request: -1
> 2012/08/03 13:25:45 [debug] 1546#0: *823 finalize http proxy request
> 2012/08/03 13:25:45 [debug] 1546#0: *823 free rr peer 2 0
> 2012/08/03 13:25:45 [debug] 1546#0: *823 close http upstream connection:
> 149
> 2012/08/03 13:25:45 [debug] 1546#0: *823 event timer del: 149:
> 1344011252972
> 2012/08/03 13:25:45 [debug] 1546#0: *823 http upstream cache fd: 0
> 2012/08/03 13:25:45 [debug] 1546#0: *823 http file cache free
> 2012/08/03 13:25:45 [debug] 1546#0: *823 http finalize request: -1,
> "/s/miracle+noodle?" 1
> 2012/08/03 13:25:45 [debug] 1546#0: *823 http close request
> 2012/08/03 13:25:45 [debug] 1546#0: *823 http log handler
> 2012/08/03 13:25:45 [debug] 1546#0: *823 run cleanup: 0000000001A3DC40
> 2012/08/03 13:25:45 [debug] 1546#0: *823 run cleanup: 0000000001850E58
> 2012/08/03 13:25:45 [debug] 1546#0: *823 free: 0000000001B8DC30
> 2012/08/03 13:25:45 [debug] 1546#0: *823 free: 00000000018501C0, unused: 1
> 2012/08/03 13:25:45 [debug] 1546#0: *823 free: 0000000001A3D890, unused:
> 757
> 2012/08/03 13:25:45 [debug] 1546#0: *823 close http connection: 148
> 2012/08/03 13:25:45 [debug] 1546#0: *823 free: 00000000019B7C20
> 2012/08/03 13:25:45 [debug] 1546#0: *823 free: 0000000001B2BEF0
> 2012/08/03 13:25:45 [debug] 1546#0: *823 free: 0000000001B3DC00, unused: 8
> 2012/08/03 13:25:45 [debug] 1546#0: *823 free: 00000000019F55B0, unused:
> 112
>
> The above debug log on upsteams seems to say that the backend closed the
> connection after headers, am I right?
>
> However, backend debug of another affected request that always log as
> returned the same amount of bytes (which is pretty weird), shows a
> different story..
> After many of...
> (...)
> 2012/08/03 13:56:17 [debug] 1519#0: *221 pipe recv chain: -2
> 2012/08/03 13:56:17 [debug] 1519#0: *221 pipe buf busy s:1 t:1 f:0
> 000000000109F1E0, pos 00000000010A2FD0, size: 49752 file: 0, size: 0
> 2012/08/03 13:56:17 [debug] 1519#0: *221 pipe buf busy s:1 t:1 f:0
> 000000000109F1E0, pos 00000000010AF230, size: 65456 file: 0, size: 0
> 2012/08/03 13:56:17 [debug] 1519#0: *221 pipe buf free s:0 t:1 f:0
> 00000000010BF1F0, pos 00000000010BF1F0, size: 80 file: 0, size: 0
> 2012/08/03 13:56:17 [debug] 1519#0: *221 pipe length: -1
> 2012/08/03 13:56:17 [debug] 1519#0: *221 pipe write downstream: 0
> 2012/08/03 13:56:17 [debug] 1519#0: *221 pipe read upstream: 0
> 2012/08/03 13:56:17 [debug] 1519#0: *221 pipe buf busy s:1 t:1 f:0
> 000000000109F1E0, pos 00000000010A2FD0, size: 49752 file: 0, size: 0
> 2012/08/03 13:56:17 [debug] 1519#0: *221 pipe buf busy s:1 t:1 f:0
> 000000000109F1E0, pos 00000000010AF230, size: 65456 file: 0, size: 0
> 2012/08/03 13:56:17 [debug] 1519#0: *221 pipe buf free s:0 t:1 f:0
> 00000000010BF1F0, pos 00000000010BF1F0, size: 80 file: 0, size: 0
> 2012/08/03 13:56:17 [debug] 1519#0: *221 pipe length: -1
> (...)
>
> Sudenly this comes:
> 2012/08/03 13:56:17 [debug] 1519#0: *221 pipe recv chain: -2
> 2012/08/03 13:56:17 [debug] 1519#0: *221 pipe buf busy s:1 t:1 f:0
> 000000000109F1E0, pos 00000000010A2FD0, size: 49752 file: 0, size: 0
> 2012/08/03 13:56:17 [debug] 1519#0: *221 pipe buf busy s:1 t:1 f:0
> 000000000109F1E0, pos 00000000010AF230, size: 65456 file: 0, size: 0
> 2012/08/03 13:56:17 [debug] 1519#0: *221 pipe buf free s:0 t:1 f:0
> 00000000010BF1F0, pos 00000000010BF1F0, size: 49232 file: 0, size: 0
> 2012/08/03 13:56:17 [debug] 1519#0: *221 pipe length: -1
> 2012/08/03 13:56:17 [debug] 1519#0: *221 pipe write downstream: 0
> 2012/08/03 13:56:17 [debug] 1519#0: *221 pipe read upstream: 0
> 2012/08/03 13:56:17 [debug] 1519#0: *221 pipe buf busy s:1 t:1 f:0
> 000000000109F1E0, pos 00000000010A2FD0, size: 49752 file: 0, size: 0
> 2012/08/03 13:56:17 [debug] 1519#0: *221 pipe buf busy s:1 t:1 f:0
> 000000000109F1E0, pos 00000000010AF230, size: 65456 file: 0, size: 0
> 2012/08/03 13:56:17 [debug] 1519#0: *221 pipe buf free s:0 t:1 f:0
> 00000000010BF1F0, pos 00000000010BF1F0, size: 49232 file: 0, size: 0
> 2012/08/03 13:56:17 [debug] 1519#0: *221 pipe length: -1
> 2012/08/03 13:56:17 [debug] 1519#0: *221 event timer: 13, old:
> 1344013157360, new: 1344013157362
> 2012/08/03 13:56:17 [debug] 1519#0: *221 event timer: 12, old:
> 1344013037361, new: 1344013037362
> 2012/08/03 13:56:17 [debug] 1519#0: *221 post event 000000000106B360
> 2012/08/03 13:56:17 [debug] 1519#0: *221 delete posted event
> 000000000106B360
> 2012/08/03 13:56:17 [debug] 1519#0: *221 http run request:
> "/s/new+balance+1500?"
> 2012/08/03 13:56:17 [debug] 1519#0: *221 http upstream check client, write
> event:0, "/s/new+balance+1500"
> 2012/08/03 13:56:17 [debug] 1519#0: *221 http upstream recv(): 0 (11:
> Resource temporarily unavailable)
> 2012/08/03 13:56:17 [debug] 1519#0: *221 finalize http upstream request:
> 499
> 2012/08/03 13:56:17 [debug] 1519#0: *221 finalize http fastcgi request
> 2012/08/03 13:56:17 [debug] 1519#0: *221 free rr peer 1 0
> 2012/08/03 13:56:17 [debug] 1519#0: *221 close http upstream connection: 13
> 2012/08/03 13:56:17 [debug] 1519#0: *221 free: 0000000000FF7CD0, unused: 48
> 2012/08/03 13:56:17 [debug] 1519#0: *221 event timer del: 13: 1344013157360
> 2012/08/03 13:56:17 [debug] 1519#0: *221 reusable connection: 0
> 2012/08/03 13:56:17 [debug] 1519#0: *221 http upstream temp fd: -1
> 2012/08/03 13:56:17 [debug] 1519#0: *221 http output filter
> "/s/new+balance+1500?"
> 2012/08/03 13:56:17 [debug] 1519#0: *221 http copy filter:
> "/s/new+balance+1500?"
> 2012/08/03 13:56:17 [debug] 1519#0: *221 image filter
> 2012/08/03 13:56:17 [debug] 1519#0: *221 xslt filter body
> 2012/08/03 13:56:17 [debug] 1519#0: *221 http postpone filter
> "/s/new+balance+1500?" 0000000000FEDA18
> 2012/08/03 13:56:17 [debug] 1519#0: *221 http copy filter: -1
> "/s/new+balance+1500?"
> 2012/08/03 13:56:17 [debug] 1519#0: *221 http finalize request: -1,
> "/s/new+balance+1500?" a:1, c:1
>
> So this means the backend felt the upstream closed the connection before
> it was allowed to return all data:
> 2012/08/03 13:56:17 [debug] 1519#0: *221 finalize http upstream request:
> 499
>
> If that's the case, why it saves this to the logfile:
> [03/Aug/2012:13:56:17 -0300] "GET /s/new+balance+1500 HTTP/1.0" 200 15776
> "-" "AdsBot-Google (+http://www.google.com/adsbot.html)" - Cache: - 200
> 24.535
>
> I think the problem may be in the upstream, as this weird beahavior
> happens with both backends and both use different nginx versions, but I'm
> really out of answers right now with this issue.
>
> Any idea? Hint? Today of 8K requests, 2779 returned 0 bytes on the
> upstream and 15776 bytes on the backend....
>
> Thank you!!
>
> Guzmán
>
>
>
> --
> Guzmán Brasó Núñez
> Senior Perl Developer / Sysadmin
> Web: http://guzman.braso.info
> Mobile: +598 98 674020
>



-- 
Guzmán Brasó Núñez
Senior Perl Developer / Sysadmin
Web: http://guzman.braso.info
Mobile: +598 98 674020
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20120809/1549ba8e/attachment-0001.html>


More information about the nginx mailing list