upstream timeouts I can not explain
ruz at sports.ru
Mon Jan 9 18:56:54 UTC 2017
On Mon, Jan 9, 2017 at 6:14 PM, Maxim Dounin <mdounin at mdounin.ru> wrote:
> On Sun, Jan 08, 2017 at 05:04:47PM +0300, Руслан Закиров wrote:
> > Hello,
> > nginx 1.10.2  running on FreeBSD 10.3-RELEASE-p7
> > We have quite a lot of "upstream timed out" errors. For example:
> > 2017/01/08 16:20:54 [error] 82154#0: *494223064 upstream timed out (60:
> > Operation timed out) while connecting to upstream, client: 192.168.1.43,
> > server: q0.sports.ru, request: "POST /pub?topic=docs HTTP/1.1",
> upstream: "
> > http://192.168.1.206:4151/pub?topic=docs", host: "q0.sports.ru"
> > Here is tcpdump:
> > 16:20:53.218916 IP myself-fe.28957 > q0-2.4151: Flags [S], seq
> > win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val 787079162 ecr 0],
> > length 0
> > 16:20:53.219721 IP q0-2.4151 > myself-fe.28957: Flags [S.], seq
> > ack 2479999412, win 28960, options [mss 1460,sackOK,TS val 352522447 ecr
> > 787079162,nop,wscale 7], length 0
> > 16:20:53.219742 IP myself-fe.28957 > q0-2.4151: Flags [.], ack 1, win
> > options [nop,nop,TS val 787079163 ecr 352522447], length 0
> > 16:20:54.572712 IP myself-fe.28957 > q0-2.4151: Flags [F.], seq 1, ack 1,
> > win 1040, options [nop,nop,TS val 787080516 ecr 352522447], length 0
> > 16:20:54.573173 IP q0-2.4151 > myself-fe.28957: Flags [F.], seq 1, ack 2,
> > win 227, options [nop,nop,TS val 352522786 ecr 787080516], length 0
> > 16:20:54.573195 IP myself-fe.28957 > q0-2.4151: Flags [.], ack 2, win
> > options [nop,nop,TS val 787080516 ecr 352522786], length 0
> > What I don't see here is a packet with HTTP request.
> The "upstream timeout ... while connecting to upstream" suggests
> that nginx wasn't able to see the connect event.
Ok, spent whole day learning and investigating.
> Any idea where I should look next?
> Some things to consider:
> - Make sure you are looking at tcpdump on the nginx host, and
> there are no firewalls on the host to interfere with.
These were tcpdumps from nginx host. I have dump from other end and they
are symmetrical. We have proxy_connect_timeout at 300ms at the top level of
the config. When we first started to investigate it we increased timeout to
this location. An hour ago I increased it to 5 seconds and finally couldn't
the problem with a simple "bomber" script.
>From dumps you can see that connection was established within 10ms. What
can stop nginx from receiving the event for more than a second?
This happens on all served domains as pretty much everywhere connect
timeout is 300ms. If I tail -F error log, count this error occurrences
grouped by second then I see 1-3 seconds spikes: silence or <5 errors for
20-40 seconds then ~200 errors in a few seconds. Is there anything that may
block events processing nginx for quite a while?
- Try collecting nginx debug logs, they should have enough
> information to see if the event was reported by the kernel or
As you can see from above it is reported by the kernel, but with quite big
delay. Not sure if delay is in kernel or nginx, no delay on the network.
> Is it some known issue?
> Maxim Dounin
> nginx mailing list
> nginx at nginx.org
Руководитель отдела разработки веб-сервисов
+7(916) 597-92-69, ruz @ <http://www.sports.ru/>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the nginx