Strange $upstream_response_time latency spikes with reverse proxy
mdounin at mdounin.ru
Tue Mar 19 14:19:29 UTC 2013
On Mon, Mar 18, 2013 at 02:19:26PM -0700, Jay Oster wrote:
> On Sun, Mar 17, 2013 at 4:42 AM, Maxim Dounin <mdounin at mdounin.ru> wrote:
> > On "these hosts"? Note that listen queue aka backlog size is
> > configured in _applications_ which call listen(). At a host level
> > you may only configure somaxconn, which is maximum allowed listen
> > queue size (but an application may still use anything lower, even
> > just 1).
> "These hosts" means we have a lot of servers in production right now, and
> they all exhibit the same issue. It hasn't been a showstopper, but it's
> been occurring for as long as anyone can remember. The total number of
> upstream servers on a typical day is 6 machines (each running 3 service
> processes), and hosts running nginx account for another 4 machines. All of
> these are Ubuntu 12.04 64-bit VMs running on AWS EC2 m3.xlarge instance
> I was under the impression that /proc/sys/net/ipv4/tcp_max_syn_backlog was
> for configuring the maximum queue size on the host. It's set to 1024, here,
> and increasing the number doesn't change the frequency of the missed
> /proc/sys/net/core/somaxconn is set to 500,000
As far as I understand, tcp_max_syn_backlog configures global
cumulative limit for all listening sockets, while somaxconn limits
one listening socket backlog. If any of the two is too small -
you'll see SYN packets dropped.
> > Make sure to check actual listen queue sizes used on listen
> > sockets involved. On Linux (you are using Linux, right?) this
> > should be possible with "ss -nlt" (or "netstat -nlt").
> According to `ss -nlt`, send-q on these ports is set to 128. And recv-q on
> all ports is 0. I don't know what this means for recv-q, use default? And
> would default be 1024?
In "ss -nlt" output send-q column is used to display listen queue
size for listen sockets. Number 128 here means you have listen
queue for 128 connections only. You should tune your backends to
use bigger listen queues, 128 is certanly too small for concurency
5000 you use in your tests.
(The recv-q column should indicate current number of connections
in listen queue.)
> But according to `netstat -nlt` both queues are 0?
This means that netstat isn't showing listen queue sizes on your
host. It looks like many linux systems still always display 0 for
More information about the nginx