Strange $upstream_response_time latency spikes with reverse proxy

Maxim Dounin mdounin at mdounin.ru
Tue Mar 19 14:19:29 UTC 2013


Hello!

On Mon, Mar 18, 2013 at 02:19:26PM -0700, Jay Oster wrote:

> On Sun, Mar 17, 2013 at 4:42 AM, Maxim Dounin <mdounin at mdounin.ru> wrote:
> 
> > On "these hosts"?  Note that listen queue aka backlog size is
> > configured in _applications_ which call listen().  At a host level
> > you may only configure somaxconn, which is maximum allowed listen
> > queue size (but an application may still use anything lower, even
> > just 1).
> >
> 
> "These hosts" means we have a lot of servers in production right now, and
> they all exhibit the same issue. It hasn't been a showstopper, but it's
> been occurring for as long as anyone can remember. The total number of
> upstream servers on a typical day is 6 machines (each running 3 service
> processes), and hosts running nginx account for another 4 machines. All of
> these are Ubuntu 12.04 64-bit VMs running on AWS EC2 m3.xlarge instance
> types.
> 
> I was under the impression that /proc/sys/net/ipv4/tcp_max_syn_backlog was
> for configuring the maximum queue size on the host. It's set to 1024, here,
> and increasing the number doesn't change the frequency of the missed
> packets.
> 
> /proc/sys/net/core/somaxconn is set to 500,000

As far as I understand, tcp_max_syn_backlog configures global 
cumulative limit for all listening sockets, while somaxconn limits 
one listening socket backlog.  If any of the two is too small - 
you'll see SYN packets dropped.

> > Make sure to check actual listen queue sizes used on listen
> > sockets involved.  On Linux (you are using Linux, right?) this
> > should be possible with "ss -nlt" (or "netstat -nlt").
> 
> 
> According to `ss -nlt`, send-q on these ports is set to 128. And recv-q on
> all ports is 0. I don't know what this means for recv-q, use default? And
> would default be 1024?

In "ss -nlt" output send-q column is used to display listen queue 
size for listen sockets.  Number 128 here means you have listen 
queue for 128 connections only.  You should tune your backends to 
use bigger listen queues, 128 is certanly too small for concurency 
5000 you use in your tests.

(The recv-q column should indicate current number of connections 
in listen queue.)

> But according to `netstat -nlt` both queues are 0?

This means that netstat isn't showing listen queue sizes on your 
host.  It looks like many linux systems still always display 0 for 
listen sockets.

-- 
Maxim Dounin
http://nginx.org/en/donation.html



More information about the nginx mailing list