Identifying "Writing" connections in status stub
nginx-ml at acheronmedia.hr
Sun Jul 30 13:31:51 UTC 2017
On 2017-07-30 13:30, Peter Booth wrote:
> It appears that you have a lot of data that could help in this
> How frequently is the status page being queried? Does every status
> datapoint get recorded
> or is munin showing some rolled up rrd data?
The nginx status page is queried every 5 minutes (default Munin polling
time), and it stores raw metrics into rrd database. But munin is not
imporant in this issue. I get the same values if I query the status page
> If you open the status page in a browser do the numbers report match
> what you see with netstat?
# netstat -n | grep -E "tcp4|tcp6" | grep ESTABLISHED | wc -l \
&& echo "----------------------------" \
&& fetch -qo - http://10.0.0.4/nginx_status
Active connections: 89
server accepts handled requests
669843 669843 3158515
Reading: 0 Writing: 22 Waiting: 82
And I ran it a few times with several minutes in between, the above is
just an example from the last run. This is inside the nginx jail, so
grepping tcp4|tcp6 shows only connections to the nginx server.
Now, the part I don't quite understand is whether Active = Reading +
Writing + Waiting. The above certainly doesn't seem to suggest so.
> Do you have a hypothesis that explains
> why the graph could jump back to 12/13, rather than spend a few days
> increasing linearly in the way it did from
> the 18th to the 23rd?
Bots crawling the sites, pacing themselves over a longer time frame so
there's no correlation to daily sinusoid caused by live visitors. We do
have a lot of resources on all those sites to crawl through. They're all
real estate agency sites, and there are tens of thousands of pages with
hundreds of thousands of images. And looking at the logs, quite a number
of requests from bots (that are decent enough to say they're bots).
We've deviated a bit into assuming this is a bug or some unexpected
behavior (my fault for suggesting it in the beginning). That's why all I
wanted to do was to check which IPs are those that nginx considers
"Writing" to. The only reason this caught my attention was apparently
"flat" appearance of Writing, but now thinking about bots, this could be
> How long was nginx down for? If you graph only the “writing”
> variable for just 23rd July does the length of
> time that the # of writing connections is thoughtto be 0 make sense?
It was only restarted. It appears the "offending" connections started
showing up less than an hour later.
> I wonder whether what you are seeing could be a side-effect of the
> server being in a FreeBSD jail?
I doubt it. I used to see this when the server was on Debian Jessie, but
it was much less noticeable. Then again, back then we had much less
traffic and much less content.
> Do any of the other nginx sites in other jails exhibit the same
There is only one instance of nginx running on the server. Individual
sites are only runing php-fpm or uwsgi in their jails.
> In FreeBSD jails is there an equivalent of Dom) in a XEN hypervisor? A
> parent or root OS?
FreeBSD jails are OS-level virtualization. It's basically similar to
containers on Linux but with more isolation (it's not just namespacing).
> If so, do you see all connections on al jails the you log into it? If
> wondering if you are hitting some ulimit or
> resource shortage on the host as a whole?
I don't think it's that, as limits are far above the current demands for
traffic, and there's nothing logged about potential resource exhaustion.
Thanks for helping me figure this out.
More information about the nginx