Unexpected behaviour "aio threads" option

Bart Warmerdam bartw at xs4all.nl
Tue Nov 24 14:10:12 UTC 2015


Hello,

On a system with a load of about 500-600 URI/sec I see some unexpected
behaviour when using "aio threads" option in the configuration.

System setup:
The system runs on RHEL6.6 with 3 workers running nginx 1.9.6 with
thread support. Content is cached and populated by a proxied-upstream.
The cache location is a tmpfs file system with more then enough space
at all times. Proxy buffer size 8k. The output buffer is default (no
config item, so 2 32k). Keepalive timeout 75s. Sendfile is enabled.

Seen behaviour:
On the WAF in front of this system I see occasional hangs on resources
(mainly larger files like js, jpeg, ..). Seen in the WAF log is that
this WAF waits for the transfer to be completed until nginx closes the
connection at the keepalive time of 75s. In the nginx access.log I see
the entry served from cache (upstream server '-') with the correct
content length. In the tcp dump I see the response of this call to
contain a content-length header with the correct length, a server time
header over 1 minute older then the tcpdump timestamp (all servers are
ntp-connected). The served jpeg is half-way in its cache lifetime at
that time and there are previous served entries from cache without
incomplete transfers. In the tcp dump the jpeg file starts to differ
from the original after 32168 bytes and misses 8192 bytes after which
the remaining content is served (which is identical to original). From
the tcpdump I can extract the file which is missing 8192 bytes.

We have also a dump when during the proxied call this same behaviour
was seen. The upstream call is started to get a jpeg from the origin.
After a few packets the data is sent to the WAF. The complete upstream
file is retrieved (can be validated in the tcpdump that the jpeg is
complete and correctly retrieved), but not all the data is sent to the
listening socket to the WAF.


If I change the setup to "aio on" or "aio off" this behaviour is not
seen. This is the only change in the configuration between the tests.
It looks like this behaviour only affects bigger files. I have not seen
this effect on small files or proxied responses.


Does anyone have the same experience with this option. And what is the
best way to proceed in tracing this?

Regards,

B.



More information about the nginx-devel mailing list