Kernel stall while testing high-speed HTTPS traffic.
Ben Greear
greearb at candelatech.com
Thu May 28 19:26:55 UTC 2015
We are seeing problems with Nginx (mostly)locking up the server when
running high loads of HTTPS traffic.
This scenario we had nginx configured to
bind to eth3 but our ssh sessions on eth0 were frozen during this condition as well.
The system restores itself after a few minutes, (the load generation would
have stopped after a minute or two of lockup, that may be what lets things
recover).
We tested different kernels (4.0.4+, 4.0.0+, 3.17.8+ with local patches,
and stock 3.14.27-100.fc19.x86_64, all with same results), different NICs (Intel 10G, Intel 40G),
and Apache as web server.
Apache can sustain about 10.8Gbps of HTTPS traffic and shows no
instability/lockups. nginx maxes out at 2.2Gbps (until it locks up machine).
Some kernel splats indicated some files writing to the file system
journal were blocked > 180 seconds, but they recover, so it is not
a hard lock. The system should not be doing any heavy disk access
since we have 32GB RAM. Swap shows no usage.
=== Scenario ===
Load testing box has a direct connection to eth3->eth3 over 10Gbps port.
Curl clients using https, keepalive, requesting a 1MB file:
1000 clients @ 0.25 req/sec = 243 req/sec, 2.2Gbps tx, load 8.3
400 clients @ 0.65 req/sec = 260 req/sec, 2.2Gbps tx, load 9.2
=== Environment ===
processor : 7
vendor_id : GenuineIntel
cpu family : 6
model : 63
model name : Intel(R) Xeon(R) CPU E5-1630 v3 @ 3.70GHz
> free
total used free shared buffers cached
Mem: 32840296 1394884 31445412 0 132792 632068
-/+ buffers/cache: 630024 32210272
Swap: 16457724 0 16457724
> cat /etc/issue
Fedora release 19 (Schrödinger’s Cat)
Kernel \r on an \m (\l)
# uname -a
Linux e5-1630-v3-qc 3.14.27-100.fc19.x86_64 #1 SMP Wed Dec 17 19:36:34 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
# /usr/local/lanforge/nginx/sbin/nginx -v
nginx version: nginx/1.9.1
We are running small patch to allow nginx to bind to a particular interface. We
tried with this option disabled, and that causes the same trouble. The exact
source is found below:
https://github.com/greearb/nginx/commits/master
We are compiling nginx with these options:
./configure --prefix=/usr/local/lanforge/nginx/ --with-http_ssl_module --with-ipv6 --without-http_rewrite_module
=== Nginx Config ===
worker_processes auto;
worker_rlimit_nofile 100000;
error_log logs/eth3_error.log;
pid /home/lanforge/vr_conf/nginx_eth3.pid;
events {
use epoll;
worker_connections 8096;
multi_accept on;
}
http {
include /usr/local/lanforge/nginx/conf/mime.types;
default_type application/octet-stream;
access_log off;
sendfile on;
directio 1m;
disable_symlinks on;
gzip off;
tcp_nopush on;
tcp_nodelay on;
open_file_cache max=1000 inactive=10s;
open_file_cache_valid 600s;
open_file_cache_min_uses 2000;
open_file_cache_errors off;
etag off;
server {
listen 1.1.1.1:80 so_keepalive=on bind_dev=eth3;
server_name nginx.local nginx web.local web;
location / {
root /var/www/html;
index index.html index.htm;
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root html;
}
}
server {
listen 1.1.1.1:443 so_keepalive=on ssl bind_dev=eth3;
server_name nginx.local nginx web.local web;
ssl_certificate /usr/local/lanforge/apache.crt;
ssl_certificate_key /usr/local/lanforge/apache.key;
location / {
root /var/www/html;
index index.html index.htm;
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root html;
}
}
}
Any help or suggestions is appreciated.
Thanks,
Ben
--
Ben Greear <greearb at candelatech.com>
Candela Technologies Inc http://www.candelatech.com
More information about the nginx-devel
mailing list