Kernel stall while testing high-speed HTTPS traffic.
Ben Greear
greearb at candelatech.com
Thu May 28 22:24:23 UTC 2015
Some additional info was requested:
[root at e5-1630-v3-qc lanforge]# openssl engine -tt
(rdrand) Intel RDRAND engine
[ available ]
(dynamic) Dynamic engine loading support
[ unavailable ]
[root at e5-1630-v3-qc lanforge]# openssl version
OpenSSL 1.0.1e-fips 11 Feb 2013
[root at e5-1630-v3-qc lanforge]# openssl speed -multi ^C
# NOTE: My CPU supports AES-NI instructions...do I need to do anything
# special to enable that with nginx, or should it be working by default?
[root at e5-1630-v3-qc lanforge]# openssl speed -multi 4 rsa2048 ecdsap256
Forked child 0
Forked child 1
Forked child 2
Forked child 3
+DTP:2048:private:rsa:10
+DTP:2048:private:rsa:10
+DTP:2048:private:rsa:10
+DTP:2048:private:rsa:10
+R1:10253:2048:10.00
+DTP:2048:public:rsa:10
+R1:10345:2048:10.00
+DTP:2048:public:rsa:10
+R1:5385:2048:10.00
+DTP:2048:public:rsa:10
+R1:5387:2048:10.00
+DTP:2048:public:rsa:10
+R2:334855:2048:10.00
+R2:336207:2048:10.00
+DTP:256:sign:ecdsa:10
+DTP:256:sign:ecdsa:10
+R2:185283:2048:10.00
+R2:185265:2048:10.00
+DTP:256:sign:ecdsa:10
+DTP:256:sign:ecdsa:10
+R5:115623:256:10.00
+R5:116966:256:10.00
+DTP:256:verify:ecdsa:10
+DTP:256:verify:ecdsa:10
+R5:64033:256:10.00
+R5:64223:256:10.00
+DTP:256:verify:ecdsa:10
+DTP:256:verify:ecdsa:10
+R6:29783:256:10.00
+R6:30572:256:10.00
Got: +F2:2:2048:0.000967:0.000030 from 0
Got: +F4:3:256:0.000085:0.000327 from 0
+R6:15179:256:10.00
+R6:15196:256:10.00
Got: +F2:2:2048:0.001857:0.000054 from 1
Got: +F4:3:256:0.000156:0.000658 from 1
Got: +F2:2:2048:0.000975:0.000030 from 2
Got: +F4:3:256:0.000086:0.000336 from 2
Got: +F2:2:2048:0.001856:0.000054 from 3
Got: +F4:3:256:0.000156:0.000659 from 3
OpenSSL 1.0.1e-fips 11 Feb 2013
built on: Thu Oct 16 11:09:39 UTC 2014
options:bn(64,64) md2(int) rc4(16x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx)
compiler: gcc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DKRB5_MIT -m64 -DL_ENDIAN -DTERMIO -Wall -O2 -g -pipe -Wall
-Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -Wa,--noexecstack -DPURIFY
-DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM
-DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM
sign verify sign/s verify/s
rsa 2048 bits 0.000319s 0.000010s 3137.1 103703.7
sign verify sign/s verify/s
256 bit ecdsa (nistp256) 0.0000s 0.0001s 36213.1 9071.5
# NOTE on the below ldd info: the /home/lanforge/libssl.so.10 and libcrypto.so.10 are
# just copies of the same files from /usr/lib64/
[root at e5-1630-v3-qc lanforge]# ldd /usr/local/lanforge/nginx/sbin/nginx
linux-vdso.so.1 => (0x00007fff5d7fe000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003a9d800000)
libcrypt.so.1 => /lib64/libcrypt.so.1 (0x0000003aae000000)
libssl.so.10 => /home/lanforge/libssl.so.10 (0x00000033fe000000)
libcrypto.so.10 => /home/lanforge/libcrypto.so.10 (0x00000033f8000000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000003a9d400000)
libz.so.1 => /lib64/libz.so.1 (0x0000003a9dc00000)
libc.so.6 => /lib64/libc.so.6 (0x0000003a9d000000)
/lib64/ld-linux-x86-64.so.2 (0x0000003a9c800000)
libfreebl3.so => /lib64/libfreebl3.so (0x0000003aac800000)
libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x0000003aae400000)
libkrb5.so.3 => /lib64/libkrb5.so.3 (0x0000003ab2400000)
libcom_err.so.2 => /lib64/libcom_err.so.2 (0x0000003aadc00000)
libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x0000003aae800000)
libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x0000003ab1800000)
libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x0000003aaf400000)
libresolv.so.2 => /lib64/libresolv.so.2 (0x0000003a9f400000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x0000003a9e800000)
libpcre.so.1 => /lib64/libpcre.so.1 (0x0000003a9e400000)
[root at e5-1630-v3-qc lanforge]# lspci|grep -F Eth
02:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
02:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
03:00.0 Ethernet controller: Intel Corporation Ethernet Controller LX710 for 40GbE QSFP+ (rev 01)
03:00.1 Ethernet controller: Intel Corporation Ethernet Controller LX710 for 40GbE QSFP+ (rev 01)
07:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
07:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
I am using in-kernel drivers, and I am quite sure it is not a NIC issue since this
same system can sustain 10.8Gbps of HTTPS traffic served by Apache, and the 40G NICs can
sustain 20+Gbps of UDP traffic. So, I skiped the NIC stats that were requested. If they
really seem to be needed, I can gather that info.
Thanks,
Ben
On 05/28/2015 12:26 PM, Ben Greear wrote:
> We are seeing problems with Nginx (mostly)locking up the server when
> running high loads of HTTPS traffic.
>
> This scenario we had nginx configured to
> bind to eth3 but our ssh sessions on eth0 were frozen during this condition as well.
> The system restores itself after a few minutes, (the load generation would
> have stopped after a minute or two of lockup, that may be what lets things
> recover).
>
> We tested different kernels (4.0.4+, 4.0.0+, 3.17.8+ with local patches,
> and stock 3.14.27-100.fc19.x86_64, all with same results), different NICs (Intel 10G, Intel 40G),
> and Apache as web server.
>
> Apache can sustain about 10.8Gbps of HTTPS traffic and shows no
> instability/lockups. nginx maxes out at 2.2Gbps (until it locks up machine).
>
> Some kernel splats indicated some files writing to the file system
> journal were blocked > 180 seconds, but they recover, so it is not
> a hard lock. The system should not be doing any heavy disk access
> since we have 32GB RAM. Swap shows no usage.
>
> === Scenario ===
> Load testing box has a direct connection to eth3->eth3 over 10Gbps port.
>
> Curl clients using https, keepalive, requesting a 1MB file:
> 1000 clients @ 0.25 req/sec = 243 req/sec, 2.2Gbps tx, load 8.3
> 400 clients @ 0.65 req/sec = 260 req/sec, 2.2Gbps tx, load 9.2
>
>
>
> === Environment ===
> processor : 7
> vendor_id : GenuineIntel
> cpu family : 6
> model : 63
> model name : Intel(R) Xeon(R) CPU E5-1630 v3 @ 3.70GHz
>
>> free
> total used free shared buffers cached
> Mem: 32840296 1394884 31445412 0 132792 632068
> -/+ buffers/cache: 630024 32210272
> Swap: 16457724 0 16457724
>
>> cat /etc/issue
> Fedora release 19 (Schrödinger’s Cat)
> Kernel \r on an \m (\l)
>
> # uname -a
> Linux e5-1630-v3-qc 3.14.27-100.fc19.x86_64 #1 SMP Wed Dec 17 19:36:34 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
>
> # /usr/local/lanforge/nginx/sbin/nginx -v
> nginx version: nginx/1.9.1
>
> We are running small patch to allow nginx to bind to a particular interface. We
> tried with this option disabled, and that causes the same trouble. The exact
> source is found below:
>
> https://github.com/greearb/nginx/commits/master
>
> We are compiling nginx with these options:
>
> ./configure --prefix=/usr/local/lanforge/nginx/ --with-http_ssl_module --with-ipv6 --without-http_rewrite_module
>
> === Nginx Config ===
>
> worker_processes auto;
> worker_rlimit_nofile 100000;
> error_log logs/eth3_error.log;
> pid /home/lanforge/vr_conf/nginx_eth3.pid;
> events {
> use epoll;
> worker_connections 8096;
> multi_accept on;
> }
> http {
> include /usr/local/lanforge/nginx/conf/mime.types;
> default_type application/octet-stream;
> access_log off;
> sendfile on;
> directio 1m;
> disable_symlinks on;
> gzip off;
> tcp_nopush on;
> tcp_nodelay on;
>
> open_file_cache max=1000 inactive=10s;
> open_file_cache_valid 600s;
> open_file_cache_min_uses 2000;
> open_file_cache_errors off;
> etag off;
>
> server {
> listen 1.1.1.1:80 so_keepalive=on bind_dev=eth3;
> server_name nginx.local nginx web.local web;
>
> location / {
> root /var/www/html;
> index index.html index.htm;
> }
> error_page 500 502 503 504 /50x.html;
> location = /50x.html {
> root html;
> }
> }
> server {
> listen 1.1.1.1:443 so_keepalive=on ssl bind_dev=eth3;
> server_name nginx.local nginx web.local web;
> ssl_certificate /usr/local/lanforge/apache.crt;
> ssl_certificate_key /usr/local/lanforge/apache.key;
> location / {
> root /var/www/html;
> index index.html index.htm;
> }
> error_page 500 502 503 504 /50x.html;
> location = /50x.html {
> root html;
> }
> }
> }
>
>
> Any help or suggestions is appreciated.
>
> Thanks,
> Ben
>
--
Ben Greear <greearb at candelatech.com>
Candela Technologies Inc http://www.candelatech.com
More information about the nginx-devel
mailing list