Quick performance deterioration when No. of clients increases

Nikolaos Milas nmilas at noa.gr
Wed Oct 16 10:32:55 UTC 2013


On 14/10/2013 5:47 μμ, Toni Mueller wrote:

> did you investigate disk I/O?

Hi again,

Thanks for your suggestions (see below on that).

In the meantime, we have increased CPU power to 4 cores and the behavior 
of the server is much better.

I found that the server performance was reaching a bottleneck (by 
php-fpm) by NOT using microcache, because most pages were returning 
codes 303 502 (and these return codes were not included in 
fastcgi_cache_valid by default). When I set:

    fastcgi_cache_valid 200 301 302 303 502 3s;

then I saw immediate performance gains and drop to unix load down to 
almost 0 (from 100 - not a typo -) during load.

I used iostat during a load test and I didn't see any serious stress on 
I/O. The worst (max load) recorded entry is:

==========================================================================================================
avg-cpu: %user %nice %system %iowait %steal %idle
85.43 0.00 12.96 0.38 0.00 1.23

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await 
svctm %util
vda 0.00 136.50 0.00 21.20 0.00 1260.00 59.43 1.15 54.25 3.92 8.30
dm-0 0.00 0.00 0.00 157.50 0.00 1260.00 8.00 13.39 85.04 0.53 8.29
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
==========================================================================================================

Can you see a serious problem here? (I am not an expert, but, judging 
from what I've read on the Internet, it should not be bad.)

Now my problem is that there seems to be a limit of performance to 
around 1200 req/sec (which is not too bad, anyway), although CPU and 
memory is ample during all test. Increasing stress load more than that 
(I am using tsung for load testing), results only to increasing 
"error_connect_emfile" errors.

See results of a test attached. (100 users arriving per second for 5 
minutes (with max 10000 users), each of them hitting the homepage 100 
times. Details of the test at the bottom of this mail.)

My research showed that this should be a result of file descriptor 
exhaustion, however I could not find the root cause. The following seem OK:

# cat /proc/sys/fs/file-max
592940
# ulimit -n
200000
# ulimit -Hn
200000
# ulimit -Sn
200000
# grep nofile /etc/security/limits.conf
* - nofile 200000

Could you please guide me on how to resolve this issue? What is the real 
bottleneck here and how to overcome?

My config remains as was initially posted (it can also be seen here: 
https://www.ruby-forum.com/topic/4417776), with the difference of: 
"worker_processes 4" (since we now have 4 CPU cores).

Please advise.

============================= tsung.xml <start> 
=============================

<?xml version="1.0"?>
<!DOCTYPE tsung SYSTEM "/usr/share/tsung/tsung-1.0.dtd">

<tsung loglevel="debug" dumptraffic="false" version="1.0">

<clients>
<client host="localhost" use_controller_vm="true" maxusers="10000"/>
</clients>

<servers>
<server host="www.example.com" port="80" type="tcp"></server>
</servers>

<load duration="5" unit="minute">
<arrivalphase phase="1" duration="5" unit="minute">
<users arrivalrate="100" unit="second"/>
</arrivalphase>
</load>

<sessions>
<session probability="100" name="hit_en_homepage" type="ts_http">
<for from="1" to="100" var="i">
<request><http url='/' version='1.1' method='GET'></http></request>
<thinktime random='true' value='1'/>
</for>
</session>
</sessions>

</tsung>

============================== tsung.xml <end> 
===============================

Thanks and Regards,
Nick

-------------- next part --------------
A non-text attachment was scrubbed...
Name: graphes-Perfs-rate_tn.png
Type: image/png
Size: 3023 bytes
Desc: not available
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20131016/64ab8551/attachment-0005.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graphes-Users_Arrival-rate_tn.png
Type: image/png
Size: 3530 bytes
Desc: not available
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20131016/64ab8551/attachment-0006.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graphes-Users-simultaneous_tn.png
Type: image/png
Size: 2924 bytes
Desc: not available
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20131016/64ab8551/attachment-0007.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graphes-Errors-rate_tn.png
Type: image/png
Size: 3370 bytes
Desc: not available
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20131016/64ab8551/attachment-0008.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graphes-Perfs-mean_tn.png
Type: image/png
Size: 3223 bytes
Desc: not available
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20131016/64ab8551/attachment-0009.png>


More information about the nginx mailing list