high "Load Average"
Sessna
nginx-forum at nginx.us
Tue Mar 16 18:53:28 MSK 2010
Stefan Parvu Wrote:
-------------------------------------------------------
> I meant nicstat. Works on Linux, Solaris.
> Probable worth of considering porting this to FreeBSD too.
here are latest results from netstat and nicstat tools
netstat -ant | grep ESTABLISHED | egrep ':80\>' | wc -l
1157
We can say 1157 HTTP sessions are established (and may be kept alive)
nicstat
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat
10:24:02 lo 0.33 0.33 290.8 290.8 1.16 1.16 0.00 0.00
10:24:02 eth0 0.29 0.79 1183.8 1448.1 0.25 0.56 0.00 0.00
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat
10:24:03 lo 153.7 153.7 338.6 338.6 464.7 464.7 0.00 0.00
10:24:03 eth0 1070.9 768.7 3599.4 3987.9 304.7 197.4 0.00 0.00
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat
10:24:04 lo 65.22 65.22 191.9 191.9 348.1 348.1 0.00 0.00
10:24:04 eth0 932.5 691.3 3250.8 3437.7 293.7 205.9 0.00 0.00
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat
10:24:05 lo 522.5 522.5 687.5 687.5 778.2 778.2 0.00 0.00
10:24:05 eth0 1075.6 988.8 3758.0 4308.4 293.1 235.0 0.00 0.00
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat
10:24:06 lo 235.0 235.0 551.9 551.9 436.0 436.0 0.00 0.00
10:24:06 eth0 855.4 857.6 3476.3 3761.2 252.0 233.5 0.00 0.00
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat
10:24:07 lo 135.3 135.3 351.9 351.9 393.6 393.6 0.00 0.00
10:24:07 eth0 1021.0 1025.4 3970.6 4591.4 263.3 228.7 0.00 0.00
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat
10:24:08 lo 168.1 168.1 492.3 492.3 349.6 349.6 0.00 0.00
10:24:08 eth0 793.6 864.8 3340.2 3875.5 243.3 228.5 0.00 0.00
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat
10:24:09 lo 32.88 32.88 78.97 78.97 426.4 426.4 0.00 0.00
10:24:09 eth0 551.1 462.1 2416.2 2461.1 233.6 192.3 0.00 0.00
> Good is to have latest patches applied etc. To me
> this is a good candidate for DTrace (Solaris) or SystemTap (linux).
getting acquainted with SystemTap, looks like not "5 minutes to learn" system
Cliff Wells Wrote:
-------------------------------------------------------
I've arranged answers regarding to questions grouped by "theme"
> Well, all it means is that the bottleneck lies outside PHP.
> I wouldn't read too much else into it.
> It just means that 99% of your slowness is elsewhere.
> I cannot possibly fathom how the length of T doesn't matter.
> To me that is probably the single most important data point to investigate.
> The entire response time directly depends on T and you say it "doesn't matter"?
> Why do you think machine A is slow?
> Machine A waits for machine B and you blame machine A.
> I don't see how you arrive at this conclusion.
> To me it seems it could be either machine (or the connection between the machines)
> and more testing needs to be done to arrive at such a conclusion.
Combined answer on all these questions. Looks like we are talking about not the same thing in here. You are talking about response time for network request issued by user. It really consists of several stages of processing in here. Like
1. HTTP request handled by nginx
2. data processing by PHP (provided via CGI interface with php-fpm in my case)
3. external DB request processing (aka T in our discussion)
4. response to user
and in this case I totally agree with you - time T does matter and directly influences response time.
What I am talking about is a little bit different. In peak hours response time degrades significantly, but is still more or less acceptable, but what is unacceptable is that machine A slows down and replies for external actions (like SSH login, VPN connection) very slowly. For example, I sometimes even can't establish VPN connection to it due to timeouts. (there is openvpn server running on it). That's why I am talking about "slow machine A" and blame it. That's why I am worried about "uninterruptible sleep" processes and thinking about scheduling lag
> I'd be curious to see the output of iotop on your
> MySQL server as well.
here it is:
Total DISK READ: 0.00 B/s | Total DISK WRITE: 0.00 B/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
23144 be/4 mysql 0.00 B/s 19.10 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
880 be/4 mysql 0.00 B/s 11.46 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
1247 be/4 mysql 0.00 B/s 11.46 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
1287 be/4 mysql 0.00 B/s 53.49 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
1305 be/4 mysql 0.00 B/s 22.92 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
7929 be/4 mysql 0.00 B/s 15.28 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
30493 be/4 mysql 0.00 B/s 15.28 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
10222 be/4 mysql 0.00 B/s 7.64 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
1 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % init [3]
2 be/3 root 0.00 B/s 0.00 B/s 0.00 % 0.00 %
3 rt/3 root 0.00 B/s 0.00 B/s 0.00 % 0.00 %
and another sample
Total DISK READ: 0.00 B/s | Total DISK WRITE: 107.53 K/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
709 be/3 root 0.00 B/s 55.62 K/s 0.00 % 45.85 %
4129 be/4 mysql 0.00 B/s 3.71 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
4147 be/4 mysql 0.00 B/s 3.71 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
4233 be/4 mysql 0.00 B/s 7.42 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
4273 be/4 mysql 0.00 B/s 7.42 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
6587 be/4 mysql 0.00 B/s 14.83 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
15008 be/4 mysql 0.00 B/s 3.71 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
7449 be/4 mysql 0.00 B/s 11.12 K/s 0.00 % 0.00 % mysqld --basedir=/usr~/lib/mysql/mysql.sock
looks like nothing special for me. MySQL is running, not much I/O
>> May be it is something related to task scheduling? Is big number of sleeping process impacts performance and/or slows down scheduler?
> Not since kernel 2.6, AFAIK. How many processes are we talking about here?
nginx worker_processes 4
php-pfm max_children 1000 total
and system processes of course, but nothing much
> Maybe. Or maybe you've simply disguised the
> problem by throwing more processes at it. How many PHP processes are you
> running? Can you provide your php-fpm parameters? Also, what's an
> approximation of your requests per second during these peak times?
2 pools with 500 php-fpm children each and something like 1000-2000 concurrent HTTP sessions
> As an aside, one thing that you mentioned earlier that I was wondering
> about: what is writing to the local disk at 3.27MB/s (from iotop output)?
Total DISK READ: 0.00 B/s | Total DISK WRITE: 3.56 M/s
TID PRIO USER DISK READ DISK WRITE> SWAPIN IO COMMAND
32722 be/4 nginx 0.00 B/s 207.89 K/s 0.00 % 0.00 % nginx: worker process
22705 be/4 nobody 0.00 B/s 107.30 K/s 0.00 % 0.00 % php-cgi --fpm --fpm-config /etc/php-fp
22235 be/4 nobody 0.00 B/s 83.83 K/s 0.00 % 0.00 % php-cgi --fpm --fpm-config /etc/php-fp
21203 be/4 nobody 0.00 B/s 57.00 K/s 0.00 % 0.00 % php-cgi --fpm --fpm-config /etc/php-fp
32719 be/4 nginx 0.00 B/s 50.30 K/s 0.00 % 55.19 % nginx: worker process
32718 be/4 nginx 0.00 B/s 10.06 K/s 0.00 % 0.00 % nginx: worker process
21195 be/4 nobody 0.00 B/s 3.35 K/s 0.00 % 0.00 % php-cgi --fpm --fpm-config /etc/php-fp
21327 be/4 nobody 0.00 B/s 3.35 K/s 0.00 % 0.00 % php-cgi --fpm --fpm-config /etc/php-fp
22189 be/4 nobody 0.00 B/s 3.35 K/s 0.00 % 0.00 % php-cgi --fpm --fpm-config /etc/php-fp
php-cgi and nginx are writing. php is writing some processed data and logs. Anyway, is 3.5 M/s is not a big deal for SCSI disk, isn't it?
By the way here: http://www.mysqlperformanceblog.com/2006/12/04/using-loadavg-for-performance-optimization/ is couple of words about slowing down systems with active network IO.
Best regards,
Sessna
Posted at Nginx Forum: http://forum.nginx.org/read.php?2,63176,64417#msg-64417
More information about the nginx
mailing list