high "Load Average"
sparvu at systemdatarecorder.org
Tue Mar 16 19:35:10 MSK 2010
> Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat
> 10:24:02 lo 0.33 0.33 290.8 290.8 1.16 1.16 0.00 0.00
> 10:24:02 eth0 0.29 0.79 1183.8 1448.1 0.25 0.56 0.00 0.00
make sure you run enicstat since Util, Sat are always 0 on Linux if you
dont do that:
Added a script, enicstat, which uses ethtool to get speeds and duplex modes for all interfaces, then calls nicstat with an appropriate -S value."
> What I am talking about is a little bit different. In peak hours response time degrades significantly, but is still more or less acceptable, but what is unacceptable is that machine A slows down and replies for external actions (like SSH login, VPN connection) very slowly. For example, I sometimes even can't establish VPN connection to it due to timeouts. (there is openvpn server running on it). That's why I am talking about "slow machine A" and blame it. That's why I am worried about "uninterruptible sleep" processes and thinking about scheduling lag
You are talking about a system slowdown caused by your current
workload. This might be caused by a series of things some related
to the kernel. But most likely analysing with SystemTap whats going
on here might help. Thats why I keep telling people DTrace is like a gold mine
or a step in the future. Since you are not on Solaris you need to
start looking into SystemTap. If possible, have a box with Solaris
or FreeBSD next running this workload and check with DTraceToolkit.
More information about the nginx