Strange $upstream_response_time latency spikes with reverse proxy
Maxim Dounin
mdounin at mdounin.ru
Fri Mar 15 08:20:59 UTC 2013
Hello!
On Thu, Mar 14, 2013 at 07:07:20PM -0700, Jay Oster wrote:
[...]
> The access log has 10,000 lines total (i.e. two of these tests with 5,000
> concurrent connections), and when I sort by upstream_response_time, I get a
> log with the first 140 lines having about 1s on the upstream_response_time,
> and the remaining 9,860 lines show 700ms and less. Here's a snippet showing
> the strangeness, starting with line numbers:
>
>
> 1: 127.0.0.1 - - [14/Mar/2013:17:37:21 -0700] "GET /time/0 HTTP/1.0"
> 200 19 "-" "ApacheBench/2.3" 1.027 1.026 234 83
> 2: 127.0.0.1 - - [14/Mar/2013:17:37:21 -0700] "GET /time/0 HTTP/1.0"
> 200 19 "-" "ApacheBench/2.3" 1.027 1.026 234 83
> 3: 127.0.0.1 - - [14/Mar/2013:17:37:21 -0700] "GET /time/0 HTTP/1.0"
> 200 19 "-" "ApacheBench/2.3" 1.026 1.025 234 83
> ...
> 138: 127.0.0.1 - - [14/Mar/2013:17:57:18 -0700] "GET /time/0 HTTP/1.0"
> 200 19 "-" "ApacheBench/2.3" 1.000 0.999 234 81
> 139: 127.0.0.1 - - [14/Mar/2013:17:57:18 -0700] "GET /time/0 HTTP/1.0"
> 200 19 "-" "ApacheBench/2.3" 0.999 0.999 234 81
> 140: 127.0.0.1 - - [14/Mar/2013:17:57:18 -0700] "GET /time/0 HTTP/1.0"
> 200 19 "-" "ApacheBench/2.3" 0.999 0.999 234 81
> 141: 127.0.0.1 - - [14/Mar/2013:17:37:21 -0700] "GET /time/0 HTTP/1.0"
> 200 19 "-" "ApacheBench/2.3" 0.708 0.568 234 83
> 142: 127.0.0.1 - - [14/Mar/2013:17:37:21 -0700] "GET /time/0 HTTP/1.0"
> 200 19 "-" "ApacheBench/2.3" 0.708 0.568 234 83
> 143: 127.0.0.1 - - [14/Mar/2013:17:37:21 -0700] "GET /time/0 HTTP/1.0"
> 200 19 "-" "ApacheBench/2.3" 0.708 0.568 234 83
> ...
> 9998: 127.0.0.1 - - [14/Mar/2013:17:57:16 -0700] "GET /time/0 HTTP/1.0"
> 200 19 "-" "ApacheBench/2.3" 0.142 0.005 234 81
> 9999: 127.0.0.1 - - [14/Mar/2013:17:57:16 -0700] "GET /time/0 HTTP/1.0"
> 200 19 "-" "ApacheBench/2.3" 0.142 0.005 234 81
> 10000: 127.0.0.1 - - [14/Mar/2013:17:57:16 -0700] "GET /time/0 HTTP/1.0"
> 200 19 "-" "ApacheBench/2.3" 0.122 0.002 234 81
>
>
>
> The upstream_response_time difference between line 140 and 141 is nearly
> 500ms! The total request_time also displays an interesting gap of almost
> 300ms. What's going on here?
I would suggests there are packet loss and retransmits for some
reason. Try tcpdump'ing traffic between nginx and backends to see
what goes on in details.
--
Maxim Dounin
http://nginx.org/en/donation.html
More information about the nginx
mailing list