Strange behavior on proxy cache at high load spike
abr3
nginx-forum at forum.nginx.org
Mon May 4 21:07:32 UTC 2020
Hi,
this bugs me for some time now. I have nginx 1.16.0 configured as following
on proxy cache:
proxy_cache_path /dev/shm/nginx_cache levels=1:2
keys_zone=proxy:1024m max_size=1024m inactive=60m;
proxy_temp_path /dev/shm/nginx_proxy_tmp;
proxy_cache_use_stale updating;
proxy_cache_lock on;
proxy_cache_lock_timeout 30s;
Most of the time all is fine and working as expected. There is some
specialty in the deployment setup where some expected spikes in requests
(end clients updating daily data) to few locations occur. Response size
varies 1M-1.5M non-gziped. Log snippet from such spike:
[2020-05-03T00:00:44] "GET /api/34/guide?date=2020-05-03 HTTP/1.0" 200
445984 cache: HIT request time: 50.211 sec
[2020-05-03T00:00:44] "GET /api/34/guide?date=2020-05-03 HTTP/1.0" 200
780472 cache: HIT request time: 52.891 sec
[2020-05-03T00:00:44] "GET /api/34/guide?date=2020-05-03 HTTP/1.0" 200
85432 cache: HIT request time: 33.284 sec
[2020-05-03T00:00:44] "GET /api/34/guide?date=2020-05-03 HTTP/1.0" 200
57920 cache: HIT request time: 34.957 sec
[2020-05-03T00:00:44] "GET /api/34/guide?date=2020-05-03 HTTP/1.0" 200
401096 cache: HIT request time: 49.991 sec
[2020-05-03T00:00:44] "GET /api/34/guide?date=2020-05-03 HTTP/1.0" 200
244712 cache: HIT request time: 48.412 sec
[2020-05-03T00:00:44] "GET /api/34/guide?date=2020-05-03 HTTP/1.0" 200
101360 cache: HIT request time: 34.955 sec
[2020-05-03T00:00:44] "GET /api/34/guide?date=2020-05-03 HTTP/1.0" 200
102808 cache: HIT request time: 34.753 sec
...
[2020-03-24T00:02:16] "GET /api/34/guide?date=2020-05-03 HTTP/1.0" 200
1526025 cache: HIT request time: 48.671 sec
Monitoring du on cache location shows max 1.1G, like:
1.1G /dev/shm/nginx_cache
0 /dev/shm/nginx_proxy_tmp
After 2minutes response 'stabilizes' with correct size (in this example
1526025). Problem is also amplified due clients validate response and retry
progressively if corrupted.
There are no weird log lines in error log or linux (centos) messages, also
there is no cache 'updating', just hits (I guess this omits upstream servers
issue). Is it possible we have issue with reading cached entries from
/dev/shm during peak times?
I would kindly ask for hints where possibly to start looking and debugging?
Big thanks in advance
Posted at Nginx Forum: https://forum.nginx.org/read.php?2,287951,287951#msg-287951
More information about the nginx
mailing list