Nginx proxy cache purge process does not clean up items fast enough for new elements
Maxim Dounin
mdounin at mdounin.ru
Wed Oct 16 12:56:49 UTC 2019
Hello!
On Wed, Oct 16, 2019 at 05:24:01AM -0400, sachin.shetty at gmail.com wrote:
> We have a nginx fronting our object storage which caches large objects.
> Objects are as large as 100GB. The nginx cache max size is set to about
> 3.5TB.
>
> When there is a surge of large object requests and disk quickly fills up,
> nginx runs into out of disk space error. I was expecting the cache manager
> to purge items based on LRU and make room for the new elements, but that
> does not happen.
>
> I can reproduce the problem with a simple test case:
>
> Config:
>
> proxy_cache_path /tmp/cache levels=1:2 keys_zone=cache_one:256m inactive=2d
> max_size=16G use_temp_path=off;
>
> Test:
>
> Run a request to download a file of 15GB, it is served correctly and
> stored in cache.
> Run a second request to download a different file of 10GB, it will fail
> with something like this:
>
> 2019/10/04 11:49:08 [crit] 20206#20206: *21 pwritev()
> "/tmp/cache/9/fa/a301d42ca6e5d4188c38ecf56aa3afa9.0000000001" has written
> only 221184 of 229376 while reading upstream, client: 127.0.0.1, server:
> eos_cache_filer, request: "GET...
> 2019/10/04 12:07:29 [crit] 21201#21201: *487 pwrite()
> "/tmp/cache/9/fa/a301d42ca6e5d4188c38ecf56aa3afa9.0000000002" failed (28: No
> space left on device) while reading upstream, client: 127.0.0.1, server:
> eos_cache_filer, request:
>
> Can I tune some cache_manager parameters to make this work? Is there a way
> to disable buffering in such case - ideally download should not fail, it
> should just disable caching and buffering.
Cache manager works in parallel to worker processes which fill up
cache. Further, with "max_size=" it only starts to clean things
once max_size limit is reached. Hence it is possible that total
size of the cache will be larger than max_size configured.
It is recommended to keep max_size smaller than actual disk
space available, and maintain the difference large enough for at
least 10 seconds of filling up cache (10 seconds is how long cache
manager will sleep if it has nothing to do), preferably more.
In particular, the difference is expected to be larger than
maximum size of a single cache item, or it is possible that adding
one cache item will fail if max_size limit is not yet reached.
This is probably what happens in your case.
Note well that temporary files, regardless of whether you use
"use_temp_path=off" or not, are not included into cache size. You
have to reserve some space for temporary files as well.
--
Maxim Dounin
http://mdounin.ru/
More information about the nginx
mailing list