Using request URI path to store cached files instead of MD5 hash-based path
Lucas Rolff
lucas at lucasrolff.com
Fri Oct 6 13:24:45 UTC 2017
Hi,
> Is it possible to change this behaviour through configuration to cache the files using the request URI path itself, say, under the host-name directory under the proxy_cache_path.
No, it’s not possible to do that with proxy_cache, you can however do it with proxy_store ( http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_store ).
> I think such a direct way of defining cached file paths would help in finding or locating specific content in cache
You can already find cached file paths by calculating the md5 hash yourself, it’s rather easy.
> Also, it would be helpful in purging content from cache, even using wild-carded expressions.
You can easily purge the cache, just not wildcard expressions, for that you’d need the plus version of nginx.
> However, I seem to be missing the key benefit of why files are stored based on MD5 hash based paths
One of the benefits I can think of, is that fact that you only deal with a-z0-9 characters, using ascii characters ensures compatibility with every system, it’s lightweight since you only have to deal with a small set of characters, where if you would use $request_uri as an example you’d have to use UTF-8 or similar, it makes lookups a lot heavier to do, and there could be compatibility issues with characters, and the fact that $request_uri includes query strings as well, you’d end up with very weird filenames.
At same time, it wouldn’t surprise me that it’s a lot more efficient for nginx to have a consistent filename length when indexing data, you know that every file on the filesystem will be 32 characters long, you know exactly how much memory each file takes in memory, and you wouldn’t run into the problem where people have a request uri of a few hundred or even thousands of characters and possibly 10s or 100s of sub-directories.
I’m pretty sure that nginx decided to use an md5 hash due to a lot of benefits over storing it as proxy_store currently does. Maybe Maxim or someone else with extensive knowledge about the codebase and its design decisions can share briefly why.
Best Regards,
Lucas Rolff
On 05/10/2017, 13.29, "nginx on behalf of rnmx18" <nginx-bounces at nginx.org on behalf of nginx-forum at forum.nginx.org> wrote:
Hi,
If proxy caching is enabled, NGINX is saving the files under subdirectories
of the proxy_cache_path, based on the MD5 hash of the cache-key and the
levels parameter value.
Is it possible to change this behaviour through configuration to cache the
files using the request URI path itself, say, under the host-name directory
under the proxy_cache_path.
For example, if the proxy_cache_path is /tmp/cache1 and the request is
http://www.example.com/movies/file1.mp4, then can the file get cached as
/tmp/cache1/www.example.com/movies/file1.mp4
I think such a direct way of defining cached file paths would help in
finding or locating specific content in cache. Also, it would be helpful in
purging content from cache, even using wild-carded expressions.
However, I seem to be missing the key benefit of why files are stored based
on MD5 hash based paths.
Could someone explain the reason for using MD5 hash based file paths?
Also, with vanilla-NGINX, if there is no configurable way to use direct
request URI paths, is there any external module which could help me to get
this functionality?
Thanks
Rajesh
Posted at Nginx Forum: https://forum.nginx.org/read.php?2,276700,276700#msg-276700
_______________________________________________
nginx mailing list
nginx at nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx
More information about the nginx
mailing list