Using request URI path to store cached files instead of MD5 hash-based path

Lucas Rolff lucas at lucasrolff.com
Fri Oct 6 13:24:45 UTC 2017


Hi,

> Is it possible to change this behaviour through configuration to cache the files using the request URI path itself, say, under the host-name directory under the proxy_cache_path.

No, it’s not possible to do that with proxy_cache, you can however do it with proxy_store ( http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_store ).

> I think such a direct way of defining cached file paths would help in finding or locating specific content in cache

You can already find cached file paths by calculating the md5 hash yourself, it’s rather easy.

> Also, it would be helpful in purging content from cache, even using wild-carded expressions. 

You can easily purge the cache, just not wildcard expressions, for that you’d need the plus version of nginx.

> However, I seem to be missing the key benefit of why files are stored based on MD5 hash based paths

One of the benefits I can think of, is that fact that you only deal with a-z0-9 characters, using ascii characters ensures compatibility with every system, it’s lightweight since you only have to deal with a small set of characters, where if you would use $request_uri as an example you’d have to use UTF-8 or similar, it makes lookups a lot heavier to do, and there could be compatibility issues with characters, and the fact that $request_uri includes query strings as well, you’d end up with very weird filenames.

At same time, it wouldn’t surprise me that it’s a lot more efficient for nginx to have a consistent filename length when indexing data, you know that every file on the filesystem will be 32 characters long, you know exactly how much memory each file takes in memory, and you wouldn’t run into the problem where people have a request uri of a few hundred or even thousands of characters and possibly 10s or 100s of sub-directories.

I’m pretty sure that nginx decided to use an md5 hash due to a lot of benefits over storing it as proxy_store currently does. Maybe Maxim or someone else with extensive knowledge about the codebase and its design decisions can share briefly why.

Best Regards,
Lucas Rolff

On 05/10/2017, 13.29, "nginx on behalf of rnmx18" <nginx-bounces at nginx.org on behalf of nginx-forum at forum.nginx.org> wrote:

    Hi,
    
    If proxy caching is enabled, NGINX is saving the files under subdirectories
    of the proxy_cache_path, based on the MD5 hash of the cache-key and the
    levels parameter value.
    
    Is it possible to change this behaviour through configuration to cache the
    files using the request URI path itself, say, under the host-name directory
    under the proxy_cache_path.
    
    For example, if the proxy_cache_path is /tmp/cache1 and the request is
    http://www.example.com/movies/file1.mp4, then can the file get cached as
    /tmp/cache1/www.example.com/movies/file1.mp4
    
    I think such a direct way of defining cached file paths would help in
    finding or locating specific content in cache. Also, it would be helpful in
    purging content from cache, even using wild-carded expressions. 
    
    However, I seem to be missing the key benefit of why files are stored based
    on MD5 hash based paths.
    
    Could someone explain the reason for using MD5 hash based file paths? 
    
    Also, with vanilla-NGINX, if there is no configurable way to use direct
    request URI paths, is there any external module which could help me to get
    this functionality?
    
    Thanks
    Rajesh
    
    Posted at Nginx Forum: https://forum.nginx.org/read.php?2,276700,276700#msg-276700
    
    _______________________________________________
    nginx mailing list
    nginx at nginx.org
    http://mailman.nginx.org/mailman/listinfo/nginx
    



More information about the nginx mailing list