How can the number of parallel/redundant open streams/temp_files be controlled/limited?

Tue Jul 1 12:44:47 UTC 2014

As it appears a downstream response is not cached until first completely read into a temp_file (which for a large file may require 100's if not 1,000's of MB be transferred), there appears to be no "cache node formed" which to "lock" or serve "stale" responses from, and thereby until the first "cache node" is useably created, proxy_cache_lock has nothing to lock requests to?

The code does not appear to be forming a "cache node" using the designated cache_key until the requested downstream element has completed transfer as you've noted?

For the scheme to work, a lockable cache_node would need to formed immediately upon the first unique cache_key request, and not wait until the transfer of the requested item being stored into a temp_file is complete; as otherwise multiple redundant active streams between nginx and a backend server may be formed, each most likely transferring the same information needlessly; being what proxy_cache_lock was seemingly introduced to prevent (but it doesn't)?

On Jul 1, 2014, at 7:01 AM, Maxim Dounin <mdounin at mdounin.ru> wrote:

> Hello!
> 
> On Mon, Jun 30, 2014 at 11:10:52PM -0400, Paul Schlie wrote:
> 
>> Regarding:
>> 
>>> In http, responses are not guaranteed to be the same.  Each 
>>> response can be unique, and you can't assume responses have to be 
>>> identical even if their URLs match.
>> 
>> Yes, but potentially unique does not imply that upon the first valid ok or valid
>> partial response that it will likely be productive to continue to open further such
>> channels unless no longer responsive, as doing so will most likely be counter
>> productive, only wasting limited resources by establishing redundant channels;
>> being seemingly why proxy_cache_lock was introduced, as you initially suggested.
> 
> Again: responses are not guaranteed to be the same, and unless 
> you are using cache (and hence proxy_cache_key and various header 
> checks to ensure responses are at least interchangeable), the only 
> thing you can do is to proxy requests one by one.
> 
> If you are using cache, then there is proxy_cache_key to identify 
> a resource requested, and proxy_cache_lock to prevent multiple 
> parallel requests to populate the same cache node (and 
> "proxy_cache_use_stale updating" to prevent multiple requests when 
> updating a cache node).
> 
> In theory, cache code can be improved (compared to what we 
> currently have) to introduce sending of a response being loaded 
> into a cache to multiple clients.  I.e., stop waiting for a cache 
> lock once we've got the response headers, and stream the response 
> body being load to all clients waited for it.  This should/can 
> help when loading large files into a cache, when waiting with 
> proxy_cache_lock for a complete response isn't cheap.  In 
> practice, introducing such a code isn't cheap either, and it's not 
> about using other names for temporary files.
> 
> -- 
> Maxim Dounin
> http://nginx.org/
> 
> _______________________________________________
> nginx mailing list
> nginx at nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx