[PATCH 1 of 2] Cache: don't update cache if revalidated response is not cacheable

Maxim Dounin mdounin at mdounin.ru
Thu Dec 18 19:29:46 UTC 2014


Hello!

On Tue, Dec 16, 2014 at 05:21:52PM -0800, Piotr Sikora wrote:

> Hey Maxim,
> 
> > I don't really understand the logic you suggest.
> >
> > Previous behaviour is as follows:
> >
> > - We don't check if the response is cacheable or not, but rather
> >   rely on the fact that it's already in the cache (hence it, at
> >   least was, cacheable).
> >
> > This kind of agree with what nginx normally does for cache items -
> > cacheability is determined when we are placing a response into the
> > cache, and not changed after this even if settings are changed.
> 
> Which is strange, because the 304 response is used to signal validity
> and cacheability of the original (cached) response, so ignoring that
> feels wrong.

The cacheability of the response was already signalled previously.  
And a 304 response can't really change it.  Consider, for example, 
that there were no validation request at all - and it happens some 
time later.  So the only thing that really changes is validity.

> Also, right now, validity, but not cacheability of the 304 response
> takes priority over cached response, which means that in most cases,
> the decision is based partially on the 304 and partially on the cached
> response.

That's in line with the RFC2616: it allows 304 to supply new 
Cache-Control and Expires headers if they are different from what 
was previously sent.  So we use validity from the 304 response, 
then from the cached response, then from proxy_cache_valid.

What is probably missing here is not using valid_sec if 
u->cacheable is set.

> One of the reasons for this change is to avoid requests with
> Authorization header and/or responses with Set-Cookie header
> revalidating cached responses for anonymous users.
> 
> Another, more technical reason, is that this is the only place in the
> code where r->cache->valid_sec isn't guarded by u->cacheable, which is
> protection assumed by rest of the code that doesn't clear this value
> on errors, just sets u->cacheable to 0, so ignoring it can result in
> odd validity values that wouldn't be applied originally on cache miss.

I agree, see above.

> > With the suggested change, you do the following:
> >
> > - If the 304 response is cacheable and there is a validity known in
> >   advance (from Expires, Cache-Control or X-Accel-Expires), the
> >   response will be cached.
> >
> > - If the 304 response is cacheable and the cached response is
> >   cacheable according to current settings, the response will be
> >   cached.
> 
> Correct, so the only difference between existing and proposed behavior
> is that in case 304 response (which, according to RFC7232, must
> include all cache-controlling headers, other than the specific to
> nginx X-Accel-Expires) isn't cacheable, nginx won't update the
> validity of the cached item. I don't see anything wrong with this
> logic.
> 
> > This doesn't look consistent.  In particular,
> >
> > - If the cached response is not cacheable according to current
> >   settings, it's still can be cached if the 304 response can be
> >   cached and have a validity explicitly set.
> >
> > Doesn't look like a good change to me.
> 
> I don't see how this is different from the current behavior, which is
> less strict than the proposed one.
> 
> Maybe look at the diff ignoring white spaces? This patch just adds
> requirement for 304 to be cacheable, that's it.

The problem is consistency of the behaviour in equivalent cases.

E.g., consider the following:

- A response with Set-Cookie cached with 
  "proxy_ignore_headers Set-Cookie" in the configuration.  The 
  response contains "Cache-Control: max-age=100".

- Now proxy_ignore_headers is removed from the configuration.  We 
  do a request with If-Modified-Since, 304 is returned:

  (a) If the 304 contains "Cache-Control: max-age=100", the 
      response will be cached for 100 more seconds.

  (b) If the 304 does not contain "Cache-Control: max-age=100",
      the response won't be updated.

On the other hand, (a) and (b) are equivalent per RFC2616.

> >> +            time_t  now;
> >
> > Please do not put declarations inside blocks, with the exception
> > of dedicated conditionally compiled blocks.
> 
> I only wrote it like that to be consistent with rest of the code,
> where "now" and/or "valid" are defined inside u->cacheable conditional
> blocks. See: ngx_http_upstream.c lines 2553 and 3763.

Understood, but please don't.  These two cases are more or less 
dedicated conditionally compiled blocks (the only if() in the 
conditionally complied code in 2553, the inner and the only block 
with local variables in 3763).  Adding more than one place with 
local variables inside an #if is certainly a bad idea.

-- 
Maxim Dounin
http://nginx.org/



More information about the nginx-devel mailing list