[ANNOUNCE] gunzip filter module 0.3

Maxim Dounin mdounin at mdounin.ru
Wed Mar 24 17:20:37 MSK 2010


On Wed, Mar 24, 2010 at 07:48:13AM -0500, Ryan Malayter wrote:

> On Tue, Mar 23, 2010 at 8:46 AM, Maxim Dounin <mdounin at mdounin.ru> wrote:
> > I belive proxy_cache must cache response as it was got from
> > upstream.  It is not it's business to compress or change anything,
> > there are output filters to do changes.
> Transformation of content by proxies and caches is specifically
> allowed in the HTTP specs, unless a "Cache-Control: no-transform"
> directive is present.
> Or were you  referring to the nginx architecture/code specifically? If
> that is so, why is it not the business of proxy_cache to transform
> content (it already  manipulates headers out of necessity)?

Yes, I'm referring to nginx code.  The main problem with upstream 
module is that it does too many things already.  And teaching it 
to do things which may (and should) be done elsewhere is really 
bad idea.

> Re-applying the same output filter repeatedly is wasteful and
> increases latency. If Igor is worried about the impact updating HTTP
> date strings more than once per second, surely avoiding thousands of
> loops through a gzip filter is an optimization that would be smiled
> upon?

While re-gzipping indeed costly, just extracting pre-gzipped 
content from gzip_cache isn't much different from extracting 
pre-gzipped content from proxy_cache.  On the other hand 
gzip_cache will be able to use pre-gzipped content for much more 

> Even Microsoft gets this specific part right (static content is cached
> in its compressed state in IIS, and can use a different compression
> ratio from dynamic content).

Yes, and gzip_cache will allow us to do the same thing.  And won't 
be tightly coupled with proxy.

Though right now one may pre-gzip static files and use gzip_static - I 
believe transparent gzip_cache will be better.

> > On the other hand it is believed to be good idea to implement
> > cache support in gzip filter.  I.e. gzip filter will cache gzipped
> > content and will send it to client instead of re-compressing it.
> > And it's actually in Igor's plans AFAIK, but most likely not near
> > plans.
> Integrating the compression with the "retrieval" portion of the cache
> code would allow for the use of high compression ratios for long-lived
> objects, as well as prevent duplication of data on disk. Also, any
> caching mechanism is going to need the same quantity of settings and
> infrastructure as proxy_cache already has, so there would be a lot of
> unnecessary code duplication if the mechanism was separate from
> proxy_cache. But it would be more general (for nginx) to have a
> separate standalone gzip_cache module.

Cache infrastracture was designed with many consumers in mind (and 
e.g. slowfs_cache module by Piotr Sikora uses it), so 
infrastructure is mostly present.  What we have to do is to teach 
gzip to save it's responses to cache.

Maxim Dounin

More information about the nginx mailing list