Potential Bug: Gzipping preventing HTTP range requests

Maxim Dounin mdounin at mdounin.ru
Sun Jun 26 01:30:55 MSD 2011


Hello!

On Sat, Jun 25, 2011 at 08:22:21AM +0400, Igor Sysoev wrote:

> On Fri, Jun 24, 2011 at 07:04:04PM -0400, Ensiferous wrote:
> > Yeah it's a weird situation. As a user I would probably expect that the
> > range applied to the actual content served, before it was compressed. So
> > that if I request 100 bytes when everything is transferred and
> > decompressed I have 100 bytes worth of content.
> 
> A "Content-Length" header of a gzipped response corresponds to length
> of the gzipped data. I'm not sure if this specified in RFC, but this
> is de facto behaviour. So since range is associated with the
> "Content-Length" header it should work with already gzipped body,
> so Apache 2.3.8 does it right:

Yes, as long as Content-Encoding used (not Transfer-Encoding) 
ranges must be in interpreted on compressed content.

http://tools.ietf.org/html/rfc2616#section-14.13

   The Content-Length entity-header field indicates the size of the
   entity-body, in decimal number of OCTETs...

http://tools.ietf.org/html/rfc2616#section-14.35.1

   Byte range specifications in HTTP apply to the sequence of bytes in
   the entity-body (not necessarily the same as the message-body).

http://tools.ietf.org/html/rfc2616#section-4.3

   The message-body (if any) of an HTTP message is used to carry the
   entity-body associated with the request or response. The message-body
   differs from the entity-body only when a transfer-coding has been
   applied, as indicated by the Transfer-Encoding header field (section
   14.41).

And, just for completeness, http message syntax is:

        generic-message = start-line
                          *(message-header CRLF)
                          CRLF
                          [ message-body ]

(http://tools.ietf.org/html/rfc2616#section-4.1)

[...]

> Note also that it's impossible to ungzip a response part if you have not
> preceding parts from the very start.

This as well applies to many other types of data.

The main problem with Content-Encoding and ranges is that one 
somehow should be able to reproduce exactly the same entity-body 
(or at least make sure cache validators would change on 
entity-body change).  This is not something trivial when you 
compress on the fly with possible different compression options.

I personally think that moving towards using Transfer-Encoding 
would be a good step for "on the fly" compression.  But browser 
support seems to be not here at all.

Maxim Dounin



More information about the nginx mailing list