[ANNOUNCE] gunzip filter module 0.3
Maxim Dounin
mdounin at mdounin.ru
Tue Apr 20 06:15:46 MSD 2010
Hello!
On Mon, Apr 19, 2010 at 05:15:16PM -0400, theromis1 wrote:
> Perfect Max,
>
> understood your style of module, right now I'm working hard to
> deploy it just with small hacks.
>
> Actually we don't need to do unzipping always, we need unzip
> only for 200 upstream responses and only for text/html answers
> for reducing load on server. Looks like better to have
> coordination with your way of development, so I need small
> instructions how better to do it, and I'll send my patch for it.
>
> --- /home/roman/work/ngx_http_gunzip_filter_module-0.3/ngx_http_gunzip_filter_module.c 2010-03-22 11:11:16.000000000 -0700
> +++ ngx_http_gunzip_filter_module.c 2010-04-16 16:37:01.000000000 -0700
> @@ -132,6 +132,7 @@
> if (!conf->enable
> || r->headers_out.content_encoding == NULL
> || r->headers_out.content_encoding->value.len != 4
> + || r->upstream->state->status != 200
This is obviously wrong.
1. Nobody promised r->upstream is here. Expect coredumps on
static requests and/or internal error responses.
2. Unzipping only responses with status 200 isn't going to work as
long as client doesn't support gzip at all.
If your module happens to process only 200 responses - well, it
should be considered to be "module request" and coded as such.
Alternatively there may be some settings to request "gunzip
always" only for particular responses, but I tend to think it's
overkill.
> || ngx_strncasecmp(r->headers_out.content_encoding->value.data,
> (u_char *) "gzip", 4) != 0)
> {
> @@ -142,6 +143,9 @@
>
> r->gzip_vary = 1;
>
> + r->gzip_tested = 1;
> + r->gzip_ok = 1;
> +
No, you shouldn't modify nginx idea if client supports gzip.
Instead, you should bypass the whole detection logic if you need
to gunzip regardless of client's support.
And you code suggests that further tests will assume client
supports gzip, while some don't. This may lead to wierd results
if you have gzip filter enabled.
> if (!r->gzip_tested) {
> if (ngx_http_gzip_ok(r) == NGX_OK) {
> return ngx_http_next_header_filter(r);
> @@ -315,7 +319,7 @@
> ctx->zstream.opaque = ctx;
>
> /* windowBits +16 to decode gzip, zlib 1.2.0.4+ */
> - rc = inflateInit2(&ctx->zstream, MAX_WBITS + 16);
> + rc = inflateInit2(&ctx->zstream, MAX_WBITS + 32); // yahoo looks weird with previous init
+32 means decode zlib stream, which isn't what expected with gzip
content-encoding; it's content-encoding deflate. And there are
differencies.
>
> if (rc != Z_OK) {
> ngx_log_error(NGX_LOG_ALERT, r->connection->log, 0,
>
> If not apply r->upstream->state->status != 200 in headers
> processing I'm getting a lot of errors in log, one of it is
> http://yandex.ru/yandsearch?text=sunken , which sends 302
> redirect url with gzipped content, I've tried to fix it, but
> found just error in zlib, when I've stored dumped data and used
> 'gzip -d' on it all decompressed fine, and I've got normal HTML.
> How better to debug it? What advice you can give me?
They return incorrect data in reply:
00000000 1f 8b 08 00 00 00 00 00 00 03 02 00 00 00 ff ff |................|
00000010 1f 8b 08 00 00 00 00 00 00 03 2d 8e bb 0e 82 40 |..........-....@|
00000020 10 45 7b be 62 a4 b0 d3 51 28 1d d6 44 c1 68 e2 |.E{.b...Q(..D.h.|
00000030 ab 58 0b cb 95 1d b3 46 58 08 2c 46 fe 5e 1e 76 |.X.....FX.,F.^.v|
00000040 33 73 ee e4 5c 9a c4 97 ad bc 5f 13 d8 cb d3 11 |3s..\....._.....|
00000050 ae b7 cd f1 b0 05 7f 86 78 48 e4 0e 31 96 f1 48 |........xH..1..H|
00000060 82 f9 02 31 39 fb c2 23 e3 f2 4c 90 61 a5 bb c5 |...19..#..L.a...|
00000070 bd 5c c6 22 5c 04 b0 2b 1a ab 09 c7 83 47 38 04 |.\."\..+.....G8.|
00000080 e8 51 e8 b6 ff 59 8a 3f ef 26 8f 4a 21 0d 83 2e |.Q...Y.?.&.J!...|
00000090 d2 26 67 eb c0 a8 1a f2 e2 c3 1a 48 81 a9 f8 19 |.&g........H....|
000000a0 f9 d8 2a ab 6b 56 55 6a d6 8e bf 2e aa 1b fb 66 |..*.kVUj.......f|
000000b0 3b 55 79 b9 ca aa 28 58 86 be 30 5c 31 a1 12 73 |;Uy...(X..0\1..s|
000000c0 c2 b2 37 0e ae ce d0 f7 f3 7e 75 a4 7e 57 da 00 |..7......~u.~W..|
000000d0 00 00 |..|
000000d2
First 16 bytes are incomplete/broken gzip member. Correct one is
at offset 0x10 (and it indeed may be decoded to valid html).
It's intresting how they achieved this. Hey, anybody from Yandex
here? Comments?
Maxim Dounin
More information about the nginx
mailing list