nginx doesn't handle different URL encodings well

helen nginx-forum at nginx.us
Thu Oct 21 06:32:58 MSD 2010


On Wed, 20 Oct 2010 21:57:32 -0400, helen  wrote:

> On Wed, 20 Oct 2010 21:23:46 -0400, Pierre-Marie Baty  wrote:
>
>> When the URL is Latin-1 encoded, the request sent is : GET
>> /%e9t%e9-2008.jpg ----> nginx resolves this to "été-2008.jpg", the
> file
>> is served, OK
>> When the URL is UTF-8 encoded, the request sent is : GET
>> /%C3%A9t%C3%A9-2008.jpg ----> nginx resolves this to
> "été-2008.jpg",
>> and the file is not served. (file not found)
>
> I only spent about 5 minutes looking for this, so I could be totally
> wrong:
>
> In 0.8.53, src/http/ngx_http_parse.c:1220 appears to be the start of
the
> relevant code.  On a quick scan, it looks like the percent-decoding
is
> hardcoded.  (case sw_quoted, followed by case sw_quoted_second, inside
a
> switch loop)

Sorry to reply to my own post, but it looks like I am wrong; that looks
like where %xx is decoded only (duh).  I am still following the chain to
where this is passed to the OS, and I don't have time to look further
now.

helen

Posted at Nginx Forum: http://forum.nginx.org/read.php?2,142950,142966#msg-142966




More information about the nginx mailing list