nginx doesn't handle different URL encodings well

Pierre-Marie Baty baty.pm at hotmail.fr
Thu Oct 21 05:23:46 MSD 2010


Hello Igor, hello all,
 
Congratulations for your fantastic and neatly programmed web server. It's a pleasure to use it.
 
I have a problem with nginx not serving files with accentuated characters when the sumbitted URL is UTF-8 encoded.
 
Here is my nginx.conf : http://nginx.pastebin.com/aB7XRLM3 It's a home webserver that is primarily used to serve stuff like holiday photos.
 
For example, I have a file called "été-2008.jpg" on my webserver. When I request http://myserver/été-2008.jpg, depending on whether the "Always send URLs as UTF-8" checkbox is checked or not in the Internet Explorer advanced options, the file is correctly served, or not.
 
When the URL is Latin-1 encoded, the request sent is : GET /%e9t%e9-2008.jpg ----> nginx resolves this to "été-2008.jpg", the file is served, OK
When the URL is UTF-8 encoded, the request sent is : GET /%C3%A9t%C3%A9-2008.jpg ----> nginx resolves this to "été-2008.jpg", and the file is not served. (file not found)
 
Shouldn't a fallback mechanism be implemented so that when a file isn't found after an URL has been decoded, a second try is made with another encoding ? I believe two RFCs are involved : rfc2396 and rfc3986 (info given by PiotrSikora on IRC). IMO, nginx shouldn't assume the URL it gets are always following the same RFC. From what I know, this ambiguity is resolved in Apache. Maybe they have that sort of fallback mechanism.
 
Thanks to the IRC channel members who pointed me towards this mailing list. I look forward for your reply in order to know what to do :)
 
Spassiba !
 
-- 
Pierre-Marie Baty
  		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://nginx.org/pipermail/nginx/attachments/20101021/4421d48c/attachment.html>


More information about the nginx mailing list