URL encoding issue

Maxim Dounin mdounin at mdounin.ru
Wed Feb 16 17:18:59 MSK 2011


Hello!

On Wed, Feb 16, 2011 at 08:41:14AM -0500, acaron wrote:

[...]

> I've recently purchased a business license for http://yuml.me/.  I've
> installed it using Mongrel and nginx uses the HTTP reverse-proxy to
> delegate requests.  The setup "works" in that I can access the different
> pages.  However, all UML diagram URLs are broken and return a "502 --
> bad gateway" message.
> 
> Entries in the nginx log look like (note the URL in **encoded** form): 
> 
> xxx.xxx.xxx.xxx - - [16/Feb/2011:07:50:28 -0500] "GET
> /diagram/scruffy/class/%5BC\ustomer%5D%2B1-%3E*%5BOrder%5D%2C%20%5BOrder%5D%2B%2B1-items%20%3E*%5BLineItem%\5D%2C%20%5BOrder%5D-0..1%3E%5BPaymentMethod%5D
> HTTP/1.1" 502 173 "-" "Mozilla/5\.0 (Windows; U; Windows NT 6.1; en-US)
> AppleWebKit/534.13 (KHTML, like Gecko) C\hrome/9.0.597.98
> Safari/534.13"
> 
> And the mongrel log for the same query looks like (note the URL in
> **decoded** form):
> 
> Wed Feb 16 07:50:28 -0500 2011: HTTP parse error, malformed request
> (127.0.0.1)\: #
> Wed Feb 16 07:50:28 -0500 2011: REQUEST DATA: "GET
> /diagram/scruffy/class/[Cust\omer]+1->*[Order],%20[Order]++1-items%20>*[LineItem],%20[Order]-0..1>[PaymentMe\thod]
> HTTP/1.0\r\nHost: yuml.eaddrinuse.ca\r\nConnection: close\r\nAccept:
> appl\ication/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/\*;q=0.5\r\nUser-Agent:
> Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWeb\Kit/534.13
> (KHTML, like Gecko) Chrome/9.0.597.98 Safari/534.13\r\nAccept-Encodi\ng:
> gzip,deflate,sdch\r\nAccept-Language: en-US,en;q=0.8\r\nAccept-Charset:
> ISO\-8859-1,utf-8;q=0.7,*;q=0.3\r\n\r\n"
> 
> Here the entire configuration for this virtual host (note the proxy_pass
> directive):

[...]

>   }
>   location / {
>     #proxy_set_header X-Real-IP $remote_addr;
>     #proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
>     proxy_set_header Host $http_host;
>     proxy_redirect false;
>     proxy_pass http://127.0.0.1:9001/;
>   }
> }
> 
> As you can see I'm not using any modification whatsoever on the URL
> before passing it to the proxy server.  Yet, nginx passes the request
> using a remote URL in decoded URL form (thus, a rightfully invalid URL).
>  According to Maxim's post, I understand this should not be the case.

Actually, you modify uri: by using proxy_pass with uri component 
(note trailing "/").  Thus nginx decodes uri, substitues uri part 
matched by location ("/") with uri component in your proxy_pass 
(another "/") and re-encodes uri.

To make sure nginx will pass uri as is - use proxy_pass without 
uri component, i.e.

    proxy_pass http://127.0.0.1:9001;

(note no trailing "/")

> Is there any documentation supporting the official behavior adopted by
> nginx in a reverse proxy + special characters in URL situation?  I
> believe this is a (major) bug in the HttpProxy module, as it basically
> prevents use of nginx in front of any application that uses special
> characters in URLs.  Does anyone know how to handle this issue?

The actual problem is that nginx indeed doesn't encode some chars 
not allowed unencoded per RFC 2396 (and newer RFC 3986), most 
notably "<" and ">".  This doesn't affect HTTP operation, but 
strict checkers may (rightfully) complain.

One of the known problems it causes is with MS Exchange web 
interface (with symptoms similar to yours).

I believe I've posted patch to make uri escaping strictly 
RFC3986-complaint a while ago, probably it should be resurrected.  
See above for a workaround.

Maxim Dounin



More information about the nginx mailing list