proxy_pass is double-encoding some pre-encoded uri's

mike mike503 at gmail.com
Thu May 22 06:01:32 MSD 2008


slightly OT but you do know there is an "apt-cacher" utility that does
exactly this for you :)


On 5/21/08, Joey Korkames <lists at ruby-forum.com> wrote:
> Hello, just wanted to start by saying that nginx is my favorite server
> for my personal projects - what an awesome piece of work. This is my
> first bug/help request.
>
> I've been using proxy_store as a "mirror on demand" for serving APT
> packages to debian machines. Occasionally a package will have a tilde
> ("~") in the file name and the proxy_pass's GET to the upstream server
> will fail.
>
> Looking through nginx's debug logs and tcpdumps, it seems APT will make
> the initial GET with the URI already encoded but the URL is encoded
> again at the moment of proxy_pass making the GET request to the upstream
> server.
>
> My proxy_store config:
>
> location /apt-cache/debian/lenny {
>         root /var/www/spawn.llnw.com/htdocs/proxy_store;
>         recursive_error_pages on;
>         error_page 404 = /apt-fetch-easynews$request_uri;
> }
>
> location /apt-fetch-easynews {
>          internal;
>          rewrite /apt-fetch-easynews/apt-cache/([^/]*)/([^/]*)(.*)
> /linux/debian$3 break;
>
>          recursive_error_pages on;
>          proxy_intercept_errors on;
>          proxy_connect_timeout 6;
>          proxy_read_timeout 20;
>          proxy_next_upstream error timeout invalid_header http_500
> http_503 http_404;
>          proxy_pass http://debian.mirrors.easynews.com;
>
>          proxy_store /var/www/default/htdocs/proxy_store/$request_uri;
>          proxy_store_access user:rw group:rw all:r;
>
>          error_page 404 503 504 = /apt-fetch-kernelorg$request_uri;
> #failover to kernel.org
> }
>
> For URI:
> http://localhost/apt-cache/debian/lenny/pool/main/b/binutils/binutils_2.18.17~cvs20080103-4+b1_amd64.deb
>
> GET from client:
>
> 2008/05/22 01:32:42 [debug] 7400#0: *1 http request line: "GET
> /apt-cache/debian/lenny/pool/main/b/binutils/binutils_2.18.1%7ecvs20080103-4+b1_amd64.deb
> HTTP/1.1"
> 2008/05/22 01:32:42 [debug] 7400#0: *1 http uri:
> "/apt-cache/debian/lenny/pool/main/b/binutils/binutils_2.18.1~cvs20080103-4+b1_amd64.deb"
> 2008/05/22 01:32:42 [debug] 7400#0: *1 http args: ""
> 2008/05/22 01:32:42 [debug] 7400#0: *1 http exten: "deb"
> 2008/05/22 01:32:42 [debug] 7400#0: *1 http process request header line
> 2008/05/22 01:32:42 [debug] 7400#0: *1 http header: "Host: localhost"
> 2008/05/22 01:32:42 [debug] 7400#0: *1 http header: "Connection:
> keep-alive"
> 2008/05/22 01:32:42 [debug] 7400#0: *1 http header: "User-Agent: Debian
> APT-HTTP/1.3 (0.7.11)"
> 2008/05/22 01:32:42 [debug] 7400#0: *1 http header done
>
> ....
>
> GET to upstream server:
> 2008/05/22 01:32:42 [debug] 7400#0: *1 http proxy header: "User-Agent:
> Debian APT-HTTP/1.3 (0.7.11)"
> 2008/05/22 01:32:42 [debug] 7400#0: *1 http proxy header:
> "GET
> /linux/debian/pool/main/b/binutils/binutils_2.18.1%257ecvs20080103-4+b1_amd64.deb
> HTTP/1.0
> Host: localhost
> Connection: close
> User-Agent: Debian APT-HTTP/1.3 (0.7.11)
>
> "
>
> ...
> 404 from upstream server:
> 2008/05/22 01:32:43 [debug] 7400#0: epoll: fd:12 ev:0005
> d:00002AAAAAAC5290
> 2008/05/22 01:32:43 [debug] 7400#0: *1 http upstream process header
> 2008/05/22 01:32:43 [debug] 7400#0: *1 malloc: 00000000006ABE90:4096
> 2008/05/22 01:32:43 [debug] 7400#0: *1 recv: fd:12 440 of 4096
> 2008/05/22 01:32:43 [debug] 7400#0: *1 http proxy status 404 "404 Not
> Found"
> 2008/05/22 01:32:43 [debug] 7400#0: *1 http proxy header: "Date: Thu, 22
> May 2008 01:32:42 GMT"
> 2008/05/22 01:32:43 [debug] 7400#0: *1 http proxy header: "Server:
> Apache"
> 2008/05/22 01:32:43 [debug] 7400#0: *1 http proxy header:
> "Content-Length: 276"
> 2008/05/22 01:32:43 [debug] 7400#0: *1 http proxy header: "Connection:
> close"
> 2008/05/22 01:32:43 [debug] 7400#0: *1 http proxy header: "Content-Type:
> text/html; charset=iso-8859-1"
> 2008/05/22 01:32:43 [debug] 7400#0: *1 http proxy header done
> 2008/05/22 01:32:43 [debug] 7400#0: *1 finalize http upstream request:
> 404
> 2008/05/22 01:32:43 [debug] 7400#0: *1 finalize http proxy request
> 2008/05/22 01:32:43 [debug] 7400#0: *1 free rr peer 1 0
> 2008/05/22 01:32:43 [debug] 7400#0: *1 close http upstream connection:
> 12
> 2008/05/22 01:32:43 [debug] 7400#0: *1 event timer del: 12:
> 1211419982557
>
> The same transaction as seen through tcpdump:
>
> 01:36:36.253662 IP 127.0.0.1.60417 > 69.16.168.244.80: P 1:242(241) ack
> 1 win 92 <nop,nop,timestamp 1058195635 33813862>
> E..%W]@. at ....o.+E......P....c|vV...\N......
> ?......fGET
> /linux/debian/pool/main/b/binutils/binutils_2.18.1%257ecvs20080103-4+b1_amd64.deb
> HTTP/1.0
> Host: localhost
> Connection: close
> User-Agent: Debian APT-HTTP/1.3 (0.7.11)
>
> 01:36:36.324114 IP 69.16.168.244.80 > 127.0.0.1.60417: P 1:441(440) ack
> 242 win 54 <nop,nop,timestamp 33813870 1058195635>
> E....?@.7.q-E....o.+.P..c|vV.......6.......
> ...n?...HTTP/1.1 404 Not Found
> Date: Thu, 22 May 2008 01:36:36 GMT
> Server: Apache
> Content-Length: 276
> Connection: close
> Content-Type: text/html; charset=iso-8859-1
>
> <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
> <html><head>
> <title>404 Not Found</title>
> </head><body>
> <h1>Not Found</h1>
> <p>The requested URL
> /linux/debian/pool/main/b/binutils/binutils_2.18.1%7ecvs20080103-4+b1_amd64.deb
> was not found on this server.</p>
> </body></html>
>
> If you take the uri and fix the double-encoding it by hand...
> http://69.16.168.244/linux/debian/pool/main/b/binutils/binutils_2.18.1%257ecvs20080103-4+b1_amd64.deb
> "%25" -> "%"
> http://69.16.168.244/linux/debian/pool/main/b/binutils/binutils_2.18.1%7ecvs20080103-4+b1_amd64.deb
> ..the once-encoded uri works.
>
> I realize this can be considered an apt-get bug, but some browsers out
> there may pre-encode "unreserved" special characters in their uris
> (http://www.ietf.org/rfc/rfc2396.txt see: sect 2.3) like apt-get is
> doing.
>
> Nginx does seem to know when to decode the original URI and save it in
> decoded form in all of the logs - can this same logic be used by
> proxy_pass to determine whether it should encode a GET request or not to
> the upstream server?
>
> joey
> --
> Posted via http://www.ruby-forum.com/.
>
>





More information about the nginx mailing list