weird anchor bug

Igor Sysoev is at rambler-co.ru
Sat Oct 28 14:14:57 MSD 2006


On Fri, 27 Oct 2006, Jonathan Dance wrote:

>> > Because this bug could affect a large number of backends
>> > (cgi/fastcgi/proxy), nginx should remove the anchor part of the URL
>> > before passing it on to any other service.
>> 
>> That sounds like something someone should do in a config file , to
>> match with certain versions of IE
>> 
>> # is invalid... it should be encoded and sent as a %23 -- but i can
>> see the possibility of breaking more browsers by stripping it, than
>> by leaving it in and just having a user-defined browser match regex
>> for ie
>
> Yes, it is invalid, and since it is invalid, the theoretical correct
> action should be a 400 Bad Request, but since this is production IE
> bug, that would definitely break things. If the # was re-interpreted
> as %23, it would cause a 404 error. It's also noteworthy that you get
> a 404 if you are requesting a local file with nginx (I assume it looks
> for the file "#bar" instead of the index for a request to "/#bar").
> The only real solution is to strip off that portion of the URL.
>
> FWIW, Apache2.2 seems to strips off the anchor to handle this case.
> That certainly doesn't make me right, nginx isn't apache. (My test was
> simply testing a production site I know runs Apache2.2 + Mongrel +
> Rails and I didn't get an error.)
>
> You're right, it can be a config file thing - I will add stripping of
> anchors regardless of browser (because it should never be sent) - but
> unless *everyone* has it in their config file I think it's still a
> problem - people will blame nginx when they run across this.

Thank you. The attached patch deletes "#fragment" from all nginx internals
except $request_line and $request_uri (unparsed URIs). It seems that
Apache did the same thing since 1.3b2.
Despite the name (patch-0.4.11.1.txt) the patch can be applied to
the most modern nginx versions.


Igor Sysoev
http://sysoev.ru/en/
-------------- next part --------------
Index: src/http/ngx_http_parse.c
===================================================================
--- src/http/ngx_http_parse.c	(revision 139)
+++ src/http/ngx_http_parse.c	(working copy)
@@ -282,6 +282,10 @@
                 r->args_start = p + 1;
                 state = sw_uri;
                 break;
+            case '#':
+                r->complex_uri = 1;
+                state = sw_uri;
+                break;
             case '+':
                 r->plus_in_uri = 1;
                 break;
@@ -341,6 +345,10 @@
                 r->args_start = p + 1;
                 state = sw_uri;
                 break;
+            case '#':
+                r->complex_uri = 1;
+                state = sw_uri;
+                break;
             case '+':
                 r->plus_in_uri = 1;
                 break;
@@ -366,6 +374,9 @@
                 r->uri_end = p;
                 r->http_minor = 9;
                 goto done;
+            case '#':
+                r->complex_uri = 1;
+                break;
             case '\0':
                 r->zero_in_uri = 1;
                 break;
@@ -822,6 +833,8 @@
                 break;
             case '?':
                 r->args_start = p;
+                goto args;
+            case '#':
                 goto done;
             case '.':
                 r->uri_ext = u + 1;
@@ -853,6 +866,8 @@
                 break;
             case '?':
                 r->args_start = p;
+                goto args;
+            case '#':
                 goto done;
             case '+':
                 r->plus_in_uri = 1;
@@ -883,6 +898,8 @@
                 break;
             case '?':
                 r->args_start = p;
+                goto args;
+            case '#':
                 goto done;
             case '+':
                 r->plus_in_uri = 1;
@@ -915,6 +932,8 @@
                 break;
             case '?':
                 r->args_start = p;
+                goto args;
+            case '#':
                 goto done;
 #if (NGX_WIN32)
             case '.':
@@ -958,6 +977,8 @@
                 break;
             case '?':
                 r->args_start = p;
+                goto args;
+            case '#':
                 goto done;
             case '+':
                 r->plus_in_uri = 1;
@@ -1001,7 +1022,11 @@
                     break;
                 }
 
-                if (ch == '\0') {
+                if (ch == '#') {
+                    *u++ = ch;
+                    ch = *p++;
+
+                } else if (ch == '\0') {
                     r->zero_in_uri = 1;
                 }
 
@@ -1041,6 +1066,31 @@
     r->uri_ext = NULL;
 
     return NGX_OK;
+
+args:
+
+    while (p < r->uri_end) {
+        if (*p++ != '#') {
+            continue;
+        }
+
+        r->args.len = p - 1 - r->args_start;
+        r->args.data = r->args_start;
+        r->args_start = NULL;
+
+        break;
+    }
+
+    r->uri.len = u - r->uri.data;
+
+    if (r->uri_ext) {
+        r->exten.len = u - r->uri_ext;
+        r->exten.data = r->uri_ext;
+    }
+
+    r->uri_ext = NULL;
+
+    return NGX_OK;
 }
 
 


More information about the nginx mailing list