preventing rewrite loops with "index"

Dennis J. dennisml at conversis.de
Thu Jan 28 06:15:14 MSK 2010


On 01/25/2010 11:57 AM, Maxim Dounin wrote:
> Hello!
>
> On Mon, Jan 25, 2010 at 08:58:18AM +0200, Marcus Clyne wrote:
>
>> Hi,
>>
>> Maxim Dounin wrote:
>>> Hello!
>>>
>>> On Sun, Jan 24, 2010 at 10:45:32PM +0100, Piotr Sikora wrote:
>>>
>>>>> 2. Note "internal" in location /users/.  It means "only visible
>>>>> for internal redirects", so even user called "users" should be
>>>>> correctly processed by the first location.
>>>> Actually, this isn't true. Any attempt to access internal location
>>>> results in 404 response.
>>>>
>>>> You can verify this with very simple configuration:
>>>>
>>>> server {
>>>>    listen 8000;
>>>>    location / { return 500; }
>>>>    location /x { internal; return 500; }
>>>> }
>>>>
>>>> Accessing /x will result in 404 response.
>> The example is obviously correct, but it doesn't truly explain the
>> reason for getting the 404 for accessing /users/xxx URLs (even
>> though the result is almost the same).  The reason is to do with the
>> order that locations are handled, specifically that ^~ locations are
>> handled before ~* and ~ ones, and if they match, then the regex ones
>> aren't tested.  If you try to access the URL /users/xxx, it will
>> therefore match the second location given by ^~, and return 404
>> because it's an internal location.  Therefore, trying access
>> anything under a user named 'users' will fail (though the URL /users
>> on its own is ok, because that will match the regex location and not
>> the ^~ location).
>
> It's somewhat obvious.
>
>>
>> Using location /users in the original locations will result in an
>> internal server error, because the regex will be caught before the
>> /users location each time the URL is checked, creating an infinite
>> loop.
>
> By "original" you mean config I'm suggested to Dennis J?  No, as
> first rewrite will add '/' to it, and on next iteration it will be
> caught by /users/.
>
> The problem will arise with directory redirects though
> (/username/dir ->  /username/dir/), as they will use paths after
> rewrites, and this isn't what we need here.  When user has dir in
> it's htdocs - wee need redirect "/user/dir" ->  "/user/dir/", but
> the config will issue "/users/u/s/e/users/dir/" one.
>
>> From the above I think that using alias will be better.  In
> 0.8.* this may be done with named captures and nested locations,
> like this:
>
>     location ~* ^/(?<name>(?<n1>[a-z])(?<n2>[a-z0-9])(?<n3>[a-z0-9])[^/]*)(?<p>/.*)?$ {
>         alias /tmp/users/$n1/$n2/$n3/$name/htdocs$p;
>
>         location ~ \.(mpg|zip|avi)$ {
>             valid_referers localhost none blocked;
>             if ($invalid_referer) {
>                 return 403;
>             }
>         }
>     }

I now went with something like the above but I'm still running into a snag 
when I get to setting up fastCGI which apparently has to do with the fact 
that $document_root is not set up properly when "alias" is used. This is my 
config so far:

location ~* 
^/(?<name>(?<n1>[a-z])(?<n2>[a-z0-9])(?<n3>[a-z0-9])[^/]*)(?<p>/.*)?$ {
     alias /web/users/$n1/$n2/$n3/$name/htdocs$p;

     if (-f /web/users/$n1/$n2/$n3/$name/user/.disable-member) {
         return 405;
     }

     location ~ \.(zip|mpg|avi)$ {
         valid_referers none www.testdomain.com;
         if ($invalid_referer) {
             return 403;
         }
     }

     location ~ \.php$ {
         set $phphost 127.0.0.1:9000;
         fastcgi_pass   $phphost;
         fastcgi_index  index.php;
         fastcgi_param  DOCUMENT_ROOT /web/users/$n1/$n2/$n3/$name/htdocs;
         fastcgi_param  SCRIPT_FILENAME  $document_root$fastcgi_script_name;
         fastcgi_param  PATH_INFO  $fastcgi_script_name;
         include        fastcgi_params;
     }
}

Notice how I have to set DOCUMENT_ROOT and SCRIPT_FILENAME in order to get 
this working. What is strange is that $document_root is 
"/web/users/t/e/s/test/htdocs/index.php" (the alias?) and 
$fastcgi_script_name is "/test/index.php" yet when I call a script 
displaying $_SERVER I get SCRIPT_FILENAME displayed as 
"/web/users/t/e/s/test/htdocs/index.php" which is what I want but according 
to the definition above I shouldn't get. What I would expect is 
"/web/users/t/e/s/test/htdocs/index.php/test/index.php" given the values of 
the variables.
Also even if these are right I still get ORIG_SCRIPT_FILENAME as 
"/web/users/t/e/s/test/htdocs/index.php/test/index.php", PATH_TRANSLATED as 
"/web/users/t/e/s/test/htdocs/test/index.php" and PHP_SELF as 
"/test/index.php/test/index.php" which all don't look right.

Any ideas why the variables look messed up like this?

Regards,
   Dennis



More information about the nginx mailing list