preventing rewrite loops with "index"

Maxim Dounin mdounin at mdounin.ru
Mon Jan 25 13:57:21 MSK 2010


Hello!

On Mon, Jan 25, 2010 at 08:58:18AM +0200, Marcus Clyne wrote:

> Hi,
> 
> Maxim Dounin wrote:
> >Hello!
> >
> >On Sun, Jan 24, 2010 at 10:45:32PM +0100, Piotr Sikora wrote:
> >
> >>>2. Note "internal" in location /users/.  It means "only visible
> >>>for internal redirects", so even user called "users" should be
> >>>correctly processed by the first location.
> >>Actually, this isn't true. Any attempt to access internal location
> >>results in 404 response.
> >>
> >>You can verify this with very simple configuration:
> >>
> >>server {
> >>   listen 8000;
> >>   location / { return 500; }
> >>   location /x { internal; return 500; }
> >>}
> >>
> >>Accessing /x will result in 404 response.
> The example is obviously correct, but it doesn't truly explain the
> reason for getting the 404 for accessing /users/xxx URLs (even
> though the result is almost the same).  The reason is to do with the
> order that locations are handled, specifically that ^~ locations are
> handled before ~* and ~ ones, and if they match, then the regex ones
> aren't tested.  If you try to access the URL /users/xxx, it will
> therefore match the second location given by ^~, and return 404
> because it's an internal location.  Therefore, trying access
> anything under a user named 'users' will fail (though the URL /users
> on its own is ok, because that will match the regex location and not
> the ^~ location).

It's somewhat obvious.

> 
> Using location /users in the original locations will result in an
> internal server error, because the regex will be caught before the
> /users location each time the URL is checked, creating an infinite
> loop.

By "original" you mean config I'm suggested to Dennis J?  No, as 
first rewrite will add '/' to it, and on next iteration it will be 
caught by /users/.

The problem will arise with directory redirects though 
(/username/dir -> /username/dir/), as they will use paths after 
rewrites, and this isn't what we need here.  When user has dir in 
it's htdocs - wee need redirect "/user/dir" -> "/user/dir/", but 
the config will issue "/users/u/s/e/users/dir/" one.

>From the above I think that using alias will be better.  In 
0.8.* this may be done with named captures and nested locations, 
like this:

   location ~* ^/(?<name>(?<n1>[a-z])(?<n2>[a-z0-9])(?<n3>[a-z0-9])[^/]*)(?<p>/.*)?$ {
       alias /tmp/users/$n1/$n2/$n3/$name/htdocs$p;

       location ~ \.(mpg|zip|avi)$ {
           valid_referers localhost none blocked;
           if ($invalid_referer) {
               return 403;
           }
       }
   }

In older versions one have to create separate locations for normal 
files and ones which need special processing, e.g.

   location ~* ^/(([a-z])([a-z0-9])([a-z0-9])[^/]*)(/.*\.(mpg|zip|avi))?$ {
       alias /tmp/users/$2/$3/$4/$1/htdocs$5;
       valid_referers localhost none blocked;
       if ($invalid_referer) {
           return 403;
       }
   }

   location ~* ^/(([a-z])([a-z0-9])([a-z0-9])[^/]*)(/.*)?$ {
       alias /tmp/users/$2/$3/$4/$1/htdocs$5;
   }

> >True, I was wrong here.  Actually I wasn't sure and that's why I
> >used "should".  :)
> >
> >>This is a bug and it's somewhere on my TODO list.
> >
> >Strictly - it's not bug, it's just how internal locations work
> >now.  But I agree it's a probably good idea to change semantics
> >and make them just invisible for external requests.
> I was under the impression that the way internal requests currently
> work was a consciously-chosen decision, and was considered a
> feature.  It's a useful one IMHO.  Surely if you want to make a
> location fully 'invisible' (i.e. both internally and externally),
> you can just add the directive 'return 404;' to the location.

No, "invisible" != "one which returns 404".  The idea is 
that internal locations should be ignored during matching of 
external requests and let other (non-internal) locations match 
request instead.

Maxim Dounin



More information about the nginx mailing list