preventing rewrite loops with "index"
Maxim Dounin
mdounin at mdounin.ru
Mon Jan 25 13:57:21 MSK 2010
Hello!
On Mon, Jan 25, 2010 at 08:58:18AM +0200, Marcus Clyne wrote:
> Hi,
>
> Maxim Dounin wrote:
> >Hello!
> >
> >On Sun, Jan 24, 2010 at 10:45:32PM +0100, Piotr Sikora wrote:
> >
> >>>2. Note "internal" in location /users/. It means "only visible
> >>>for internal redirects", so even user called "users" should be
> >>>correctly processed by the first location.
> >>Actually, this isn't true. Any attempt to access internal location
> >>results in 404 response.
> >>
> >>You can verify this with very simple configuration:
> >>
> >>server {
> >> listen 8000;
> >> location / { return 500; }
> >> location /x { internal; return 500; }
> >>}
> >>
> >>Accessing /x will result in 404 response.
> The example is obviously correct, but it doesn't truly explain the
> reason for getting the 404 for accessing /users/xxx URLs (even
> though the result is almost the same). The reason is to do with the
> order that locations are handled, specifically that ^~ locations are
> handled before ~* and ~ ones, and if they match, then the regex ones
> aren't tested. If you try to access the URL /users/xxx, it will
> therefore match the second location given by ^~, and return 404
> because it's an internal location. Therefore, trying access
> anything under a user named 'users' will fail (though the URL /users
> on its own is ok, because that will match the regex location and not
> the ^~ location).
It's somewhat obvious.
>
> Using location /users in the original locations will result in an
> internal server error, because the regex will be caught before the
> /users location each time the URL is checked, creating an infinite
> loop.
By "original" you mean config I'm suggested to Dennis J? No, as
first rewrite will add '/' to it, and on next iteration it will be
caught by /users/.
The problem will arise with directory redirects though
(/username/dir -> /username/dir/), as they will use paths after
rewrites, and this isn't what we need here. When user has dir in
it's htdocs - wee need redirect "/user/dir" -> "/user/dir/", but
the config will issue "/users/u/s/e/users/dir/" one.
>From the above I think that using alias will be better. In
0.8.* this may be done with named captures and nested locations,
like this:
location ~* ^/(?<name>(?<n1>[a-z])(?<n2>[a-z0-9])(?<n3>[a-z0-9])[^/]*)(?<p>/.*)?$ {
alias /tmp/users/$n1/$n2/$n3/$name/htdocs$p;
location ~ \.(mpg|zip|avi)$ {
valid_referers localhost none blocked;
if ($invalid_referer) {
return 403;
}
}
}
In older versions one have to create separate locations for normal
files and ones which need special processing, e.g.
location ~* ^/(([a-z])([a-z0-9])([a-z0-9])[^/]*)(/.*\.(mpg|zip|avi))?$ {
alias /tmp/users/$2/$3/$4/$1/htdocs$5;
valid_referers localhost none blocked;
if ($invalid_referer) {
return 403;
}
}
location ~* ^/(([a-z])([a-z0-9])([a-z0-9])[^/]*)(/.*)?$ {
alias /tmp/users/$2/$3/$4/$1/htdocs$5;
}
> >True, I was wrong here. Actually I wasn't sure and that's why I
> >used "should". :)
> >
> >>This is a bug and it's somewhere on my TODO list.
> >
> >Strictly - it's not bug, it's just how internal locations work
> >now. But I agree it's a probably good idea to change semantics
> >and make them just invisible for external requests.
> I was under the impression that the way internal requests currently
> work was a consciously-chosen decision, and was considered a
> feature. It's a useful one IMHO. Surely if you want to make a
> location fully 'invisible' (i.e. both internally and externally),
> you can just add the directive 'return 404;' to the location.
No, "invisible" != "one which returns 404". The idea is
that internal locations should be ignored during matching of
external requests and let other (non-internal) locations match
request instead.
Maxim Dounin
More information about the nginx
mailing list