preventing rewrite loops with "index"

Maxim Dounin mdounin at mdounin.ru
Sat Jan 23 08:52:33 MSK 2010


Hello!

On Sat, Jan 23, 2010 at 05:19:07AM +0100, Dennis J. wrote:

> On 01/23/2010 04:59 AM, Maxim Dounin wrote:
> >Hello!
> >
> >On Sat, Jan 23, 2010 at 02:47:12AM +0100, Dennis J. wrote:
> >
> >>On 01/22/2010 04:01 PM, Igor Sysoev wrote:
> >>>On Fri, Jan 22, 2010 at 03:06:47PM +0100, Dennis J. wrote:
> >>>
> >>>>Hi,
> >>>>So with my first rewrite issue solved I now move closer towards the real
> >>>>configuration and run into a problem with the index directive.
> >>>>
> >>>>My location looks like this:
> >>>>
> >>>>location ~* ^/(([A-Za-z])([A-Za-z0-9])([A-Za-z0-9])[^/]*)(/.*)?$ {
> >>>>      root /web;
> >>>>      set $site_path /users/$2/$3/$4/$1/htdocs;
> >>>>      set $real_uri $5;
> >>>>      rewrite .* $site_path$real_uri break;
> >>>>}
> >>>>
> >>>>When I request "/test/index.html" the location matches and gets properly
> >>>>rewritten into a hashed form "/users/t/e/s/test/index.html". Then the root
> >>>>get prefixed resulting in the path "/web/users/t/e/s/test/index.html" which
> >>>>get correctly delivered by nginx. So far so good.
> >>>>
> >>>>The problem happens when I request "/test/" instead which should deliver
> >>>>the same index.html through the index directive. That doesn't happen though.
> >>>>
> >>>>Looking at the log what seems to happen is that nginx sees that
> >>>>"/web/users/t/e/s/test/" is a directory and issues a new request with the
> >>>>uri "/web/users/t/e/s/test/index.html". This however matches the above
> >>>>location again resulting in another rewrite that ends with a completely
> >>>>broken path and a 404.
> >>>>
> >>>>How can I get that the correct index processing for the first correctly
> >>>>rewritten path without triggering another round of location processing
> >>>>messing things up?
> >>>
> >>>  location ~* ^/(([A-Za-z])([A-Za-z0-9])([A-Za-z0-9])[^/]*)(/.*)?$ {
> >>>      alias  /web/users/$2/$3/$4/$1/htdocs$5;
> >>>  }
> >>
> >>This works as intended, thanks!
> >>When I try to add a referrer check though I run into trouble. Adding
> >>the following after the alias directive makes nginx return a 404
> >>instead of index.html:
> >>
> >>             if ($request_uri ~ zip) {
> >>             }
> >>
> >>The log says that nginx cannot find the file
> >>"/web/users/////htdocs". When I change the ~ into a = then nginx
> >>returns index.html correctly.
> >>What I'm trying to get at is something similar to this:
> >>
> >>valid_referers none www.mydomain.com;
> >>if ($request_uri ~* \.(mpg|zip|avi)$) {
> >>   if ($invalid_referer) {
> >>     return 405;
> >>   }
> >>}
> >>
> >>I noticed that nested if's are not possible so I'm not sure how to
> >>handle such a case where multiple conditions have to be satisfied
> >>(name must match and $invalid_referer must be set). But right now
> >>I'm wondering why changing the "=" into "~" above suddenly results
> >>in a 404 and the captured variables all beeing empty.
> >
> >With "=" condition is false.  And - no, there is no surprise here.
> >See here for some more details:
> 
> With ~ the condition is false too after all I'm calling
> "/test.index.html" but if "if" is generally buggy then I gues that
> might be the problem.

No, than if isn't the thing to blame on.  And, after all, it's not 
*generally* buggy, it's *specifically* buggy.  :)

In this case you just smashed captures from location with another 
regexp (as alias evaluates after rewrite directives, including 
your if with regexp).  Solution is to use named captures as 
available in nginx 0.8.25+ or explicit set to save captures.

> >http://wiki.nginx.org/IfIsEvil
> 
> So how do I accomplish what I'm trying to do above with nginx?
> For Apache this would look something like this:
> 
> RewriteCond %{HTTP_REFERER} !www.mydomain.com
> RewriteCond %{HTTP_REFERER} !^$
> RewriteRule \.(mpg|zip|avi)$ - [F]
> 
> What is the equivalent in nginx?

Normally this translates to:

    location ~ \.(mpg|zip|avi)$ {
        valid_referers ...

        if ($invalid_referer) {
            return 403;
        }
    }

As you need this together with already complex location - I belive 
better aproach is to use separate rewrite as in your original 
message, but protect destination to avoid your original problem.

Something like this should work:

   location ~* ^/(([a-z])([a-z0-9])([a-z0-9])[^/]*)(/.*)?$ {
       set $path /$2/$3/$4/$1/htdocs$5;
       rewrite ^ /users/$path last;
   }

   location ^~ /users/ {
       internal;
       root /web;

       location ~ \.(mpg|zip|avi)$ {
           valid_referers ...
           if ($invalid_referer) {
               return 403;
           }
       }
   }

Some key points:

1. Note "^~" in location /users/.  It means "do not apply regexp 
locations", so your location with rewrite won't be triggered 
again.

2. Note "internal" in location /users/.  It means "only visible 
for internal redirects", so even user called "users" should be 
correctly processed by the first location.

Maxim Dounin



More information about the nginx mailing list