How does 'locate' work?

Maxim Dounin mdounin at mdounin.ru
Thu Oct 22 03:35:16 MSD 2009


Hello!

On Wed, Oct 21, 2009 at 04:50:38PM -0400, GAZ082 wrote:

> Hi! I want to protect a directory and its content. The directory is located in the server in the dir:
> 
> /var/www/site.com/public/documents/
> 
> So, i have:
> 
>     location /public/documents/ {
>         root /var/www/site.com/;
>         auth_basic            "Access restricted.";
>         auth_basic_user_file   /private/pass;
> }

This seems to be correct and should work.  But see below.

> The thing is, i've been toying with the location and root options, with and without ^ and seems that i can not frigging protect the directory. So, can someone explain me with practical examples, how does the location and root parameters work? I read the wiki and can't get it to work properly.

Basic concepts:

1. nginx uses URI to find out location to use to process request.

2. Only one location used to process request.  Don't expect 
different locations to be magically combined.

3. There are normal locations ("location /uri/") and regexp 
locations ("location ~ regex").  Normal locations use prefix 
matching and most specific match wins, regex locations are applied 
in order and first match wins.

4. For everything that needs regexp matching nginx uses pcre.  See 
man pcre for details.

5. nginx writes error_log for reason, looking there may help a 
lot.

Practical examples:

1. Files in /path/to/root/uri/:

    server {
        root /path/to/root;

        location /uri/ {
            # correct root is inherited, no need to set explicitly 
            # here; files will be accessed as <root> + <uri>

            ...
        }
    }

2. Files in /path/to/root/uri and /path/to/another/root/uri2:

    server {
        root /path/to/root;

        location /uri/ {
            # correct root is inherited, no need to set explicitly 
            # here

            ...
        }

        location /uri2/ {
            root /path/to/another/root;

            ...
        }
    }

3. Files in /something/unrelated, should be accessible via /uri/:

    server {
        location /uri/ {
            # <root> + <uri> gives extra '/uri/', so we have to 
            # use alias instead; alias replaces part matched by 
            # location, for /uri/file it will be <alias> + "file"

            alias /something/unrelated/;
        }
    }

4. Somebody comes and adds regexp location to the example 1.

    server {
        root /path/to/root;

        location /uri/ {
            ... something ...
        }

        location ~ \.php$ {
            fastcgi_pass ...
        }
    }

Here fun begins.  As long as you have nothing in "something" - 
this will work.  But once you have essential part of your 
configuration there (e.g. auth_basic) - it will work for normal 
files, but not for php.

The reason is that instead of "something" nginx will use 
configuration from "location ~ \.php$".  Remember - only *one* 
location, no magic?  While this seems obvious in examples like

    location /path/to/ {
        # ... configuration A
    }

    location /path/to/something/ {
        # ... configuration B
    }

it turns to cause lots of conusion in configurations with regexps.

Possible solutions include "^~" (obvious mnemonic: no regex) 
location modifier to make "location /uri/" win over regexp 
locations:

    server {
        root /path/to/root;

        location ^~ /uri/ {
            auth_basic ...
        }

        location ~ \.php$ {
            fastcgi_pass ...
        }
    }

This will check authorization on anything under /uri/.  But 
obviously this won't allow php files to be passed to backend in 
/uri/ subdir.  If you want to do both, you have add another 
"combined" location, e.g.:

    server {
        root /path/to/root;

        location /uri/ {
            # note: allow regex locations, we will catch them 
            # individually

            auth_basic ...
        }

        location ~ ^/uri/.*\.php$ {
            # make sure php location won't override our auth_basic

            auth_basic ...
            fastcgi_pass ...
        }

        location ~ \.php$ {
            fastcgi_pass ...
        }
    }

But this is really fragile.  As soon as new regex location appears 
- you'll have to add another one to preserve auth_basic.  More 
safe aproach is to use nested locations, i.e.:

    server {
        root /path/to/root;

        location ^~ /uri/ {
            # no regexps, please

            auth_basic ...

            # nested location to handle php

            location \.php$ {
                fastcgi_pass ...
            }
        }

        location ~ \.php$ {
            fastcgi_pass ...
        }
    }

Note that nested locations still may have some inheritance issues 
and therefore not documented.  Though it's probably the best available 
aproach if you use regexp locations.  The only aproach that is 
obviously better is not to use regexp locations at all.

If you are still reading, you probably in doubt: how this is 
related to original question?

Answer is simple - you can't expect that "location /uri/" will be 
matched without looking for other locations in the same server 
block, especially regexp locations.  So config snippet you 
provided is fundamentally incomplete, it may work or not depending 
on other content of your config.

Maxim Dounin





More information about the nginx mailing list