if-clause garbles variable content

Maxim Dounin mdounin at mdounin.ru
Sun Dec 20 00:56:54 MSK 2009


Hello!

On Sat, Dec 19, 2009 at 03:33:10PM -0500, marius wrote:

> I'd like to use a map in order to prevent various spiders from indexing documents, but I'm running into strange issues, which I can't explain (nginx 0.7.62 or 0.7.64):
> 
> 
> root /tmp/test;
> 
> map $lookup $test_blacklist {
>   default "";
>   /aabbccddeeff/aabbccddeeff/2009/A/Pdf/20090101.pdf 1;
> }
> 
> location ~* ^/aabbccddeeff(/.*) {
>   set $lookup "";
>   if ($http_user_agent ~ "Googlebot|Slurp|msnbot") {
>     # force lowercase "aabbccddeeff"
>     set $lookup /aabbccddeeff$1;
>   }
> 
>   if ($test_blacklist != "") {
>     return 410;
>   }
> 
>   charset iso-8859-1;
>   alias  /tmp/test/aabbccddeeff$1;
> }
> 
> 
> When requesting a file from that map, the open() log points to a garbled path "/aabbccddeeffcddeeff" rather than the requested "/aabbccddeeff/aabbccddeeff":

This is somewhat expected, as you trashed captures from location 
by executing another regex.

You should either use named captures as supported in nginx 
0.8.25+, like this:

    location ~* ^/aabbccddeeff(?<file>/.*) {
        ...
        alias /tmp/test/aabbccddeeff$file;
    }
 
or save capture results before executing another regexp, e.g.

    location ~* ^/aabbccddeeff(/.*) {
        set $file $1;
        ...
        alias /tmp/test/aabbccddeeff$file;
    }


[...]

> What's more, if the user-agent isn't matched, the server sends a 301 redirect with an appended "/" rather than delivering the file:
> 
> HTTP/1.1 301 Moved Permanently
> Location: http://test/aabbccddeeff/aabbccddeeff/2009/A/Pdf/20090101.pdf/

And this one isn't expected, but seems to be just another chapter 
in "if is evil" saga.  In this particular case alias directive 
isn't correctly inherited into implicit location created by if(), 
and this screws things up.

Am I right in the assumption that you need "aabbccddeeff" to be case 
insensitive while it's in lower case on filesystem, and that's why 
you use alias instead of root?  Try something like this:

    # lowercase
    rewrite ^(?i)/aabbccddeeff/(.*) /aabbccddeeff/$1;

    location /aabbccddeeff/ {
        set $lookup "";
        if ($http_user_agent ~ "Googlebot|Slurp|msnbot") {
            set $lookup $uri;
        }
        if ($test_blacklist != "") {
            return 410;
        }
        root /tmp/test;
    }

> On the other hand, if I omit the $http_user_agent test, the server behaves as expected, by either delivering the file or returning a 410 status, dependning of the content of $lookup.
> 
> Am I doing something wrong?

The only safe things to do inside if() in location are

1. rewrite ... last;

2. return ...;

By using anything else you are searching for troubles.

Maxim Dounin



More information about the nginx mailing list