Problem with rewrite last giving a HTTP 301 ?

toto2008 nginx-forum at nginx.us
Wed Sep 22 02:10:05 MSD 2010


Hello,

I'm having a weird problem with my website.
In my nginx conf, I have this rule:
rewrite ^/robots\.txt$ /cms/robotstxt.php last;

and a location to handle PHP files:
location ~ \.php$ {
    fastcgi_pass   127.0.0.1:9000;
    fastcgi_index  index.php;
    fastcgi_param  SCRIPT_FILENAME  $document_root$fastcgi_script_name;
    include fastcgi_params;
}

When I visit robots.txt, I get the expected result (a robots.txt
dynamically generated). If I check the nginx log file, as expected I get
a 200 HTTP answer :
IPADDRESS - - [21/Sep/2010:16:22:57 +0200] "GET /robots.txt HTTP/1.0"
200 231 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1;
Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR
3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C; .NET4.0E)"
If I check with fiddler (a HTTP debugger), everything is also OK : I
have my robots.txt file with the 200 HTTP code.

By checking the logs most of the time the 200 code is here.

Now, here is the problem : sometimes, for some reasons that I don't
know, instead of the 200 code, nginx send a 301 code. Here's an example
of the googlebot visiting my website :
66.249.65.168 - - [21/Sep/2010:12:02:09 +0200] "GET /robots.txt
HTTP/1.0" 301 178 "-" "Googlebot-Image/1.0"
66.249.65.166 - - [21/Sep/2010:12:02:09 +0200] "GET /cms/robotstxt.php
HTTP/1.0" 200 231 "-" "Googlebot-Image/1.0"

Of course, I don't want bots or people to visit this /cms/robotstxt.php
page directly...

Tonight, for the first time I've also found this problem with another
rewrite rule. Again, I have a rule like :
rewrite "^/([0-9]+)-([a-z0-9-]*)-([a-z]{2})$"
"/cms/pages.php?id=$1;title=$2;language=$3" last; and normally I get a
200 HTTP code when I visit one of my page.
But I discovered some weird logs from the Yandex bot :
95.108.151.244 - - [21/Sep/2010:21:20:56 +0200] "GET /123-mypage-fr
HTTP/1.0" 301 178 "-" "Mozilla/5.0 (compatible; YandexBot/3.0;
MirrorDetector; +http://yandex.com/bots)"


I've also tried to fetch the robots.txt file from the google webmaster
tools, but I got a correct 200 HTTP answer.

Is anybody know what the problem could be ? I have no idea how I could
reproduce this strange results with my browser or wget or whatever
tool.


I'm using nginx 0.7.67, the version packaged with Debian Squeeze
32bits.

Thanks.

Posted at Nginx Forum: http://forum.nginx.org/read.php?2,132723,132723#msg-132723




More information about the nginx mailing list