Having issues with nginx / root captures (0.7.53)
Igor Sysoev
is at rambler-co.ru
Fri May 1 10:41:18 MSD 2009
On Thu, Apr 30, 2009 at 11:16:42PM -0700, Michael Shadle wrote:
> sorry, that was supposed to be bar.com - i just messed up substituting
>
> 2009/4/30 Igor Sysoev <is at rambler-co.ru>:
>
> > First, "~^foo(.*?)\.bar\.ssgisp\.com$" will never match "foo2.mike.bar.com".
> > Second, "~^foo(.*?)\.bar\.com$" will capture "2.mike" with "foo2.mike.bar.com".
>
> You're right though, something in the files is messing with my matching.
>
> What is it in this file that is setting up some sort of capture?
>
> For some reason this turns a
>
> foo123.mike.bar.com into /home/mike/web/foo, not foo123
>
> does -any- regular expression mess with the regexps
>
> location ^~ /robots.txt {
> auth_basic off;
> root /etc/nginx/robots;
> break;
You do not "break" here. This is waste of CPU cycles.
> }
>
> if ($http_user_agent ~* googlebot) {
> return 404;
> break;
> }
>
> if ($http_user_agent ~* looksmart) {
> return 404;
> break;
> }
>
> if ($http_user_agent ~* crawl) {
> return 404;
> break;
> }
>
> if ($http_user_agent ~* robot) {
> return 404;
> break;
> }
>
> if ($http_user_agent ~* findlinks) {
> return 404;
> break;
> }
>
> if ($http_user_agent ~* infoseek) {
> return 404;
> break;
> }
>
> if ($http_user_agent ~* search) {
> return 404;
> break;
> }
The "break" after "return" costs nothing, but useless.
Also, it' better to combine all check in single regex - it will be run
much faster:
if ($http_user_agent ~* "googlebot|looksmart|...") {
return 404;
}
--
Igor Sysoev
http://sysoev.ru/en/
More information about the nginx
mailing list