Stoping bots and wget in nginx
Eden Li
eden at mojiti.com
Tue Dec 18 23:09:59 MSK 2007
wget (and many other user agents) respect robots.txt if you place it
at /robots.txt:
http://www.robotstxt.org/orig.html
http://en.wikipedia.org/wiki/Robots.txt
Of course malicious agents will ignore it and continue scraping your
site. It's pretty hard to block these kinds of bots since they can
mimic browser requests that would be difficult to disambiguate from
normal user requests.
On 12/18/07, Fabio Coatti <cova at ferrara.linux.it> wrote:
> Alle martedì 18 dicembre 2007, Alexis Torres Garnica ha scritto:
> > Hi guys, I am new to the list. Is there a way to stop or block the bots
> > access and wget to a nginx web server? tnks
> >
> > att: alex
>
> If with "block bots" you mean "block requests based on User Agent", you can do
> this setting up something like this:
>
> if ($http_user_agent ~ libwww-perl ) {
> return 400;
> }
>
>
> (just an example, of course)
>
>
> --
> Fabio "Cova" Coatti http://members.ferrara.linux.it/cova
> Ferrara Linux Users Group http://ferrara.linux.it
> GnuPG fp:9765 A5B6 6843 17BC A646 BE8C FA56 373A 5374 C703
> Old SysOps never die... they simply forget their password.
>
>
More information about the nginx
mailing list