Bloking Bad bots

Anoop Alias anoopalias01 at gmail.com
Mon Nov 14 15:40:46 UTC 2016


I had asked the same question once and got no to the point response.

So here is what I infer:

the if causes nginx to check the header for each request against the list
of patterns you have configured and return a 403 if found .

So the processing slows down on each request to for the if processing..

If you see mod_security etc ..this is also doing something similar and
doing a check on each request - so in that way (that is if you are willing
to compromise lack of speed for the user agent checking) this is fine . But
you are definitely making the nginx slower and consume more resource by
adding the if  there and making it more by increasing the list size.



On Mon, Nov 14, 2016 at 9:00 PM, <lists at lazygranch.com> wrote:

> You can block some of those bots at the firewall permanently.
>
> I use the nginx map feature in a similar manner, but I don't know if map
> is more efficient than your code. ‎I started out blocking similar to your
> scheme, but the map feature looks clear to me in the conf file.
>
> Majestic and Sogou sure are annoying. For what I block, I use 444 rather
> than 403. (And yes, I know that destroys the matter/anti-matter mix of the
> universe, so don't lecture me.) I then eyeball the 444 hits periodically,
> using a script to pull the 444 requests out of the access.log file. I have
> another script to get just the IP addresses from access.log.
>
> For the search engines like Majestic and Sogou, which don't seem to have
> an IP space you can look up via BGP tools, I take the IP used and add it to
> my firewall blocking table. I can go weeks before a new IP gets used.
>
>   Original Message
> From: debilish99
> Sent: Monday, November 14, 2016 7:04 AM
> To: nginx at nginx.org
> Reply To: nginx at nginx.org
> Subject: Bloking Bad bots
>
> Hello,
>
> I have a server with several domains, in the configuration file of each
> domain I have a line like this to block bad bots.
>
> If ($ http_user_agent ~ *
> (zealbot|MJ12bot|AhrefsBot|sogou|PaperLiBot|uipbot|
> DotBot|GetIntent|Cliqzbot|YandexBot|Nutch|TurnitinBot|IndeedBot)
> Return 403;
> }
>
> This works fine.
>
> The question is, if I increase the list of bad bots to 1000, for example,
> this would be a speed problem when nginx manages every request that
> arrives.
>
> I have domains that can have 500,000 hits daily and up to 20,000 hits.
>
> Thank you all.
>
> Greetings.
>
> Posted at Nginx Forum: https://forum.nginx.org/read.
> php?2,270930,270930#msg-270930
>
> _______________________________________________
> nginx mailing list
> nginx at nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx
>
> _______________________________________________
> nginx mailing list
> nginx at nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx
>



-- 
*Anoop P Alias*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20161114/d41f363e/attachment.html>


More information about the nginx mailing list