rate limit with good bot IPs whitelisted

Oleksandr V. Typlyns'kyi wangsamp at gmail.com
Sat Nov 22 15:33:57 UTC 2014


Yesterday Nov 21, 2014 at 20:07 neubyr wrote:

> I am trying to figure out if there is any way to rate limit all traffic
> except Googlebot, msnbot, yandex and baidu bots. Here is what I have
> started with:
> 
>   # Whitelisted IPs
>   geo $rate_limit_ip {
>       default $binary_remote_addr;
>       127.0.0.1 "";
>       10.0.0.0/8 "";
>   }
> 
>   # Rate limit
>   limit_req_zone $rate_limit_ip zone=publix:10m rate=10r/s;

 It will not work as you expect.
 Geo does not support variables in values.
 You need something like this:
 geo $whitelist {
     default 0;
     127.0.0.1 1;
     ...
 }
 map $whitelist $rate_limit_ip {
     default $binary_remote_addr;
     1       "";
 }

> I can add googlebot, msnbot, yandex and baidu IP ranges manually to the 
> whitelist, but that will make lookup table big. I am not sure whether 
> this approach will work for high traffic like - 1200 requests/second 
> distributed across 20 nginx hosts. Any ideas on such setup will be 
> really helpful.

  Nginx parses and loads this data into radix tree in memory on startup.

> Also, can such host lookups be done in real-time for every request? I am
> guessing that may not be efficient for each request, but I was wondering if
> there are any solutions.

  All variables are evaluated when they are used in request.

-- 
WNGS-RIPE



More information about the nginx mailing list