rate limit with good bot IPs whitelisted
neubyr
neubyr at gmail.com
Sat Nov 22 17:42:45 UTC 2014
Thank you Oleksandr!!
On Sat, Nov 22, 2014 at 7:33 AM, Oleksandr V. Typlyns'kyi <
wangsamp at gmail.com> wrote:
> Yesterday Nov 21, 2014 at 20:07 neubyr wrote:
>
> > I am trying to figure out if there is any way to rate limit all traffic
> > except Googlebot, msnbot, yandex and baidu bots. Here is what I have
> > started with:
> >
> > # Whitelisted IPs
> > geo $rate_limit_ip {
> > default $binary_remote_addr;
> > 127.0.0.1 "";
> > 10.0.0.0/8 "";
> > }
> >
> > # Rate limit
> > limit_req_zone $rate_limit_ip zone=publix:10m rate=10r/s;
>
> It will not work as you expect.
> Geo does not support variables in values.
> You need something like this:
> geo $whitelist {
> default 0;
> 127.0.0.1 1;
> ...
> }
> map $whitelist $rate_limit_ip {
> default $binary_remote_addr;
> 1 "";
> }
>
>
I am not sure how, but it's working only with geo defining IP addresses. I
can see HTTP 503 on client side and also 'limiting requests, excess: 10.033
by zone' in error logs. Nginx version: nginx/1.6.0
geo $rate_limit_ip {
default $binary_remote_addr;
127.0.0.1 1;
10.0.0.0/8 1;
}
> > I can add googlebot, msnbot, yandex and baidu IP ranges manually to the
> > whitelist, but that will make lookup table big. I am not sure whether
> > this approach will work for high traffic like - 1200 requests/second
> > distributed across 20 nginx hosts. Any ideas on such setup will be
> > really helpful.
>
> Nginx parses and loads this data into radix tree in memory on startup.
>
> > Also, can such host lookups be done in real-time for every request? I am
> > guessing that may not be efficient for each request, but I was wondering
> if
> > there are any solutions.
>
> All variables are evaluated when they are used in request.
>
>
I was wondering if remote ip's hostname lookup can be done before
rate-limiting it. For example, I don't want to block IPs coming from
baidu.com. Can I do such IP-hostname lookup before rate-limiting? Will it
efficient or what are other options?
Thanks again for detailed reply.
- N
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20141122/a8c9a1cb/attachment.html>
More information about the nginx
mailing list