Token bucket to limit bots and site grabbers
mdounin at mdounin.ru
Mon Feb 15 14:33:58 MSK 2010
On Mon, Feb 15, 2010 at 11:34:16AM +0100, Tobia Conforto wrote:
> Is there any module I can use to limit or deny access to bots
> and site grabbers, based on the long-term request rate?
> I'm thinking of a token bucket with a timeframe of hours or
> days, where a legitimate user will only download, say, 50 pages
> (images and css excluded) per day, from a single ip address.
> Bots will obviously try and grab more content than that. Even if
> they set a long delay between requests, the overall number of
> requests per day will be much higher than that of a legitimate
> limit_req is not what I'm looking for, because it has a short
> timeframe of seconds or minutes, and because this kind of limit
> requires a token bucket, not a leaky bucket.
To turn limit_req into token bucket it's enough to specify
It should be relatively easy to extend supported time frames, too.
Not as easy as just adding another line of configuration parsing,
but I believe it's something that should be done.
More information about the nginx