Token bucket to limit bots and site grabbers
tobia.conforto at gmail.com
Mon Feb 15 13:34:16 MSK 2010
Is there any module I can use to limit or deny access to bots and site grabbers, based on the long-term request rate?
I'm thinking of a token bucket with a timeframe of hours or days, where a legitimate user will only download, say, 50 pages (images and css excluded) per day, from a single ip address. Bots will obviously try and grab more content than that. Even if they set a long delay between requests, the overall number of requests per day will be much higher than that of a legitimate user.
limit_req is not what I'm looking for, because it has a short timeframe of seconds or minutes, and because this kind of limit requires a token bucket, not a leaky bucket.
Is there anything available, or should I write my own module?
More information about the nginx