Token bucket to limit bots and site grabbers

Tobia Conforto tobia.conforto at gmail.com
Mon Feb 15 13:34:16 MSK 2010


Hello

Is there any module I can use to limit or deny access to bots and site grabbers, based on the long-term request rate?

I'm thinking of a token bucket with a timeframe of hours or days, where a legitimate user will only download, say, 50 pages (images and css excluded) per day, from a single ip address. Bots will obviously try and grab more content than that. Even if they set a long delay between requests, the overall number of requests per day will be much higher than that of a legitimate user.

limit_req is not what I'm looking for, because it has a short timeframe of seconds or minutes, and because this kind of limit requires a token bucket, not a leaky bucket.

Is there anything available, or should I write my own module?

Tobia


More information about the nginx mailing list