Serve *only* from cache for particular user-agents
Maxim Dounin
mdounin at mdounin.ru
Fri Feb 21 15:47:47 UTC 2014
Hello!
On Fri, Feb 21, 2014 at 10:25:58AM -0500, rge3 wrote:
> I havne't found any ideas for this and thought I might ask here. We have a
> fairly straightforward proxy_cache setup with a proxy_pass backend. We
> cache documents for different lengths of time or go the backend for what's
> missing. My problem is we're getting overrun with bot and spider requests.
> MSN in particular started hitting us exceptionally hard yesterday and
> started bringing our backend servers down. Because they're crawling the
> site from end to end our cache is missing a lot of those pages and nginx has
> to pass the request on through.
>
> I'm looking for a way to match on User-Agent and say that if it matches
> certain bots to *only* serve out of proxy_cache. Ideally I'd like the logic
> to be: if it's in the cache, serve it. If it's not, then return some 4xx
> error. But in the case of those user-agents, *don't* go to the backend.
> Only give them cache. My first thought was something like...
>
> if ($http_user_agent ~* msn-bot) {
> proxy_pass http://devnull;
> }
>
> by making a bogus backend. But in nginx 1.4.3 (that's what we're running) I
> get
> nginx: [emerg] "proxy_pass" directive is not allowed here
>
> Does anyone have another idea?
The message suggests you are trying to write the snippet above at
server{} level. Moving things into a location should do the
trick.
Please make sure to read http://wiki.nginx.org/IfIsEvil though.
--
Maxim Dounin
http://nginx.org/
More information about the nginx
mailing list