Serve *only* from cache for particular user-agents

Maxim Dounin mdounin at mdounin.ru
Fri Feb 21 15:47:47 UTC 2014


Hello!

On Fri, Feb 21, 2014 at 10:25:58AM -0500, rge3 wrote:

> I havne't found any ideas for this and thought I might ask here.  We have a
> fairly straightforward proxy_cache setup with a proxy_pass backend.  We
> cache documents for different lengths of time or go the backend for what's
> missing.  My problem is we're getting overrun with bot and spider requests. 
> MSN in particular started hitting us exceptionally hard yesterday and
> started bringing our backend servers down.  Because they're crawling the
> site from end to end our cache is missing a lot of those pages and nginx has
> to pass the request on through.
> 
> I'm looking for a way to match on User-Agent and say that if it matches
> certain bots to *only* serve out of proxy_cache.  Ideally I'd like the logic
> to be:  if it's in the cache, serve it.  If it's not, then return some 4xx
> error.  But in the case of those user-agents, *don't* go to the backend. 
> Only give them cache.  My first thought was something like...
> 
> if ($http_user_agent ~* msn-bot) {
>       proxy_pass http://devnull;
>  }
> 
> by making a bogus backend.  But in nginx 1.4.3 (that's what we're running) I
> get
> nginx: [emerg] "proxy_pass" directive is not allowed here
> 
> Does anyone have another idea?

The message suggests you are trying to write the snippet above at 
server{} level.  Moving things into a location should do the 
trick.

Please make sure to read http://wiki.nginx.org/IfIsEvil though.

-- 
Maxim Dounin
http://nginx.org/



More information about the nginx mailing list