Geo blocking, but allowing google index robot to pass-thru

Steve Holdoway steve at greengecko.co.nz
Mon Feb 18 21:10:56 UTC 2013


On Mon, 2013-02-18 at 14:00 +0200, Pekka.Panula at sofor.fi wrote:
> Hi 
> 
> I have a site where i want to geo block all but one country, but
> perhaps allow Google to index site, perhaps some other index bot too. 
> 
> So what sort of configuration is needed so i can detect Google bot and
> let it pass-thru? Would be nice if there is example configuration. Is
> only good way to check user-agent? 
> 
> ______________________________________________________________________
> 
> Pekka Panula | Jatkuvat Palvelut  | Direct +358 10 235 9232  |
> Pekka.Panula at sofor.fi 
> Sofor Oy | www.sofor.fi| Takakaarre 3 | PL 51 | FIN-62200 Kauhava 
> tel.  +358 10 235 90 | fax  +358 10 235 9100 
> 
Here's some code I use for a similar setup...

map $geoip_country_code $external_redirects {
	default	'Block';

	US	http://www.example.com;	
}

#Whitelist crawlers
map $http_user_agent $crawler {
	default	0;

	~*(AdsBot-Google|Googlebot-Mobile|Googlebot-Image|Mediapartners-Google|
bingbot|Feedfetcher-Google|Googlebot|Yahoo\ !Slurp|msnbot|msnbot-media|
YahooCacheSystem)		1;
}
# You'll also probably need to override for specific IP addresses too...
geo $whitelisted {
	default $crawler;

#	localhost
	127.0.0.0/8		1;

# 	CloudFlare

	204.93.240.0/24		1;
	204.93.177.0/24		1;
	199.27.128.0/21		1;
	173.245.48.0/20		1;
	103.21.244.0/22		1;
	103.22.200.0/22		1;
	103.31.4.0/22		1;
	141.101.64.0/18		1;
        108.162.192.0/18        1;
	190.93.240.0/20		1;
	188.114.96.0/20		1;
	197.234.240.0/22        1;
	198.41.128.0/17		1;
	}


You can then use your own logic with the values of $external_redirects
and $whitelisted to control redirections.

hth,


Steve

-- 
Steve Holdoway BSc(Hons) MIITP 
http://www.greengecko.co.nz
Skype: sholdowa
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 6189 bytes
Desc: not available
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20130219/ccf2013b/attachment.bin>


More information about the nginx mailing list