<div dir="ltr">Some time ago I wrote this <a href="https://github.com/wandenberg/nginx-trusted-proxy-resolver-module">module</a> to check when an access is done through the Google Proxy using reverse DNS + DNS resolve and comparing the results to validate the access.<div>You can do something similar.</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Sep 25, 2016 at 11:58 PM, <a href="mailto:lists@lazygranch.com">lists@lazygranch.com</a> <span dir="ltr"><<a href="mailto:lists@lazygranch.com" target="_blank">lists@lazygranch.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I got a spoofed googlebot hit. It was easy to detect since there were<br>
probably a hundred requests that triggered my hacker detection map<br>
scheme. Only two requests received a 200 return and both were harmless.<br>
<br>
200 118.193.176.53 - - [25/Sep/2016:17:45:23 +0000] "GET / HTTP/1.1" 847 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +<a href="http://www.google.com/bot.html" rel="noreferrer" target="_blank">http://www.google.com/bot.<wbr>html</a>)" "-"<br>
<br>
For the fake googlebot:<br>
# host 118.193.176.53<br>
Host 53.176.193.118.in-addr.arpa not found: 3(NXDOMAIN)<br>
<br>
For a real googlebot:<br>
# host 66.249.69.184<br>
184.69.249.66.in-addr.arpa domain name pointer <a href="http://crawl-66-249-69-184.googlebot.com" rel="noreferrer" target="_blank">crawl-66-249-69-184.googlebot.<wbr>com</a>.<br>
<br>
IP2location shows it is a Chinese ISP:<br>
3(NXDOMAIN)<a href="http://www.ip2location.com/118.193.176.53" rel="noreferrer" target="_blank">http://www.<wbr>ip2location.com/118.193.176.53</a><br>
<br>
Nginx has a reverse DNS module:<br>
<a href="https://github.com/flant/nginx-http-rdns" rel="noreferrer" target="_blank">https://github.com/flant/<wbr>nginx-http-rdns</a><br>
I see it has a 10.1 issue:<br>
<a href="https://github.com/flant/nginx-http-rdns/issues/8" rel="noreferrer" target="_blank">https://github.com/flant/<wbr>nginx-http-rdns/issues/8</a><br>
<br>
Presuming this bug gets fixed, does anyone have code to verify<br>
googlebots? Or some other method?<br>
<br>
______________________________<wbr>_________________<br>
nginx mailing list<br>
<a href="mailto:nginx@nginx.org">nginx@nginx.org</a><br>
<a href="http://mailman.nginx.org/mailman/listinfo/nginx" rel="noreferrer" target="_blank">http://mailman.nginx.org/<wbr>mailman/listinfo/nginx</a><br>
</blockquote></div><br></div>