<div dir="ltr">That IP resolves to <a href="http://rate-limited-proxy-72-14-199-18.google.com">rate-limited-proxy-72-14-199-18.google.com</a> - this is not the Google search crawler, hence why it ignores your robots.txt. No one seems to know for sure what the rate-limited-proxy IPs are used for. They could represent random Chrome users using the Google data saving feature, hence the varying user-agents you will see. Either way, they are probably best not blocked, as they could represent many end user IPs. Maybe there is an X-Forwarded-For header you could look at.<div><br></div><div>The Google search crawler will resolve to an IP like <a href="http://crawl-66-249-64-213.googlebot.com">crawl-66-249-64-213.googlebot.com</a>.<br><div><br></div><div><br></div></div></div><br><div class="gmail_quote"><div dir="ltr">On Mon, Jun 11, 2018 at 5:05 PM Francis Daly <<a href="mailto:francis@daoine.org">francis@daoine.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Thu, Jun 07, 2018 at 07:57:43PM -0400, shiz wrote:<br>
<br>
Hi there,<br>
<br>
> Recently, Google has started spidering my website and in addition to normal<br>
> pages, appended "&" to all urls, even the pages excluded by robots.txt<br>
> <br>
> e.g. page.php?page=aaa -> page.php?page=aaa&<br>
> <br>
> Any idea how to redirect/rewrite this?<br>
<br>
Untested, but:<br>
<br>
if ($args ~ "&$") { return 400; }<br>
<br>
should handle all requests that end in the four characters you report.<br>
<br>
You may prefer a different response code.<br>
<br>
Good luck with it,<br>
<br>
f<br>
-- <br>
Francis Daly <a href="mailto:francis@daoine.org" target="_blank">francis@daoine.org</a><br>
_______________________________________________<br>
nginx mailing list<br>
<a href="mailto:nginx@nginx.org" target="_blank">nginx@nginx.org</a><br>
<a href="http://mailman.nginx.org/mailman/listinfo/nginx" rel="noreferrer" target="_blank">http://mailman.nginx.org/mailman/listinfo/nginx</a><br>
</blockquote></div>