<div dir="ltr">That IP resolves to <a href="http://rate-limited-proxy-72-14-199-18.google.com">rate-limited-proxy-72-14-199-18.google.com</a> - this is not the Google search crawler, hence why it ignores your robots.txt. No one seems to know for sure what the rate-limited-proxy IPs are used for. They could represent random Chrome users using the Google data saving feature, hence the varying user-agents you will see. Either way, they are probably best not blocked, as they could represent many end user IPs. Maybe there is an X-Forwarded-For header you could look at.<div><br></div><div>The Google search crawler will resolve to an IP like <a href="http://crawl-66-249-64-213.googlebot.com">crawl-66-249-64-213.googlebot.com</a>.<br><div><br></div><div><br></div></div></div><br><div class="gmail_quote"><div dir="ltr">On Mon, Jun 11, 2018 at 5:05 PM Francis Daly <<a href="mailto:francis@daoine.org">francis@daoine.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Thu, Jun 07, 2018 at 07:57:43PM -0400, shiz wrote:<br>

<br>

Hi there,<br>

<br>

> Recently, Google has started spidering my website and in addition to normal<br>

> pages, appended "&amp" to all urls, even the pages excluded by robots.txt<br>

> <br>

> e.g.  page.php?page=aaa -> page.php?page=aaa&amp<br>

> <br>

> Any idea how to redirect/rewrite this?<br>

<br>

Untested, but:<br>

<br>

  if ($args ~ "&amp$") { return 400; }<br>

<br>

should handle all requests that end in the four characters you report.<br>

<br>

You may prefer a different response code.<br>

<br>

Good luck with it,<br>

<br>

        f<br>

-- <br>

Francis Daly        <a href="mailto:francis@daoine.org" target="_blank">francis@daoine.org</a><br>

_______________________________________________<br>

nginx mailing list<br>

<a href="mailto:nginx@nginx.org" target="_blank">nginx@nginx.org</a><br>

<a href="http://mailman.nginx.org/mailman/listinfo/nginx" rel="noreferrer" target="_blank">http://mailman.nginx.org/mailman/listinfo/nginx</a><br>

</blockquote></div>