nginx serving wrong proxy content, static assets not affected
ekortright at ewtn.com
Tue Jan 3 20:54:48 UTC 2023
Both Rails and nginx are running in Docker containers. I am using nginx 1.23.1 running in a Docker image built from the official Debian Docker image (I only added certbot for TLS certificate processing). nginx connects to each proxy via HTTP using the internal service name defined in the Docker network by each docker-compose.yml file.
For some reason, some time after nginx starts nginx gets confused and begins serving content from the wrong proxy. For instance, requesting a page from aaa.com<http://aaa.com> returns the Rails content for bbb.com<http://bbb.com>; requesting bbb.com<http://bbb.com> returns content from ccc.com<http://ccc.com>; and requesting ccc.com<http://ccc.com> returns content from aaa.com<http://aaa.com>. This problem only affects the proxy content; the static assets for aaa.com<http://aaa.com> are returned as expected (and so on for all the other sites).
This is not a problem of nginx simply returning content from the wrong site. If you request a page from aaa.com<http://aaa.com>, since the dynamic content (HTML markup) comes from bbb.com<http://bbb.com>, the markup will contain references to assets from bbb.com<http://bbb.com>, but since the URLs are relative they will be requested from aaa.com<http://aaa.com>; those assets return 404 Not found because they do not exist in aaa.com<http://aaa.com>. If you change the URL manually and request them from bbb.com<http://bbb.com>, they are returned with no problem, so it’s not that nginx can’t resolve the host name, just that the proxy content is being routed incorrectly.
Also, I don’t believe this is a problem with Docker. When nginx gets confused, I can run a shell in any Docker container and connect directly to Rails (by running a cURL command pointing to the proxy URL configured for each site in nginx), and I get the correct content every time. It is only when I request it through nginx that the content comes from a different site than the one requested.
I have no idea what triggers this behavior. Once it happens, the only thing that can be done to correct it is to restart nginx. After that (could be minutes, hours, or days), the server will function as expected once again. Since I am using this setup in several production servers, at first I created a cron job to restart nginx every day, then every hour, and finally I decided to poll the sites on each server every five minutes, so that if the responses don’t look right I can restart nginx without having users experience a lengthy interruption.
I have confirmed that this problem occurs on more than one server (although on one server I have only observed it once). I have also set up a staging server that is as close to one of the production servers as possible, but so far the problem has not occurred there (since this staging server does not get any traffic, the problem may never surface if it is triggered by a particular kind of incoming request).
I have asked a question at Server Fault (https://serverfault.com/questions/1117412/nginx-serving-content-from-wrong-proxy), where I have posted sanitized versions of the nginx configuration for two sample sites. I can post here the same or any other configuration that might help diagnose this. I have never seen nginx behave like this before (I have used it to host multiple Rails sites for years without any problems—only not in combination with Docker).
Any suggestions as to what to look for or what to try would be most appreciated.
EWTN Online Services
(205) 332-4835 (cell)
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the nginx