Nginx reverse proxy crash when dns unavailable
Maxim Dounin
mdounin at mdounin.ru
Fri Oct 23 04:13:30 MSD 2009
Hello!
On Thu, Oct 22, 2009 at 03:53:32PM -0400, masom wrote:
> But shoulnd't nginx start anyway if the end point is not responding and just try to reach it anyway?
>
> I can't really see why it would need to stop or crash when either the endpoint (apache) or the dns system is unavailable.
>
> Yes it should display 5xx errors saying the endpoint is unreachable (dns or server failure / not-responding) but nginx should not "lock up" after 1 bad answer.
>
> Current problem:
>
> unit starts
> dhcp kicks in
> nginx get started before dhcp process is completed
> nginx realize that content.dev.local is not reachable (dns settings are not yet set by dhcp)
> nginx exits
> Browser on unit starts, says address is unreachable (as nginx did not start).
>
>
> Shouldn't nginx just attempt to connect to the end point as requests are coming in?
Probably I'm not explained well enough.
When nginx have something it may attempt to connect to - it will
happily work. But in case of failed name resolution during
configuration parsing it just don't have an ip.
When you write in the config something like
location /pass-to-backend/ {
proxy_pass http://backend;
}
hostname "backend" is resolved during config parsing via standard
function gethostbyname(). This function is blocking and therefore
can't be used during request processing in nginx workers as it
will block all clients for unknown period of time. So this
function is only used during config parsing, hostname "backend"
resolved to ip address[es], and later during request processing
this ip is used without further DNS lookups.
If "backend" can't be resolved during config parsing there are
basically two options:
1. Work as is, always returning 502 when user tries to access uri
that should be proxied. We have no ip to connect() to, remember?
2. Refuse to start, assuming administrator will fix the problem
and start us normally.
Option (1) probably better in situations where you have
improperly configured system without any reliability implemented
that have to start unattended at any cost and do at least
something.
But it's not really wise to do (1) in normal situation. It will
basically start service in broken and almost undetectable state.
Consider it's the part of big cluster - new node comes up, seems
to work. But for some requests it returns errors for no reason.
It's administrative nightmare.
On the other hand, during reconfiguration, configuration testing,
binary upgrade and other attended operations the only sensible
thing to do is certanly (2). You wrote hostname in config that
can't be resolved - it's just configuration error.
Note well: note that there is quite a different mode of proxy_pass
operation, proxy_pass with variables, which may use nginx's
internal async resolver. For this mode nginx won't try to
resolve hostnames during configuration parsing, and nginx will
start perfectly even when dns isn't available. But this
a) requires additional configuration (you have to configure ip of
your DNS server via resolver directive);
b) much more resource consuming;
c) internal nginx resolver known to have problems at least in
stable branch.
Therefore I can't recommend using it in production.
> The solution we consider is the hosts file that would always point to a static ip for the content server, but would be a little management problem as we are deploying in several different location with different networks.
I don't really understand why not just impose correct
prerequisites before starting nginx. It's not really hard to wait
before network comes up.
Maxim Dounin
More information about the nginx
mailing list