UDP Load balancer does not scale
ajmalahd
nginx-forum at forum.nginx.org
Tue May 16 07:10:47 UTC 2017
Hi
I am trying to set up a UDP load balancer using Nginx. Initially, I
configured 4 usptream servers with two server processes running on each of
them.
It gave a throughput of around 24000 query per second when tested with
dnsperf. When I try to add two more upstreams servers, the throughput is not
increasing as expected. In fact, it deteriorates to the range of 5000 query
per second with the following error:
[warn] 5943#0: *10433175 upstream server temporarily disabled while proxying
connection, udp client: xxx.xxx.xxx.29, server: 0.0.0.0:53, upstream:
"xxx.xxx.xxx.224:53", bytes from/to client:80/0, bytes from/to
upstream:0/80
[error] 5943#0: *10085077 no live upstreams while connecting to upstream,
udp client: xxx.xxx.xxx.224, server: 0.0.0.0:53, upstream: "dns_upstreams",
bytes from/to client:80/0, bytes from/to upstream:0/0
I understood that the above error appears when Nginx doesn't receive
responses from upstream on time, and it is marked as unavailable
temporarily. I used to get this error before even with 4 upstream servers,
but after adding the following additional configuration, it had got
resolved:
user nginx;
worker_processes 4;
worker_rlimit_nofile 65535;
load_module "/usr/lib64/nginx/modules/ngx_stream_module.so";
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;
events {
worker_connections 10240;
}
stream {
upstream dns_upstreams {
server xxx.xxx.xxx.0:53 max_fails=2000 fail_timeout=30s;
server xxx.xxx.xxx.0:6363 max_fails=2000 fail_timeout=0s;
server xxx.xxx.xxx.187:53 max_fails=2000 fail_timeout=30s;
server xxx.xxx.xxx.187:6363 max_fails=2000 fail_timeout=30s;
server xxx.xxx.xxx.183:53 max_fails=2000 fail_timeout=30s;
server xxx.xxx.xxx.183:6363 max_fails=2000 fail_timeout=30s;
server xxx.xxx.xxx.212:53 max_fails=2000 fail_timeout=30s;
server xxx.xxx.xxx.212:6363 max_fails=2000 fail_timeout=30s;
}
server {
listen 53 udp;
proxy_pass dns_upstreams;
proxy_timeout 1s;
proxy_responses 1;
}
}
Even though this configuration works fine with 4 upstream servers, it
doesn't help when I increase the number of servers.
The Nginx server has enough memory and CPU capacity remaining when running
with 4 upstream servers as well as 6 upstream servers. And the dnsperf
client is not a bottleneck here because it can send much more load in a
different setup. Also, the individual upstream server can serve a bit more
than 5000 request per second.
I am trying to get some hints about why I am observing more upstream
failures and eventual unavailability when I add more servers. If anybody has
faced a similar issue in the past and can give me some pointers to solve it,
that would of great help.
Thanks,
Ajmal
Posted at Nginx Forum: https://forum.nginx.org/read.php?2,274257,274257#msg-274257
More information about the nginx
mailing list