Fwd: [ module ] Add http upstream keep alive timeout parameter
Maxim Dounin
mdounin at mdounin.ru
Mon Nov 13 19:49:46 UTC 2017
Hello!
On Sun, Nov 12, 2017 at 11:25:20PM +1100, Wei Xu wrote:
> We are running Nginx and upstream on the same machine using docker, so
> there's no firewall.
Note that this isn't usually true. Docker uses iptables
implicitly, and unless you specifically checked your iptables
configuration - likely you are using firewall.
> I did a test locally and captured the network packages.
>
> For the normal requests, upstream send a [FIN, ACK] to nginx after
> keep-alive timeout (500 ms), and nginx also send a [FIN, ACK] back, then
> upstream send a [ACK] to close the connection completely.
[...]
> For more detailed description of the test process, you can reference my
> post at:
> https://theantway.com/2017/11/analyze-connection-reset-error-in-nginx-upstream-with-keep-alive-enabled/
The test demonstrates that it is indeed possible to trigger the
problem in question. Unfortunately, it doesn't provide any proof
that what you observed in production is the same issue though.
While it is more or less clear that the race condition in question
is real, it seems to be very unlikely with typical workloads. And
even when triggered, in most cases nginx handles this good enough,
re-trying the request per proxy_next_upstream.
Nevertheless, thank you for detailed testing. A simple test case
that reliably demonstrates the race is appreciated, and I was able
to reduce it to your client script and nginx with the following
trivial configuration:
upstream u {
server 127.0.0.1:8082;
keepalive 10;
}
server {
listen 8080;
location / {
proxy_pass http://u;
proxy_http_version 1.1;
proxy_set_header Connection "";
}
}
server {
listen 8082;
keepalive_timeout 500ms;
location / {
return 200 ok\n;
}
}
> To Fix the issue, I tried to add a timeout for keep-alived upstream, and
> you can check the patch at:
> https://github.com/weixu365/nginx/blob/docker-1.13.6/docker/stretch/patches/01-http-upstream-keepalive-timeout.patch
>
> The patch is for my current testing, and I can create a different format if
> you need.
The patch looks good enough for testing, though there are various
minor issues - notably testing timeout for NGX_CONF_UNSET_MSEC at
runtime, using wrong type for timeout during parsing (time_t
instead of ngx_msec_t).
Also I tend to think that using a separate keepalive_timeout
directive should be easier, and we probably want to introduce some
default value for it.
Please take a look if the following patch works for you:
# HG changeset patch
# User Maxim Dounin <mdounin at mdounin.ru>
# Date 1510601341 -10800
# Mon Nov 13 22:29:01 2017 +0300
# Node ID 9ba0a577601b7c1b714eb088bc0b0d21c6354699
# Parent 6f592a42570898e1539d2e0b86017f32bbf665c8
Upstream keepalive: keepalive_timeout directive.
The directive configures maximum time a connection can be kept in the
cache. By configuring a time which is smaller than the corresponding
timeout on the backend side one can avoid the race between closing
a connection by the backend and nginx trying to use the same connection
to send a request at the same time.
diff --git a/src/http/modules/ngx_http_upstream_keepalive_module.c b/src/http/modules/ngx_http_upstream_keepalive_module.c
--- a/src/http/modules/ngx_http_upstream_keepalive_module.c
+++ b/src/http/modules/ngx_http_upstream_keepalive_module.c
@@ -12,6 +12,7 @@
typedef struct {
ngx_uint_t max_cached;
+ ngx_msec_t timeout;
ngx_queue_t cache;
ngx_queue_t free;
@@ -84,6 +85,13 @@ static ngx_command_t ngx_http_upstream_
0,
NULL },
+ { ngx_string("keepalive_timeout"),
+ NGX_HTTP_UPS_CONF|NGX_CONF_TAKE1,
+ ngx_conf_set_msec_slot,
+ NGX_HTTP_SRV_CONF_OFFSET,
+ offsetof(ngx_http_upstream_keepalive_srv_conf_t, timeout),
+ NULL },
+
ngx_null_command
};
@@ -141,6 +149,8 @@ ngx_http_upstream_init_keepalive(ngx_con
us->peer.init = ngx_http_upstream_init_keepalive_peer;
+ ngx_conf_init_msec_value(kcf->timeout, 60000);
+
/* allocate cache items and add to free queue */
cached = ngx_pcalloc(cf->pool,
@@ -261,6 +271,10 @@ found:
c->write->log = pc->log;
c->pool->log = pc->log;
+ if (c->read->timer_set) {
+ ngx_del_timer(c->read);
+ }
+
pc->connection = c;
pc->cached = 1;
@@ -339,9 +353,8 @@ ngx_http_upstream_free_keepalive_peer(ng
pc->connection = NULL;
- if (c->read->timer_set) {
- ngx_del_timer(c->read);
- }
+ ngx_add_timer(c->read, kp->conf->timeout);
+
if (c->write->timer_set) {
ngx_del_timer(c->write);
}
@@ -392,7 +405,7 @@ ngx_http_upstream_keepalive_close_handle
c = ev->data;
- if (c->close) {
+ if (c->close || c->read->timedout) {
goto close;
}
@@ -485,6 +498,8 @@ ngx_http_upstream_keepalive_create_conf(
* conf->max_cached = 0;
*/
+ conf->timeout = NGX_CONF_UNSET_MSEC;
+
return conf;
}
--
Maxim Dounin
http://mdounin.ru/
More information about the nginx-devel
mailing list