Fwd: [ module ] Add http upstream keep alive timeout parameter
Wei Xu
weixu365 at gmail.com
Tue Nov 14 03:03:04 UTC 2017
Hi,
Really nice, much simpler than my patch. It's great to have a default
timeout value. thanks for you time.
On Tue, Nov 14, 2017 at 6:49 AM, Maxim Dounin <mdounin at mdounin.ru> wrote:
> Hello!
>
> On Sun, Nov 12, 2017 at 11:25:20PM +1100, Wei Xu wrote:
>
> > We are running Nginx and upstream on the same machine using docker, so
> > there's no firewall.
>
> Note that this isn't usually true. Docker uses iptables
> implicitly, and unless you specifically checked your iptables
> configuration - likely you are using firewall.
>
> > I did a test locally and captured the network packages.
> >
> > For the normal requests, upstream send a [FIN, ACK] to nginx after
> > keep-alive timeout (500 ms), and nginx also send a [FIN, ACK] back, then
> > upstream send a [ACK] to close the connection completely.
>
> [...]
>
> > For more detailed description of the test process, you can reference my
> > post at:
> > https://theantway.com/2017/11/analyze-connection-reset-
> error-in-nginx-upstream-with-keep-alive-enabled/
>
> The test demonstrates that it is indeed possible to trigger the
> problem in question. Unfortunately, it doesn't provide any proof
> that what you observed in production is the same issue though.
>
> While it is more or less clear that the race condition in question
> is real, it seems to be very unlikely with typical workloads. And
> even when triggered, in most cases nginx handles this good enough,
> re-trying the request per proxy_next_upstream.
>
> Nevertheless, thank you for detailed testing. A simple test case
> that reliably demonstrates the race is appreciated, and I was able
> to reduce it to your client script and nginx with the following
> trivial configuration:
>
> upstream u {
> server 127.0.0.1:8082;
> keepalive 10;
> }
>
> server {
> listen 8080;
>
> location / {
> proxy_pass http://u;
> proxy_http_version 1.1;
> proxy_set_header Connection "";
> }
> }
>
> server {
> listen 8082;
>
> keepalive_timeout 500ms;
>
> location / {
> return 200 ok\n;
> }
> }
>
> > To Fix the issue, I tried to add a timeout for keep-alived upstream, and
> > you can check the patch at:
> > https://github.com/weixu365/nginx/blob/docker-1.13.6/
> docker/stretch/patches/01-http-upstream-keepalive-timeout.patch
> >
> > The patch is for my current testing, and I can create a different format
> if
> > you need.
>
> The patch looks good enough for testing, though there are various
> minor issues - notably testing timeout for NGX_CONF_UNSET_MSEC at
> runtime, using wrong type for timeout during parsing (time_t
> instead of ngx_msec_t).
>
> Also I tend to think that using a separate keepalive_timeout
> directive should be easier, and we probably want to introduce some
> default value for it.
>
> Please take a look if the following patch works for you:
>
> # HG changeset patch
> # User Maxim Dounin <mdounin at mdounin.ru>
> # Date 1510601341 -10800
> # Mon Nov 13 22:29:01 2017 +0300
> # Node ID 9ba0a577601b7c1b714eb088bc0b0d21c6354699
> # Parent 6f592a42570898e1539d2e0b86017f32bbf665c8
> Upstream keepalive: keepalive_timeout directive.
>
> The directive configures maximum time a connection can be kept in the
> cache. By configuring a time which is smaller than the corresponding
> timeout on the backend side one can avoid the race between closing
> a connection by the backend and nginx trying to use the same connection
> to send a request at the same time.
>
> diff --git a/src/http/modules/ngx_http_upstream_keepalive_module.c
> b/src/http/modules/ngx_http_upstream_keepalive_module.c
> --- a/src/http/modules/ngx_http_upstream_keepalive_module.c
> +++ b/src/http/modules/ngx_http_upstream_keepalive_module.c
> @@ -12,6 +12,7 @@
>
> typedef struct {
> ngx_uint_t max_cached;
> + ngx_msec_t timeout;
>
> ngx_queue_t cache;
> ngx_queue_t free;
> @@ -84,6 +85,13 @@ static ngx_command_t ngx_http_upstream_
> 0,
> NULL },
>
> + { ngx_string("keepalive_timeout"),
> + NGX_HTTP_UPS_CONF|NGX_CONF_TAKE1,
> + ngx_conf_set_msec_slot,
> + NGX_HTTP_SRV_CONF_OFFSET,
> + offsetof(ngx_http_upstream_keepalive_srv_conf_t, timeout),
> + NULL },
> +
> ngx_null_command
> };
>
> @@ -141,6 +149,8 @@ ngx_http_upstream_init_keepalive(ngx_con
>
> us->peer.init = ngx_http_upstream_init_keepalive_peer;
>
> + ngx_conf_init_msec_value(kcf->timeout, 60000);
> +
> /* allocate cache items and add to free queue */
>
> cached = ngx_pcalloc(cf->pool,
> @@ -261,6 +271,10 @@ found:
> c->write->log = pc->log;
> c->pool->log = pc->log;
>
> + if (c->read->timer_set) {
> + ngx_del_timer(c->read);
> + }
> +
> pc->connection = c;
> pc->cached = 1;
>
> @@ -339,9 +353,8 @@ ngx_http_upstream_free_keepalive_peer(ng
>
> pc->connection = NULL;
>
> - if (c->read->timer_set) {
> - ngx_del_timer(c->read);
> - }
> + ngx_add_timer(c->read, kp->conf->timeout);
> +
> if (c->write->timer_set) {
> ngx_del_timer(c->write);
> }
> @@ -392,7 +405,7 @@ ngx_http_upstream_keepalive_close_handle
>
> c = ev->data;
>
> - if (c->close) {
> + if (c->close || c->read->timedout) {
> goto close;
> }
>
> @@ -485,6 +498,8 @@ ngx_http_upstream_keepalive_create_conf(
> * conf->max_cached = 0;
> */
>
> + conf->timeout = NGX_CONF_UNSET_MSEC;
> +
> return conf;
> }
>
>
> --
> Maxim Dounin
> http://mdounin.ru/
> _______________________________________________
> nginx-devel mailing list
> nginx-devel at nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx-devel/attachments/20171114/e4ea9e4f/attachment.html>
More information about the nginx-devel
mailing list