[PATCH] HTTP: trigger lingering close when keepalive connection will be closed

Fri Jan 27 12:11:34 UTC 2023

Hi,

> 2023年1月27日 12:01，Maxim Dounin <mdounin at mdounin.ru> 写道：
> 
> Hello!
> 
> On Wed, Jan 25, 2023 at 12:49:10PM +0800, Miao Wang wrote:
> 
>>> 2023年1月25日 10:17，Maxim Dounin <mdounin at mdounin.ru> 写道：
>>> 
>>> On Mon, Jan 23, 2023 at 07:01:16PM +0800, Miao Wang wrote:
>>> 
>>>>> 2023年1月23日 12:05，Maxim Dounin <mdounin at mdounin.ru> 写道：
>>>>> 
>>>>> On Wed, Jan 18, 2023 at 11:28:52PM +0800, Miao Wang wrote:
>>>>> 
>>>>>> # HG changeset patch
>>>>>> # User Miao Wang <shankerwangmiao at gmail.com>
>>>>>> # Date 1674055068 -28800
>>>>>> #      Wed Jan 18 23:17:48 2023 +0800
>>>>>> # Node ID 73aa64bd29f3dec9e43e97560d6b5a07cdf40063
>>>>>> # Parent  07b0bee87f32be91a33210bc06973e07c4c1dac9
>>>>>> HTTP: trigger lingering close when keepalive connection will be closed
>>>>>> 
>>>>>> When finalizing a request, if the request is not keepalive but
>>>>>> its connection has served more than one request, then the connection
>>>>>> has been a keepalive connection previously and this connection will
>>>>>> be closed after this response. In this condition, it is likely that
>>>>>> there are pipelined requests following this request, which we should
>>>>>> ignore. As a result, lingering close is necessary in this case.
>>>>>> 
>>>>>> Without this patch, nginx (with its default configuration) will send
>>>>>> out TCP RST when there are more pipelined requests. The symptom is
>>>>>> obvious when nginx is serving a debian repository and apt is
>>>>>> downloading massive of packages. See [1]. It becomes more obvious
>>>>>> when `keepalive_requests` is lower or nginx is under a relative
>>>>>> higher load, and it disappears when specifying
>>>>>> `lingering_close always`.
>>>>>> 
>>>>>> [1]: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=973861#10
>>>>>> 
>>>>>> diff -r 07b0bee87f32 -r 73aa64bd29f3 src/http/ngx_http_request.c
>>>>>> --- a/src/http/ngx_http_request.c Wed Dec 21 14:53:27 2022 +0300
>>>>>> +++ b/src/http/ngx_http_request.c Wed Jan 18 23:17:48 2023 +0800
>>>>>> @@ -2749,6 +2749,10 @@
>>>>>>       return;
>>>>>>   }
>>>>>> 
>>>>>> +    if (!r->keepalive && r->connection->requests > 1) {
>>>>>> +        r->lingering_close = 1;
>>>>>> +    }
>>>>>> +
>>>>>>   if (clcf->lingering_close == NGX_HTTP_LINGERING_ALWAYS
>>>>>>       || (clcf->lingering_close == NGX_HTTP_LINGERING_ON
>>>>>>           && (r->lingering_close
>>>>> 
>>>>> Thanks for the patch and the link to the Debian bug report.
>>>>> 
>>>>> Lingering close implies noticeable additional resource usage: even 
>>>>> if nothing happens on the connection, it will be kept open for 
>>>>> lingering_timeout, which is 5 seconds by default.  Given that 
>>>>> pipelining is not used by most of the clients, forcing lingering 
>>>>> close for all clients which are using keepalive does not look like 
>>>>> a good solution to me.
>>>>> 
>>>>> In general, nginx tries hard to determine if any additional data 
>>>>> are expected on the connection, and uses lingering close if there 
>>>>> is a good chance there will be some, but avoids lingering close by 
>>>>> default if additional data are unlikely.  If this logic does not 
>>>>> work for some reason, lingering close can be explicitly requested 
>>>>> with "lingering_close always;".
>>>> 
>>>> That's true since the symptom I described can be worked around with
>>>> that option.
>>>> 
>>>>> 
>>>>> In particular, note the "r->header_in->pos < r->header_in->last" 
>>>>> and "r->connection->read->ready" checks - these are expected to 
>>>>> catch connections with additional pipelined requests (see revision
>>>>> 3981:77604e9a1ed8).  And from the log provided in the report it 
>>>>> looks like it works most of the time - there are more than 6k HTTP 
>>>>> requests, and 60+ connections.  But sometimes it fails - there are 
>>>>> two RST errors logged (and one "Undetermined Error", which looks 
>>>>> like a bug in apt, but might be related).
>>>>> 
>>>>> It looks like when apt is downloading many resources, it does not 
>>>>> send all the requests at once (or in batches), but instead tries 
>>>>> to maintain a constant "depth", a number of pipelined requests in 
>>>>> flight.  This essentially means that after reading of a response 
>>>>> it sends an additional request.
>>>> 
>>>> That's right. From a traffic dump, I can see apt first sends one
>>>> request, and after receiving the response, it will send out 10
>>>> more requests, and maintain a depth of 10, since by default
>>>> Acquire::http::Pipeline-Depth is 10.
>>>> 
>>>>> 
>>>>> I see at least two possible cases which can result in nginx not 
>>>>> using lingering close with such a load:
>>>>> 
>>>>> 1.  If a response where keepalive_requests is reached happens to 
>>>>> be the last request in the r->header_in buffer (so the 
>>>>> "r->header_in->pos < r->header_in->last" won't be effective), and 
>>>>> there is a chance that nginx wasn't yet got an event from kernel 
>>>>> about additional data (and therefore "r->connection->read->ready" 
>>>>> will not be set).  As such, nginx won't use lingering close, and 
>>>>> might close connection with unread data in the socket buffer, 
>>>>> resulting in RST.
>>>>> 
>>>>> 2.  Similarly, if nginx happens to be faster than apt, and socket 
>>>>> buffers are large enough, it might sent all the responses, 
>>>>> including the last one with "Connection: close", and close the 
>>>>> connection (since there are no pending pipelined requests at the 
>>>>> moment) even before an additional request is sent by apt.  When 
>>>>> later apt will send an additional request after reading some of 
>>>>> the responses, it will send the request to already closed 
>>>>> connection, again resulting in RST.
>>>> 
>>>> Actually, comparing the debug log and the pcap, nginx calls
>>>> close() after writing the last response. However, at that time,
>>>> that response is not fully transmitted to the client and there
>>>> seems to be more requests not processed in the kernel buffer.
>>>> Thus close() triggers an immediate RST.
>>> 
>>> Thanks for the details.  This looks more like the first case, and 
>>> probably can be addressed by improving likelihood of detecting the 
>>> read event.
>>> 
>>> Could you please test if the patch below fixes the particular 
>>> issue you are seeing?  It is somewhat unrelated, but it might be
>>> a good enough solution (and is more or less equivalent to 
>>> checking r->pipeline).
>>> 
>>>>> It would be interesting to see more details, such as tcpdump 
>>>>> and/or nginx debug logs, to find out what actually goes on here.
>>>> 
>>>> The tcpdump and debug logs are too large to send in this mail list.
>>>> I wonder if I can directly email it to you.
>>> 
>>> Feel free to, my email should accept up to 100M messages.  
>>> Alternatively, a good solution might be to make the files 
>>> available for download and post a link here.
>>> 
>>>>> Overall, given how apt uses pipelining, I tend to think that at 
>>>>> least (2) is unavoidable and can happen with certain sizes of the 
>>>>> responses.
>>>>> 
>>>>> A good enough solution might be check for r->pipeline, which is 
>>>>> set by nginx as long as it reads a pipelined request.  It might 
>>>>> not be enough though, since r->pipeline is only set for requests 
>>>>> seen by nginx as pipelined, which might not be true for the last 
>>>>> request.
>>>>> 
>>>>> A more complete solution might be to introduce something like 
>>>>> c->pipeline flag and use lingering close if any pipelined requests 
>>>>> were seen on the connection.
>>> 
>>> The following patch reworks handling of pipelined requests by 
>>> postponing them to the next event loop iteration.  It is expected 
>>> make it more likely for nginx to know there are any additional 
>>> unread data in the socket buffer (and right now is mostly 
>>> equivalent to checking r->pipeline, since c->read->ready is always 
>>> set for pipelined requests):
>>> 
>>> # HG changeset patch
>>> # User Maxim Dounin <mdounin at mdounin.ru>
>>> # Date 1674610218 -10800
>>> #      Wed Jan 25 04:30:18 2023 +0300
>>> # Node ID 8cfd22c325a3db370b9e45aa6f897ff7bc8222f3
>>> # Parent  c7e103acb409f0352cb73997c053b3bdbb8dd5db
>>> Reworked pipelined requests to use posted next events.
>>> 
>>> This is expected to improve handling of pipelined requests in a number
>>> of ways, including:
>>> 
>>> 1) It will make a room for additional requests from other clients,
>>> reducing worker monopolization by a single client.
>>> 
>>> 2) The c->read->ready flag will be set, so nginx will either read the
>>> additional data, or will use lingering close.  This is expected to help
>>> with clients using pipelining with some constant depth, such as apt[1][2].
>>> 
>>> The ngx_event_move_posted_next() was modified to make it possible to
>>> post read events on connections with kqueue.  Previously, it used to
>>> set ev->available to -1, potentially overwriting a valid positive value
>>> provided by kqueue, so ngx_unix_recv() and ngx_readv_chain() will stop
>>> reading from the socket before reading all the data available.
>>> 
>>> Note that currently ngx_event_move_posted_next() will always set
>>> the ev->ready flag.  While this is expected behaviour for the ev->available
>>> use case (where ev->ready is explicitly cleared), this is not needed for
>>> pipelining.  For pipelining, this will result in extra unneeded read()
>>> syscall after processing of all pipelined requests, and there might be
>>> a room for improvement here.
>>> 
>>> [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=973861#10
>>> [2] https://mailman.nginx.org/pipermail/nginx-devel/2023-January/ZA2SP5SJU55LHEBCJMFDB2AZVELRLTHI.html
>>> 
>>> diff --git a/src/event/ngx_event_posted.c b/src/event/ngx_event_posted.c
>>> --- a/src/event/ngx_event_posted.c
>>> +++ b/src/event/ngx_event_posted.c
>>> @@ -51,8 +51,10 @@ ngx_event_move_posted_next(ngx_cycle_t *
>>>        ngx_log_debug1(NGX_LOG_DEBUG_EVENT, cycle->log, 0,
>>>                      "posted next event %p", ev);
>>> 
>>> -        ev->ready = 1;
>>> -        ev->available = -1;
>>> +        if (!ev->ready) {
>>> +            ev->ready = 1;
>>> +            ev->available = -1;
>>> +        }
>>>    }
>>> 
>>>    ngx_queue_add(&ngx_posted_events, &ngx_posted_next_events);
>>> diff --git a/src/http/ngx_http_request.c b/src/http/ngx_http_request.c
>>> --- a/src/http/ngx_http_request.c
>>> +++ b/src/http/ngx_http_request.c
>>> @@ -3129,7 +3129,7 @@ ngx_http_set_keepalive(ngx_http_request_
>>>        }
>>> 
>>>        rev->handler = ngx_http_process_request_line;
>>> -        ngx_post_event(rev, &ngx_posted_events);
>>> +        ngx_post_event(rev, &ngx_posted_next_events);
>>>        return;
>>>    }
>>> 
>> 
>> I can confirm that the symptom disappears after applying this patch
> 
> Thanks for testing, and thanks for the logs and traffic dumps 
> provided.
> 
> From the logs you've provided it looks like the patch should help 
> with 1 out of 5 errors: the last request in the connection with 
> this error was pipelined, and therefore with the patch it is 
> expected to use lingering close.
> 
> In 4 other cases the last requests are handled without pipelining: 
> once the request arrives, ngx_http_keepalive_handler() is called, 
> and nginx reads the request and responds to it, closing the 
> connection.  The next request arrives at some time later, and the 
> OS responds with RST.
> 
> E.g., the request to bzip2_1.0.8-4_amd64.deb (which corresponds to 
> the second error as seen by apt, while downloading 
> libpython3.9-stdlib_3.9.2-1_amd64.deb; connection *26, with client 
> port 40110) arrives at 14:08:36.495858, nginx responds to it, as 
> seen in nginx logs, and then closes the connection.  The next 
> request arrives at 14:08:36.496977 and immediately responded with 
> RST.
> 
> The patch does not change handling of these 4 cases, except may 
> be some mostly unrelated event handling changes, so the timing 
> might be slightly different.  As such, I would expect the errors 
> to be still reproducible with the patch, but probably somewhat 
> less often.
> 
> Could you please re-check to see if the symptoms still happen with 
> the patch, at least occasionally?

You are totally right about that. I tested the previous patch
for more times and can see the RST occasionally with less frequency.

> 
> Overall, after looking into logs and tcpdump you've provided I 
> tend to think that the only working fix would be to introduce 
> c->pipeline flag, and force lingering close if there were any 
> pipelined requests on the connection.
> 
> Below is the patch which implements this approach.  Review and 
> testing appreciated.  It can be used either separately or with the 
> previously provided patch to use posted next events.

I've tested the new patch for several times and can confirm the
problem is solved. Also, I added a new variable exposing the
r->connection->pipeline and can confirm that the flag works
as intended. The flag won't be set for normal keep-alive requests,
and will only be set for pipelined requests.

Thanks again for looking into this issue.

> 
> # HG changeset patch
> # User Maxim Dounin <mdounin at mdounin.ru>
> # Date 1674790916 -10800
> #      Fri Jan 27 06:41:56 2023 +0300
> # Node ID 784d0fa0b5a0796561642a5a64dc4e9e07592852
> # Parent  4eb1383f6432b034630e6de18739b817f6565c8c
> Lingering close for connections with pipelined requests.
> 
> This is expected to help with clients using pipelining with some constant
> depth, such as apt[1][2].
> 
> When downloading many resources, apt uses pipelining with some constant
> depth, a number of requests in flight[1][2].  This essentially means that
> after receiving a response it sends an additional request to the server,
> and this can result in requests arriving to the server at any time.  Further,
> additional requests are sent one-by-one, and can be easily seen as such
> (neither as pipelined, nor followed by pipelined requests).
> 
> The only safe approach to close such connections (for example, when
> keepalive_requests is reached) is with lingering.  To do so, now nginx
> monitors if pipelining was used on the connection, and if it was, closes
> the connection with lingering.
> 
> [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=973861#10
> [2] https://mailman.nginx.org/pipermail/nginx-devel/2023-January/ZA2SP5SJU55LHEBCJMFDB2AZVELRLTHI.html
> 
> diff --git a/src/core/ngx_connection.h b/src/core/ngx_connection.h
> --- a/src/core/ngx_connection.h
> +++ b/src/core/ngx_connection.h
> @@ -172,6 +172,7 @@ struct ngx_connection_s {
>     unsigned            timedout:1;
>     unsigned            error:1;
>     unsigned            destroyed:1;
> +    unsigned            pipeline:1;
> 
>     unsigned            idle:1;
>     unsigned            reusable:1;
> diff --git a/src/http/ngx_http_request.c b/src/http/ngx_http_request.c
> --- a/src/http/ngx_http_request.c
> +++ b/src/http/ngx_http_request.c
> @@ -2753,7 +2753,8 @@ ngx_http_finalize_connection(ngx_http_re
>         || (clcf->lingering_close == NGX_HTTP_LINGERING_ON
>             && (r->lingering_close
>                 || r->header_in->pos < r->header_in->last
> -                || r->connection->read->ready)))
> +                || r->connection->read->ready
> +                || r->connection->pipeline)))
>     {
>         ngx_http_set_lingering_close(r->connection);
>         return;
> @@ -3123,6 +3124,7 @@ ngx_http_set_keepalive(ngx_http_request_
> 
>         c->sent = 0;
>         c->destroyed = 0;
> +        c->pipeline = 1;
> 
>         if (rev->timer_set) {
>             ngx_del_timer(rev);
> 
> -- 
> Maxim Dounin
> http://mdounin.ru/

Cheers,

Miao Wang