[PATCH] HTTP: trigger lingering close when keepalive connection will be closed
Miao Wang
shankerwangmiao at gmail.com
Wed Jan 25 04:49:10 UTC 2023
Hi,
> 2023年1月25日 10:17,Maxim Dounin <mdounin at mdounin.ru> 写道:
>
> Hello!
>
> On Mon, Jan 23, 2023 at 07:01:16PM +0800, Miao Wang wrote:
>
>>> 2023年1月23日 12:05,Maxim Dounin <mdounin at mdounin.ru> 写道:
>>>
>>> On Wed, Jan 18, 2023 at 11:28:52PM +0800, Miao Wang wrote:
>>>
>>>> # HG changeset patch
>>>> # User Miao Wang <shankerwangmiao at gmail.com>
>>>> # Date 1674055068 -28800
>>>> # Wed Jan 18 23:17:48 2023 +0800
>>>> # Node ID 73aa64bd29f3dec9e43e97560d6b5a07cdf40063
>>>> # Parent 07b0bee87f32be91a33210bc06973e07c4c1dac9
>>>> HTTP: trigger lingering close when keepalive connection will be closed
>>>>
>>>> When finalizing a request, if the request is not keepalive but
>>>> its connection has served more than one request, then the connection
>>>> has been a keepalive connection previously and this connection will
>>>> be closed after this response. In this condition, it is likely that
>>>> there are pipelined requests following this request, which we should
>>>> ignore. As a result, lingering close is necessary in this case.
>>>>
>>>> Without this patch, nginx (with its default configuration) will send
>>>> out TCP RST when there are more pipelined requests. The symptom is
>>>> obvious when nginx is serving a debian repository and apt is
>>>> downloading massive of packages. See [1]. It becomes more obvious
>>>> when `keepalive_requests` is lower or nginx is under a relative
>>>> higher load, and it disappears when specifying
>>>> `lingering_close always`.
>>>>
>>>> [1]: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=973861#10
>>>>
>>>> diff -r 07b0bee87f32 -r 73aa64bd29f3 src/http/ngx_http_request.c
>>>> --- a/src/http/ngx_http_request.c Wed Dec 21 14:53:27 2022 +0300
>>>> +++ b/src/http/ngx_http_request.c Wed Jan 18 23:17:48 2023 +0800
>>>> @@ -2749,6 +2749,10 @@
>>>> return;
>>>> }
>>>>
>>>> + if (!r->keepalive && r->connection->requests > 1) {
>>>> + r->lingering_close = 1;
>>>> + }
>>>> +
>>>> if (clcf->lingering_close == NGX_HTTP_LINGERING_ALWAYS
>>>> || (clcf->lingering_close == NGX_HTTP_LINGERING_ON
>>>> && (r->lingering_close
>>>
>>> Thanks for the patch and the link to the Debian bug report.
>>>
>>> Lingering close implies noticeable additional resource usage: even
>>> if nothing happens on the connection, it will be kept open for
>>> lingering_timeout, which is 5 seconds by default. Given that
>>> pipelining is not used by most of the clients, forcing lingering
>>> close for all clients which are using keepalive does not look like
>>> a good solution to me.
>>>
>>> In general, nginx tries hard to determine if any additional data
>>> are expected on the connection, and uses lingering close if there
>>> is a good chance there will be some, but avoids lingering close by
>>> default if additional data are unlikely. If this logic does not
>>> work for some reason, lingering close can be explicitly requested
>>> with "lingering_close always;".
>>
>> That's true since the symptom I described can be worked around with
>> that option.
>>
>>>
>>> In particular, note the "r->header_in->pos < r->header_in->last"
>>> and "r->connection->read->ready" checks - these are expected to
>>> catch connections with additional pipelined requests (see revision
>>> 3981:77604e9a1ed8). And from the log provided in the report it
>>> looks like it works most of the time - there are more than 6k HTTP
>>> requests, and 60+ connections. But sometimes it fails - there are
>>> two RST errors logged (and one "Undetermined Error", which looks
>>> like a bug in apt, but might be related).
>>>
>>> It looks like when apt is downloading many resources, it does not
>>> send all the requests at once (or in batches), but instead tries
>>> to maintain a constant "depth", a number of pipelined requests in
>>> flight. This essentially means that after reading of a response
>>> it sends an additional request.
>>
>> That's right. From a traffic dump, I can see apt first sends one
>> request, and after receiving the response, it will send out 10
>> more requests, and maintain a depth of 10, since by default
>> Acquire::http::Pipeline-Depth is 10.
>>
>>>
>>> I see at least two possible cases which can result in nginx not
>>> using lingering close with such a load:
>>>
>>> 1. If a response where keepalive_requests is reached happens to
>>> be the last request in the r->header_in buffer (so the
>>> "r->header_in->pos < r->header_in->last" won't be effective), and
>>> there is a chance that nginx wasn't yet got an event from kernel
>>> about additional data (and therefore "r->connection->read->ready"
>>> will not be set). As such, nginx won't use lingering close, and
>>> might close connection with unread data in the socket buffer,
>>> resulting in RST.
>>>
>>> 2. Similarly, if nginx happens to be faster than apt, and socket
>>> buffers are large enough, it might sent all the responses,
>>> including the last one with "Connection: close", and close the
>>> connection (since there are no pending pipelined requests at the
>>> moment) even before an additional request is sent by apt. When
>>> later apt will send an additional request after reading some of
>>> the responses, it will send the request to already closed
>>> connection, again resulting in RST.
>>
>> Actually, comparing the debug log and the pcap, nginx calls
>> close() after writing the last response. However, at that time,
>> that response is not fully transmitted to the client and there
>> seems to be more requests not processed in the kernel buffer.
>> Thus close() triggers an immediate RST.
>
> Thanks for the details. This looks more like the first case, and
> probably can be addressed by improving likelihood of detecting the
> read event.
>
> Could you please test if the patch below fixes the particular
> issue you are seeing? It is somewhat unrelated, but it might be
> a good enough solution (and is more or less equivalent to
> checking r->pipeline).
>
>>> It would be interesting to see more details, such as tcpdump
>>> and/or nginx debug logs, to find out what actually goes on here.
>>
>> The tcpdump and debug logs are too large to send in this mail list.
>> I wonder if I can directly email it to you.
>
> Feel free to, my email should accept up to 100M messages.
> Alternatively, a good solution might be to make the files
> available for download and post a link here.
>
>>> Overall, given how apt uses pipelining, I tend to think that at
>>> least (2) is unavoidable and can happen with certain sizes of the
>>> responses.
>>>
>>> A good enough solution might be check for r->pipeline, which is
>>> set by nginx as long as it reads a pipelined request. It might
>>> not be enough though, since r->pipeline is only set for requests
>>> seen by nginx as pipelined, which might not be true for the last
>>> request.
>>>
>>> A more complete solution might be to introduce something like
>>> c->pipeline flag and use lingering close if any pipelined requests
>>> were seen on the connection.
>
> The following patch reworks handling of pipelined requests by
> postponing them to the next event loop iteration. It is expected
> make it more likely for nginx to know there are any additional
> unread data in the socket buffer (and right now is mostly
> equivalent to checking r->pipeline, since c->read->ready is always
> set for pipelined requests):
>
> # HG changeset patch
> # User Maxim Dounin <mdounin at mdounin.ru>
> # Date 1674610218 -10800
> # Wed Jan 25 04:30:18 2023 +0300
> # Node ID 8cfd22c325a3db370b9e45aa6f897ff7bc8222f3
> # Parent c7e103acb409f0352cb73997c053b3bdbb8dd5db
> Reworked pipelined requests to use posted next events.
>
> This is expected to improve handling of pipelined requests in a number
> of ways, including:
>
> 1) It will make a room for additional requests from other clients,
> reducing worker monopolization by a single client.
>
> 2) The c->read->ready flag will be set, so nginx will either read the
> additional data, or will use lingering close. This is expected to help
> with clients using pipelining with some constant depth, such as apt[1][2].
>
> The ngx_event_move_posted_next() was modified to make it possible to
> post read events on connections with kqueue. Previously, it used to
> set ev->available to -1, potentially overwriting a valid positive value
> provided by kqueue, so ngx_unix_recv() and ngx_readv_chain() will stop
> reading from the socket before reading all the data available.
>
> Note that currently ngx_event_move_posted_next() will always set
> the ev->ready flag. While this is expected behaviour for the ev->available
> use case (where ev->ready is explicitly cleared), this is not needed for
> pipelining. For pipelining, this will result in extra unneeded read()
> syscall after processing of all pipelined requests, and there might be
> a room for improvement here.
>
> [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=973861#10
> [2] https://mailman.nginx.org/pipermail/nginx-devel/2023-January/ZA2SP5SJU55LHEBCJMFDB2AZVELRLTHI.html
>
> diff --git a/src/event/ngx_event_posted.c b/src/event/ngx_event_posted.c
> --- a/src/event/ngx_event_posted.c
> +++ b/src/event/ngx_event_posted.c
> @@ -51,8 +51,10 @@ ngx_event_move_posted_next(ngx_cycle_t *
> ngx_log_debug1(NGX_LOG_DEBUG_EVENT, cycle->log, 0,
> "posted next event %p", ev);
>
> - ev->ready = 1;
> - ev->available = -1;
> + if (!ev->ready) {
> + ev->ready = 1;
> + ev->available = -1;
> + }
> }
>
> ngx_queue_add(&ngx_posted_events, &ngx_posted_next_events);
> diff --git a/src/http/ngx_http_request.c b/src/http/ngx_http_request.c
> --- a/src/http/ngx_http_request.c
> +++ b/src/http/ngx_http_request.c
> @@ -3129,7 +3129,7 @@ ngx_http_set_keepalive(ngx_http_request_
> }
>
> rev->handler = ngx_http_process_request_line;
> - ngx_post_event(rev, &ngx_posted_events);
> + ngx_post_event(rev, &ngx_posted_next_events);
> return;
> }
>
>
> --
> Maxim Dounin
> http://mdounin.ru/
> _______________________________________________
> nginx-devel mailing list
> nginx-devel at nginx.org
> https://mailman.nginx.org/mailman/listinfo/nginx-devel
I can confirm that the symptom disappears after applying this patch
Cheers,
Miao Wang
More information about the nginx-devel
mailing list