[PATCH 0 of 6] QUIC PATH_CHALLENGE-related series
Sergey Kandaurov
pluknet at nginx.com
Mon Dec 11 14:45:45 UTC 2023
> On 11 Dec 2023, at 17:49, Roman Arutyunyan <arut at nginx.com> wrote:
>
> Hi,
>
> On Mon, Dec 11, 2023 at 01:40:06PM +0400, Sergey Kandaurov wrote:
>>
>>> On 30 Nov 2023, at 15:05, Roman Arutyunyan <arut at nginx.com> wrote:
>>>
>>> Hi,
>>>
>>> A number of patches discussed previously.
>>>
>>
>> A couple more patches to follow.
>>
>> Last two weeks I spent pondering if we have to migrate to per-path
>> congestion control and RTT estimator (as inspired by Alibaba patch
>> https://mailman.nginx.org/pipermail/nginx-devel/2023-January/PNDV4MEDZPN5QJ6RZ2FQCWA7NGQ2ILSX.html),
>> in order to alleviate severe congestion controller bugs in in-flight
>> accounting that may result in connection stalls due to a type underflow.
>> Previously existed, they become more visible within this series (that
>> makes PATH_CHALLENGE frames congestion aware), such that any migration
>> may end up in connection stalls, easily seen in tests.
>>
>> After discussion with Roman, I came to conclusion that it's possible
>> to fix those bugs without touching a single context.
>>
>> Other than that below are various optimization I happened to see
>> while testing in-flight accounting during connection migration.
>>
>> # HG changeset patch
>> # User Sergey Kandaurov <pluknet at nginx.com>
>> # Date 1702282940 -14400
>> # Mon Dec 11 12:22:20 2023 +0400
>> # Node ID 34311bcfd27d3cc420771b9349108e839d6a532e
>> # Parent 3518c40102b2a0f4a5be00fef00becdff70921cb
>> QUIC: reset RTT estimator for the new path.
>>
>> RTT is a property of the path, it must be reset on confirming a peer's
>> ownership of its new address.
>>
>> diff --git a/src/event/quic/ngx_event_quic.c b/src/event/quic/ngx_event_quic.c
>> --- a/src/event/quic/ngx_event_quic.c
>> +++ b/src/event/quic/ngx_event_quic.c
>> @@ -260,14 +260,7 @@ ngx_quic_new_connection(ngx_connection_t
>>
>> ngx_queue_init(&qc->free_frames);
>>
>> - qc->avg_rtt = NGX_QUIC_INITIAL_RTT;
>> - qc->rttvar = NGX_QUIC_INITIAL_RTT / 2;
>> - qc->min_rtt = NGX_TIMER_INFINITE;
>> - qc->first_rtt = NGX_TIMER_INFINITE;
>> -
>> - /*
>> - * qc->latest_rtt = 0
>> - */
>> + ngx_quic_init_rtt(qc);
>>
>> qc->pto.log = c->log;
>> qc->pto.data = c;
>> diff --git a/src/event/quic/ngx_event_quic_connection.h b/src/event/quic/ngx_event_quic_connection.h
>> --- a/src/event/quic/ngx_event_quic_connection.h
>> +++ b/src/event/quic/ngx_event_quic_connection.h
>> @@ -65,6 +65,13 @@ typedef struct ngx_quic_keys_s ng
>>
>> #define ngx_quic_get_socket(c) ((ngx_quic_socket_t *)((c)->udp))
>>
>> +#define ngx_quic_init_rtt(qc) \
>> + (qc)->avg_rtt = NGX_QUIC_INITIAL_RTT; \
>> + (qc)->rttvar = NGX_QUIC_INITIAL_RTT / 2; \
>> + (qc)->min_rtt = NGX_TIMER_INFINITE; \
>> + (qc)->first_rtt = NGX_TIMER_INFINITE; \
>> + (qc)->latest_rtt = 0;
>> +
>>
>> typedef enum {
>> NGX_QUIC_PATH_IDLE = 0,
>> diff --git a/src/event/quic/ngx_event_quic_migration.c b/src/event/quic/ngx_event_quic_migration.c
>> --- a/src/event/quic/ngx_event_quic_migration.c
>> +++ b/src/event/quic/ngx_event_quic_migration.c
>> @@ -181,6 +181,8 @@ valid:
>> 14720));
>> qc->congestion.ssthresh = (size_t) -1;
>> qc->congestion.recovery_start = ngx_current_msec;
>> +
>> + ngx_quic_init_rtt(qc);
>> }
>>
>> path->validated = 1;
>
> Looks fine.
>
>> # HG changeset patch
>> # User Sergey Kandaurov <pluknet at nginx.com>
>> # Date 1702285561 -14400
>> # Mon Dec 11 13:06:01 2023 +0400
>> # Node ID 2e747e7b203e9b62455fc6b8457bd10879a88bec
>> # Parent 34311bcfd27d3cc420771b9349108e839d6a532e
>> QUIC: path aware in-flight bytes accounting.
>>
>> On-packet acknowledgement is made path aware, as per RFC 9000, Section 9.4:
>> Packets sent on the old path MUST NOT contribute to congestion control
>> or RTT estimation for the new path.
>>
>> To make it possible over a single congestion control context, the first packet
>> to be sent after the new path has been validated, which includes resetting the
>> congestion controller and RTT estimator, is now remembered in the connection.
>> Packets sent previously, such as on the old path, are not taken into account.
>>
>> diff --git a/src/event/quic/ngx_event_quic_ack.c b/src/event/quic/ngx_event_quic_ack.c
>> --- a/src/event/quic/ngx_event_quic_ack.c
>> +++ b/src/event/quic/ngx_event_quic_ack.c
>> @@ -325,6 +325,10 @@ ngx_quic_congestion_ack(ngx_connection_t
>> qc = ngx_quic_get_connection(c);
>> cg = &qc->congestion;
>>
>> + if (f->pnum < qc->rst_pnum) {
>> + return;
>> + }
>
> While this condition looks simple, there's something more complex going on.
> The value for pnum corresponds to the first application level packet that is
> send over the new path. Here however there's no check for packet level because
> other levels can only show up while rst_pnum is zero. I think this should be
> explained in the commit log.
Thanks, added this para at the end for clarity:
Note that although the packet number is saved per-connection, the added checks
affect application level packets only. For non-application level packets,
which are only processed prior to the handshake is complete, the remembered
packet number remains set to zero.
>
>> +
>> blocked = (cg->in_flight >= cg->window) ? 1 : 0;
>>
>> cg->in_flight -= f->plen;
>> @@ -667,6 +671,10 @@ ngx_quic_congestion_lost(ngx_connection_
>> qc = ngx_quic_get_connection(c);
>> cg = &qc->congestion;
>>
>> + if (f->pnum < qc->rst_pnum) {
>> + return;
>> + }
>
> The above applies here as well.
>
>> +
>> blocked = (cg->in_flight >= cg->window) ? 1 : 0;
>>
>> cg->in_flight -= f->plen;
>> diff --git a/src/event/quic/ngx_event_quic_connection.h b/src/event/quic/ngx_event_quic_connection.h
>> --- a/src/event/quic/ngx_event_quic_connection.h
>> +++ b/src/event/quic/ngx_event_quic_connection.h
>> @@ -266,6 +266,8 @@ struct ngx_quic_connection_s {
>> ngx_quic_streams_t streams;
>> ngx_quic_congestion_t congestion;
>>
>> + uint64_t rst_pnum; /* first on validated path */
>> +
>> off_t received;
>>
>> ngx_uint_t error;
>> diff --git a/src/event/quic/ngx_event_quic_migration.c b/src/event/quic/ngx_event_quic_migration.c
>> --- a/src/event/quic/ngx_event_quic_migration.c
>> +++ b/src/event/quic/ngx_event_quic_migration.c
>> @@ -110,6 +110,7 @@ ngx_quic_handle_path_response_frame(ngx_
>> ngx_uint_t rst;
>> ngx_queue_t *q;
>> ngx_quic_path_t *path, *prev;
>> + ngx_quic_send_ctx_t *ctx;
>> ngx_quic_connection_t *qc;
>>
>> qc = ngx_quic_get_connection(c);
>> @@ -174,6 +175,11 @@ valid:
>> }
>>
>> if (rst) {
>> + /* prevent old path packets contribution to congestion control */
>> +
>> + ctx = ngx_quic_get_send_ctx(qc, ssl_encryption_application);
>> + qc->rst_pnum = ctx->pnum;
>> +
>> ngx_memzero(&qc->congestion, sizeof(ngx_quic_congestion_t));
>>
>> qc->congestion.window = ngx_min(10 * qc->tp.max_udp_payload_size,
>
> Here we call a reset function for rtt below, but there's no such function for
> congestion. As discussed in private, this will be addresses later while
> improving congestion algorithm.
>
> The patch looks ok.
>
>> # HG changeset patch
>> # User Sergey Kandaurov <pluknet at nginx.com>
>> # Date 1702287236 -14400
>> # Mon Dec 11 13:33:56 2023 +0400
>> # Node ID 561791598a9610e79ea4bc24788ff887d032b3b3
>> # Parent 2e747e7b203e9b62455fc6b8457bd10879a88bec
>> QUIC: abandoned PMTU discovery on the old path.
>>
>> It is seemingly a useless work to probe MTU on the old path, which may be gone.
>> This also goes in line with RFC 9000, Section 8.2.4, Failed Path Validation,
>> which prescribes to abandon old path validation if switching to a new path.
>>
>> diff --git a/src/event/quic/ngx_event_quic_migration.c b/src/event/quic/ngx_event_quic_migration.c
>> --- a/src/event/quic/ngx_event_quic_migration.c
>> +++ b/src/event/quic/ngx_event_quic_migration.c
>> @@ -497,6 +497,7 @@ ngx_quic_handle_migration(ngx_connection
>> }
>> }
>>
>> + qc->path->state = NGX_QUIC_PATH_IDLE;
>> qc->path->tag = NGX_QUIC_PATH_BACKUP;
>> ngx_quic_path_dbg(c, "is now backup", qc->path);
>
> When new path validation fails, we return to the backup path with potentially
> unfinished MTU discovery. We need to restart MTU discovery in this case by
> calling ngx_quic_discover_path_mtu(). If the discovery was finished, the
> function will do nothing. A separate case is when the path was validated, but
> initial MTU (1200) was not verified due to amplification limit. This should be
> addressed as well.
While restarting PMTUD sounds doable, restoring MTU path validation
may need more work. I think we can postpone this patch, since it
is not directly related to this series.
>
>> # HG changeset patch
>> # User Sergey Kandaurov <pluknet at nginx.com>
>> # Date 1702287237 -14400
>> # Mon Dec 11 13:33:57 2023 +0400
>> # Node ID 2f65a579a235d08902e47cf796db3c1bc1ca8790
>> # Parent 561791598a9610e79ea4bc24788ff887d032b3b3
>> QUIC: do not arm PTO timer for path validation.
>>
>> As per RFC 9000, Section 9.4, a separate timer is set when a PATH_CHALLENGE
>> is sent. Further, PATH_CHALLENGE frames have a distinct retransmission policy.
>> Thus, it makes no sense to arm the PTO timer for PATH_CHALLENGE, which may
>> result in spurious PTO events.
>>
>> diff --git a/src/event/quic/ngx_event_quic_ack.c b/src/event/quic/ngx_event_quic_ack.c
>> --- a/src/event/quic/ngx_event_quic_ack.c
>> +++ b/src/event/quic/ngx_event_quic_ack.c
>> @@ -752,7 +752,23 @@ ngx_quic_set_lost_timer(ngx_connection_t
>> }
>> }
>>
>> - q = ngx_queue_last(&ctx->sent);
>> + /* PATH_CHALLENGE timer is separate (see RFC 9000, 9.4) */
>> +
>> + for (q = ngx_queue_last(&ctx->sent);
>> + q != ngx_queue_sentinel(&ctx->sent);
>> + q = ngx_queue_prev(q))
>> + {
>> + f = ngx_queue_data(q, ngx_quic_frame_t, queue);
>> +
>> + if (f->type != NGX_QUIC_FT_PATH_CHALLENGE) {
>> + break;
>> + }
>> + }
>> +
>> + if (q == ngx_queue_sentinel(&ctx->sent)) {
>> + continue;
>> + }
>> +
>> f = ngx_queue_data(q, ngx_quic_frame_t, queue);
>> w = (ngx_msec_int_t)
>> (f->send_time + (ngx_quic_pto(c, ctx) << qc->pto_count) - now);
>> # HG changeset patch
>> # User Sergey Kandaurov <pluknet at nginx.com>
>> # Date 1702287238 -14400
>> # Mon Dec 11 13:33:58 2023 +0400
>> # Node ID 4e1706b406978b8546f68a72e873c8579b8bc1d5
>> # Parent 2f65a579a235d08902e47cf796db3c1bc1ca8790
>> QUIC: do not consider PATH_CHALLENGE acknowledgment an RTT sample.
>>
>> Since PATH_CHALLENGE frames use a separate retransmission timer (other than
>> PTO), it makes no sense to use PATH_CHALLENGE ACKs for RTT samples as well.
>> This extends RFC 9002, Section 6.2.2 to ACK frames. Previously, through
>> influence on RTT estimator, it could be possible to affect PTO duration,
>> which is not used for resending PATH_CHALLENGE frames.
>>
>> diff --git a/src/event/quic/ngx_event_quic_ack.c b/src/event/quic/ngx_event_quic_ack.c
>> --- a/src/event/quic/ngx_event_quic_ack.c
>> +++ b/src/event/quic/ngx_event_quic_ack.c
>> @@ -264,7 +264,9 @@ ngx_quic_handle_ack_frame_range(ngx_conn
>> break;
>> }
>>
>> - if (f->pnum == max) {
>> + /* PATH_CHALLENGE timer is separate (see RFC 9000, 9.4) */
>> +
>> + if (f->pnum == max && f->type != NGX_QUIC_FT_PATH_CHALLENGE) {
>> st->max_pn = f->send_time;
>> }
>
> The last two patches will be addressed separately.
--
Sergey Kandaurov
More information about the nginx-devel
mailing list