[PATCH 0 of 6] QUIC PATH_CHALLENGE-related series

Sergey Kandaurov pluknet at nginx.com
Mon Dec 11 14:45:45 UTC 2023


> On 11 Dec 2023, at 17:49, Roman Arutyunyan <arut at nginx.com> wrote:
> 
> Hi,
> 
> On Mon, Dec 11, 2023 at 01:40:06PM +0400, Sergey Kandaurov wrote:
>> 
>>> On 30 Nov 2023, at 15:05, Roman Arutyunyan <arut at nginx.com> wrote:
>>> 
>>> Hi,
>>> 
>>> A number of patches discussed previously.
>>> 
>> 
>> A couple more patches to follow.
>> 
>> Last two weeks I spent pondering if we have to migrate to per-path
>> congestion control and RTT estimator (as inspired by Alibaba patch
>> https://mailman.nginx.org/pipermail/nginx-devel/2023-January/PNDV4MEDZPN5QJ6RZ2FQCWA7NGQ2ILSX.html),
>> in order to alleviate severe congestion controller bugs in in-flight
>> accounting that may result in connection stalls due to a type underflow.
>> Previously existed, they become more visible within this series (that
>> makes PATH_CHALLENGE frames congestion aware), such that any migration
>> may end up in connection stalls, easily seen in tests.
>> 
>> After discussion with Roman, I came to conclusion that it's possible
>> to fix those bugs without touching a single context.
>> 
>> Other than that below are various optimization I happened to see
>> while testing in-flight accounting during connection migration.
>> 
>> # HG changeset patch
>> # User Sergey Kandaurov <pluknet at nginx.com>
>> # Date 1702282940 -14400
>> #      Mon Dec 11 12:22:20 2023 +0400
>> # Node ID 34311bcfd27d3cc420771b9349108e839d6a532e
>> # Parent  3518c40102b2a0f4a5be00fef00becdff70921cb
>> QUIC: reset RTT estimator for the new path.
>> 
>> RTT is a property of the path, it must be reset on confirming a peer's
>> ownership of its new address.
>> 
>> diff --git a/src/event/quic/ngx_event_quic.c b/src/event/quic/ngx_event_quic.c
>> --- a/src/event/quic/ngx_event_quic.c
>> +++ b/src/event/quic/ngx_event_quic.c
>> @@ -260,14 +260,7 @@ ngx_quic_new_connection(ngx_connection_t
>> 
>>     ngx_queue_init(&qc->free_frames);
>> 
>> -    qc->avg_rtt = NGX_QUIC_INITIAL_RTT;
>> -    qc->rttvar = NGX_QUIC_INITIAL_RTT / 2;
>> -    qc->min_rtt = NGX_TIMER_INFINITE;
>> -    qc->first_rtt = NGX_TIMER_INFINITE;
>> -
>> -    /*
>> -     * qc->latest_rtt = 0
>> -     */
>> +    ngx_quic_init_rtt(qc);
>> 
>>     qc->pto.log = c->log;
>>     qc->pto.data = c;
>> diff --git a/src/event/quic/ngx_event_quic_connection.h b/src/event/quic/ngx_event_quic_connection.h
>> --- a/src/event/quic/ngx_event_quic_connection.h
>> +++ b/src/event/quic/ngx_event_quic_connection.h
>> @@ -65,6 +65,13 @@ typedef struct ngx_quic_keys_s        ng
>> 
>> #define ngx_quic_get_socket(c)               ((ngx_quic_socket_t *)((c)->udp))
>> 
>> +#define ngx_quic_init_rtt(qc)                                                 \
>> +    (qc)->avg_rtt = NGX_QUIC_INITIAL_RTT;                                     \
>> +    (qc)->rttvar = NGX_QUIC_INITIAL_RTT / 2;                                  \
>> +    (qc)->min_rtt = NGX_TIMER_INFINITE;                                       \
>> +    (qc)->first_rtt = NGX_TIMER_INFINITE;                                     \
>> +    (qc)->latest_rtt = 0;
>> +
>> 
>> typedef enum {
>>     NGX_QUIC_PATH_IDLE = 0,
>> diff --git a/src/event/quic/ngx_event_quic_migration.c b/src/event/quic/ngx_event_quic_migration.c
>> --- a/src/event/quic/ngx_event_quic_migration.c
>> +++ b/src/event/quic/ngx_event_quic_migration.c
>> @@ -181,6 +181,8 @@ valid:
>>                                            14720));
>>         qc->congestion.ssthresh = (size_t) -1;
>>         qc->congestion.recovery_start = ngx_current_msec;
>> +
>> +        ngx_quic_init_rtt(qc);
>>     }
>> 
>>     path->validated = 1;
> 
> Looks fine.
> 
>> # HG changeset patch
>> # User Sergey Kandaurov <pluknet at nginx.com>
>> # Date 1702285561 -14400
>> #      Mon Dec 11 13:06:01 2023 +0400
>> # Node ID 2e747e7b203e9b62455fc6b8457bd10879a88bec
>> # Parent  34311bcfd27d3cc420771b9349108e839d6a532e
>> QUIC: path aware in-flight bytes accounting.
>> 
>> On-packet acknowledgement is made path aware, as per RFC 9000, Section 9.4:
>>    Packets sent on the old path MUST NOT contribute to congestion control
>>    or RTT estimation for the new path.
>> 
>> To make it possible over a single congestion control context, the first packet
>> to be sent after the new path has been validated, which includes resetting the
>> congestion controller and RTT estimator, is now remembered in the connection.
>> Packets sent previously, such as on the old path, are not taken into account.
>> 
>> diff --git a/src/event/quic/ngx_event_quic_ack.c b/src/event/quic/ngx_event_quic_ack.c
>> --- a/src/event/quic/ngx_event_quic_ack.c
>> +++ b/src/event/quic/ngx_event_quic_ack.c
>> @@ -325,6 +325,10 @@ ngx_quic_congestion_ack(ngx_connection_t
>>     qc = ngx_quic_get_connection(c);
>>     cg = &qc->congestion;
>> 
>> +    if (f->pnum < qc->rst_pnum) {
>> +        return;
>> +    }
> 
> While this condition looks simple, there's something more complex going on.
> The value for pnum corresponds to the first application level packet that is
> send over the new path.  Here however there's no check for packet level because
> other levels can only show up while rst_pnum is zero.  I think this should be
> explained in the commit log.

Thanks, added this para at the end for clarity:

Note that although the packet number is saved per-connection, the added checks
affect application level packets only.  For non-application level packets,
which are only processed prior to the handshake is complete, the remembered
packet number remains set to zero.

> 
>> +
>>     blocked = (cg->in_flight >= cg->window) ? 1 : 0;
>> 
>>     cg->in_flight -= f->plen;
>> @@ -667,6 +671,10 @@ ngx_quic_congestion_lost(ngx_connection_
>>     qc = ngx_quic_get_connection(c);
>>     cg = &qc->congestion;
>> 
>> +    if (f->pnum < qc->rst_pnum) {
>> +        return;
>> +    }
> 
> The above applies here as well.
> 
>> +
>>     blocked = (cg->in_flight >= cg->window) ? 1 : 0;
>> 
>>     cg->in_flight -= f->plen;
>> diff --git a/src/event/quic/ngx_event_quic_connection.h b/src/event/quic/ngx_event_quic_connection.h
>> --- a/src/event/quic/ngx_event_quic_connection.h
>> +++ b/src/event/quic/ngx_event_quic_connection.h
>> @@ -266,6 +266,8 @@ struct ngx_quic_connection_s {
>>     ngx_quic_streams_t                streams;
>>     ngx_quic_congestion_t             congestion;
>> 
>> +    uint64_t                          rst_pnum;    /* first on validated path */
>> +
>>     off_t                             received;
>> 
>>     ngx_uint_t                        error;
>> diff --git a/src/event/quic/ngx_event_quic_migration.c b/src/event/quic/ngx_event_quic_migration.c
>> --- a/src/event/quic/ngx_event_quic_migration.c
>> +++ b/src/event/quic/ngx_event_quic_migration.c
>> @@ -110,6 +110,7 @@ ngx_quic_handle_path_response_frame(ngx_
>>     ngx_uint_t              rst;
>>     ngx_queue_t            *q;
>>     ngx_quic_path_t        *path, *prev;
>> +    ngx_quic_send_ctx_t    *ctx;
>>     ngx_quic_connection_t  *qc;
>> 
>>     qc = ngx_quic_get_connection(c);
>> @@ -174,6 +175,11 @@ valid:
>>     }
>> 
>>     if (rst) {
>> +        /* prevent old path packets contribution to congestion control */
>> +
>> +        ctx = ngx_quic_get_send_ctx(qc, ssl_encryption_application);
>> +        qc->rst_pnum = ctx->pnum;
>> +
>>         ngx_memzero(&qc->congestion, sizeof(ngx_quic_congestion_t));
>> 
>>         qc->congestion.window = ngx_min(10 * qc->tp.max_udp_payload_size,
> 
> Here we call a reset function for rtt below, but there's no such function for
> congestion.  As discussed in private, this will be addresses later while
> improving congestion algorithm.
> 
> The patch looks ok.
> 
>> # HG changeset patch
>> # User Sergey Kandaurov <pluknet at nginx.com>
>> # Date 1702287236 -14400
>> #      Mon Dec 11 13:33:56 2023 +0400
>> # Node ID 561791598a9610e79ea4bc24788ff887d032b3b3
>> # Parent  2e747e7b203e9b62455fc6b8457bd10879a88bec
>> QUIC: abandoned PMTU discovery on the old path.
>> 
>> It is seemingly a useless work to probe MTU on the old path, which may be gone.
>> This also goes in line with RFC 9000, Section 8.2.4, Failed Path Validation,
>> which prescribes to abandon old path validation if switching to a new path.
>> 
>> diff --git a/src/event/quic/ngx_event_quic_migration.c b/src/event/quic/ngx_event_quic_migration.c
>> --- a/src/event/quic/ngx_event_quic_migration.c
>> +++ b/src/event/quic/ngx_event_quic_migration.c
>> @@ -497,6 +497,7 @@ ngx_quic_handle_migration(ngx_connection
>>             }
>>         }
>> 
>> +        qc->path->state = NGX_QUIC_PATH_IDLE;
>>         qc->path->tag = NGX_QUIC_PATH_BACKUP;
>>         ngx_quic_path_dbg(c, "is now backup", qc->path);
> 
> When new path validation fails, we return to the backup path with potentially
> unfinished MTU discovery.  We need to restart MTU discovery in this case by
> calling ngx_quic_discover_path_mtu().  If the discovery was finished, the
> function will do nothing.  A separate case is when the path was validated, but
> initial MTU (1200) was not verified due to amplification limit.  This should be
> addressed as well.

While restarting PMTUD sounds doable, restoring MTU path validation
may need more work.  I think we can postpone this patch, since it
is not directly related to this series.

> 
>> # HG changeset patch
>> # User Sergey Kandaurov <pluknet at nginx.com>
>> # Date 1702287237 -14400
>> #      Mon Dec 11 13:33:57 2023 +0400
>> # Node ID 2f65a579a235d08902e47cf796db3c1bc1ca8790
>> # Parent  561791598a9610e79ea4bc24788ff887d032b3b3
>> QUIC: do not arm PTO timer for path validation.
>> 
>> As per RFC 9000, Section 9.4, a separate timer is set when a PATH_CHALLENGE
>> is sent.  Further, PATH_CHALLENGE frames have a distinct retransmission policy.
>> Thus, it makes no sense to arm the PTO timer for PATH_CHALLENGE, which may
>> result in spurious PTO events.
>> 
>> diff --git a/src/event/quic/ngx_event_quic_ack.c b/src/event/quic/ngx_event_quic_ack.c
>> --- a/src/event/quic/ngx_event_quic_ack.c
>> +++ b/src/event/quic/ngx_event_quic_ack.c
>> @@ -752,7 +752,23 @@ ngx_quic_set_lost_timer(ngx_connection_t
>>             }
>>         }
>> 
>> -        q = ngx_queue_last(&ctx->sent);
>> +        /* PATH_CHALLENGE timer is separate (see RFC 9000, 9.4) */
>> +
>> +        for (q = ngx_queue_last(&ctx->sent);
>> +             q != ngx_queue_sentinel(&ctx->sent);
>> +             q = ngx_queue_prev(q))
>> +        {
>> +            f = ngx_queue_data(q, ngx_quic_frame_t, queue);
>> +
>> +            if (f->type != NGX_QUIC_FT_PATH_CHALLENGE) {
>> +                break;
>> +            }
>> +        }
>> +
>> +        if (q == ngx_queue_sentinel(&ctx->sent)) {
>> +            continue;
>> +        }
>> +
>>         f = ngx_queue_data(q, ngx_quic_frame_t, queue);
>>         w = (ngx_msec_int_t)
>>                 (f->send_time + (ngx_quic_pto(c, ctx) << qc->pto_count) - now);
>> # HG changeset patch
>> # User Sergey Kandaurov <pluknet at nginx.com>
>> # Date 1702287238 -14400
>> #      Mon Dec 11 13:33:58 2023 +0400
>> # Node ID 4e1706b406978b8546f68a72e873c8579b8bc1d5
>> # Parent  2f65a579a235d08902e47cf796db3c1bc1ca8790
>> QUIC: do not consider PATH_CHALLENGE acknowledgment an RTT sample.
>> 
>> Since PATH_CHALLENGE frames use a separate retransmission timer (other than
>> PTO), it makes no sense to use PATH_CHALLENGE ACKs for RTT samples as well.
>> This extends RFC 9002, Section 6.2.2 to ACK frames.  Previously, through
>> influence on RTT estimator, it could be possible to affect PTO duration,
>> which is not used for resending PATH_CHALLENGE frames.
>> 
>> diff --git a/src/event/quic/ngx_event_quic_ack.c b/src/event/quic/ngx_event_quic_ack.c
>> --- a/src/event/quic/ngx_event_quic_ack.c
>> +++ b/src/event/quic/ngx_event_quic_ack.c
>> @@ -264,7 +264,9 @@ ngx_quic_handle_ack_frame_range(ngx_conn
>>                 break;
>>             }
>> 
>> -            if (f->pnum == max) {
>> +            /* PATH_CHALLENGE timer is separate (see RFC 9000, 9.4) */
>> +
>> +            if (f->pnum == max && f->type != NGX_QUIC_FT_PATH_CHALLENGE) {
>>                 st->max_pn = f->send_time;
>>             }
> 
> The last two patches will be addressed separately.

-- 
Sergey Kandaurov


More information about the nginx-devel mailing list