[PATCH 1 of 3] QUIC: changed path validation timeout

Sergey Kandaurov pluknet at nginx.com
Tue May 9 10:11:43 UTC 2023


> On 3 May 2023, at 17:00, Roman Arutyunyan <arut at nginx.com> wrote:
> 
> Hi,
> 
> On Mon, Apr 24, 2023 at 04:15:21PM +0400, Sergey Kandaurov wrote:
>> 
>> 
>>> On 28 Mar 2023, at 18:51, Roman Arutyunyan <arut at nginx.com> wrote:
>>> 
>>> # HG changeset patch
>>> # User Roman Arutyunyan <arut at nginx.com>
>>> # Date 1679925333 -14400
>>> #      Mon Mar 27 17:55:33 2023 +0400
>>> # Branch quic
>>> # Node ID f76e83412133085a6c82fce2c3e15b2c34a6e959
>>> # Parent  5fd628b89bb7fb5c95afa1dc914385f7ab79f6a3
>>> QUIC: changed path validation timeout.
>>> 
>>> Path validation packets containing PATH_CHALLENGE frames are sent separately
>>> from regular frame queue, because of the need to use a decicated path and
>>> pad the packets.  The packets are also resent separately from the regular
>>> probe/lost detection mechanism.  A path validation packet is resent 3 times,
>>> each time after PTO expiration.  Assuming constant PTO, the overall maximum
>>> waiting time is 3 * PTO.  According to RFC 9000, 8.2.4. Failed Path Validation,
>>> the following value is recommended as a validation timeout:
>>> 
>>> A value of three times the larger of the current PTO
>>> or the PTO for the new path (using kInitialRtt, as
>>> defined in [QUIC-RECOVERY]) is RECOMMENDED.
>>> 
>>> The change adds PTO of the new path to the equation as the lower bound.
>>> Also, max_ack_delay is now always accounted for, unlike previously, when
>>> it was only used when there are packets in flight.  As mentioned before,
>>> PACH_CHALLENGE is not considered in-flight by nginx since it's processed
>>> separately, but technically it is.
>> 
>> I don't like an idea to make a separate function to calculate
>> time for path validation retransmits.  It looks like an existing
>> function could be reused.
>> 
>> I tend to think checking for inflight packets in ngx_quic_pto()
>> isn't correct at the first place.  The condition comes from
>> the GetPtoTimeAndSpace example in 9002, A.8:
>> 
>> : GetPtoTimeAndSpace():
>> :   duration = (smoothed_rtt + max(4 * rttvar, kGranularity))
>> :       * (2 ^ pto_count)
>> :   // Anti-deadlock PTO starts from the current time
>> :   if (no ack-eliciting packets in flight):
>> :     assert(!PeerCompletedAddressValidation())
>> :     if (has handshake keys):
>> :       return (now() + duration), Handshake
>> :     else:
>> :       return (now() + duration), Initial
>> :   <..>
>> :   return pto_timeout, pto_space
>> 
>> But PeerCompletedAddressValidation is always true for the server.
>> The above anti-deadlock measure seems to only make sense for a client
>> when it has no new data to send, but forced to send something to rise
>> an anti-amplification limit for the server.  This thought is supported
>> by commentaries in places of GetPtoTimeAndSpace use.
>> 
>> Removing the condition from ngx_quic_pto() makes possible to unify
>> the function to use it for both regular PTO and path validation.
>> 
>> Next is to make retransmits similar to a new connection establishment.
>> Per RFC 9000, 8.2.1:
>> : An endpoint SHOULD NOT probe a new path with packets containing a
>> : PATH_CHALLENGE frame more frequently than it would send an Initial packet.
>> 
>> I think we can improve path validation to use a separate backoff,
>> path->tries can be used to base a backoff upon it.
>> 
>> Since PATH_CHALLENGE are resent separately from the regular probe/lost
>> detection mechanism, this needs to be moved out from ngx_quic_pto().
>> 
>> This makes the following series based on your patch.
>> We could set an overall maximum waiting time of 3 * PTO and test it
>> in pv handler in addition to the check for NGX_QUIC_PATH_RETRIES.
> 
> Jftr, discussed all of the above in person.  Agreed to implement that.
> 
>> [..]
>> # HG changeset patch
>> # User Sergey Kandaurov <pluknet at nginx.com>
>> # Date 1682338151 -14400
>> #      Mon Apr 24 16:09:11 2023 +0400
>> # Branch quic
>> # Node ID 808fe808e276496a9b026690c141201720744ab3
>> # Parent  f49aba6e3fb54843d3e3bd5df26dbb45f5d3d687
>> QUIC: separated path validation retransmit backoff.
>> 
>> Path validation packets containing PATH_CHALLENGE frames are sent separately
>> from regular frame queue, because of the need to use a decicated path and pad
>> the packets.  The packets are sent periodically, separately from the regular
>> probe/lost detection mechanism.  A path validation packet is resent up to 3
>> times, each time after PTO expiration, with increasing per-path PTO backoff.
>> 
>> diff --git a/src/event/quic/ngx_event_quic_ack.c b/src/event/quic/ngx_event_quic_ack.c
>> --- a/src/event/quic/ngx_event_quic_ack.c
>> +++ b/src/event/quic/ngx_event_quic_ack.c
>> @@ -736,7 +736,8 @@ ngx_quic_set_lost_timer(ngx_connection_t
>> 
>>         q = ngx_queue_last(&ctx->sent);
>>         f = ngx_queue_data(q, ngx_quic_frame_t, queue);
>> -        w = (ngx_msec_int_t) (f->last + ngx_quic_pto(c, ctx) - now);
>> +        w = (ngx_msec_int_t) (f->last + (ngx_quic_pto(c, ctx) << qc->pto_count)
>> +                              - now);
>> 
>>         if (w < 0) {
>>             w = 0;
>> @@ -785,10 +786,9 @@ ngx_quic_pto(ngx_connection_t *c, ngx_qu
>> 
>>     duration = qc->avg_rtt;
>>     duration += ngx_max(4 * qc->rttvar, NGX_QUIC_TIME_GRANULARITY);
>> -    duration <<= qc->pto_count;
>> 
>>     if (ctx->level == ssl_encryption_application && c->ssl->handshaked) {
>> -        duration += qc->ctp.max_ack_delay << qc->pto_count;
>> +        duration += qc->ctp.max_ack_delay;
>>     }
>> 
>>     return duration;
>> @@ -846,7 +846,9 @@ ngx_quic_pto_handler(ngx_event_t *ev)
>>             continue;
>>         }
>> 
>> -        if ((ngx_msec_int_t) (f->last + ngx_quic_pto(c, ctx) - now) > 0) {
>> +        if ((ngx_msec_int_t) (f->last + (ngx_quic_pto(c, ctx) << qc->pto_count)
>> +                              - now) > 0)
>> +        {
>>             continue;
>>         }
>> 
>> diff --git a/src/event/quic/ngx_event_quic_migration.c b/src/event/quic/ngx_event_quic_migration.c
>> --- a/src/event/quic/ngx_event_quic_migration.c
>> +++ b/src/event/quic/ngx_event_quic_migration.c
>> @@ -496,6 +496,7 @@ ngx_quic_validate_path(ngx_connection_t 
>>                    "quic initiated validation of path seq:%uL", path->seqnum);
>> 
>>     path->validating = 1;
>> +    path->tries = 0;
>> 
>>     if (RAND_bytes(path->challenge1, 8) != 1) {
>>         return NGX_ERROR;
>> @@ -513,7 +514,6 @@ ngx_quic_validate_path(ngx_connection_t 
>>     pto = ngx_quic_pto(c, ctx);
>> 
>>     path->expires = ngx_current_msec + pto;
>> -    path->tries = NGX_QUIC_PATH_RETRIES;
>> 
>>     if (!qc->path_validation.timer_set) {
>>         ngx_add_timer(&qc->path_validation, pto);
>> @@ -578,7 +578,6 @@ ngx_quic_path_validation_handler(ngx_eve
>>     qc = ngx_quic_get_connection(c);
>> 
>>     ctx = ngx_quic_get_send_ctx(qc, ssl_encryption_application);
>> -    pto = ngx_quic_pto(c, ctx);
>> 
>>     next = -1;
>>     now = ngx_current_msec;
>> @@ -605,7 +604,9 @@ ngx_quic_path_validation_handler(ngx_eve
>>             continue;
>>         }
>> 
>> -        if (--path->tries) {
>> +        if (++path->tries < NGX_QUIC_PATH_RETRIES) {
>> +            pto = ngx_quic_pto(c, ctx) << path->tries;
> 
> Here we schedule a timer for 2 * PTO or 4 * PTO.  Technically, our path
> validation code allows several paths to be validated at the same time.  If that
> happens, 2 * PTO and 4 * PTO will be too much for the first attempt for the new
> path, where the timeout is just PTO.  As a result, the last path may wait 2x
> or 4x longer than needed.

Agree it needs to be addressed.

Below is the first glance to set the next available pv timer:
1) in ngx_quic_validate_path(), a new path may own a shorter timer
than already established, especially after several timeouts
Previously, new path validation could be scheduled late.
2) in ngx_quic_handle_path_response_frame(), a validated path
might left a spurious timer, while nothing more to validate
or remaining paths have more distant timer
This one is less severe, just extra work added.

Unlike the two above, pv handler has a more complex
logic I decided not to touch it.

# HG changeset patch
# User Sergey Kandaurov <pluknet at nginx.com>
# Date 1683626923 -14400
#      Tue May 09 14:08:43 2023 +0400
# Branch quic
# Node ID 90f3e839532d899b09967cb2db3b3de30484c484
# Parent  c9a6960c548501b5234ca7f7ff5ae23fad75b023
QUIC: reschedule path validation on path insertion/removal.

Two issues fixed:
- new path validation could be scheduled late
- a validated path could leave a spurious timer

diff --git a/src/event/quic/ngx_event_quic_migration.c b/src/event/quic/ngx_event_quic_migration.c
--- a/src/event/quic/ngx_event_quic_migration.c
+++ b/src/event/quic/ngx_event_quic_migration.c
@@ -16,6 +16,7 @@ static ngx_int_t ngx_quic_validate_path(
     ngx_quic_path_t *path);
 static ngx_int_t ngx_quic_send_path_challenge(ngx_connection_t *c,
     ngx_quic_path_t *path);
+static ngx_msec_int_t ngx_quic_next_pv_timer(ngx_connection_t *c);
 static ngx_quic_path_t *ngx_quic_get_path(ngx_connection_t *c, ngx_uint_t tag);
 
 
@@ -78,6 +79,7 @@ ngx_quic_handle_path_response_frame(ngx_
 {
     ngx_uint_t              rst;
     ngx_queue_t            *q;
+    ngx_msec_int_t          next;
     ngx_quic_path_t        *path, *prev;
     ngx_quic_connection_t  *qc;
 
@@ -169,6 +171,17 @@ valid:
     path->validating = 0;
     path->limited = 0;
 
+    /* reschedule if validated path owned next timer */
+
+    next = ngx_quic_next_pv_timer(c);
+
+    if (next == -1) {
+        ngx_del_timer(&qc->path_validation);
+
+    } else if (next > (ngx_msec_int_t) (path->expires - ngx_current_msec)) {
+        ngx_add_timer(&qc->path_validation, next);
+    }
+
     return NGX_OK;
 }
 
@@ -486,7 +499,7 @@ ngx_quic_handle_migration(ngx_connection
 static ngx_int_t
 ngx_quic_validate_path(ngx_connection_t *c, ngx_quic_path_t *path)
 {
-    ngx_msec_t              pto;
+    ngx_msec_t              pto, next;
     ngx_quic_send_ctx_t    *ctx;
     ngx_quic_connection_t  *qc;
 
@@ -517,6 +530,15 @@ ngx_quic_validate_path(ngx_connection_t 
 
     if (!qc->path_validation.timer_set) {
         ngx_add_timer(&qc->path_validation, pto);
+        return NGX_OK;
+    }
+
+    /* reschedule if new path owns next timer */
+
+    next = ngx_quic_next_pv_timer(c);
+
+    if (next == pto) {
+        ngx_add_timer(&qc->path_validation, next);
     }
 
     return NGX_OK;
@@ -563,6 +585,41 @@ ngx_quic_send_path_challenge(ngx_connect
 }
 
 
+static ngx_msec_int_t
+ngx_quic_next_pv_timer(ngx_connection_t *c)
+{
+    ngx_msec_t              now;
+    ngx_queue_t            *q;
+    ngx_msec_int_t          left, next;
+    ngx_quic_path_t        *path;
+    ngx_quic_connection_t  *qc;
+
+    qc = ngx_quic_get_connection(c);
+
+    now = ngx_current_msec;
+    next = -1;
+
+    for (q = ngx_queue_head(&qc->paths);
+         q != ngx_queue_sentinel(&qc->paths);
+         q = ngx_queue_next(q))
+    {
+        path = ngx_queue_data(q, ngx_quic_path_t, queue);
+
+        if (!path->validating) {
+            continue;
+        }
+
+        left = path->expires - now;
+
+        if (next == -1 || left < next) {
+            next = left;
+        }
+    }
+
+    return next;
+}
+
+
 void
 ngx_quic_path_validation_handler(ngx_event_t *ev)
 {


>  Luckily, this is not possible, because when we start
> validation of a path, the validation for the old path is stopped.

As internally discussed, it is still possible on apparent migration,
where path validation starts on a previous active and now active paths.

>  This
> allows us to simplify the PATH_RESPONSE and timeout handlers to only handle
> qc->path.  I suggest to add a patch for this.

Hence, this wont work if client validates backup path (which is not qc->path).

>  It will also affect PMTUD,
> since the same handler will be responsible for MTU packets.  Also, it's
> probably a good idea to suspend PMTUD for the old path when we switch to a
> new one.  This will happend naturally if we support only one path in those
> handlers.
> 

-- 
Sergey Kandaurov


More information about the nginx-devel mailing list