[PATCH 1 of 6] QUIC: ignore server address while looking up a connection

Roman Arutyunyan arut at nginx.com
Thu Jan 19 14:01:13 UTC 2023


Hi,

On Thu, Dec 29, 2022 at 05:13:49PM +0400, Sergey Kandaurov wrote:

[..]

> # HG changeset patch
> # User Yu Zhu <lishu.zy at alibaba-inc.com>
> # Date 1672317960 -14400
> #      Thu Dec 29 16:46:00 2022 +0400
> # Branch quic
> # Node ID 46e738a5c0ab9d98577b21e72447cc9a6e3e9784
> # Parent  91ad1abfb2850f952bccb607e4c5843854576a09
> QUIC: moved rtt and congestion control to ngx_quic_path_t.
> 
> As per RFC 9002, section 6. Loss Detection:
> 
>    Loss detection is separate per packet number space, unlike RTT measurement
>    and congestion control, because RTT and congestion control are properties
>    of the path, whereas loss detection also relies upon key availability.

RFC 9000 (Section 9.4.) is less certain about the congestion control part:

    While multiple paths might be used during connection migration,
    a single congestion control context and a single loss recovery
    context (as described in [QUIC-RECOVERY]) could be adequate.
    For instance, an endpoint might delay switching to a new congestion
    control context until it is confirmed that an old path is no longer
    needed (such as the case described in Section 9.3.3).

Loss recovery context is path-independent before and after the change, so maybe
congestion control should remain where it is now.

> No functional changes.

I'm not sure about this.

> diff --git a/src/event/quic/ngx_event_quic.c b/src/event/quic/ngx_event_quic.c
> --- a/src/event/quic/ngx_event_quic.c
> +++ b/src/event/quic/ngx_event_quic.c
> @@ -263,15 +263,6 @@ ngx_quic_new_connection(ngx_connection_t
>  
>      ngx_queue_init(&qc->free_frames);
>  
> -    qc->avg_rtt = NGX_QUIC_INITIAL_RTT;
> -    qc->rttvar = NGX_QUIC_INITIAL_RTT / 2;
> -    qc->min_rtt = NGX_TIMER_INFINITE;
> -    qc->first_rtt = NGX_TIMER_INFINITE;
> -
> -    /*
> -     * qc->latest_rtt = 0
> -     */
> -
>      qc->pto.log = c->log;
>      qc->pto.data = c;
>      qc->pto.handler = ngx_quic_pto_handler;
> @@ -311,12 +302,6 @@ ngx_quic_new_connection(ngx_connection_t
>      qc->streams.client_max_streams_uni = qc->tp.initial_max_streams_uni;
>      qc->streams.client_max_streams_bidi = qc->tp.initial_max_streams_bidi;
>  
> -    qc->congestion.window = ngx_min(10 * qc->tp.max_udp_payload_size,
> -                                    ngx_max(2 * qc->tp.max_udp_payload_size,
> -                                            14720));
> -    qc->congestion.ssthresh = (size_t) -1;
> -    qc->congestion.recovery_start = ngx_current_msec;
> -
>      if (pkt->validated && pkt->retried) {
>          qc->tp.retry_scid.len = pkt->dcid.len;
>          qc->tp.retry_scid.data = ngx_pstrdup(c->pool, &pkt->dcid);
> diff --git a/src/event/quic/ngx_event_quic_ack.c b/src/event/quic/ngx_event_quic_ack.c
> --- a/src/event/quic/ngx_event_quic_ack.c
> +++ b/src/event/quic/ngx_event_quic_ack.c
> @@ -29,7 +29,7 @@ typedef struct {
>  } ngx_quic_ack_stat_t;
>  
>  
> -static ngx_inline ngx_msec_t ngx_quic_lost_threshold(ngx_quic_connection_t *qc);
> +static ngx_inline ngx_msec_t ngx_quic_lost_threshold(ngx_quic_path_t *path);
>  static void ngx_quic_rtt_sample(ngx_connection_t *c, ngx_quic_ack_frame_t *ack,
>      enum ssl_encryption_level_t level, ngx_msec_t send_time);
>  static ngx_int_t ngx_quic_handle_ack_frame_range(ngx_connection_t *c,
> @@ -48,11 +48,11 @@ static void ngx_quic_lost_handler(ngx_ev
>  
>  /* RFC 9002, 6.1.2. Time Threshold: kTimeThreshold, kGranularity */
>  static ngx_inline ngx_msec_t
> -ngx_quic_lost_threshold(ngx_quic_connection_t *qc)
> +ngx_quic_lost_threshold(ngx_quic_path_t *path)
>  {
>      ngx_msec_t  thr;
>  
> -    thr = ngx_max(qc->latest_rtt, qc->avg_rtt);
> +    thr = ngx_max(path->latest_rtt, path->avg_rtt);
>      thr += thr >> 3;
>  
>      return ngx_max(thr, NGX_QUIC_TIME_GRANULARITY);
> @@ -179,21 +179,23 @@ ngx_quic_rtt_sample(ngx_connection_t *c,
>      enum ssl_encryption_level_t level, ngx_msec_t send_time)
>  {
>      ngx_msec_t              latest_rtt, ack_delay, adjusted_rtt, rttvar_sample;
> +    ngx_quic_path_t        *path;
>      ngx_quic_connection_t  *qc;
>  
>      qc = ngx_quic_get_connection(c);
> +    path = qc->path;

Technically this is not correct.  The correct path for update is pkt->path
from ngx_quic_handle_ack_frame() which may differ from qc->path.
TBH accessing qc->path seems suspicious to me in most cases since pkt->path is
always a better choice, except where there's no pkt in current context.

However, even after fixing that, the result is still incorrect since we
need to track rtt for the path the initial packet was sent over.  It's also
desirable for the ACK to be received on the same path IMHO.  This is
partially addressed by the next patch (not for rtt though, but for congestion).

I suggest that we pass pkt->path to this function and update the next patch
to address the remaining issues.

>      latest_rtt = ngx_current_msec - send_time;
> -    qc->latest_rtt = latest_rtt;
> +    path->latest_rtt = latest_rtt;
>  
> -    if (qc->min_rtt == NGX_TIMER_INFINITE) {
> -        qc->min_rtt = latest_rtt;
> -        qc->avg_rtt = latest_rtt;
> -        qc->rttvar = latest_rtt / 2;
> -        qc->first_rtt = ngx_current_msec;
> +    if (path->min_rtt == NGX_TIMER_INFINITE) {
> +        path->min_rtt = latest_rtt;
> +        path->avg_rtt = latest_rtt;
> +        path->rttvar = latest_rtt / 2;
> +        path->first_rtt = ngx_current_msec;
>  
>      } else {
> -        qc->min_rtt = ngx_min(qc->min_rtt, latest_rtt);
> +        path->min_rtt = ngx_min(path->min_rtt, latest_rtt);
>  
>          ack_delay = (ack->delay << qc->ctp.ack_delay_exponent) / 1000;
>  
> @@ -203,18 +205,19 @@ ngx_quic_rtt_sample(ngx_connection_t *c,
>  
>          adjusted_rtt = latest_rtt;
>  
> -        if (qc->min_rtt + ack_delay < latest_rtt) {
> +        if (path->min_rtt + ack_delay < latest_rtt) {
>              adjusted_rtt -= ack_delay;
>          }
>  
> -        qc->avg_rtt += (adjusted_rtt >> 3) - (qc->avg_rtt >> 3);
> -        rttvar_sample = ngx_abs((ngx_msec_int_t) (qc->avg_rtt - adjusted_rtt));
> -        qc->rttvar += (rttvar_sample >> 2) - (qc->rttvar >> 2);
> +        path->avg_rtt += (adjusted_rtt >> 3) - (path->avg_rtt >> 3);
> +        rttvar_sample = ngx_abs((ngx_msec_int_t)
> +                                (path->avg_rtt - adjusted_rtt));
> +        path->rttvar += (rttvar_sample >> 2) - (path->rttvar >> 2);
>      }
>  
>      ngx_log_debug4(NGX_LOG_DEBUG_EVENT, c->log, 0,
>                     "quic rtt sample latest:%M min:%M avg:%M var:%M",
> -                   latest_rtt, qc->min_rtt, qc->avg_rtt, qc->rttvar);
> +                   latest_rtt, path->min_rtt, path->avg_rtt, path->rttvar);
>  }
>  
>  
> @@ -317,7 +320,7 @@ ngx_quic_congestion_ack(ngx_connection_t
>      }
>  
>      qc = ngx_quic_get_connection(c);
> -    cg = &qc->congestion;
> +    cg = &qc->path->congestion;

We need to find the right path here, but the next patch fixes this.

>      blocked = (cg->in_flight >= cg->window) ? 1 : 0;
>  
> @@ -428,13 +431,15 @@ ngx_quic_detect_lost(ngx_connection_t *c
>      ngx_uint_t              i, nlost;
>      ngx_msec_t              now, wait, thr, oldest, newest;
>      ngx_queue_t            *q;
> +    ngx_quic_path_t        *path;
>      ngx_quic_frame_t       *start;
>      ngx_quic_send_ctx_t    *ctx;
>      ngx_quic_connection_t  *qc;
>  
>      qc = ngx_quic_get_connection(c);
> +    path = qc->path;

Not sure what to do here, but it's obvious that we can have packets sent over
different paths in ctx->sent with different "thr" and "first_rtt".
Probably we need to find the right path for each packet in loop.  This may
delay followup packets which could otherwise be resent earlier though.

Or just ignore all that and use qc->path, since everything else is so much
harder and the effect from switching to qc->path for all packets may be just
a small shift in waiting time before resend.

>      now = ngx_current_msec;
> -    thr = ngx_quic_lost_threshold(qc);
> +    thr = ngx_quic_lost_threshold(path);
>  
>      /* send time of lost packets across all send contexts */
>      oldest = NGX_TIMER_INFINITE;
> @@ -471,7 +476,7 @@ ngx_quic_detect_lost(ngx_connection_t *c
>                  break;
>              }
>  
> -            if (start->last > qc->first_rtt) {
> +            if (start->last > path->first_rtt) {
>  
>                  if (oldest == NGX_TIMER_INFINITE || start->last < oldest) {
>                      oldest = start->last;
> @@ -519,8 +524,8 @@ ngx_quic_pcg_duration(ngx_connection_t *
>  
>      qc = ngx_quic_get_connection(c);
>  
> -    duration = qc->avg_rtt;
> -    duration += ngx_max(4 * qc->rttvar, NGX_QUIC_TIME_GRANULARITY);
> +    duration = qc->path->avg_rtt;
> +    duration += ngx_max(4 * qc->path->rttvar, NGX_QUIC_TIME_GRANULARITY);
>      duration += qc->ctp.max_ack_delay;
>      duration *= NGX_QUIC_PERSISTENT_CONGESTION_THR;
>  
> @@ -535,7 +540,7 @@ ngx_quic_persistent_congestion(ngx_conne
>      ngx_quic_connection_t  *qc;
>  
>      qc = ngx_quic_get_connection(c);
> -    cg = &qc->congestion;
> +    cg = &qc->path->congestion;
>  
>      cg->recovery_start = ngx_current_msec;
>      cg->window = qc->tp.max_udp_payload_size * 2;
> @@ -656,7 +661,7 @@ ngx_quic_congestion_lost(ngx_connection_
>      }
>  
>      qc = ngx_quic_get_connection(c);
> -    cg = &qc->congestion;
> +    cg = &qc->path->congestion;

We need to find the right path here for the frame.  The third patch addresses
this, but I suggest another solution, see below.

>      blocked = (cg->in_flight >= cg->window) ? 1 : 0;
>  
> @@ -721,7 +726,8 @@ ngx_quic_set_lost_timer(ngx_connection_t
>          if (ctx->largest_ack != NGX_QUIC_UNSET_PN) {
>              q = ngx_queue_head(&ctx->sent);
>              f = ngx_queue_data(q, ngx_quic_frame_t, queue);
> -            w = (ngx_msec_int_t) (f->last + ngx_quic_lost_threshold(qc) - now);
> +            w = (ngx_msec_int_t)
> +                           (f->last + ngx_quic_lost_threshold(qc->path) - now);
>  
>              if (f->pnum <= ctx->largest_ack) {
>                  if (w < 0 || ctx->largest_ack - f->pnum >= NGX_QUIC_PKT_THR) {
> @@ -777,17 +783,19 @@ ngx_msec_t
>  ngx_quic_pto(ngx_connection_t *c, ngx_quic_send_ctx_t *ctx)
>  {
>      ngx_msec_t              duration;
> +    ngx_quic_path_t        *path;
>      ngx_quic_connection_t  *qc;
>  
>      qc = ngx_quic_get_connection(c);
> +    path = qc->path;
>  
>      /* RFC 9002, Appendix A.8.  Setting the Loss Detection Timer */
> -    duration = qc->avg_rtt;
> +    duration = path->avg_rtt;
>  
> -    duration += ngx_max(4 * qc->rttvar, NGX_QUIC_TIME_GRANULARITY);
> +    duration += ngx_max(4 * path->rttvar, NGX_QUIC_TIME_GRANULARITY);
>      duration <<= qc->pto_count;
>  
> -    if (qc->congestion.in_flight == 0) { /* no in-flight packets */
> +    if (path->congestion.in_flight == 0) { /* no in-flight packets */
>          return duration;
>      }
>  
> diff --git a/src/event/quic/ngx_event_quic_connection.h b/src/event/quic/ngx_event_quic_connection.h
> --- a/src/event/quic/ngx_event_quic_connection.h
> +++ b/src/event/quic/ngx_event_quic_connection.h
> @@ -80,6 +80,14 @@ struct ngx_quic_server_id_s {
>  };
>  
>  
> +typedef struct {
> +    size_t                            in_flight;
> +    size_t                            window;
> +    size_t                            ssthresh;
> +    ngx_msec_t                        recovery_start;
> +} ngx_quic_congestion_t;
> +
> +
>  struct ngx_quic_path_s {
>      ngx_queue_t                       queue;
>      struct sockaddr                  *sockaddr;
> @@ -96,6 +104,15 @@ struct ngx_quic_path_s {
>      uint64_t                          seqnum;
>      ngx_str_t                         addr_text;
>      u_char                            text[NGX_SOCKADDR_STRLEN];
> +
> +    ngx_msec_t                        first_rtt;
> +    ngx_msec_t                        latest_rtt;
> +    ngx_msec_t                        avg_rtt;
> +    ngx_msec_t                        min_rtt;
> +    ngx_msec_t                        rttvar;
> +
> +    ngx_quic_congestion_t             congestion;
> +
>      unsigned                          validated:1;
>      unsigned                          validating:1;
>      unsigned                          limited:1;
> @@ -143,14 +160,6 @@ typedef struct {
>  } ngx_quic_streams_t;
>  
>  
> -typedef struct {
> -    size_t                            in_flight;
> -    size_t                            window;
> -    size_t                            ssthresh;
> -    ngx_msec_t                        recovery_start;
> -} ngx_quic_congestion_t;
> -
> -
>  /*
>   * RFC 9000, 12.3.  Packet Numbers
>   *
> @@ -218,12 +227,6 @@ struct ngx_quic_connection_s {
>      ngx_event_t                       path_validation;
>      ngx_msec_t                        last_cc;
>  
> -    ngx_msec_t                        first_rtt;
> -    ngx_msec_t                        latest_rtt;
> -    ngx_msec_t                        avg_rtt;
> -    ngx_msec_t                        min_rtt;
> -    ngx_msec_t                        rttvar;
> -
>      ngx_uint_t                        pto_count;
>  
>      ngx_queue_t                       free_frames;
> @@ -237,7 +240,6 @@ struct ngx_quic_connection_s {
>  #endif
>  
>      ngx_quic_streams_t                streams;
> -    ngx_quic_congestion_t             congestion;
>  
>      off_t                             received;
>  
> diff --git a/src/event/quic/ngx_event_quic_migration.c b/src/event/quic/ngx_event_quic_migration.c
> --- a/src/event/quic/ngx_event_quic_migration.c
> +++ b/src/event/quic/ngx_event_quic_migration.c
> @@ -135,17 +135,26 @@ valid:
>          {
>              /* address did not change */
>              rst = 0;
> +
> +            path->avg_rtt = prev->avg_rtt;
> +            path->rttvar = prev->rttvar;
> +            path->min_rtt = prev->min_rtt;
> +            path->first_rtt = prev->first_rtt;
> +            path->latest_rtt = prev->latest_rtt;

What if we introduce a new struct ngx_quic_rtt_t similar to
ngx_quic_congestion_t?

> +
> +            ngx_memcpy(&path->congestion, &prev->congestion,
> +                       sizeof(ngx_quic_congestion_t));
>          }
>      }
>  
>      if (rst) {
> -        ngx_memzero(&qc->congestion, sizeof(ngx_quic_congestion_t));
> +        ngx_memzero(&path->congestion, sizeof(ngx_quic_congestion_t));
>  
> -        qc->congestion.window = ngx_min(10 * qc->tp.max_udp_payload_size,
> -                                   ngx_max(2 * qc->tp.max_udp_payload_size,
> -                                           14720));
> -        qc->congestion.ssthresh = (size_t) -1;
> -        qc->congestion.recovery_start = ngx_current_msec;
> +        path->congestion.window = ngx_min(10 * qc->tp.max_udp_payload_size,
> +                                       ngx_max(2 * qc->tp.max_udp_payload_size,
> +                                               14720));
> +        path->congestion.ssthresh = (size_t) -1;
> +        path->congestion.recovery_start = ngx_current_msec;

We need to reset rtt as well.

  On confirming a peer's ownership of its new address, an endpoint MUST
  immediately reset the congestion controller and **round-trip time**
  estimator for the new path to initial values

We can do it in a separate patch.  Also, I feel like we need separate functions
for initializing rtt and congestion.  The congestion initializer was present in
the initial version of the patch as one of congestion methods.  I think even
though we have no methods now, the function still makes sense.

>      }
>  
>      /*
> @@ -217,6 +226,21 @@ ngx_quic_new_path(ngx_connection_t *c,
>      path->addr_text.len = ngx_sock_ntop(sockaddr, socklen, path->text,
>                                          NGX_SOCKADDR_STRLEN, 1);
>  
> +    path->avg_rtt = NGX_QUIC_INITIAL_RTT;
> +    path->rttvar = NGX_QUIC_INITIAL_RTT / 2;
> +    path->min_rtt = NGX_TIMER_INFINITE;
> +    path->first_rtt = NGX_TIMER_INFINITE;
> +
> +    /*
> +     * path->latest_rtt = 0
> +     */
> +
> +    path->congestion.window = ngx_min(10 * qc->tp.max_udp_payload_size,
> +                                      ngx_max(2 * qc->tp.max_udp_payload_size,
> +                                              14720));
> +    path->congestion.ssthresh = (size_t) -1;
> +    path->congestion.recovery_start = ngx_current_msec;
> +
>      ngx_log_debug2(NGX_LOG_DEBUG_EVENT, c->log, 0,
>                     "quic path seq:%uL created addr:%V",
>                     path->seqnum, &path->addr_text);
> diff --git a/src/event/quic/ngx_event_quic_output.c b/src/event/quic/ngx_event_quic_output.c
> --- a/src/event/quic/ngx_event_quic_output.c
> +++ b/src/event/quic/ngx_event_quic_output.c
> @@ -87,7 +87,7 @@ ngx_quic_output(ngx_connection_t *c)
>      c->log->action = "sending frames";
>  
>      qc = ngx_quic_get_connection(c);
> -    cg = &qc->congestion;
> +    cg = &qc->path->congestion;
>  
>      in_flight = cg->in_flight;
>  
> @@ -135,8 +135,8 @@ ngx_quic_create_datagrams(ngx_connection
>      static u_char           dst[NGX_QUIC_MAX_UDP_PAYLOAD_SIZE];
>  
>      qc = ngx_quic_get_connection(c);
> -    cg = &qc->congestion;
>      path = qc->path;
> +    cg = &path->congestion;
>  
>      while (cg->in_flight < cg->window) {
>  
> @@ -222,8 +222,7 @@ ngx_quic_commit_send(ngx_connection_t *c
>      ngx_quic_connection_t  *qc;
>  
>      qc = ngx_quic_get_connection(c);
> -
> -    cg = &qc->congestion;
> +    cg = &qc->path->congestion;
>  
>      while (!ngx_queue_empty(&ctx->sending)) {
>  
> @@ -336,8 +335,8 @@ ngx_quic_create_segments(ngx_connection_
>      static u_char           dst[NGX_QUIC_MAX_UDP_SEGMENT_BUF];
>  
>      qc = ngx_quic_get_connection(c);
> -    cg = &qc->congestion;
>      path = qc->path;
> +    cg = &path->congestion;
>  
>      ctx = ngx_quic_get_send_ctx(qc, ssl_encryption_application);
>  
> # HG changeset patch
> # User Sergey Kandaurov <pluknet at nginx.com>
> # Date 1672317973 -14400
> #      Thu Dec 29 16:46:13 2022 +0400
> # Branch quic
> # Node ID 0c8d81ada23c23f079db5c9c7f60d0cd3768555a
> # Parent  46e738a5c0ab9d98577b21e72447cc9a6e3e9784
> QUIC: path aware in-flight bytes accounting.
> 
> On packet acknowledgement is made path aware, as per RFC 9000, section 9.4:
>     Packets sent on the old path MUST NOT contribute to congestion control
>     or RTT estimation for the new path.
> 
> Previously, the in-flight counter could be decremented for the wrong path.
> If the active path was switched on connection migration with in-flight
> contributing packets, the acknowledgement received after the congestion
> controller is reset resulted in the counter underflow on the new path.
> 
> diff --git a/src/event/quic/ngx_event_quic_ack.c b/src/event/quic/ngx_event_quic_ack.c
> --- a/src/event/quic/ngx_event_quic_ack.c
> +++ b/src/event/quic/ngx_event_quic_ack.c
> @@ -312,6 +312,8 @@ ngx_quic_congestion_ack(ngx_connection_t
>  {
>      ngx_uint_t              blocked;
>      ngx_msec_t              timer;
> +    ngx_queue_t            *q;
> +    ngx_quic_path_t        *path;
>      ngx_quic_congestion_t  *cg;
>      ngx_quic_connection_t  *qc;
>  
> @@ -320,7 +322,27 @@ ngx_quic_congestion_ack(ngx_connection_t
>      }
>  
>      qc = ngx_quic_get_connection(c);
> -    cg = &qc->path->congestion;
> +
> +#if (NGX_SUPPRESS_WARN)
> +    path = NULL;
> +#endif
> +
> +    for (q = ngx_queue_head(&qc->paths);
> +         q != ngx_queue_sentinel(&qc->paths);
> +         q = ngx_queue_next(q))
> +    {
> +        path = ngx_queue_data(q, ngx_quic_path_t, queue);
> +
> +        if (path == f->path) {
> +            break;
> +        }
> +    }
> +
> +    if (q == ngx_queue_sentinel(&qc->paths)) {
> +        return;
> +    }
> +
> +    cg = &path->congestion;
>  
>      blocked = (cg->in_flight >= cg->window) ? 1 : 0;
>  
> diff --git a/src/event/quic/ngx_event_quic_output.c b/src/event/quic/ngx_event_quic_output.c
> --- a/src/event/quic/ngx_event_quic_output.c
> +++ b/src/event/quic/ngx_event_quic_output.c
> @@ -234,6 +234,7 @@ ngx_quic_commit_send(ngx_connection_t *c
>          if (f->pkt_need_ack && !qc->closing) {
>              ngx_queue_insert_tail(&ctx->sent, q);
>  
> +            f->path = qc->path;
>              cg->in_flight += f->plen;
>  
>          } else {
> diff --git a/src/event/quic/ngx_event_quic_transport.h b/src/event/quic/ngx_event_quic_transport.h
> --- a/src/event/quic/ngx_event_quic_transport.h
> +++ b/src/event/quic/ngx_event_quic_transport.h
> @@ -265,6 +265,7 @@ struct ngx_quic_frame_s {
>      ngx_uint_t                                  type;
>      enum ssl_encryption_level_t                 level;
>      ngx_queue_t                                 queue;
> +    ngx_quic_path_t                            *path;

This does not look reliable to me since paths may be deleted and then reused.
I suggest that we use path->seqnum for this purpose.  Currently it's only used
for debugging.  A similar change is needed in ngx_quic_congestion_lost().

>      uint64_t                                    pnum;
>      size_t                                      plen;
>      ngx_msec_t                                  first;
> # HG changeset patch
> # User Sergey Kandaurov <pluknet at nginx.com>
> # Date 1672317976 -14400
> #      Thu Dec 29 16:46:16 2022 +0400
> # Branch quic
> # Node ID c450d58ba0108311c6e8dfa3e1ef15b267e12c9a
> # Parent  0c8d81ada23c23f079db5c9c7f60d0cd3768555a
> QUIC: avoid sending in-flight packets on a not yet validated path.
> 
> Sending in-flight packets on a not yet validated path followed by confirming
> a peer's ownership of its new address and the congestion controller reset, as
> per RFC 9000, section 9.4, resulted in the lost accouting of in-flight bytes
> and the bytes counter underflow on subsequent acknowledgement.
> In practice, this occurred with NEW_CONNECTION_ID sent in response to peer's
> RETIRE_CONNECTION_ID, which is acknowledged after the new path is validated.
> 
> Since we change the address to send to in response to highest-numbered packets,
> this measure should be sufficiently safe as an interim solution.

IMHO it's better to track which packets contributed to in_flight to make sure
we only subtract them.  When confirming the ownership and resetting the
congestion, we can increment path seqnum to make sure the old packets will not
be subtracted later.

> diff --git a/src/event/quic/ngx_event_quic_output.c b/src/event/quic/ngx_event_quic_output.c
> --- a/src/event/quic/ngx_event_quic_output.c
> +++ b/src/event/quic/ngx_event_quic_output.c
> @@ -153,6 +153,10 @@ ngx_quic_create_datagrams(ngx_connection
>  
>              ctx = &qc->send_ctx[i];
>  
> +            if (ctx->level == ssl_encryption_application && !path->validated) {
> +                break;
> +            }
> +
>              preserved_pnum[i] = ctx->pnum;
>  
>              if (ngx_quic_generate_ack(c, ctx) != NGX_OK) {
> # HG changeset patch
> # User Sergey Kandaurov <pluknet at nginx.com>
> # Date 1672317981 -14400
> #      Thu Dec 29 16:46:21 2022 +0400
> # Branch quic
> # Node ID 1afa9e4c5dd7d2ff1b5f9b4fe521b429fc44e78c
> # Parent  c450d58ba0108311c6e8dfa3e1ef15b267e12c9a
> QUIC: updating the local sockaddr when receiving a QUIC packet.
> 
> In addition to saving the local sockaddr when establishing a new connection,
> this change updates the connection local sockaddr on the next received packet.
> The cached value will be used to set the property of a new path, which aims
> to be different when using a preferred address.

This and the following patches implement preferred address which I suggest we
put on hold for now.  First, my patch that enabled preferred address feature
within a single listener is no longer considered for commit.  Second, we should
make sure the demand is there for a limited feature like this.  Alternatively,
we can implement full server context migration from a handshake QUIC server to
a worker QUIC server, subject to demand.

[..]

--
Roman Arutyunyan


More information about the nginx-devel mailing list