[patch] quic PTO counter fixes

Vladimir Homutov vl at inspert.ru
Thu Nov 9 15:14:08 UTC 2023


> On Thu, Oct 26, 2023 at 03:08:55AM +0400, Sergey Kandaurov wrote:
> > # HG changeset patch
> > # User Vladimir Khomutov <vl at wbsrv.ru>
> > # Date 1697031803 -10800
> > #      Wed Oct 11 16:43:23 2023 +0300
> > # Node ID 9ba2840e88f62343b3bd794e43900781dab43686
> > # Parent  1f188102fbd944df797e8710f70cccee76164add
> > QUIC: fixed handling of PTO counter.
> >
> > The RFC 9002 clearly says in "6.2. Probe Timeout":
> >     ...
> >     As with loss detection, the PTO is per packet number space.
> >     That is, a PTO value is computed per packet number space.
> >
> > Despite that, current code is using per-connection PTO counter.
> > For example, this may lead to situation when packet loss at handshake
> > level will affect PTO calculation for initial packets, preventing
> > send of new probes.
>
> Although PTO value is per packet number space, PTO backoff is not,
> see "6.2.1 Computing PTO":
>
> : When ack-eliciting packets in multiple packet number spaces are in flight, the
> : timer MUST be set to the earlier value of the Initial and Handshake packet
> : number spaces.

And I read this fragment as:
    - there are multiple timer values (i.e. per packet number space)
      (and value is pto * backoff)
    - we have to choose the earliest value

The ngx_quic_pto() has nothing that depends on packet number space
(with the minor exception that we add max_ack_delay at application level
 after the handshake)
So pto_count is the only thing that can make timer values to be
different in different packet number spaces.

>
> But:
>
> : When a PTO timer expires, the PTO backoff MUST be increased <..>
>
> : This exponential reduction in the sender's rate is important because consecutive
> : PTOs might be caused by loss of packets or acknowledgments due to severe
> : congestion.  Even when there are ack-eliciting packets in flight in multiple
> : packet number spaces, the exponential increase in PTO occurs across all spaces
> : to prevent excess load on the network.  For example, a timeout in the Initial
> : packet number space doubles the length of the timeout in the Handshake packet
> : number space.

yes, this really looks like contradiction.
At least I don't understand how it is possible to have PTO value
different by packet number space given the way we calculate it.

> Even if that would be proven otherwise, I don't think the description
> provides detailed explanation.  It describes a pretty specific use case,
> when both Initial and Handshake packet number spaces have in-flight packets
> with different PTO timeout (i.e. different f->last).  Typically they are
> sent coalesced (e.g. CRYPTO frames for ServerHello and (at least)
> EncryptedExtensions TLS messages).
> In interop tests, though, it might be different: such packets may be
> sent separately, with Handshake packet thus having a later PTO timeout.
> If such, PTO timer will first fire for the Initial packet, then for Handshake,
> which will result in PTO backoff accumulated for each packet:
>
>  t1: <- Initial (lost)
>  t2: <- Handshake (lost)
> t1': pto(t1) timeout
>      <- Initial (pto_count=1)
> t2': pto(t2) timeout
>      <- Handshake (pto_count=2)
> t1'': pto(t1') timeout
>      <- Initial (pto_count=3)
>
> So, I would supplement the description with the phrase that that's
> fair typically with uncoalesced packets seen in interop tests, and
> that the same is true vice verse with packet loss at initial packet
> number space affecting PTO backoff in handshake packet number space.
>
> But see above about PTO backoff increase across all spaces.

I tend to think that it is better to leave things as is.
maybe RFC needs some better wording in this case.

I've checked ngtcp2 and msquic and it it looks like both
handle pto counter per-connection too;
(see pto_count in ngtcp2 and QUIC_LOSS_DETECTION.ProbeCount in msquic)


> > Additionally, one case of successful ACK receiving was missing:
> > PING frames are not stored in the ctx->sent queue, thus PTO was not
> > reset when corresponding packets were acknowledged.
>
> See below.
>
> >
> > diff --git a/src/event/quic/ngx_event_quic.c b/src/event/quic/ngx_event_quic.c
> > --- a/src/event/quic/ngx_event_quic.c
> > +++ b/src/event/quic/ngx_event_quic.c
> > @@ -1088,8 +1088,6 @@ ngx_quic_discard_ctx(ngx_connection_t *c
> >
> >      ngx_quic_keys_discard(qc->keys, level);
> >
> > -    qc->pto_count = 0;
> > -
> >      ctx = ngx_quic_get_send_ctx(qc, level);
> >
> >      ngx_quic_free_buffer(c, &ctx->crypto);
> > @@ -1120,6 +1118,7 @@ ngx_quic_discard_ctx(ngx_connection_t *c
> >      }
> >
> >      ctx->send_ack = 0;
> > +    ctx->pto_count = 0;
> >
> >      ngx_quic_set_lost_timer(c);
> >  }
> > diff --git a/src/event/quic/ngx_event_quic_ack.c b/src/event/quic/ngx_event_quic_ack.c
> > --- a/src/event/quic/ngx_event_quic_ack.c
> > +++ b/src/event/quic/ngx_event_quic_ack.c
> > @@ -286,8 +286,12 @@ ngx_quic_handle_ack_frame_range(ngx_conn
> >      if (!found) {
> >
> >          if (max < ctx->pnum) {
> > -            /* duplicate ACK or ACK for non-ack-eliciting frame */
> > -            return NGX_OK;
> > +            /*
> > +             * - ACK for frames not in sent queue (i.e. PING)
> > +             * - duplicate ACK
> > +             * - ACK for non-ack-eliciting frame
> > +             */
> > +            goto done;
> >          }
> >
> >          ngx_log_error(NGX_LOG_INFO, c->log, 0,
> > @@ -300,11 +304,13 @@ ngx_quic_handle_ack_frame_range(ngx_conn
> >          return NGX_ERROR;
> >      }
> >
> > +done:
> > +
> >      if (!qc->push.timer_set) {
> >          ngx_post_event(&qc->push, &ngx_posted_events);
> >      }
> >
> > -    qc->pto_count = 0;
> > +    ctx->pto_count = 0;
>
> This part of the change to reset pto_count
> for duplicate ACK or ACK for non-ack-eliciting frame
> contradicts the OnAckReceived example in RFC 9002,
> although I didn't found a format text in the RFC itself:
>
> OnAckReceived(ack, pn_space):
>   ...
>   // DetectAndRemoveAckedPackets finds packets that are newly
>   // acknowledged and removes them from sent_packets.
>   newly_acked_packets =
>       DetectAndRemoveAckedPackets(ack, pn_space)
>   // Nothing to do if there are no newly acked packets.
>   if (newly_acked_packets.empty()):
>     return
>
>   // Update the RTT if the largest acknowledged is newly acked
>   // and at least one ack-eliciting was newly acked.
>   ...
>
>   // Reset pto_count ...
>
> From which it follows that pto_count is reset
> (and RTT updated) for newly ack'ed packets only.

yes, agreed.
the main issue is tracking of in-flight PING frames.
>
> I think the better fix would be to properly track in-flight PING frames.
> Moreover, the current behaviour of not tracking PING frames in ctx->sent
> prevents from a properly calculated PTO timeout: each time it is calculated
> against the original packet (with increasingly receding time to the past)
> that triggered the first PTO timeout, which doesn't result in exponentially
> increased PTO period as expected, but rather some bogus value.

Agree. I've created a patch that fixes this.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: ping_frames_inflight.diff
Type: text/x-diff
Size: 3496 bytes
Desc: not available
URL: <http://mailman.nginx.org/pipermail/nginx-devel/attachments/20231109/af549575/attachment.bin>


More information about the nginx-devel mailing list