Optimizing NGINX TLS Time To First Byte (TTTFB)

Thu Dec 19 00:50:00 UTC 2013

On 2013-12-19 01:04, Ilya Grigorik wrote:

> ...and we're looking at ~1360 bytes.. Which is close to what you're seeing in your testing. 

Yes, and I haven't employed IPv6 yet; hence I could save 20 bytes.

> and minimizes impact of packet reordering and packet loss.

I remember reading (I believe it was in your (excellent) book! ;)) that
upon packet loss, the full TLS record has to be retransmitted. Not cool
if the TLS record is large and fragmented. So that's indeed a good
reason to keep TLS records small and preferably within the size of a TCP
segment.

> FWIW, for these exact reasons the Google frontend servers have been using TLS record = TCP segment for a few years now... So there is good precedent to using this as a default. 

Yeah, about that. Google's implementation looks very nice. I keep
looking at it in Wireshark and wonder if there is a way that I could
replicate their implementation with my limited knowledge. It probably
requires tuning of the underlying application as well? Google uses a
1470 bytes frame size (14 bytes header plus 1456 bytes payload), with
the TLS record fixed at ~ 1411 bytes. Not sure if a MTU 1470 / MSS 1430
is any beneficial for TLS communication.

They optimized the stack to almost always _exactly_ fit a TLS record
into the available space of a TCP segment. If I look at one of my sites,
https://www.zeitgeist.se, with standard MTU/MSS, and the TLS record size
fixed to 1370 bytes + overhead, Nginx would happily use the remaining
space in the TCP record and add part of a second TLS record to it, of
which the rest then fragments into a second TCP segment. I played around
with TCP_CORK (tcp_nopush), but it didn't seem to make any difference.

> That said, small records do incur overhead due to extra framing, plus more CPU cycles (more MACs and framing processing). So, in some instances, if you're delivering large streams (e.g. video), you may want to use larger records... Exposing record size as a configurable option would address this. 

Absolutely. Before I said Google uses a 1470 bytes frame size, but that
is not true for example when it comes to streaming from Youtube. Here
they use the standard MTU, and also large, fragmenting TLS records. So
like you said it's important to look at the application you're trying to
optimize. +1 for the configurable TLS record size option. To pick up
from the code Maxim just posted, perhaps the record size could be even
dynamically altered within location blocks (to specify different record
sizes for large and small streams).