[PATCH 11 of 11] SSL: automatic rotation of session ticket keys
Maxim Dounin
mdounin at mdounin.ru
Wed Sep 28 18:37:37 UTC 2022
Hello!
On Mon, Sep 26, 2022 at 02:17:18PM +0400, Sergey Kandaurov wrote:
> > On 17 Sep 2022, at 01:08, Maxim Dounin <mdounin at mdounin.ru> wrote:
> >
> > On Thu, Sep 15, 2022 at 09:50:24AM +0400, Sergey Kandaurov wrote:
> >
> >>> On 26 Aug 2022, at 07:01, Maxim Dounin <mdounin at mdounin.ru> wrote:
> >>>
> >>> # HG changeset patch
> >>> # User Maxim Dounin <mdounin at mdounin.ru>
> >>> # Date 1661481958 -10800
> >>> # Fri Aug 26 05:45:58 2022 +0300
> >>> # Node ID 5c26fe5f6ab0bf4c0d18cae8f6f6483348243d4b
> >>> # Parent 2487bf5766f79c813b3397b3bb897424c3590445
> >>> SSL: automatic rotation of session ticket keys.
> >>>
> >>> As long as ssl_session_cache in shared memory is configured, session ticket
> >>> keys are now automatically generated in shared memory, and rotated
> >>> periodically.
> >>
> >> It looks odd how session cache is (ab)used to store ticket keys.
> >> I understand that it is comfortable to reuse existing shared zone
> >> (and not to touch configuration) but probably a more correct way
> >> is to create a separate zone to store keys?
> >> Something pretty much the same as ssl_session_cache syntax:
> >> ssl_session_ticket_key file | shared:name:size
> >>
> >> What do you think?
> >
> > I don't think it is a good idea to introduce additional shared
> > zones here, especially given that it only needs about 160 bytes of
> > shared memory. It really complicates things for users for no real
> > reason.
> >
> > Rather, I was thinking about using nginx_shared_zone, which is
> > always available, but it is rather special and will require
> > additional integration code. Further, as a single contention
> > point it might be a problem, at least unless lock avoidance is
> > also implemented (see below). And a single key for all servers
> > might not be the best solution either. On the other hand,
> > ssl_session_cache in shared memory is readily available and
> > logically related, and I see no reasons not to use it to rotate
> > ticket keys if it's configured.
>
> I've been pondering for a while and tend to agree to go with
> the ssl_session_cache route for its simplicity.
> I share the dislike to introduce new syntax, while
> nginx_shared_zone looks unrelated enough to put SSL bits in.
I don't think that nginx_shared_zone is unrelated, it's a basic
shared zone used for multiple nginx-wide tasks, including mostly
module-specific ones, such as stub_status. I don't see why it
shouldn't contain some keying material used for SSL ticket keys.
But using it certainly would require some additional work, notably
a) mutex initialization for proper locking (since there is no
shared pool in the zone) and b) some key derivation logic, as we
probably want to avoid using identical keys for all server blocks.
> > If there will be understanding that we need to rotate ticket keys
> > in all cases, even without ssl_session_cache in shared memory
> > being configured, the idea of using nginx_shared_zone might be
> > revisited.
>
> Well, an obvious downside is the need to enable a session cache
> which might be undesirable in certain memory constrained systems
> if you intend to resume only with session tickets. Even though
> the impact is reduced by keeping shared memory zone to the minimum,
> (re)using session cache to store only ticket keys nonetheless
> requires to allocate memory shared zone of at least 8 page sizes.
> OTOH, it quite uncommon to have clients that use tickets only.
I rather see ticket key rotation as a free feature you'll now get
if you have ssl_session_cache enabled. If there will be enough
demand for ticket key rotation without ssl_session_cache enabled,
using nginx_shared_zone might be revisited, see above.
[...]
> >>> +
> >>> + keys = SSL_CTX_get_ex_data(ssl_ctx, ngx_ssl_ticket_keys_index);
> >>> + if (keys == NULL) {
> >>> + return NGX_OK;
> >>> + }
> >>> +
> >>> + key = keys->elts;
> >>> +
> >>> + if (!key[0].shared) {
> >>> + return NGX_OK;
> >>> + }
> >>> +
> >>> + now = ngx_time();
> >>> + expire = now + SSL_CTX_get_timeout(ssl_ctx);
> >>> +
> >>> + shm_zone = SSL_CTX_get_ex_data(ssl_ctx, ngx_ssl_session_cache_index);
> >>> +
> >>> + cache = shm_zone->data;
> >>> + shpool = (ngx_slab_pool_t *) shm_zone->shm.addr;
> >>> +
> >>> + ngx_shmtx_lock(&shpool->mutex);
> >>> +
> >>> + key = cache->ticket_keys;
> >>> +
> >>> + if (key[0].expire == 0) {
> >>> +
> >>> + /* initialize the current key */
> >>> +
> >>> + if (RAND_bytes(buf, 80) != 1) {
> >>> + ngx_ssl_error(NGX_LOG_ALERT, log, 0, "RAND_bytes() failed");
> >>> + ngx_shmtx_unlock(&shpool->mutex);
> >>> + return NGX_ERROR;
> >>> + }
> >>> +
> >>> + key->shared = 1;
> >>> + key->expire = expire;
> >>> + key->size = 80;
> >>> + ngx_memcpy(key->name, buf, 16);
> >>> + ngx_memcpy(key->hmac_key, buf + 16, 32);
> >>> + ngx_memcpy(key->aes_key, buf + 48, 32);
> >>> +
> >>> + ngx_explicit_memzero(&buf, 80);
> >>> +
> >>> + ngx_log_debug2(NGX_LOG_DEBUG_EVENT, log, 0,
> >>> + "ssl ticket key: \"%*xs\"",
> >>> + (size_t) 16, key->name);
> >>> + }
> >>> +
> >>> + if (key[1].expire < now) {
> >>> +
> >>> + /*
> >>> + * if the previous key is no longer needed (or not initialized),
> >>> + * replace it with the current key and generate new current key
> >>> + */
> >>> +
> >>> + key[1] = key[0];
> >>> +
> >>> + if (RAND_bytes(buf, 80) != 1) {
> >>> + ngx_ssl_error(NGX_LOG_ALERT, log, 0, "RAND_bytes() failed");
> >>> + ngx_shmtx_unlock(&shpool->mutex);
> >>> + return NGX_ERROR;
> >>> + }
> >>> +
> >>> + key->shared = 1;
> >>> + key->expire = expire;
> >>> + key->size = 80;
> >>> + ngx_memcpy(key->name, buf, 16);
> >>> + ngx_memcpy(key->hmac_key, buf + 16, 32);
> >>> + ngx_memcpy(key->aes_key, buf + 48, 32);
> >>> +
> >>> + ngx_explicit_memzero(&buf, 80);
> >>> +
> >>> + ngx_log_debug2(NGX_LOG_DEBUG_EVENT, log, 0,
> >>> + "ssl ticket key: \"%*xs\"",
> >>> + (size_t) 16, key->name);
> >>> + }
> >>
> >> I wonder if cpu intensive ops can be moved from under the lock.
> >> At least, buf[] can be removed if write random data directly
> >> to ngx_ssl_ticket_key_t. This eliminates 3 memory copies
> >> (likely optimized by compiler to a single) and explicit write.
> >
> > Micro-optimizations are not something important here: new keys are
> > generated once per SSL session timeout, that is, once per 300
> > seconds by default. As such, the code is kept as close as
> > possible to the ticket keys reading in
> > ngx_ssl_session_ticket_keys(), and not dependant on the
> > ngx_ssl_ticket_key_t structure layout.
>
> Ok, that makes sense.
>
> >
> > What can be interesting here is to avoid locking at all unless we
> > are going to update keys (change expiration of the current key, or
> > switch to the next key).
> >
> > To do so without races, the code has to maintains 3 keys: current,
> > previous, and next. If a worker will switch to the next key
> > earlier, other workers will still be able to decrypt new tickets,
> > since they will be encrypted with the next key.
> >
> > But with the SSL session cache in shared memory we have to lock
> > shared zone anyway to create sessions, and this was never seen to
> > be a performance bottleneck.
>
> Good to know this is not a hot path.
> Nonetheless, it is good to preserve a status-quo regarding
> the locking and/or keep to the minimum its coverage/impact.
>
> For the record, with this patch and ticket keys rotation enabled,
> each ticket operation resembles existing locking in session cache:
> - 1 lock per new ticket encryption
> - 1 lock per each ticket decryption + 1 lock for possible renew
> In TLSv1.3 this is a bit worse:
> - 2 locks, each for issuing 2 new tickets
> - 2 locks, each for ticket decryption + obligatory ticket renew
Note that previously (before the 1st patch from this series) with
ssl_session_cache enabled and TLSv1.3 this used to result in 2
sessions being created and both saved to the session cache, with
corresponding 2 locks.
> With the locking patch, locking acquirement reduces at least to:
> - 1 lock total for decryption and renew, as that happens in 1 sec.
> - 1 lock for issuing 2 new tickets in TLSv1.3, the same reasons
>
> In loaded scenarios, working with tickets requires obtaining
> at most 1 lock per 1 sec. per each worker process.
Correct.
> And by the way, while reviewing this patch, I noticed that
> OpenSSL doesn't allow a client to gracefully renew TLSv1.2 session
> when the client receives a new session ticket in resumed sessions.
> In practice, it is visible when client resumes a not yet expired
> session encrypted with not a fresh ticket key (after rotation),
> which results in sending a new session ticket.
> See ssl_update_cache() for the !s->hit condition.
> In the opposite, BoringSSL always allows to renew TLSv1.2 sessions.
You mean on the client side? Yes, it looks like
ngx_ssl_new_client_session() won't be called for such a new
session ticket, and updated ticket will be never saved. This
might need to be worked around.
This should be safe with the key rotation logic introduced in this
patch though, given that the previous key is preserved till the
last ticket encrypted with it is expected to expire.
One of the possible solutions might be to avoid re-encryption of
tickets with the new key, as the old key is anyway expected to be
available till the session expires.
> Additionally, BoringSSL view a bit differs on a session freshness.
> Unlike in OpenSSL, BoringSSL treats sessions as fresh only if
> the session's time of issuance + timeout points to the future.
> In the opposite, OpenSSL resumes sessions expiring in the current second.
> See math in ssl_get_prev_session() vs ssl_session_is_time_valid().
> This is an insignificant detail, though.
Current code preserves the previous key till key expiration is in
the past, and this should be long enough for all cases.
> > Given that, and the added code
> > complexity, I've dropped the patch which avoids locking, at least
> > for now.
> >
> > This might worth revisiting though. Patch provided below for
> > reference.
> >
>
> I see no reasons not to commit it.
> Though, it might make sense to let the dust settle down a bit
> and commit the locking patch later.
Well, if it's already reviewed I see no reasons to delay the
commit.
[...]
--
Maxim Dounin
http://mdounin.ru/
More information about the nginx-devel
mailing list