[PATCH] QUIC: better sockaddr initialization
Alejandro Colomar
alx.manpages at gmail.com
Sun May 21 23:06:53 UTC 2023
Hello!
On 5/21/23 23:22, Maxim Dounin wrote:
>> While the data being written was correctly written via memcpy(3),
>> you wouldn't be allowed to access it later as anything that is
>> not 'struct sockaddr'. For example, the following is a
>> strict-aliasing violation:
>>
>> struct s { int a; int b; };
>> struct t { int a; };
>> union u { struct s s; struct t t; };
>>
>> struct s x = {42, 42};
>> union u y;
>> int z;
>>
>> memcpy(&y.t, &x, sizeof(x)); // This is fine
>>
>> // We created an object of type 'struct t' in the union.
>> // Unions allow aliasing, so we're allowed to reinterpret
>> // that object as a 'struct s' via the other member.
>>
>> z = y.s.a; // This is fine.
>>
>> // But we're not allowed to reinterpret bytes that are
>> // officially uninitialized (even though we know they are
>> // initialized).
>>
>> z = y.s.b; // UB here.
>>
>> The reason for the UB is that the compiler is free to assume
>> that since you wrote to the struct t member, the write can't
>> possibly write to the second member of the struct (even if
>> the size passed to memcpy(3) is larger than that). In other
>> words, the compiler may assume that anything past
>> sizeof(struct t) is uninitialized.
>
> You haven't wrote to the struct t member, you wrote to the address
> using memcpy(). There is a difference, see C99 (or C11, whichever
> you prefer), 6.5 Expressions.
I assume you refer to this:
C11::6.5p6:
< The effective type of an object for an access to its stored
< value is the declared type of the object, if any.87) If a value
< is stored into an object having no declared type through an
< lvalue having a type that is not a character type, then the type
< of the lvalue becomes the effective type of the object for that
< access and for subsequent accesses that do not modify the stored
< value. If a value is copied into an object having no declared
< type using memcpy or memmove, or is copied as an array of
< character type, then the effective type of the modified object
< for that access and for subsequent accesses that do not modify
< the value is the effective type of the object from which the
< value is copied, if it has one. For all other accesses to an
< object having no declared type, the effective type of the object
< is simply the type of the lvalue used for the access.
<
[...]
<
< 87) Allocated objects have no declared type.
Let's break it into sentences:
< 87) Allocated objects have no declared type.
malloc(3)d memory has no declared type; all other memory has a
declared type.
< The effective type of an object for an access to its stored
< value is the declared type of the object, if any.89)
'y.t' has a declared type of 'struct t', so its effective type is
also 'struct t'.
< If a value is copied into an object having no declared
< type using memcpy or memmove, [...]
It doesn't apply, since 'y.t' has declared type. memcpy(3)
can't change the effective type in this case.
I think my example has UB (not 100% sure, though). However, I
notice now that it's slightly different from the one in nginx,
since nginx wasn't using a named variable (or at least it's not
obvious in ngx_quic_recvmsg(), since it's just receiving a
pointer), and it's instead probably using malloc(3)'d memory, so
the memory didn't have declared type. Only if a local temporary
variable had been used at some point upper in the call chain,
there could be UB. This is something that a static analyzer
will have a hard time checking (and a human reviewer too),
though.
<https://software.codidact.com/posts/288138>
>
>> Also, writing past an
>> object is very dubious, even via memcpy(3), even if you know
>> that the storage is there (thanks to the union). It's just
>> safer writing to the union itself, or to the field that has
>> the correct object type.
>
> And that's why the patch. While it is correct to write to the
> memory with any pointer,
Only if the memory is malloc(3)'d memory. But yes, probably
that's the case in patch. If local/global variables are
involved that's not allowed.
> using the union itself is more obvious
> and causes less confusion.
Yup; thanks for improving that.
Thanks,
Alex
>
> [...]
>
--
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://mailman.nginx.org/pipermail/nginx-devel/attachments/20230522/fce3d4dd/attachment-0001.bin>
More information about the nginx-devel
mailing list