[PATCH] QUIC: better sockaddr initialization

Alejandro Colomar alx.manpages at gmail.com
Sun May 21 23:06:53 UTC 2023


Hello!

On 5/21/23 23:22, Maxim Dounin wrote:
>> While the data being written was correctly written via memcpy(3),
>> you wouldn't be allowed to access it later as anything that is
>> not 'struct sockaddr'.  For example, the following is a
>> strict-aliasing violation:
>>
>> struct s { int       a;  int       b; };
>> struct t { int       a;               };
>> union u  { struct s  s;  struct t  t; };
>>
>> struct s  x = {42, 42};
>> union u   y;
>> int       z;
>>
>> memcpy(&y.t, &x, sizeof(x));  // This is fine
>>
>> // We created an object of type 'struct t' in the union.
>> // Unions allow aliasing, so we're allowed to reinterpret
>> // that object as a 'struct s' via the other member.
>>
>> z = y.s.a;  // This is fine.
>>
>> // But we're not allowed to reinterpret bytes that are
>> // officially uninitialized (even though we know they are
>> // initialized).
>>
>> z = y.s.b;  // UB here.
>>
>> The reason for the UB is that the compiler is free to assume
>> that since you wrote to the struct t member, the write can't
>> possibly write to the second member of the struct (even if
>> the size passed to memcpy(3) is larger than that).  In other
>> words, the compiler may assume that anything past
>> sizeof(struct t) is uninitialized.
> 
> You haven't wrote to the struct t member, you wrote to the address 
> using memcpy().  There is a difference, see C99 (or C11, whichever 
> you prefer), 6.5 Expressions.

I assume you refer to this:

C11::6.5p6:
< The effective type of an object for an access to its stored
< value is the declared type of the object, if any.87)  If a value
< is stored into an object having no declared type through an
< lvalue having a type that is not a character type, then the type
< of the lvalue becomes the effective type of the object for that
< access and for subsequent accesses that do not modify the stored
< value.  If a value is copied into an object having no declared
< type using memcpy or memmove, or is copied as an array of
< character type, then the effective type of the modified object
< for that access and for subsequent accesses that do not modify
< the value is the effective type of the object from which the
< value is copied, if it has one.  For all other accesses to an
< object having no declared type, the effective type of the object
< is simply the type of the lvalue used for the access.
<
[...]
<
< 87) Allocated objects have no declared type. 


Let's break it into sentences:

< 87) Allocated objects have no declared type. 

malloc(3)d memory has no declared type; all other memory has a
declared type.

< The effective type of an object for an access to its stored
< value is the declared type of the object, if any.89)

'y.t' has a declared type of 'struct t', so its effective type is
also 'struct t'.

< If a value is copied into an object having no declared
< type using memcpy or memmove, [...]

It doesn't apply, since 'y.t' has declared type.  memcpy(3)
can't change the effective type in this case.


I think my example has UB (not 100% sure, though).  However, I
notice now that it's slightly different from the one in nginx,
since nginx wasn't using a named variable (or at least it's not
obvious in ngx_quic_recvmsg(), since it's just receiving a
pointer), and it's instead probably using malloc(3)'d memory, so
the memory didn't have declared type.  Only if a local temporary
variable had been used at some point upper in the call chain,
there could be UB.  This is something that a static analyzer
will have a hard time checking (and a human reviewer too),
though.

<https://software.codidact.com/posts/288138>


> 
>> Also, writing past an
>> object is very dubious, even via memcpy(3), even if you know
>> that the storage is there (thanks to the union).  It's just
>> safer writing to the union itself, or to the field that has
>> the correct object type.
> 
> And that's why the patch.  While it is correct to write to the 
> memory with any pointer,

Only if the memory is malloc(3)'d memory.  But yes, probably
that's the case in patch.  If local/global variables are
involved that's not allowed.

> using the union itself is more obvious 
> and causes less confusion.

Yup; thanks for improving that.

Thanks,
Alex

> 
> [...]
> 

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://mailman.nginx.org/pipermail/nginx-devel/attachments/20230522/fce3d4dd/attachment-0001.bin>


More information about the nginx-devel mailing list