[PATCH] QUIC: better sockaddr initialization

Maxim Dounin mdounin at mdounin.ru
Mon May 22 02:35:53 UTC 2023


Hello!

On Mon, May 22, 2023 at 01:06:53AM +0200, Alejandro Colomar wrote:

> Hello!
> 
> On 5/21/23 23:22, Maxim Dounin wrote:
> >> While the data being written was correctly written via memcpy(3),
> >> you wouldn't be allowed to access it later as anything that is
> >> not 'struct sockaddr'.  For example, the following is a
> >> strict-aliasing violation:
> >>
> >> struct s { int       a;  int       b; };
> >> struct t { int       a;               };
> >> union u  { struct s  s;  struct t  t; };
> >>
> >> struct s  x = {42, 42};
> >> union u   y;
> >> int       z;
> >>
> >> memcpy(&y.t, &x, sizeof(x));  // This is fine
> >>
> >> // We created an object of type 'struct t' in the union.
> >> // Unions allow aliasing, so we're allowed to reinterpret
> >> // that object as a 'struct s' via the other member.
> >>
> >> z = y.s.a;  // This is fine.
> >>
> >> // But we're not allowed to reinterpret bytes that are
> >> // officially uninitialized (even though we know they are
> >> // initialized).
> >>
> >> z = y.s.b;  // UB here.
> >>
> >> The reason for the UB is that the compiler is free to assume
> >> that since you wrote to the struct t member, the write can't
> >> possibly write to the second member of the struct (even if
> >> the size passed to memcpy(3) is larger than that).  In other
> >> words, the compiler may assume that anything past
> >> sizeof(struct t) is uninitialized.
> > 
> > You haven't wrote to the struct t member, you wrote to the address 
> > using memcpy().  There is a difference, see C99 (or C11, whichever 
> > you prefer), 6.5 Expressions.
> 
> I assume you refer to this:
> 
> C11::6.5p6:
> < The effective type of an object for an access to its stored
> < value is the declared type of the object, if any.87)  If a value
> < is stored into an object having no declared type through an
> < lvalue having a type that is not a character type, then the type
> < of the lvalue becomes the effective type of the object for that
> < access and for subsequent accesses that do not modify the stored
> < value.  If a value is copied into an object having no declared
> < type using memcpy or memmove, or is copied as an array of
> < character type, then the effective type of the modified object
> < for that access and for subsequent accesses that do not modify
> < the value is the effective type of the object from which the
> < value is copied, if it has one.  For all other accesses to an
> < object having no declared type, the effective type of the object
> < is simply the type of the lvalue used for the access.
> <
> [...]
> <
> < 87) Allocated objects have no declared type. 
> 
> 
> Let's break it into sentences:
> 
> < 87) Allocated objects have no declared type. 
> 
> malloc(3)d memory has no declared type; all other memory has a
> declared type.
> 
> < The effective type of an object for an access to its stored
> < value is the declared type of the object, if any.89)
> 
> 'y.t' has a declared type of 'struct t', so its effective type is
> also 'struct t'.
> 
> < If a value is copied into an object having no declared
> < type using memcpy or memmove, [...]
> 
> It doesn't apply, since 'y.t' has declared type.  memcpy(3)
> can't change the effective type in this case.
> 
> 
> I think my example has UB (not 100% sure, though).  However, I
> notice now that it's slightly different from the one in nginx,
> since nginx wasn't using a named variable (or at least it's not
> obvious in ngx_quic_recvmsg(), since it's just receiving a
> pointer), and it's instead probably using malloc(3)'d memory, so
> the memory didn't have declared type.  Only if a local temporary
> variable had been used at some point upper in the call chain,
> there could be UB.  This is something that a static analyzer
> will have a hard time checking (and a human reviewer too),
> though.
> 
> <https://software.codidact.com/posts/288138>

As you've correctly noticed, your example is different from the 
nginx code, since it uses a local variable (and not an allocated 
object, like nginx does).

The next question you might consider is: what difference it makes?  
And if at all?  In particular, you might want to consider what 
memcpy() actually does.  As per the quoted paragraph, it does two 
things:

1. Modifies the object as pointed out by (void *) &y.t;

2. Changes the effective type of the modified object for 
   subsequent accesses (to the effective type of the object from 
   which the value is copied) if there is no declared type.

For your example, (2) is not relevant, since there is a 
declared type.  So the remaining part is (1).

What actually memcpy() modifies, and what it is allowed to modify?  
What makes you think that compiler is free to assume that you've 
wrote only to the y.t member, and not just some bytes in the y 
object?

[...]

-- 
Maxim Dounin
http://mdounin.ru/


More information about the nginx-devel mailing list