[PATCH 0 of 4] Avoid dead store elimination in GCC 11+

Alejandro Colomar alx.manpages at gmail.com
Tue Apr 18 14:16:20 UTC 2023


Hello Vladim,

On 4/18/23 11:50, Vadim Fedorenko via nginx-devel wrote:
> On 18.04.2023 02:54, Maxim Dounin wrote:
>> Hello!
>>
>> On Tue, Apr 18, 2023 at 02:07:06AM +0300, Vadim Fedorenko via nginx-devel wrote:
>>
>>> GCC version 11 and newer use more aggressive way to eliminate dead stores
>>> which ends up removing ngx_memzero() calls in several places. Such optimization
>>> affects calculations of md5 and sha1 implemented internally in nginx. The
>>> effect could be easily observed by adding a random data to buffer array in
>>> md5_init() or sha1_init() functions. With this simple modifications the result
>>> of the hash computation will be different each time even though the provided
>>> data to hash is not changed.
>>
>> If calculations of md5 and sha1 are affected, this means that the
>> stores in question are not dead, and they shouldn't be eliminated
>> in the first place.  From your description this looks like a bug
>> in the compiler in question.
> 
> Yeah, these ngx_memzero()s must not be dead, but according to the standart they
> are. In md5_final() the function is called this way: 
> ngx_memzero(&ctx->buffer[used], free - 8);
> That means that a new variable of type 'char *' is created with the life time

Correction: no variable is being created, but rather an object.

> scoped to the call to ngx_memzero(). As the result of of the function is ignored 
> explicitly, no other parameters are passed by pointer, and the variable is not 
> accessed anywhere else, the whole call can be optimized out.
> 
>> Alternatively, this can be a bug in nginx code which makes the
>> compiler think that it can eliminate these ngx_memzero() calls - for
>> example, GCC is known to do such things if it sees an undefined
>> behaviour in the code.
> 
> There is no undefined behavior unfortunately, everything in this place is well 
> defined.

No.  UB is being invoked somewhere, or this couldn't possibly happen.

If this object being created by ngx_memzero() only lives during this short time,
then why does it affect any other code?  It seems to be some aliasing error,
since some other code seems to be reading that "unused" memory.

Maybe the UB is far from this code, and fixing it may not be trivial, but Maxim
is right, in that using explicit_bzero(3) or equivalent is just hiding the bug
instead of fixing it.  Since GCC 11 is relatively old and tested, I'm inclined
to think the bug is in nginx.  Maybe if nginx specified -fno-strict-aliasing,
this UB issue could be resolved; you could try.

Cheers,
Alex

> 
>> You may want to elaborate more on how to reproduce this, and, if
>> possible, how to build a minimal test case which demonstrates the
>> problem.
> 
> Sure, let's elaborate a bit. To reproduce the bug you can simply apply the diff:
> 
> diff --git a/src/core/ngx_md5.c b/src/core/ngx_md5.c
> index c25d0025d..67cc06438 100644
> --- a/src/core/ngx_md5.c
> +++ b/src/core/ngx_md5.c
> @@ -24,6 +24,7 @@ ngx_md5_init(ngx_md5_t *ctx)
>       ctx->d = 0x10325476;
> 
>       ctx->bytes = 0;
> +    getrandom(ctx->buffer, 64, 0);
>   }
> 
> 
> This code will emulate the garbage for the stack-allocated 'ngx_md5_t md5;' in 
> ngx_http_file_cache_create_key when nginx is running under the load. Then you 
> can use simple configuration:
> 
> upstream test_001_origin {
>    server 127.0.0.1:8000;
> }
> 
> proxy_cache_path /var/cache/nginx/test-001 keys_zone=test_001:10m max_size=5g 
> inactive=24h levels=1:2 use_temp_path=off;
> 
> server {
>    listen 127.0.0.1:8000;
> 
>    location = /plain {
>      return 200;
>    }
> 
> }
> 
> server {
>    listen 127.0.0.1:80;
> 
>    location /oplain {
>       proxy_cache test_001;
>       proxy_cache_key /oplain;
>       proxy_pass http://test_001_origin/plain/;
>    }
> }
> 
> 
> Every time you call 'curl http://127.0.0.1/oplain'  a new cache file will be 
> created, but the md5sum of the file will be the same, meaining that the key 
> stored in the file is absolutely the same.
> 
>>> Changing the code to use current implementation
>>> of ngx_explicit_memzero() doesn't help because of link-time optimizations
>>> enabled in RHEL 9 and derivatives. Glibc 2.34 found in RHEL 9 provides
>>> explicit_bzero() function which should be used to avoid such optimization.
>>> ngx_explicit_memzero() is changed to use explicit_bzero() if possible.
>>
>> The ngx_explicit_memzero() function is to be used when zeroed data
>> are indeed not used afterwards, for example, to make sure
>> passwords are actually eliminated from memory.  It shouldn't be
>> used instead of a real ngx_memzero() call - doing so might hide
>> the problem, which is either in the compiler or in nginx, but
>> won't fix it.
> 
> In this case the nginx code should be fixed to avoid partial memory fillings, 
> but such change will come with performance penalty, especially on the CPUs 
> without proper `REP MOVSB/MOVSD/MOVSQ` implementation. Controlled usage of 
> explicit zeroing is much better is this case.
> 
>> As for using explicit_bzero() for it, we've looked into various
>> OS-specific solutions, though there are too many variants out
>> there, so it was decided that having our own simple implementation
>> is a better way to go.  If it doesn't work in the particular
>> setup, it might make sense to adjust our implementation - but
>> given the above, it might be the same issue which causes the
>> original problem.
> 
> Unfortunately, the memory barrier trick is not working anymore for linker-time 
> optimizations. Linker has better information about whether the stored 
> information is used again or not. And it will remove memset in such 
> implementation, and it will definitely affected security-related code you 
> mentioned above. explicit_bzero() function is available in well-loved *BSD 
> systems now and is a proper way to do cleaning of the artifacts, doesn't matter 
> which implementation is used in the specific system.
> _______________________________________________
> nginx-devel mailing list
> nginx-devel at nginx.org
> https://mailman.nginx.org/mailman/listinfo/nginx-devel

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://mailman.nginx.org/pipermail/nginx-devel/attachments/20230418/9b062ae4/attachment.bin>


More information about the nginx-devel mailing list