[PATCH 0 of 4] Avoid dead store elimination in GCC 11+

Vadim Fedorenko vadim.fedorenko at cdnnow.ru
Tue Apr 18 09:50:01 UTC 2023


On 18.04.2023 02:54, Maxim Dounin wrote:
> Hello!
> 
> On Tue, Apr 18, 2023 at 02:07:06AM +0300, Vadim Fedorenko via nginx-devel wrote:
> 
>> GCC version 11 and newer use more aggressive way to eliminate dead stores
>> which ends up removing ngx_memzero() calls in several places. Such optimization
>> affects calculations of md5 and sha1 implemented internally in nginx. The
>> effect could be easily observed by adding a random data to buffer array in
>> md5_init() or sha1_init() functions. With this simple modifications the result
>> of the hash computation will be different each time even though the provided
>> data to hash is not changed.
> 
> If calculations of md5 and sha1 are affected, this means that the
> stores in question are not dead, and they shouldn't be eliminated
> in the first place.  From your description this looks like a bug
> in the compiler in question.

Yeah, these ngx_memzero()s must not be dead, but according to the standart they
are. In md5_final() the function is called this way: 
ngx_memzero(&ctx->buffer[used], free - 8);
That means that a new variable of type 'char *' is created with the life time
scoped to the call to ngx_memzero(). As the result of of the function is ignored 
explicitly, no other parameters are passed by pointer, and the variable is not 
accessed anywhere else, the whole call can be optimized out.

> Alternatively, this can be a bug in nginx code which makes the
> compiler think that it can eliminate these ngx_memzero() calls - for
> example, GCC is known to do such things if it sees an undefined
> behaviour in the code.

There is no undefined behavior unfortunately, everything in this place is well 
defined.

> You may want to elaborate more on how to reproduce this, and, if
> possible, how to build a minimal test case which demonstrates the
> problem.

Sure, let's elaborate a bit. To reproduce the bug you can simply apply the diff:

diff --git a/src/core/ngx_md5.c b/src/core/ngx_md5.c
index c25d0025d..67cc06438 100644
--- a/src/core/ngx_md5.c
+++ b/src/core/ngx_md5.c
@@ -24,6 +24,7 @@ ngx_md5_init(ngx_md5_t *ctx)
      ctx->d = 0x10325476;

      ctx->bytes = 0;
+    getrandom(ctx->buffer, 64, 0);
  }


This code will emulate the garbage for the stack-allocated 'ngx_md5_t md5;' in 
ngx_http_file_cache_create_key when nginx is running under the load. Then you 
can use simple configuration:

upstream test_001_origin {
   server 127.0.0.1:8000;
}

proxy_cache_path /var/cache/nginx/test-001 keys_zone=test_001:10m max_size=5g 
inactive=24h levels=1:2 use_temp_path=off;

server {
   listen 127.0.0.1:8000;

   location = /plain {
     return 200;
   }

}

server {
   listen 127.0.0.1:80;

   location /oplain {
      proxy_cache test_001;
      proxy_cache_key /oplain;
      proxy_pass http://test_001_origin/plain/;
   }
}


Every time you call 'curl http://127.0.0.1/oplain'  a new cache file will be 
created, but the md5sum of the file will be the same, meaining that the key 
stored in the file is absolutely the same.

>> Changing the code to use current implementation
>> of ngx_explicit_memzero() doesn't help because of link-time optimizations
>> enabled in RHEL 9 and derivatives. Glibc 2.34 found in RHEL 9 provides
>> explicit_bzero() function which should be used to avoid such optimization.
>> ngx_explicit_memzero() is changed to use explicit_bzero() if possible.
> 
> The ngx_explicit_memzero() function is to be used when zeroed data
> are indeed not used afterwards, for example, to make sure
> passwords are actually eliminated from memory.  It shouldn't be
> used instead of a real ngx_memzero() call - doing so might hide
> the problem, which is either in the compiler or in nginx, but
> won't fix it.

In this case the nginx code should be fixed to avoid partial memory fillings, 
but such change will come with performance penalty, especially on the CPUs 
without proper `REP MOVSB/MOVSD/MOVSQ` implementation. Controlled usage of 
explicit zeroing is much better is this case.

> As for using explicit_bzero() for it, we've looked into various
> OS-specific solutions, though there are too many variants out
> there, so it was decided that having our own simple implementation
> is a better way to go.  If it doesn't work in the particular
> setup, it might make sense to adjust our implementation - but
> given the above, it might be the same issue which causes the
> original problem.

Unfortunately, the memory barrier trick is not working anymore for linker-time 
optimizations. Linker has better information about whether the stored 
information is used again or not. And it will remove memset in such 
implementation, and it will definitely affected security-related code you 
mentioned above. explicit_bzero() function is available in well-loved *BSD 
systems now and is a proper way to do cleaning of the artifacts, doesn't matter 
which implementation is used in the specific system.


More information about the nginx-devel mailing list