Core: Avoid memcpy from NULL
Ben Kallus
benjamin.p.kallus.gr at dartmouth.edu
Wed Dec 13 16:09:28 UTC 2023
Nginx executes numerous `memcpy`s from NULL during normal execution.
`memcpy`ing to or from NULL is undefined behavior. Accordingly, some
compilers (gcc -O2) make optimizations that assume `memcpy` arguments
are not NULL. Nginx with UBSan crashes during startup due to this
issue.
Consider the following function:
```C
#include <string.h>
int f(int i) {
char a[] = {'a'};
void *src = i ? a : NULL;
char dst[1];
memcpy(dst, src, 0);
return src == NULL;
}
```
Here's what gcc13.2 -O2 -fno-builtin will do to it:
```asm
f:
sub rsp, 24
xor eax, eax
test edi, edi
lea rsi, [rsp+14]
lea rdi, [rsp+15]
mov BYTE PTR [rsp+14], 97
cmove rsi, rax
xor edx, edx
call memcpy
xor eax, eax
add rsp, 24
ret
```
Note that `f` always returns 0, regardless of the value of `i`.
Feel free to try for yourself at https://gcc.godbolt.org/z/zfvnMMsds
The reasoning here is that since memcpy from NULL is UB, the optimizer
is free to assume that `src` is non-null. You might consider this to
be a problem with the compiler, or the C standard, and I might agree.
Regardless, relying on UB is inherently un-portable, and requires
maintenance to ensure that new compiler releases don't break existing
assumptions about the behavior of undefined operations.
The following patch adds a check to `ngx_memcpy` and `ngx_cpymem` that
makes 0-length memcpy explicitly a noop. Since all memcpying from NULL
in Nginx uses n==0, this should be sufficient to avoid UB.
It would be more efficient to instead add a check to every call to
ngx_memcpy and ngx_cpymem that might be used with src==NULL, but in
the discussion of a previous patch that proposed such a change, a more
straightforward and tidy solution was desired.
It may also be worth considering adding checks for NULL memset,
memmove, etc. I think this is not necessary unless it is demonstrated
that Nginx actually executes such undefined calls.
# HG changeset patch
# User Ben Kallus <benjamin.p.kallus.gr at dartmouth.edu>
# Date 1702406466 18000
# Tue Dec 12 13:41:06 2023 -0500
# Node ID d270203d4ecf77cc14a2652c727e236afc659f4a
# Parent a6f79f044de58b594563ac03139cd5e2e6a81bdb
Add NULL check to ngx_memcpy and ngx_cpymem to satisfy UBSan.
diff -r a6f79f044de5 -r d270203d4ecf src/core/ngx_string.c
--- a/src/core/ngx_string.c Wed Nov 29 10:58:21 2023 +0400
+++ b/src/core/ngx_string.c Tue Dec 12 13:41:06 2023 -0500
@@ -2098,6 +2098,10 @@
ngx_debug_point();
}
+ if (n == 0) {
+ return dst;
+ }
+
return memcpy(dst, src, n);
}
diff -r a6f79f044de5 -r d270203d4ecf src/core/ngx_string.h
--- a/src/core/ngx_string.h Wed Nov 29 10:58:21 2023 +0400
+++ b/src/core/ngx_string.h Tue Dec 12 13:41:06 2023 -0500
@@ -103,8 +103,9 @@
* gcc3 compiles memcpy(d, s, 4) to the inline "mov"es.
* icc8 compile memcpy(d, s, 4) to the inline "mov"es or XMM moves.
*/
-#define ngx_memcpy(dst, src, n) (void) memcpy(dst, src, n)
-#define ngx_cpymem(dst, src, n) (((u_char *) memcpy(dst, src, n)) + (n))
+#define ngx_memcpy(dst, src, n) (void) ((n) == 0 ? (dst) : memcpy(dst, src, n))
+#define ngx_cpymem(dst, src, n) \
+ ((u_char *) ((n) == 0 ? (dst) : memcpy(dst, src, n)) + (n))
#endif
diff -r a6f79f044de5 -r d270203d4ecf src/http/v2/ngx_http_v2.c
--- a/src/http/v2/ngx_http_v2.c Wed Nov 29 10:58:21 2023 +0400
+++ b/src/http/v2/ngx_http_v2.c Tue Dec 12 13:41:06 2023 -0500
@@ -3998,9 +3998,7 @@
n = size;
}
- if (n > 0) {
- rb->buf->last = ngx_cpymem(rb->buf->last, pos, n);
- }
+ rb->buf->last = ngx_cpymem(rb->buf->last, pos, n);
ngx_log_debug1(NGX_LOG_DEBUG_HTTP, fc->log, 0,
"http2 request body recv %uz", n);
More information about the nginx-devel
mailing list