Core: Avoid memcpy from NULL

Ben Kallus benjamin.p.kallus.gr at dartmouth.edu
Wed Dec 13 16:09:28 UTC 2023


Nginx executes numerous `memcpy`s from NULL during normal execution.
`memcpy`ing to or from NULL is undefined behavior. Accordingly, some
compilers (gcc -O2) make optimizations that assume `memcpy` arguments
are not NULL. Nginx with UBSan crashes during startup due to this
issue.

Consider the following function:
```C
#include <string.h>

int f(int i) {
    char a[] = {'a'};
    void *src = i ? a : NULL;
    char dst[1];
    memcpy(dst, src, 0);
    return src == NULL;
}
```
Here's what gcc13.2 -O2 -fno-builtin will do to it:
```asm
f:
        sub     rsp, 24
        xor     eax, eax
        test    edi, edi
        lea     rsi, [rsp+14]
        lea     rdi, [rsp+15]
        mov     BYTE PTR [rsp+14], 97
        cmove   rsi, rax
        xor     edx, edx
        call    memcpy
        xor     eax, eax
        add     rsp, 24
        ret
```
Note that `f` always returns 0, regardless of the value of `i`.

Feel free to try for yourself at https://gcc.godbolt.org/z/zfvnMMsds

The reasoning here is that since memcpy from NULL is UB, the optimizer
is free to assume that `src` is non-null. You might consider this to
be a problem with the compiler, or the C standard, and I might agree.
Regardless, relying on UB is inherently un-portable, and requires
maintenance to ensure that new compiler releases don't break existing
assumptions about the behavior of undefined operations.

The following patch adds a check to `ngx_memcpy` and `ngx_cpymem` that
makes 0-length memcpy explicitly a noop. Since all memcpying from NULL
in Nginx uses n==0, this should be sufficient to avoid UB.

It would be more efficient to instead add a check to every call to
ngx_memcpy and ngx_cpymem that might be used with src==NULL, but in
the discussion of a previous patch that proposed such a change, a more
straightforward and tidy solution was desired.
It may also be worth considering adding checks for NULL memset,
memmove, etc. I think this is not necessary unless it is demonstrated
that Nginx actually executes such undefined calls.

# HG changeset patch
# User Ben Kallus <benjamin.p.kallus.gr at dartmouth.edu>
# Date 1702406466 18000
#      Tue Dec 12 13:41:06 2023 -0500
# Node ID d270203d4ecf77cc14a2652c727e236afc659f4a
# Parent  a6f79f044de58b594563ac03139cd5e2e6a81bdb
Add NULL check to ngx_memcpy and ngx_cpymem to satisfy UBSan.

diff -r a6f79f044de5 -r d270203d4ecf src/core/ngx_string.c
--- a/src/core/ngx_string.c     Wed Nov 29 10:58:21 2023 +0400
+++ b/src/core/ngx_string.c     Tue Dec 12 13:41:06 2023 -0500
@@ -2098,6 +2098,10 @@
         ngx_debug_point();
     }

+    if (n == 0) {
+        return dst;
+    }
+
     return memcpy(dst, src, n);
 }

diff -r a6f79f044de5 -r d270203d4ecf src/core/ngx_string.h
--- a/src/core/ngx_string.h     Wed Nov 29 10:58:21 2023 +0400
+++ b/src/core/ngx_string.h     Tue Dec 12 13:41:06 2023 -0500
@@ -103,8 +103,9 @@
  * gcc3 compiles memcpy(d, s, 4) to the inline "mov"es.
  * icc8 compile memcpy(d, s, 4) to the inline "mov"es or XMM moves.
  */
-#define ngx_memcpy(dst, src, n)   (void) memcpy(dst, src, n)
-#define ngx_cpymem(dst, src, n)   (((u_char *) memcpy(dst, src, n)) + (n))
+#define ngx_memcpy(dst, src, n) (void) ((n) == 0 ? (dst) : memcpy(dst, src, n))
+#define ngx_cpymem(dst, src, n)                                              \
+    ((u_char *) ((n) == 0 ? (dst) : memcpy(dst, src, n)) + (n))

 #endif

diff -r a6f79f044de5 -r d270203d4ecf src/http/v2/ngx_http_v2.c
--- a/src/http/v2/ngx_http_v2.c Wed Nov 29 10:58:21 2023 +0400
+++ b/src/http/v2/ngx_http_v2.c Tue Dec 12 13:41:06 2023 -0500
@@ -3998,9 +3998,7 @@
                 n = size;
             }

-            if (n > 0) {
-                rb->buf->last = ngx_cpymem(rb->buf->last, pos, n);
-            }
+            rb->buf->last = ngx_cpymem(rb->buf->last, pos, n);

             ngx_log_debug1(NGX_LOG_DEBUG_HTTP, fc->log, 0,
                            "http2 request body recv %uz", n);


More information about the nginx-devel mailing list