Core: Avoid memcpy from NULL

Maxim Dounin mdounin at mdounin.ru
Fri Dec 15 02:41:36 UTC 2023


Hello!

On Wed, Dec 13, 2023 at 11:09:28AM -0500, Ben Kallus wrote:

> Nginx executes numerous `memcpy`s from NULL during normal execution.
> `memcpy`ing to or from NULL is undefined behavior. Accordingly, some
> compilers (gcc -O2) make optimizations that assume `memcpy` arguments
> are not NULL. Nginx with UBSan crashes during startup due to this
> issue.
> 
> Consider the following function:
> ```C
> #include <string.h>
> 
> int f(int i) {
>     char a[] = {'a'};
>     void *src = i ? a : NULL;
>     char dst[1];
>     memcpy(dst, src, 0);
>     return src == NULL;
> }
> ```
> Here's what gcc13.2 -O2 -fno-builtin will do to it:
> ```asm
> f:
>         sub     rsp, 24
>         xor     eax, eax
>         test    edi, edi
>         lea     rsi, [rsp+14]
>         lea     rdi, [rsp+15]
>         mov     BYTE PTR [rsp+14], 97
>         cmove   rsi, rax
>         xor     edx, edx
>         call    memcpy
>         xor     eax, eax
>         add     rsp, 24
>         ret
> ```
> Note that `f` always returns 0, regardless of the value of `i`.
> 
> Feel free to try for yourself at https://gcc.godbolt.org/z/zfvnMMsds
> 
> The reasoning here is that since memcpy from NULL is UB, the optimizer
> is free to assume that `src` is non-null. You might consider this to
> be a problem with the compiler, or the C standard, and I might agree.
> Regardless, relying on UB is inherently un-portable, and requires
> maintenance to ensure that new compiler releases don't break existing
> assumptions about the behavior of undefined operations.
> 
> The following patch adds a check to `ngx_memcpy` and `ngx_cpymem` that
> makes 0-length memcpy explicitly a noop. Since all memcpying from NULL
> in Nginx uses n==0, this should be sufficient to avoid UB.
> 
> It would be more efficient to instead add a check to every call to
> ngx_memcpy and ngx_cpymem that might be used with src==NULL, but in
> the discussion of a previous patch that proposed such a change, a more
> straightforward and tidy solution was desired.
> It may also be worth considering adding checks for NULL memset,
> memmove, etc. I think this is not necessary unless it is demonstrated
> that Nginx actually executes such undefined calls.
> 
> # HG changeset patch
> # User Ben Kallus <benjamin.p.kallus.gr at dartmouth.edu>
> # Date 1702406466 18000
> #      Tue Dec 12 13:41:06 2023 -0500
> # Node ID d270203d4ecf77cc14a2652c727e236afc659f4a
> # Parent  a6f79f044de58b594563ac03139cd5e2e6a81bdb
> Add NULL check to ngx_memcpy and ngx_cpymem to satisfy UBSan.
> 
> diff -r a6f79f044de5 -r d270203d4ecf src/core/ngx_string.c
> --- a/src/core/ngx_string.c     Wed Nov 29 10:58:21 2023 +0400
> +++ b/src/core/ngx_string.c     Tue Dec 12 13:41:06 2023 -0500
> @@ -2098,6 +2098,10 @@
>          ngx_debug_point();
>      }
> 
> +    if (n == 0) {
> +        return dst;
> +    }
> +
>      return memcpy(dst, src, n);
>  }
> 
> diff -r a6f79f044de5 -r d270203d4ecf src/core/ngx_string.h
> --- a/src/core/ngx_string.h     Wed Nov 29 10:58:21 2023 +0400
> +++ b/src/core/ngx_string.h     Tue Dec 12 13:41:06 2023 -0500
> @@ -103,8 +103,9 @@
>   * gcc3 compiles memcpy(d, s, 4) to the inline "mov"es.
>   * icc8 compile memcpy(d, s, 4) to the inline "mov"es or XMM moves.
>   */
> -#define ngx_memcpy(dst, src, n)   (void) memcpy(dst, src, n)
> -#define ngx_cpymem(dst, src, n)   (((u_char *) memcpy(dst, src, n)) + (n))
> +#define ngx_memcpy(dst, src, n) (void) ((n) == 0 ? (dst) : memcpy(dst, src, n))
> +#define ngx_cpymem(dst, src, n)                                              \
> +    ((u_char *) ((n) == 0 ? (dst) : memcpy(dst, src, n)) + (n))
> 
>  #endif
> 
> diff -r a6f79f044de5 -r d270203d4ecf src/http/v2/ngx_http_v2.c
> --- a/src/http/v2/ngx_http_v2.c Wed Nov 29 10:58:21 2023 +0400
> +++ b/src/http/v2/ngx_http_v2.c Tue Dec 12 13:41:06 2023 -0500
> @@ -3998,9 +3998,7 @@
>                  n = size;
>              }
> 
> -            if (n > 0) {
> -                rb->buf->last = ngx_cpymem(rb->buf->last, pos, n);
> -            }
> +            rb->buf->last = ngx_cpymem(rb->buf->last, pos, n);
> 
>              ngx_log_debug1(NGX_LOG_DEBUG_HTTP, fc->log, 0,
>                             "http2 request body recv %uz", n);

For the record, I've already provided some feedback to Ben in the 
ticket here:

https://trac.nginx.org/nginx/ticket/2570

And pointed to the existing thread here:

https://mailman.nginx.org/pipermail/nginx-devel/2023-October/PX7VH5A273NLUGSYC7DR2AZRU75CIQ3Q.html
https://mailman.nginx.org/pipermail/nginx-devel/2023-December/DCGUEGEFS6TSVIWNEWUEZO3FZMR6ESYZ.html

Hope this helps.

-- 
Maxim Dounin
http://mdounin.ru/


More information about the nginx-devel mailing list