Core: Avoid memcpy from NULL
Dipl. Ing. Sergey Brester
serg.brester at sebres.de
Fri Dec 15 14:46:19 UTC 2023
Enclosed few thoughts to the subject:
- since it is very rare situation that one needs only a memcpy without
to know whether previous alloc may fail
(e. g. some of pointers were NULL), me too thinks that the caller
should be responsible for the check.
So I would not extend ngx_memcpy or ngx_cpymem in that way.
- rewrite of `ngx_memcpy` define like here:
```
+ #define ngx_memcpy(dst, src, n) (void) ((n) == 0 ? (dst) :
memcpy(dst, src, n))
```
may introduce a regression or compat issues, e. g. fully functioning
codes like that may become broken hereafter:
```
ngx_memcpy(dst, src, ++len); // because n would be incremented twice
in the macro now
```
Sure, `ngx_cpymem` has also the same issue, but it had that already
before the "fix"...
Anyway, I'm always against of such macros (no matter with or without
check it would be better as an inline function instead).
My conclusion:
a fix of affected places invoking `ngx_memcpy` and `ngx_cpymem`, and
possibly an assert in `ngx_memcpy`
and `ngx_cpymem` would be fully enough, in my opinion.
Regards,
Sergey.
On 15.12.2023 03:41, Maxim Dounin wrote:
> Hello!
>
> On Wed, Dec 13, 2023 at 11:09:28AM -0500, Ben Kallus wrote:
>
> Nginx executes numerous `memcpy`s from NULL during normal
> execution.
> `memcpy`ing to or from NULL is undefined behavior. Accordingly,
> some
> compilers (gcc -O2) make optimizations that assume `memcpy`
> arguments
> are not NULL. Nginx with UBSan crashes during startup due to this
> issue.
>
> Consider the following function:
> ```C
> #include <string.h>
>
> int f(int i) {
> char a[] = {'a'};
> void *src = i ? a : NULL;
> char dst[1];
> memcpy(dst, src, 0);
> return src == NULL;
> }
> ```
> Here's what gcc13.2 -O2 -fno-builtin will do to it:
> ```asm
> f:
> sub rsp, 24
> xor eax, eax
> test edi, edi
> lea rsi, [rsp+14]
> lea rdi, [rsp+15]
> mov BYTE PTR [rsp+14], 97
> cmove rsi, rax
> xor edx, edx
> call memcpy
> xor eax, eax
> add rsp, 24
> ret
> ```
> Note that `f` always returns 0, regardless of the value of `i`.
>
> Feel free to try for yourself at
> https://gcc.godbolt.org/z/zfvnMMsds
>
> The reasoning here is that since memcpy from NULL is UB, the
> optimizer
> is free to assume that `src` is non-null. You might consider this
> to
> be a problem with the compiler, or the C standard, and I might
> agree.
> Regardless, relying on UB is inherently un-portable, and requires
> maintenance to ensure that new compiler releases don't break
> existing
> assumptions about the behavior of undefined operations.
>
> The following patch adds a check to `ngx_memcpy` and `ngx_cpymem`
> that
> makes 0-length memcpy explicitly a noop. Since all memcpying from
> NULL
> in Nginx uses n==0, this should be sufficient to avoid UB.
>
> It would be more efficient to instead add a check to every call to
> ngx_memcpy and ngx_cpymem that might be used with src==NULL, but in
> the discussion of a previous patch that proposed such a change, a
> more
> straightforward and tidy solution was desired.
> It may also be worth considering adding checks for NULL memset,
> memmove, etc. I think this is not necessary unless it is
> demonstrated
> that Nginx actually executes such undefined calls.
>
> # HG changeset patch
> # User Ben Kallus <benjamin.p.kallus.gr at dartmouth.edu>
> # Date 1702406466 18000
> # Tue Dec 12 13:41:06 2023 -0500
> # Node ID d270203d4ecf77cc14a2652c727e236afc659f4a
> # Parent a6f79f044de58b594563ac03139cd5e2e6a81bdb
> Add NULL check to ngx_memcpy and ngx_cpymem to satisfy UBSan.
>
> diff -r a6f79f044de5 -r d270203d4ecf src/core/ngx_string.c
> --- a/src/core/ngx_string.c Wed Nov 29 10:58:21 2023 +0400
> +++ b/src/core/ngx_string.c Tue Dec 12 13:41:06 2023 -0500
> @@ -2098,6 +2098,10 @@
> ngx_debug_point();
> }
>
> + if (n == 0) {
> + return dst;
> + }
> +
> return memcpy(dst, src, n);
> }
>
> diff -r a6f79f044de5 -r d270203d4ecf src/core/ngx_string.h
> --- a/src/core/ngx_string.h Wed Nov 29 10:58:21 2023 +0400
> +++ b/src/core/ngx_string.h Tue Dec 12 13:41:06 2023 -0500
> @@ -103,8 +103,9 @@
> * gcc3 compiles memcpy(d, s, 4) to the inline "mov"es.
> * icc8 compile memcpy(d, s, 4) to the inline "mov"es or XMM
> moves.
> */
> -#define ngx_memcpy(dst, src, n) (void) memcpy(dst, src, n)
> -#define ngx_cpymem(dst, src, n) (((u_char *) memcpy(dst, src,
> n)) + (n))
> +#define ngx_memcpy(dst, src, n) (void) ((n) == 0 ? (dst) :
> memcpy(dst, src, n))
> +#define ngx_cpymem(dst, src, n)
> \
> + ((u_char *) ((n) == 0 ? (dst) : memcpy(dst, src, n)) + (n))
>
> #endif
>
> diff -r a6f79f044de5 -r d270203d4ecf src/http/v2/ngx_http_v2.c
> --- a/src/http/v2/ngx_http_v2.c Wed Nov 29 10:58:21 2023 +0400
> +++ b/src/http/v2/ngx_http_v2.c Tue Dec 12 13:41:06 2023 -0500
> @@ -3998,9 +3998,7 @@
> n = size;
> }
>
> - if (n > 0) {
> - rb->buf->last = ngx_cpymem(rb->buf->last, pos, n);
> - }
> + rb->buf->last = ngx_cpymem(rb->buf->last, pos, n);
>
> ngx_log_debug1(NGX_LOG_DEBUG_HTTP, fc->log, 0,
> "http2 request body recv %uz", n);
>
>
> For the record, I've already provided some feedback to Ben in the
> ticket here:
>
> https://trac.nginx.org/nginx/ticket/2570
>
> And pointed to the existing thread here:
>
> https://mailman.nginx.org/pipermail/nginx-devel/2023-October/PX7VH5A273NLUGSYC7DR2AZRU75CIQ3Q.html
> https://mailman.nginx.org/pipermail/nginx-devel/2023-December/DCGUEGEFS6TSVIWNEWUEZO3FZMR6ESYZ.html
>
> Hope this helps.
>
> --
> Maxim Dounin
> http://mdounin.ru/
> _______________________________________________
> nginx-devel mailing list
> nginx-devel at nginx.org
> https://mailman.nginx.org/mailman/listinfo/nginx-devel
>
More information about the nginx-devel
mailing list