Core: Avoid memcpy from NULL
Ben Kallus
benjamin.p.kallus.gr at dartmouth.edu
Fri Dec 15 16:28:46 UTC 2023
> - rewrite of `ngx_memcpy` define like here:
> ```
> + #define ngx_memcpy(dst, src, n) (void) ((n) == 0 ? (dst) : memcpy(dst, src, n))
> ```
> may introduce a regression or compat issues, e. g. fully functioning codes like that may become broken hereafter:
> ```
> ngx_memcpy(dst, src, ++len); // because n would be incremented twice in the macro now
> ```
> Sure, `ngx_cpymem` has also the same issue, but it had that already before the "fix"...
> Anyway, I'm always against of such macros (no matter with or without check it would be better as an inline function instead).
I totally agree. I'm not a fan of function-like macros either; I
introduced those extra evaluations of n and dst only because I saw
that cpymem already valuates n twice. I'd be happy to update these to
functions (explicitly inlined or not) in the final changeset.
> - since it is very rare situation that one needs only a memcpy without to know whether previous alloc may fail
> (e. g. some of pointers were NULL), me too thinks that the caller should be responsible for the check.
> So I would not extend ngx_memcpy or ngx_cpymem in that way.
I am inclined to agree with you on this point, but Maxim previously wrote this:
> Trying to check length everywhere is just ugly and unreadable.
Which I also think is reasonable.
Does anyone else have a position on this?
-Ben
On 12/15/23, Dipl. Ing. Sergey Brester via nginx-devel
<nginx-devel at nginx.org> wrote:
> Enclosed few thoughts to the subject:
>
> - since it is very rare situation that one needs only a memcpy without
> to know whether previous alloc may fail
> (e. g. some of pointers were NULL), me too thinks that the caller
> should be responsible for the check.
> So I would not extend ngx_memcpy or ngx_cpymem in that way.
>
> - rewrite of `ngx_memcpy` define like here:
> ```
> + #define ngx_memcpy(dst, src, n) (void) ((n) == 0 ? (dst) :
> memcpy(dst, src, n))
> ```
> may introduce a regression or compat issues, e. g. fully functioning
> codes like that may become broken hereafter:
> ```
> ngx_memcpy(dst, src, ++len); // because n would be incremented twice
> in the macro now
> ```
> Sure, `ngx_cpymem` has also the same issue, but it had that already
> before the "fix"...
> Anyway, I'm always against of such macros (no matter with or without
> check it would be better as an inline function instead).
>
> My conclusion:
> a fix of affected places invoking `ngx_memcpy` and `ngx_cpymem`, and
> possibly an assert in `ngx_memcpy`
> and `ngx_cpymem` would be fully enough, in my opinion.
>
> Regards,
> Sergey.
>
> On 15.12.2023 03:41, Maxim Dounin wrote:
>
>> Hello!
>>
>> On Wed, Dec 13, 2023 at 11:09:28AM -0500, Ben Kallus wrote:
>>
>> Nginx executes numerous `memcpy`s from NULL during normal
>> execution.
>> `memcpy`ing to or from NULL is undefined behavior. Accordingly,
>> some
>> compilers (gcc -O2) make optimizations that assume `memcpy`
>> arguments
>> are not NULL. Nginx with UBSan crashes during startup due to this
>> issue.
>>
>> Consider the following function:
>> ```C
>> #include <string.h>
>>
>> int f(int i) {
>> char a[] = {'a'};
>> void *src = i ? a : NULL;
>> char dst[1];
>> memcpy(dst, src, 0);
>> return src == NULL;
>> }
>> ```
>> Here's what gcc13.2 -O2 -fno-builtin will do to it:
>> ```asm
>> f:
>> sub rsp, 24
>> xor eax, eax
>> test edi, edi
>> lea rsi, [rsp+14]
>> lea rdi, [rsp+15]
>> mov BYTE PTR [rsp+14], 97
>> cmove rsi, rax
>> xor edx, edx
>> call memcpy
>> xor eax, eax
>> add rsp, 24
>> ret
>> ```
>> Note that `f` always returns 0, regardless of the value of `i`.
>>
>> Feel free to try for yourself at
>> https://gcc.godbolt.org/z/zfvnMMsds
>>
>> The reasoning here is that since memcpy from NULL is UB, the
>> optimizer
>> is free to assume that `src` is non-null. You might consider this
>> to
>> be a problem with the compiler, or the C standard, and I might
>> agree.
>> Regardless, relying on UB is inherently un-portable, and requires
>> maintenance to ensure that new compiler releases don't break
>> existing
>> assumptions about the behavior of undefined operations.
>>
>> The following patch adds a check to `ngx_memcpy` and `ngx_cpymem`
>> that
>> makes 0-length memcpy explicitly a noop. Since all memcpying from
>> NULL
>> in Nginx uses n==0, this should be sufficient to avoid UB.
>>
>> It would be more efficient to instead add a check to every call to
>> ngx_memcpy and ngx_cpymem that might be used with src==NULL, but in
>> the discussion of a previous patch that proposed such a change, a
>> more
>> straightforward and tidy solution was desired.
>> It may also be worth considering adding checks for NULL memset,
>> memmove, etc. I think this is not necessary unless it is
>> demonstrated
>> that Nginx actually executes such undefined calls.
>>
>> # HG changeset patch
>> # User Ben Kallus <benjamin.p.kallus.gr at dartmouth.edu>
>> # Date 1702406466 18000
>> # Tue Dec 12 13:41:06 2023 -0500
>> # Node ID d270203d4ecf77cc14a2652c727e236afc659f4a
>> # Parent a6f79f044de58b594563ac03139cd5e2e6a81bdb
>> Add NULL check to ngx_memcpy and ngx_cpymem to satisfy UBSan.
>>
>> diff -r a6f79f044de5 -r d270203d4ecf src/core/ngx_string.c
>> --- a/src/core/ngx_string.c Wed Nov 29 10:58:21 2023 +0400
>> +++ b/src/core/ngx_string.c Tue Dec 12 13:41:06 2023 -0500
>> @@ -2098,6 +2098,10 @@
>> ngx_debug_point();
>> }
>>
>> + if (n == 0) {
>> + return dst;
>> + }
>> +
>> return memcpy(dst, src, n);
>> }
>>
>> diff -r a6f79f044de5 -r d270203d4ecf src/core/ngx_string.h
>> --- a/src/core/ngx_string.h Wed Nov 29 10:58:21 2023 +0400
>> +++ b/src/core/ngx_string.h Tue Dec 12 13:41:06 2023 -0500
>> @@ -103,8 +103,9 @@
>> * gcc3 compiles memcpy(d, s, 4) to the inline "mov"es.
>> * icc8 compile memcpy(d, s, 4) to the inline "mov"es or XMM
>> moves.
>> */
>> -#define ngx_memcpy(dst, src, n) (void) memcpy(dst, src, n)
>> -#define ngx_cpymem(dst, src, n) (((u_char *) memcpy(dst, src,
>> n)) + (n))
>> +#define ngx_memcpy(dst, src, n) (void) ((n) == 0 ? (dst) :
>> memcpy(dst, src, n))
>> +#define ngx_cpymem(dst, src, n)
>> \
>> + ((u_char *) ((n) == 0 ? (dst) : memcpy(dst, src, n)) + (n))
>>
>> #endif
>>
>> diff -r a6f79f044de5 -r d270203d4ecf src/http/v2/ngx_http_v2.c
>> --- a/src/http/v2/ngx_http_v2.c Wed Nov 29 10:58:21 2023 +0400
>> +++ b/src/http/v2/ngx_http_v2.c Tue Dec 12 13:41:06 2023 -0500
>> @@ -3998,9 +3998,7 @@
>> n = size;
>> }
>>
>> - if (n > 0) {
>> - rb->buf->last = ngx_cpymem(rb->buf->last, pos, n);
>> - }
>> + rb->buf->last = ngx_cpymem(rb->buf->last, pos, n);
>>
>> ngx_log_debug1(NGX_LOG_DEBUG_HTTP, fc->log, 0,
>> "http2 request body recv %uz", n);
>>
>>
>> For the record, I've already provided some feedback to Ben in the
>> ticket here:
>>
>> https://trac.nginx.org/nginx/ticket/2570
>>
>> And pointed to the existing thread here:
>>
>> https://mailman.nginx.org/pipermail/nginx-devel/2023-October/PX7VH5A273NLUGSYC7DR2AZRU75CIQ3Q.html
>> https://mailman.nginx.org/pipermail/nginx-devel/2023-December/DCGUEGEFS6TSVIWNEWUEZO3FZMR6ESYZ.html
>>
>> Hope this helps.
>>
>> --
>> Maxim Dounin
>> http://mdounin.ru/
>> _______________________________________________
>> nginx-devel mailing list
>> nginx-devel at nginx.org
>> https://mailman.nginx.org/mailman/listinfo/nginx-devel
>>
> _______________________________________________
> nginx-devel mailing list
> nginx-devel at nginx.org
> https://mailman.nginx.org/mailman/listinfo/nginx-devel
>
More information about the nginx-devel
mailing list