inline
Alejandro Colomar
alx.manpages at gmail.com
Wed Jun 1 23:49:01 UTC 2022
Hi,
We define `nxt_inline` as:
$ grepc nxt_inline
./src/nxt_clang.h:11:
#define nxt_inline static inline __attribute__((always_inline))
I'm concerned that we may be inlining too much.
See for example the following function calls:
$ grep -rnA1 nxt_slow_path | grep -B1 nxt_thread_get_tid
src/nxt_thread.c:247: if (nxt_slow_path(thr->tid == 0)) {
src/nxt_thread.c-248- thr->tid = nxt_thread_get_tid();
--
src/nxt_thread.c:257: if (nxt_slow_path(thr->tid == 0)) {
src/nxt_thread.c-258- thr->tid = nxt_thread_get_tid();
Looks good, but then see the definition of nxt_thread_get_tid() (it has
several difinitions, one for each system, but I'll copy here one that is
pretty big (NXT_AIX)):
$ grepc nxt_thread_get_tid
[...]
./src/nxt_thread_id.h:107:
nxt_inline nxt_tid_t
nxt_thread_get_tid(void)
{
int err, size;
pthread_t pt;
struct __pthrdsinfo ti;
size = 0;
pt = pthread_self();
err = pthread_getthrds_np(&pt, PTHRDSINFO_QUERY_TID, &ti,
sizeof(struct __pthrdsinfo), NULL, size);
if (nxt_fast_path(err == 0)) {
return ti.__pi_tid;
}
nxt_main_log_alert("pthread_getthrds_np(PTHRDSINFO_QUERY_TID)
failed %E",
err);
return 0;
}
[...]
We're inlining a *huge* function in some cases where we know it's a slow
path. We could do better by marking the function as inline (without
[[gnu::always_inline]]), and let the compiler decide that it doesn't
want to inline when it's in an `nxt_slow_path()`.
My proposal is to split the inlining to two cases: `static inline` (for
.c files) and C99 `inline` (for .h files), and have `nxt_always_inline`
as a separate attribute that we can add selectively depending on each
function (i.e., not adding it blindly to all functions).
For a deep explanation of the different `inline` models there are, see
<https://www.greenend.org.uk/rjk/tech/inline.html>.
Let me summarize:
`static inline`: This should only be used in .c files. Basically,
`static` should only be applied to functions local to a translation
unit, as in non-inline static functions. The problem of using `static
inline` in headers is that the compiler emits duplicated function code
in every translation unit (code bloat), for when the function is not
inlined. There's an exception, which is when it is used together with
`[[gnu::always_inline]]`, which we do (I guess for this reason), but in
that case `static` is redundant (`inline [[gnu::always_inline]]` does
the same thing).
`inline`: We should use the C99 model. The GNU `inline` model is
broken, and anyway, compilers default to C99 inline if the C version
required in the command line is >= C99. We should use the
`-fno-gnu89-inline` compiler flag if available to make sure the compiler
doesn't fallback to GNU inline mode. Anyway, sine we use other C99
features, this should be fine. So, following the C99 inline model, we
should put inline functions in headers, and then one `extern`
declaration for each inline function in a .c file.
`[[gnu::always_inline]]` (aka `__attribute__((__always_inline__))`): We
should use this only when we are very sure that we want to *always*
inline (i.e., oneliners; nxt_var_is_const() would be a great candidate).
I'd also like to have some benchmark to test that none of these changes
reduce performance, since changes to the inlining could trigger
important performance changes (for good or for bad, we'll see). Do we
have such a thing?
Cheers,
Alex
--
Alejandro Colomar
<http://www.alejandro-colomar.es/>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://mailman.nginx.org/pipermail/unit/attachments/20220602/fefb6140/attachment.bin>
More information about the unit
mailing list