[PATCH] Configure: Fix cacheline size for aarch64 platforms
debayang.qdt
debayang.qdt at qualcommdatacenter.com
Mon Dec 11 16:52:32 UTC 2017
Hello!
On Sat, Dec 09, 2017 at 02:05:04PM +0000, debayang.qdt wrote:
> Hello!
>
> On Fri, Dec 08, 2017 at 06:43:15PM +0000, debayang.qdt wrote:
>
> > Hello,
> >
> > 64 bytes should be the minimum size.
> >> Any links to support this claim?
>
> AFAIK several aarch64 server vendors - centriq/thunderx/xgene/A53/A57.. implementations - uses either 64B/128B cache lines .
> So keeping it at 64 bytes may be ok.
> Also in linux kernel :
> https://elixir.free-electrons.com/linux/v4.14.4/source/arch/arm64/incl
> ude/asm/cache.h#L73
This seems to use either CTR_EL0 or 128, and not even mentions 64.
Anyway, I have no objections to 64, as it is the cache line size on the aarch64 boxes I have access to.
> > However, A53 is a small segment of arm64 based servers .
> > Multiple aarch64 based servers uses 128 byte cache lines.
> > For e.g Centriq 2400 aarch64 server from Qualcomm Datacenter Technologies uses 128 bytes cache line.
> AFAIK, using a smaller cache line size is mostly fine and imply almost no performance difference in nginx. On ther other hand, using larger cache line size can easily result in multiple memory access where just one should be enough, resulting in suboptimal performance. That's why we generally assume 32 unless we know better.
>
> > To be very accurate , we may read the dcache line size from CTR_EL0 register.
> > Let me know what you think.
>
> >> Given the level of popularity of aarch64 servers, I would rather not. I would prefer to rely on something architecture-independent like sysconf(_SC_LEVEL1_DCACHE_LINESIZE), though it looks like it doesn't work on aarch64 yet, see https://bugzilla.redhat.com/show_bug.cgi?id=1190638.
>
> The aarch64 support is recently added in the glibc trunk .
> https://sourceware.org/git/?p=glibc.git;a=commit;h=6d58ce5e5072945d44f
> 2dba83ad16cd6febd056c The implementation above reads the CTR_EL0 to
> get the cache information .
>
> So , as you mentioned - it may make sense to use the sysconf to determine the cache line at runtime . In case it's not available on the system (value would be 0), we can fall back to the compile time defaults .
> Below patch in nginx for the same.
>
>
> # HG changeset patch
> # User Debayan Ghosh <debayang.qdt at qualcommdatacenter.com>
> # Date 1512811864 0
> # Sat Dec 09 09:31:04 2017 +0000
> # Node ID 217d0ed8730190a986c079c300169e10b90b4a82
> # Parent c71aed1f6f88b8c5d56a4360ca4521a62a4c17ce
> Nginx: Use sysconf to determine cacheline size
There is no need for the "Nginx" prefix.
Trailing dot is missing.
>
> Using sysconf(_SC_LEVEL1_DCACHE_LINESIZE) to determine cache line size on the system.
> This will make it architecture independent.
> Falls back to using arch specific compile time defaults if above not supported.
Please keep lines under 80 chars.
>
> diff -r c71aed1f6f88 -r 217d0ed87301 auto/unix
> --- a/auto/unix Tue Nov 28 15:19:59 2017 +0000
> +++ b/auto/unix Sat Dec 09 09:31:04 2017 +0000
> @@ -963,6 +963,15 @@
> ngx_feature_test="sysconf(_SC_NPROCESSORS_ONLN)"
> . auto/feature
>
> +ngx_feature="sysconf(_SC_LEVEL1_DCACHE_LINESIZE)"
> +ngx_feature_name="NGX_HAVE_LEVEL1_DCACHE_LINESIZE"
> +ngx_feature_run=no
> +ngx_feature_incs=
> +ngx_feature_path=
> +ngx_feature_libs=
> +ngx_feature_test="sysconf(_SC_LEVEL1_DCACHE_LINESIZE)"
> +. auto/feature
> +
>
> ngx_feature="openat(), fstatat()"
> ngx_feature_name="NGX_HAVE_OPENAT"
Style: there should be two empty lines before the feature test.
> diff -r c71aed1f6f88 -r 217d0ed87301 src/os/unix/ngx_posix_init.c
> --- a/src/os/unix/ngx_posix_init.c Tue Nov 28 15:19:59 2017 +0000
> +++ b/src/os/unix/ngx_posix_init.c Sat Dec 09 09:31:04 2017 +0000
> @@ -48,8 +48,17 @@
> }
>
> ngx_pagesize = getpagesize();
> +#if (NGX_HAVE_LEVEL1_DCACHE_LINESIZE)
> + ngx_int_t cpu_cache_line = sysconf(_SC_LEVEL1_DCACHE_LINESIZE);
> + /* Below workaround to use default values on some environments
> + where the cache line size yet not available */
> + ngx_cacheline_size = (cpu_cache_line == 0) ?
> + NGX_CPU_CACHE_LINE : cpu_cache_line; #else
> + /* Use default cache line for the architecture */
> ngx_cacheline_size = NGX_CPU_CACHE_LINE;
> -
> +#endif
> +
> for (n = ngx_pagesize; n >>= 1; ngx_pagesize_shift++) { /* void
> */ }
>
> #if (NGX_HAVE_SC_NPROCESSORS_ONLN)
>>In no particular order:
>>- There are multiple style issues, including:
>> - Variables should be declared at the function start, or, in case of
a conditional compilation, at the start of a block in #if.
>> - Multi-line comments should look like, like in style(9):
/*
* Multi-line comments look like this. Make them real
* sentences. Fill them so they look like real paragraphs.
*/
>> - I would rather suggest to leave the default as is, and introduce
a separate block with sysconf() somewhere after it. This will
result in much more readable code.
>> - There should be error handling, sysconf() can return -1.
>> - This needs some thinking about about priorities between
sysconf(_SC_LEVEL1_DCACHE_LINESIZE) and ngx_cpuinfo(). We've
seen cases when ngx_cpuinfo() detects wrong cache line size in
virtualized environments (https://trac.nginx.org/nginx/ticket/352),
so we may want to consider using sysconf(_SC_LEVEL1_DCACHE_LINESIZE)
with higher priority (not sure though).
In general it may be better to rely on the arch independent code unless the specific tunings in ngx_cpuinfo()
is known to give a noticeable performance impact.
Below two patches taking care of the comments and giving priority to sysconf.
Let me know if any further comments.
# HG changeset patch
# User Debayan Ghosh <debayang.qdt at qualcommdatacenter.com>
# Date 1513004735 0
# Mon Dec 11 15:05:35 2017 +0000
# Node ID b765307e9f516c396da24019724f82c2c8c38677
# Parent d3235149d17f7745d3ac246a6cdcc81a56698f7b
Configure: Set default cacheline size to 64 for aarch64 platforms.
diff -r d3235149d17f -r b765307e9f51 auto/os/conf
--- a/auto/os/conf Thu Dec 07 17:09:36 2017 +0300
+++ b/auto/os/conf Mon Dec 11 15:05:35 2017 +0000
@@ -110,6 +110,11 @@
NGX_MACH_CACHE_LINE=64
;;
+ aarch64 )
+ have=NGX_ALIGNMENT value=16 . auto/define
+ NGX_MACH_CACHE_LINE=64
+ ;;
+
*)
have=NGX_ALIGNMENT value=16 . auto/define
NGX_MACH_CACHE_LINE=32
# HG changeset patch
# User Debayan Ghosh <debayang.qdt at qualcommdatacenter.com>
# Date 1513009691 0
# Mon Dec 11 16:28:11 2017 +0000
# Node ID 090538eb7d973d8cc690f8f413ed5c7b5d3338b8
# Parent b765307e9f516c396da24019724f82c2c8c38677
Use sysconf to determine cacheline size at runtime.
Determine cacheline size at runtime if supported
using sysconf(_SC_LEVEL1_DCACHE_LINESIZE). In case not supported,
fallback to ngx_cpuinfo() or compile time defaults.
diff -r b765307e9f51 -r 090538eb7d97 auto/unix
--- a/auto/unix Mon Dec 11 15:05:35 2017 +0000
+++ b/auto/unix Mon Dec 11 16:28:11 2017 +0000
@@ -964,6 +964,16 @@
. auto/feature
+ngx_feature="sysconf(_SC_LEVEL1_DCACHE_LINESIZE)"
+ngx_feature_name="NGX_HAVE_LEVEL1_DCACHE_LINESIZE"
+ngx_feature_run=no
+ngx_feature_incs=
+ngx_feature_path=
+ngx_feature_libs=
+ngx_feature_test="sysconf(_SC_LEVEL1_DCACHE_LINESIZE)"
+. auto/feature
+
+
ngx_feature="openat(), fstatat()"
ngx_feature_name="NGX_HAVE_OPENAT"
ngx_feature_run=no
diff -r b765307e9f51 -r 090538eb7d97 src/os/unix/ngx_posix_init.c
--- a/src/os/unix/ngx_posix_init.c Mon Dec 11 15:05:35 2017 +0000
+++ b/src/os/unix/ngx_posix_init.c Mon Dec 11 16:28:11 2017 +0000
@@ -36,6 +36,9 @@
{
ngx_time_t *tp;
ngx_uint_t n;
+#if (NGX_HAVE_LEVEL1_DCACHE_LINESIZE)
+ ngx_int_t cpu_cache_line;
+#endif
#if (NGX_HAVE_OS_SPECIFIC_INIT)
if (ngx_os_specific_init(log) != NGX_OK) { @@ -64,6 +67,17 @@
ngx_cpuinfo();
+#if (NGX_HAVE_LEVEL1_DCACHE_LINESIZE)
+ /*
+ * Fetch cacheline from system configuration if available.
+ * This is given priority over ngx_cpuinfo().
+ */
+ cpu_cache_line = sysconf(_SC_LEVEL1_DCACHE_LINESIZE);
+ if(cpu_cache_line > 0) {
+ ngx_cacheline_size = cpu_cache_line;
+ }
+#endif
+
if (getrlimit(RLIMIT_NOFILE, &rlmt) == -1) {
ngx_log_error(NGX_LOG_ALERT, log, errno,
"getrlimit(RLIMIT_NOFILE) failed");
-- Debayan Ghosh
More information about the nginx-devel
mailing list