[PATCH] optimization of Intel processor cacheline calculation
Maxim Dounin
mdounin at mdounin.ru
Sat May 5 21:41:04 UTC 2012
Hello!
On Sat, May 05, 2012 at 08:10:30PM +0800, Simon Liu wrote:
> Hello!
>
> cacheline calculation is hardcode in ngx_cpuinfo, this will make mistake in
> some intel processor. example cache line is 64 byte in sandy bridge,
> its family code is 0110 and model no is 1010 or 1101(in this document
> http://www.intel.com/content/www/us/en/processors/processor-identification-cpuid-instruction-note.html).
> but code is this in ngx_cpuinfo:
>
> /* Pentium Pro, II, III */
> case 6:
> ngx_cacheline_size = 32;
>
> model = ((cpu[0] & 0xf0000) >> 8) | (cpu[0] & 0xf0);
>
> if (model >= 0xd0) {
> /* Intel Core, Core 2, Atom */
> ngx_cacheline_size = 64;
> }
>
> break;
>
> if model no is 1010 , ngx_cacheline_size will be 32, and so this is wrong.
Note the model variable in the above code includes extended model
field as well, and for sandy bridge it will be 0x2a0 (extended
model 0010, model 0101). Thus cache line size is correctly
detected as 64.
> Below is a patch(for nginx trunk) fix this problem, and use cpuid(2) solve
> hardcode。
>
> Index: src/core/ngx_cpuinfo.c
> ===================================================================
> --- src/core/ngx_cpuinfo.c (revision 4615)
> +++ src/core/ngx_cpuinfo.c (working copy)
> @@ -12,9 +12,93 @@
> #if (( __i386__ || __amd64__ ) && ( __GNUC__ || __INTEL_COMPILER ))
>
>
> +#define NGX_CACHE_LVL_1_DATA 1
> +#define NGX_CACHE_LVL_2 2
> +#define NGX_CACHE_LVL_3 3
> +#define NGX_CACHE_PREFETCHING 4
> +
> +
> +typedef struct ngx_cache_table {
> + u_char descriptor;
> + u_char type;
> + ngx_uint_t size;
> +} ngx_cache_table_t;
> +
> +
> static ngx_inline void ngx_cpuid(uint32_t i, uint32_t *buf);
>
>
> +static ngx_cache_table_t cache_table[] = {
> + { 0x0a, NGX_CACHE_LVL_1_DATA, 32 }, /* 32 byte line size */
> + { 0x0c, NGX_CACHE_LVL_1_DATA, 32 }, /* 32 byte line size */
> + { 0x0d, NGX_CACHE_LVL_1_DATA, 64 }, /* 64 byte line size */
> + { 0x0e, NGX_CACHE_LVL_1_DATA, 64 }, /* 64 byte line size */
[...]
I don't really think we need full intel cache descriptor decoding.
It's rather huge and I suspect it might cause more harm than good,
especially in virtualized environment.
And if we decide we need one, we probably want something simplier
(i.e. we don't care about cache levels and so on, and this
information is clearly not needed here).
Maxim Dounin
More information about the nginx-devel
mailing list