[PATCH] optimization of Intel processor cacheline calculation

Maxim Dounin mdounin at mdounin.ru
Sat May 5 21:41:04 UTC 2012


Hello!

On Sat, May 05, 2012 at 08:10:30PM +0800, Simon Liu wrote:

> Hello!
> 
> cacheline calculation is hardcode in ngx_cpuinfo, this will make mistake in
> some  intel processor. example  cache line is 64 byte in  sandy bridge,
>  its family code is 0110 and model no is 1010 or 1101(in this document
> http://www.intel.com/content/www/us/en/processors/processor-identification-cpuid-instruction-note.html).
>  but code is this in ngx_cpuinfo:
> 
>         /* Pentium Pro, II, III */
>         case 6:
>             ngx_cacheline_size = 32;
> 
>             model = ((cpu[0] & 0xf0000) >> 8) | (cpu[0] & 0xf0);
> 
>             if (model >= 0xd0) {
>                 /* Intel Core, Core 2, Atom */
>                 ngx_cacheline_size = 64;
>             }
> 
>             break;
> 
> if model no is 1010 ,  ngx_cacheline_size will be 32, and so this is wrong.

Note the model variable in the above code includes extended model 
field as well, and for sandy bridge it will be 0x2a0 (extended 
model 0010, model 0101).  Thus cache line size is correctly 
detected as 64.

> Below is a patch(for nginx trunk) fix this problem, and use cpuid(2) solve
> hardcode。
> 
> Index: src/core/ngx_cpuinfo.c
> ===================================================================
> --- src/core/ngx_cpuinfo.c (revision 4615)
> +++ src/core/ngx_cpuinfo.c (working copy)
> @@ -12,9 +12,93 @@
>  #if (( __i386__ || __amd64__ ) && ( __GNUC__ || __INTEL_COMPILER ))
> 
> 
> +#define NGX_CACHE_LVL_1_DATA            1
> +#define NGX_CACHE_LVL_2                 2
> +#define NGX_CACHE_LVL_3                 3
> +#define NGX_CACHE_PREFETCHING           4
> +
> +
> +typedef struct ngx_cache_table {
> +    u_char            descriptor;
> +    u_char            type;
> +    ngx_uint_t        size;
> +} ngx_cache_table_t;
> +
> +
>  static ngx_inline void ngx_cpuid(uint32_t i, uint32_t *buf);
> 
> 
> +static ngx_cache_table_t  cache_table[] = {
> +    { 0x0a, NGX_CACHE_LVL_1_DATA, 32 },  /* 32 byte line size */
> +    { 0x0c, NGX_CACHE_LVL_1_DATA, 32 },  /* 32 byte line size */
> +    { 0x0d, NGX_CACHE_LVL_1_DATA, 64 },  /* 64 byte line size */
> +    { 0x0e, NGX_CACHE_LVL_1_DATA, 64 },  /* 64 byte line size */

[...]

I don't really think we need full intel cache descriptor decoding.  
It's rather huge and I suspect it might cause more harm than good, 
especially in virtualized environment.

And if we decide we need one, we probably want something simplier 
(i.e. we don't care about cache levels and so on, and this 
information is clearly not needed here).

Maxim Dounin



More information about the nginx-devel mailing list