Nginx with big geo maps affects whole system performance on reload

SannisDev nginx-forum at forum.nginx.org
Tue Aug 3 13:13:07 UTC 2021


I'm running Nginx 1.20.1 on CentOS 8.4.2105 with a quite large geo maps,
including country/isp/connection for both IPv4 and IPv6, with raw size about
80Mb.
Hardware is: 2x Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz, 64Gb RAM

On nginx testconf or reload I'm experiencing increase of running nginx
latency, 99th percentile growth from 10ms to seconds.

After some investigation I found that most hot function in testconf/reload
are (sudo perf record -F 999999 /local/nginx/sbin/nginx -t):
```
  24.92%  nginx    nginx-1.20.1-cf9ac595_20210611_just_upgrade  [.]
ngx_conf_read_token
  14.84%  nginx    nginx-1.20.1-cf9ac595_20210611_just_upgrade  [.]
ngx_radix128tree_insert
  13.16%  nginx    nginx-1.20.1-cf9ac595_20210611_just_upgrade  [.]
ngx_http_geo
   6.19%  nginx    [kernel.kallsyms]                            [k]
inet_csk_bind_conflict
   5.69%  nginx    nginx-1.20.1-cf9ac595_20210611_just_upgrade  [.]
ngx_radix32tree_insert
   5.40%  nginx    nginx-1.20.1-cf9ac595_20210611_just_upgrade  [.]
ngx_ptocidr
```

Also perf stat slows high percent of cache misses (sudo perf stat -d
/local/nginx/sbin/nginx -t):
```
 Performance counter stats for '/local/nginx/sbin/nginx -t':

          7,119.99 msec task-clock                #    0.999 CPUs utilized  
       
               360      context-switches          #    0.051 K/sec          
       
                 1      cpu-migrations            #    0.000 K/sec          
       
           289,666      page-faults               #    0.041 M/sec          
       
    16,938,524,088      cycles                    #    2.379 GHz            
         (62.49%)
    26,748,500,575      instructions              #    1.58  insn per cycle 
         (74.99%)
     7,737,642,525      branches                  # 1086.749 M/sec          
         (74.99%)
        35,255,267      branch-misses             #    0.46% of all branches
         (75.00%)
     5,309,612,140      L1-dcache-loads           #  745.733 M/sec          
         (75.01%)
        91,351,291      L1-dcache-load-misses     #    1.72% of all
L1-dcache accesses  (75.01%)
        11,559,050      LLC-loads                 #    1.623 M/sec          
         (50.00%)
         7,370,770      LLC-load-misses           #   63.77% of all LL-cache
accesses  (49.99%)

       7.125064873 seconds time elapsed

       5.849075000 seconds user
       1.212134000 seconds sys
```

I suppose that this cache misses is the root cause of system slowdown on
testconf/reload.

Maybe someone have same problem and have some suggestions how to improve
this situation?

Posted at Nginx Forum: https://forum.nginx.org/read.php?2,292129,292129#msg-292129



More information about the nginx mailing list