Debugging CPU usage in Nginx

Brad Patton brad at wordkeeper.com
Tue Aug 13 18:54:51 UTC 2024


Thanks so much!  That was really helpful!  Using that I was able to uncover that something in Brotli is causing the issue.  And I've made some progress in solving it but not fully.  So I have a question but let me show what I found so far.

When I run "perf top" on this server, I see a lot of activity from libbrotli.  It was showing about 50% activity from libbrotli to the CreateBackwardReferencesNH5 function.

But we also noticed that it was calling Brotli version 1.0.9 and we had built it with 1.1.0.  Looks like there was a slight bug in our linking which caused it to fall back to the OS version.  I went ahead and fixed that to make it use 1.1.0 like it supposed to and that caused the activity to drop to about 25%!  It's a huge drop!

But it's still not where it should be for sure.  For example, if I completely disable Brotli in the Nginx config and let it fall back to Gzip, the 25% activity from Brotli is replaced by about 4% activity from "deflate_medium" (pretty sure that is Gzip).

Of course, Brotli is supposed to be faster than Gzip so something is still wrong here.  Any thoughts on what could be causing Brotli to consume so much CPU power?

________________________________
From: nginx <nginx-bounces at nginx.org> on behalf of Clima Gabriel <clima.gabrielphoto at gmail.com>
Sent: Monday, August 12, 2024 11:32 PM
To: nginx at nginx.org <nginx at nginx.org>
Subject: Re: Debugging CPU usage in Nginx


If your Nginx is compiled with debug symbols you may see some useful info `perf top`

On Mon, Aug 12, 2024, 6:24 PM Brad Patton <brad at wordkeeper.com<mailto:brad at wordkeeper.com>> wrote:
Hi all,

About 2 months ago the CPU usage on 1 of our servers started going crazy.  All of a sudden, Nginx itself started using about 6x the CPU power to serve requests with no increase in traffic at all.  It's a dramatic difference and it's directly attributed to Nginx, not PHP, Mariadb, or anything else.  I can show you a lot of graphs and data showing the problem if needed so let me know if you would like to see more.  🙂

The only OS updates during the week prior to the issue starting were to PHP.  We're currently running Nginx version 1.27.0 and we were on 1.25.5 when it started.  In addition to those updates, we've run several test builds where we disabled certain modules just randomly trying to isolate the issue.

I was starting to think that the issue had something to do with our configuration, even though that didn't change at the start of the issue either, but then I ran a test in a Fedora 40 container and it solved the problem.  Unfortunately, I could only run that test temporarily, but the results were clear.  The same exact build with the same exact configuration works fine in Fedora 40 but takes about 6x the CPU power to serve the same requests in Centos Stream 9.

That makes me think there is either some bug in Centos 9 or a bug in Nginx that only happens in Centos 9.  Does anyone have any thoughts on how to find out exactly what in Nginx is causing the much higher CPU usage?

If not, would you like to see more information about the problem?  If so, what?

Thanks!
Brad

_______________________________________________
nginx mailing list
nginx at nginx.org<mailto:nginx at nginx.org>
https://mailman.nginx.org/mailman/listinfo/nginx
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20240813/3e224814/attachment.htm>


More information about the nginx mailing list