Debugging Nginx Memory Spikes on Production Servers

Lance Dockins lance at wordkeeper.com
Thu Sep 21 13:50:31 UTC 2023


Thanky you, Dmitry.

One question before I describe what we are doing with NJS. I did read about the VM handling process before switching from Lua to NJS and it sounded very practical but my current understanding is that there could be multiple VM’s instantiated for a single request. A js_set, js_content, and js_header_filter directive that applies to a single request, for example, would instantiate 3 VMs. And were you to need to set multiple variables with js_set, then keep adding to that # of VMs. My original understanding of that was that those VMs would be destroyed once they exited so even if you had multiple VMs instantiated per request, the memory impact would not be cumulative in a single request. Is that understanding correct? Or are you saying that each VM accumulates more and more memory until the entire request completes?

As far as how we’re using NJS, we’re mostly using it for header filters, internal redirection, and access control. So there really shouldn’t be a threat to memory in most instances unless we’re not just dealing with a single request memory leak inside of a VM but also a memory leak that involves every VM that NJS instantiates just accumulating memory until the request completes.

Right now, my working theory about what is most likely to be creating the memory spikes has to do with POST body analysis. Unfortunately, some of the requests that I have to deal with are POSTs that have to either be denied access or routed differently depending on the contents of the POST body. Unfortunately, these same routes can vary in the size of the POST body and I have no control over how any of that works because the way it works is controlled by third parties. One of those third parties has significant market share on the internet so we can’t really avoid dealing with it.

In any case, before we switched to NJS, we were using Lua to do the same things and that gave us the advantage of doing both memory cleanup if needed and also doing easy analysis of POST body args. I was able to do this sort of thing with Lua before:
local post_args, post_err = ngx.req.get_post_args()
if post_args.arg_name = something then

But in NJS, there’s no such POST body utility so I had to write my own. The code that I use to parse out the POST body works for both URL encoded POST bodies and multipart POST bodies, but it has to read the entire POST into a variable before I can use it. For small POSTs, that’s not a problem. For larger POSTs that contain a big attachment, it would be. Ultimately, I only care about the string key/value pairs for my purposes (not file attachments) so I was hoping to discard attachment data while parsing the body. I think that that is actually how Lua’s version of this works too. So my next thought was that I could use a Buffer and rs.readSync to read the POST body in buffer frames to keep memory minimal so that I could could discard the any file attachments from the POST body and just evaluate the key/value data that uses simple strings. But from what you’re saying, it sounds like there’s basically no difference between fs.readSync w/ a Buffer and rs.readFileSync in terms of actual memory use. So either way, with a large POST body, you’d be steamrolling the memory use in a single Nginx worker thread. When I had to deal with stuff like this in Lua, I’d just run collectgarbage() to clean up memory and it seemed to work fine. But then I also wasn’t having to parse out the POST body myself in Lua either.

It’s possible that something else is going on other than that. qs.parse seems like it could get us into some trouble if the query_string that was passed was unusuall long too from what you’re saying about how memory is handled. None of the situations that I’m handling are for long running requests. They’re all designed for very fast requests that come into the servers that I manage on a constant basis.

If you can shed some light on the way that VM’s and their memory are handled per my question above and any insights into what to do about this type of situation, that would help a lot. I don’t know if there are any plans to offer a POST body parsing feature in NJS for those that need to evalute POST body data like how Lua did it, but if there was some way to be able to do that at the Nginx layer instead of at the NJS layer, it seems like that could be a lot more sensitive to memory use. Right now, if my understanding is correct, the only option that I’d even have would be to just stop doing POST body handling if the POST body is above a certain total size. I guess if there was some way to forcibly free memory, that would help too. But I don’t think that that is as common of a problem as having to deal with very large query strings that some third party appends to a URL (probably maliciously) and/or a very large file upload attached to a multipart POST. So the only concern that I’d have about memory in a situation where I don’t have to worry about memory when parsing a larger file woudl be if multiple js_sets and such would just keep spawning VMs and accumulating memory during a single request.

Any thoughts?

—
Lance Dockins

> On Thursday, Sep 21, 2023 at 1:45 AM, Dmitry Volyntsev <xeioex at nginx.com (mailto:xeioex at nginx.com)> wrote:
>
> On 20.09.2023 20:37, Lance Dockins wrote:
> > So I guess my question at the moment is whether endless memory use
> > growth being reported by njs.memoryStats.size after file writes is
> > some sort of false positive tied to quirks in how memory use is being
> > reported or whether this is indicative of a memory leak? Any insight
> > would be appreicated.
>
> Hi Lance,
> The reason njs.memoryStats.size keeps growing is because NJS uses arena
> memory allocator linked to a current request and a new object
> representing memoryStats structure is returned every time
> njs.memoryStats is accessed. Currently NJS does not free most of the
> internal objects and structures until the current request is destroyed
> because it is not intended for a long running code.
>
> Regarding the sudden memory spikes, please share some details about JS
> code you are using.
> One place to look is to analyze the amount of traffic that goes to NJS
> locations and what exactly those location do.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20230921/0749799b/attachment.htm>


More information about the nginx mailing list