Debugging Nginx Memory Spikes on Production Servers

Lance Dockins lance at wordkeeper.com
Thu Sep 21 23:41:13 UTC 2023


That’s good info. Thank you.

I have been doing some additional testing since my email last night and I have seen enough evidence to believe that file I/O in NJS is basically the source of the memory issues. I did some testing with very basic commands like readFileSync and Buffer + readSync and in all cases, the memory footprint when doing file handling in NJS is massive.

Just doing this:

let content = fs.readFileSync(path/to//file);
let parts = content.split(boundary);

Resulted in memory use that was close to a minimum of 4-8x the size of the file during my testing. We do have an upper bound on files that can be uploaded and that does contain this somwhat but it’s not hard for a larger request that is 99% file attachment to use exhorbitant amounts of memory. I actually tried doing a Buffer + readSync variation on the same thing and the memory footprint was actually FAR FAR worse when I did that.

The 4-8x minimum memory commit seems like a problem to me just generally. But the fact that readSync doesn’t seem to be any better on memory (much worse actually) basically means that NJS is only safe to use for processing smaller files (or POST bodies) right now. There’s just no good way to keep data that you don’t care about in a file from occupying excessive amounts of memory that can’t be reclaimed. If there is no way to improve the memory footprint when handling files (or big strings), no memory conservative way to stream a file through some sort of buffer, and no first-party utility for providing parsed POST bodies right now, then it might be worth the time to put some notes in the NJS docs that the fs module may not be appropriate for larger files (e.g. files over 1mb).

For what it’s worth, I’d also love to see some examples of how to properly use fs.readSync in the NJS examples docs. There really wasn’t much out there for that for NJS (or even in a lot of the Node docs) so I can’t say that my specific test implementation for that was ideal. But that’s just above and beyond the basic problems that I’m seeing with memory use with any form of file I/O at all (since the memory problems seem to be persistent whether doing reads or even log writes).

—
Lance Dockins

> On Thursday, Sep 21, 2023 at 5:01 PM, Dmitry Volyntsev <xeioex at nginx.com (mailto:xeioex at nginx.com)> wrote:
>
> On 9/21/23 6:50 AM, Lance Dockins wrote:
>
> Hi Lance,
>
> See my comments below.
>
> > Thanky you, Dmitry.
> >
> > One question before I describe what we are doing with NJS. I did read
> > about the VM handling process before switching from Lua to NJS and it
> > sounded very practical but my current understanding is that there
> > could be multiple VM’s instantiated for a single request. A js_set,
> > js_content, and js_header_filter directive that applies to a single
> > request, for example, would instantiate 3 VMs. And were you to need
> > to set multiple variables with js_set, then keep adding to that # of VMs.
> >
> >
> This is not correct. For js_set, js_content and js_header_filter there
> is only a single VM.
> The internalRedirect() is the exception, because a VM does not survive
> it, but the previous VMs will not be freed until current request is
> finished. BTW, a VM instance itself is pretty small in size (~2kb) so it
> should not be a problem if you have a reasonable number of redirects.
>
>
> >
> > My original understanding of that was that those VMs would be
> > destroyed once they exited so even if you had multiple VMs
> > instantiated per request, the memory impact would not be cumulative in
> > a single request. Is that understanding correct? Or are you saying
> > that each VM accumulates more and more memory until the entire request
> > completes?
> >
> > As far as how we’re using NJS, we’re mostly using it for header
> > filters, internal redirection, and access control. So there really
> > shouldn’t be a threat to memory in most instances unless we’re not
> > just dealing with a single request memory leak inside of a VM but also
> > a memory leak that involves every VM that NJS instantiates just
> > accumulating memory until the request completes.
> >
> > Right now, my working theory about what is most likely to be creating
> > the memory spikes has to do with POST body analysis. Unfortunately,
> > some of the requests that I have to deal with are POSTs that have to
> > either be denied access or routed differently depending on the
> > contents of the POST body. Unfortunately, these same routes can vary
> > in the size of the POST body and I have no control over how any of
> > that works because the way it works is controlled by third parties.
> > One of those third parties has significant market share on the
> > internet so we can’t really avoid dealing with it.
> >
> > In any case, before we switched to NJS, we were using Lua to do the
> > same things and that gave us the advantage of doing both memory
> > cleanup if needed and also doing easy analysis of POST body args. I
> > was able to do this sort of thing with Lua before:
> > local post_args, post_err = ngx.req.get_post_args()
> > if post_args.arg_name = something then
> >
> > But in NJS, there’s no such POST body utility so I had to write my
> > own. The code that I use to parse out the POST body works for both
> > URL encoded POST bodies and multipart POST bodies, but it has to read
> > the entire POST into a variable before I can use it. For small POSTs,
> > that’s not a problem. For larger POSTs that contain a big attachment,
> > it would be. Ultimately, I only care about the string key/value pairs
> > for my purposes (not file attachments) so I was hoping to discard
> > attachment data while parsing the body.
> >
> >
> >
> Thank you for the feedback, I will add it as to a future feature list.
>
> > I think that that is actually how Lua’s version of this works too.
> > So my next thought was that I could use a Buffer and rs.readSync to
> > read the POST body in buffer frames to keep memory minimal so that I
> > could could discard the any file attachments from the POST body and
> > just evaluate the key/value data that uses simple strings. But from
> > what you’re saying, it sounds like there’s basically no difference
> > between fs.readSync w/ a Buffer and rs.readFileSync in terms of actual
> > memory use. So either way, with a large POST body, you’d be
> > steamrolling the memory use in a single Nginx worker thread. When I
> > had to deal with stuff like this in Lua, I’d just run collectgarbage()
> > to clean up memory and it seemed to work fine. But then I also wasn’t
> > having to parse out the POST body myself in Lua either.
> >
> > It’s possible that something else is going on other than that.
> > qs.parse seems like it could get us into some trouble if the
> > query_string that was passed was unusuall long too from what you’re
> > saying about how memory is handled.
> >
> >
> for qs.parse() there is a limit for a number of arguments, which you can
> specify.
>
> >
> > None of the situations that I’m handling are for long running
> > requests. They’re all designed for very fast requests that come into
> > the servers that I manage on a constant basis.
> >
> > If you can shed some light on the way that VM’s and their memory are
> > handled per my question above and any insights into what to do about
> > this type of situation, that would help a lot. I don’t know if there
> > are any plans to offer a POST body parsing feature in NJS for those
> > that need to evalute POST body data like how Lua did it, but if there
> > was some way to be able to do that at the Nginx layer instead of at
> > the NJS layer, it seems like that could be a lot more sensitive to
> > memory use. Right now, if my understanding is correct, the only
> > option that I’d even have would be to just stop doing POST body
> > handling if the POST body is above a certain total size. I guess if
> > there was some way to forcibly free memory, that would help too. But
> > I don’t think that that is as common of a problem as having to deal
> > with very large query strings that some third party appends to a URL
> > (probably maliciously) and/or a very large file upload attached to a
> > multipart POST. So the only concern that I’d have about memory in a
> > situation where I don’t have to worry about memory when parsing a
> > larger file woudl be if multiple js_sets and such would just keep
> > spawning VMs and accumulating memory during a single request.
> >
> > Any thoughts?
> >
> > —
> > Lance Dockins
> >
> >
> > On Thursday, Sep 21, 2023 at 1:45 AM, Dmitry Volyntsev
> > <xeioex at nginx.com> wrote:
> >
> > On 20.09.2023 20:37, Lance Dockins wrote:
> > > So I guess my question at the moment is whether endless memory use
> > > growth being reported by njs.memoryStats.size after file writes is
> > > some sort of false positive tied to quirks in how memory use is
> > > being
> > > reported or whether this is indicative of a memory leak? Any
> > > insight
> > > would be appreicated.
> >
> > Hi Lance,
> > The reason njs.memoryStats.size keeps growing is because NJS uses
> > arena
> > memory allocator linked to a current request and a new object
> > representing memoryStats structure is returned every time
> > njs.memoryStats is accessed. Currently NJS does not free most of the
> > internal objects and structures until the current request is
> > destroyed
> > because it is not intended for a long running code.
> >
> > Regarding the sudden memory spikes, please share some details
> > about JS
> > code you are using.
> > One place to look is to analyze the amount of traffic that goes to
> > NJS
> > locations and what exactly those location do.
> >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx/attachments/20230921/5173f67b/attachment-0001.htm>


More information about the nginx mailing list