Debugging Nginx Memory Spikes on Production Servers

Dmitry Volyntsev xeioex at nginx.com
Thu Sep 21 22:01:41 UTC 2023


On 9/21/23 6:50 AM, Lance Dockins wrote:

Hi Lance,

See my comments below.

> Thanky you, Dmitry.
>
> One question before I describe what we are doing with NJS.  I did read 
> about the VM handling process before switching from Lua to NJS and it 
> sounded very practical but my current understanding is that there 
> could be multiple VM’s instantiated for a single request.  A js_set, 
> js_content, and js_header_filter directive that applies to a single 
> request, for example, would instantiate 3 VMs.  And were you to need 
> to set multiple variables with js_set, then keep adding to that # of VMs.
>
>
This is not correct. For js_set, js_content and js_header_filter there 
is only a single VM.
The internalRedirect() is the exception, because a VM does not survive 
it, but the previous VMs will not be freed until current request is 
finished. BTW, a VM instance itself is pretty small in size (~2kb) so it 
should not be a problem if you have a reasonable number of redirects.


>
> My original understanding of that was that those VMs would be 
> destroyed once they exited so even if you had multiple VMs 
> instantiated per request, the memory impact would not be cumulative in 
> a single request.  Is that understanding correct?  Or are you saying 
> that each VM accumulates more and more memory until the entire request 
> completes?
>
> As far as how we’re using NJS, we’re mostly using it for header 
> filters, internal redirection, and access control.  So there really 
> shouldn’t be a threat to memory in most instances unless we’re not 
> just dealing with a single request memory leak inside of a VM but also 
> a memory leak that involves every VM that NJS instantiates just 
> accumulating memory until the request completes.
>
> Right now, my working theory about what is most likely to be creating 
> the memory spikes has to do with POST body analysis.  Unfortunately, 
> some of the requests that I have to deal with are POSTs that have to 
> either be denied access or routed differently depending on the 
> contents of the POST body.  Unfortunately, these same routes can vary 
> in the size of the POST body and I have no control over how any of 
> that works because the way it works is controlled by third parties. 
>  One of those third parties has significant market share on the 
> internet so we can’t really avoid dealing with it.
>
> In any case, before we switched to NJS, we were using Lua to do the 
> same things and that gave us the advantage of doing both memory 
> cleanup if needed and also doing easy analysis of POST body args.  I 
> was able to do this sort of thing with Lua before:
> local post_args, post_err = ngx.req.get_post_args()
> if post_args.arg_name = something then
>
> But in NJS, there’s no such POST body utility so I had to write my 
> own.  The code that I use to parse out the POST body works for both 
> URL encoded POST bodies and multipart POST bodies, but it has to read 
> the entire POST into a variable before I can use it.  For small POSTs, 
> that’s not a problem.  For larger POSTs that contain a big attachment, 
> it would be.  Ultimately, I only care about the string key/value pairs 
> for my purposes (not file attachments) so I was hoping to discard 
> attachment data while parsing the body.
>
>
>
Thank you for the feedback, I will add it as to a future feature list.

>  I think that that is actually how Lua’s version of this works too. 
>  So my next thought was that I could use a Buffer and rs.readSync to 
> read the POST body in buffer frames to keep memory minimal so that I 
> could could discard the any file attachments from the POST body and 
> just evaluate the key/value data that uses simple strings.  But from 
> what you’re saying, it sounds like there’s basically no difference 
> between fs.readSync w/ a Buffer and rs.readFileSync in terms of actual 
> memory use. So either way, with a large POST body, you’d be 
> steamrolling the memory use in a single Nginx worker thread. When I 
> had to deal with stuff like this in Lua, I’d just run collectgarbage() 
> to clean up memory and it seemed to work fine.  But then I also wasn’t 
> having to parse out the POST body myself in Lua either.
>
> It’s possible that something else is going on other than that. 
>  qs.parse seems like it could get us into some trouble if the 
> query_string that was passed was unusuall long too from what you’re 
> saying about how memory is handled.
>
>
for qs.parse() there is a limit for a number of arguments, which you can 
specify.

>
> None of the situations that I’m handling are for long running 
> requests.  They’re all designed for very fast requests that come into 
> the servers that I manage on a constant basis.
>
> If you can shed some light on the way that VM’s and their memory are 
> handled per my question above and any insights into what to do about 
> this type of situation, that would help a lot.  I don’t know if there 
> are any plans to offer a POST body parsing feature in NJS for those 
> that need to evalute POST body data like how Lua did it, but if there 
> was some way to be able to do that at the Nginx layer instead of at 
> the NJS layer, it seems like that could be a lot more sensitive to 
> memory use.  Right now, if my understanding is correct, the only 
> option that I’d even have would be to just stop doing POST body 
> handling if the POST body is above a certain total size.  I guess if 
> there was some way to forcibly free memory, that would help too.  But 
> I don’t think that that is as common of a problem as having to deal 
> with very large query strings that some third party appends to a URL 
> (probably maliciously) and/or a very large file upload attached to a 
> multipart POST.  So the only concern that I’d have about memory in a 
> situation where I don’t have to worry about memory when parsing a 
> larger file woudl be if multiple js_sets and such would just keep 
> spawning VMs and accumulating memory during a single request.
>
> Any thoughts?
>
>> Lance Dockins
>
>
>     On Thursday, Sep 21, 2023 at 1:45 AM, Dmitry Volyntsev
>     <xeioex at nginx.com> wrote:
>
>     On 20.09.2023 20:37, Lance Dockins wrote:
>>     So I guess my question at the moment is whether endless memory use
>>     growth being reported by njs.memoryStats.size after file writes is
>>     some sort of false positive tied to quirks in how memory use is
>>     being
>>     reported or whether this is indicative of a memory leak?  Any
>>     insight
>>     would be appreicated.
>
>     Hi Lance,
>     The reason njs.memoryStats.size keeps growing is because NJS uses
>     arena
>     memory allocator linked to a current request and a new object
>     representing memoryStats structure is returned every time
>     njs.memoryStats is accessed. Currently NJS does not free most of the
>     internal objects and structures until the current request is
>     destroyed
>     because it is not intended for a long running code.
>
>     Regarding the sudden memory spikes, please share some details
>     about JS
>     code you are using.
>     One place to look is to analyze the amount of traffic that goes to
>     NJS
>     locations and what exactly those location do.
>


More information about the nginx mailing list