Let's start with some background. I'm developing a module to run inside the NGINX engine to process requests (standard content phase) against an API (not an upstream-compatible socket-based protocol). These requests could be long-running so the module is using thread tasks to execute and generate responses. I'm well aware of how to safely manage content in multi-threaded environments and the threading/pooling/data management constraints of the NGINX internals. For the most part, I've resolved all of the nuances of handling "small" requests and responses. However, the current code handles responses by accumulating a buffer chain of content to be issued with output_filter and finalize_request - which is less than optimal for really large responses which can result from certain parts of the API.
Now for the question - I'm looking for some sample code or basic direction on how to best implement a streaming consumption model to both avoid memory overload and provide write-based throttling from the threaded API processing to the HTTP response instance. Of course, there are the obvious threading boundaries where the task instance cannot call response methods on the request and I know how to handle inter-thread processing/messaging. But what I can't figure out is how to pool/interact with such a transfer mechanism from the main event thread. Three thoughts/possible solutions:
- overload the write event handler in the long-running circumstance to check/consume available chain buffers from the thread processor, with some kind of signalling/hold mechanism to throttle the thread and avoid memory overload. Two issues with this (so far) - how to capture write backlog to throttle the thread but more importantly, how to handle the edge-transition model of the underlying event loop to cause the write handler to actually fire to check for pending records. This (to me) is the 'cleanest' model that I've tried to implement but have been unable to properly get the write handler to fire consistently
- use a continuous/never-ending 1ms timer to simulate an every-event-loop-callback to process backlog, which is even less elegant, it does fire consistently but creates a delay and still has the blocking measurement issues of the previous solution, not to mention having a single callback manage a collection of responses
- the third option, which I originally didn't really like for several reasons but now that I've typed all this out think might be the best solution, is to bind a kernel pipe and use that to shuttle the content between the thread and the event loop, as if it were an external upstream. Was originally thinking of using the pipe to 'wake' the event thread to consume the records but now realizing that I could just use the pipe to shuttle all of the content. The only thing I don't like about this (aside from it seeming to be a much-too-complex solution to the problem) is the possible overload of pipes for a lot of concurrent requests.
Any thoughts? Not looking for someone to provide the code (although if someone says 'hey, check out this module, it does exactly what you need' I won't be upset). Just looking for some guidance from anyone who might have some ideas on how to best approach this problem...