One last try - large long-running worker tasks

Dipl. Ing. Sergey Brester serg.brester at sebres.de
Tue Nov 10 12:48:29 UTC 2020


 

You could do it similar proxy module is buffering the response, for
instance see proxy_buffering [2] directive: 

_When buffering is enabled, nginx receives a response from the proxied
server as soon as possible, saving it into the buffers set by the
proxy_buffer_size [3] and proxy_buffers [4] directives. If the whole
response does not fit into memory, a part of it can be saved to a
temporary file [5] on the disk. Writing to temporary files is controlled
by the proxy_max_temp_file_size [6] and proxy_temp_file_write_size [7]
directives._ 

This or other communicating modules (like fcgi, scgi or uwsgi) using
upstream buffering of response. The handling around buffering of
upstream is almost the same in all modules.
This is already event-driven - handler is called on readable, by
incoming response chunk (or on writable of downstream). 

Basically depending on how your module architecture is built, you could:


 	* either use default upstream buffering mechanism (if you have
something like upstream or can simulate that). In thin case you have to
set certain properties of r->upstream: buffering, buffer_size, bufs.num
and bufs.size, temp_file_write_size and max_temp_file_size and of course
register the handler reading the upstream pipe.
 	* or organize your own response buffering as it is implemented in
ngx_event_pipe.c and ngx_http_upstream.c, take a look there for
implementation details.

As for performance (disk I/O, etc) - it depends (buffer size, system
cache, mount type of temp storage, speed of clients downstream, etc).
But if you would configure the buffers large enough, nginx could use it
as long as possible and the storing in temp file can be considered as
safe on demand fallback to smooth out the peak of load, to avoid OOM
situation.
Usage a kernel pipe buffers could be surely faster, but indirect you'd
just relocate the potential OOM issue from nginx process to the system. 

Regards,
Sergey 

10.11.2020 02:54, Jeff Heisz wrote: 

> Hi all, I've asked this before with no response, trying one last time
> before I just make something work.
> 
> I'm making a custom module for nginx that does a number of things but
> one of the actions is a long-running (in the nginx sense) task that
> could produce a large response. I've already got proper processing
> around using worker tasks for the other long-running operations that
> have small datasets, but worry about accumulating a large amount of
> memory in a buffer chain for the response. Ideally it would drain as
> fast as the client can consume it and throttle appropriately, there
> could conceivably be gigabytes of content.
> 
> My choices (besides blowing all of the memory in the system) are:
> 
> - write to a temporary file and attach a file buffer as the response,
> less than ideal as it's essentially translating a file to begin with,
> so it's a lot of disk I/O and performance will be less than stellar.
> From what I can tell, this is one of the models for the various CGI
> systems for caching, although in my case caching is not of use
> 
> - somehow hook into the eventing system of nginx to detect the write
> transitions and implement flow control directly using threading
> conditionals. I've tried this for a few weeks but can't figure out
> the 'right' thing to make the hooks work in a separate module without
> changing the core nginx code, which I'm loathe to do (unless you are
> looking for someone to contribute such a solution, but I'd probably
> need some initial guidance)
> 
> - attach a kernel pipe object (yah yah, won't work on Windows, don't
> care) to each of my worker instances and somehow 'connect' that as an
> upstream-like resource, so that the nginx event loop handles the
> read/write consumption and the thread automatically blocks when full
> on the kernel pipe. Would need some jiggery to handle reuse and
> start/end markers. Also not clear if I can override the connection
> model for the upstream without again changing core nginx server code
> 
> Any thoughts? Not looking for code here (although telling me to look
> at the blah-blah-blah that does exactly this would be awesome), but if
> someone who is more familiar with the inner workings of the nginx data
> flow could just say which solution is a non-starter (so I don't waste
> time trying to make it work) or even which would be a suitable
> solution would be awesome!
> 
> jmh
> _______________________________________________
> nginx-devel mailing list
> nginx-devel at nginx.org
> http://mailman.nginx.org/mailman/listinfo/nginx-devel [1]
 

Links:
------
[1] http://mailman.nginx.org/mailman/listinfo/nginx-devel
[2]
http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_buffering
[3]
http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_buffer_size
[4]
http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_buffers
[5]
http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_temp_path
[6]
http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_max_temp_file_size
[7]
http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_temp_file_write_size
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nginx.org/pipermail/nginx-devel/attachments/20201110/71df6da4/attachment.htm>


More information about the nginx-devel mailing list