[PATCH 00 of 15] Serve all requests from single tempfile
jiri.setnicka at cdn77.com
Fri Jan 28 16:31:52 UTC 2022
Over the last few months, we (a small team of developers including me
and Jan Prachař, both from CDN77) developed a missing feature for the
proxy caching in Nginx. We are happy to share this feature with the
community in the following patch series.
We serve a large number of files to an immense number of clients and
often multiple clients want the same file at the very same time -
especially when it came to streaming (when a file is crafted on the
upstream in real-time and getting it could take seconds).
Previously there were two options in Nginx when using proxy caching:
* pass all incoming requests to the origin
* use proxy_cache_lock feature, pass only the first request (served in
real-time) and let other requests wait until the first request
We didn't like any of these options (the first one effectively disables
CDN and the second one is unusable for streaming). We considered using
Varnish, which solves this problem better, but we are very happy with
the Nginx infrastructure we have. Thus we came with the third option.
We developed the proxy_cache_tempfile mechanism, which acts similarly to
the proxy_cache_lock, but instead of locking other requests waiting for
the file completion, we open the tempfile used by the primary request
and periodically serve parts of it to the waiting requests.
Because there may be multiple tempfiles for the same file (for example
when the file expires before it is fully downloaded), we use shared
memory per cache with `ngx_http_file_cache_tf_node_t` for each created
tempfiles to synchronize all workers. When a new request is passed to
the origin, we record its tempfile number and when another request is
received, we try to open tempfile with this number and serve from it.
When tempfile is already used for some secondary request, it sticks with
this same tempfile until its completion.
To accomplish this we rely on the POSIX filesystem feature, when you can
open file and retain its file descriptor even when it is moved to a new
location (on the same filesystem). I'm afraid that this would be hard to
accomplish on Windows and this feature will be non-Windows only.
We tested this feature thoroughly for the last few months and we use
it already in part of our infrastructure without noticing any negative
impact, We noticed only a very small increase in memory usage and a
minimal increase in CPU and disk io usage (which corresponds with the
increased throughput of the server).
We also did some synthetic benchmarks where we compared vanilla nginx
and our patched version with and without cache lock and with cache
tempfiles. Results of the benchmarks, charts, and scripts we used for it
are available on my Github:
It should work also for fastcgi, uwsgi, and scgi caches (as it uses
internally the same mechanism), but we didn't do testing of these.
* proxy_cache_tempfile on; -- activate the whole tempfile logic
* proxy_cache_tempfile_timeout 5s; -- how long to wait for tempfile before 504
* proxy_cache_tempfile_loop 50ms; -- loop time for check tempfiles
(ans same for fastcgi_cache, uwsgi_cache and scgi_cache)
New option for proxy_cache_path: tf_zone=name:size (defaults to key zone
name with _tf suffix and 10M size). It creates a shared memory zone used
to store tempfiles nodes.
We would be very grateful for any reviews and other testing.
More information about the nginx-devel