[PATCH] Mail: add the "reuseport" option of the "listen" directive
mdounin at mdounin.ru
Wed Aug 18 17:05:40 UTC 2021
On Thu, Aug 19, 2021 at 12:28:59AM +1000, Robert Mueller wrote:
> > Could you please test if compiling with
> > --with-cc-opt="-DNGX_HAVE_EPOLLEXCLUSIVE=0"
> > improves things, notably on production systems? In my limited
> > testing it seems to be improve things, and if this is indeed the
> > case, we can consider removing use of EPOLLEXCLUSIVE.
> I can try this tomorrow, but did you see the link Jan posted to the cloudflare blog?
> This explains the problem we're seeing exactly and why reuseport fixes it.
Yes, I've seen it. It also suggests that EPOLLEXCLUSIVE might be
responsible for the balancing change you've observed with recent
kernels, something I've also suspected.
> > > As you can see, without the reuseport option, this causes severe
> > > scalability problems for us.
> > I tend to think that reuseport is a bad option for load balancing
> > between worker processes, as it can be easily tricked by an outside
> > actor to select a particular worker process, and this opens an
> > obvious DoS attack vector.
> Really? Can you explain how this is possible?
Since reuseport uses hash of the source address to balance
incoming connections between sockets, the client can choose a
source port to use so the hash will direct the connection to a
particular socket, that is, to a particular worker process. This
in turn makes it possible to overload this worker process (which
is usually several times easier than overloading all worker
processes), degrading or completely denying service to clients who
happen to be balanced to the same worker process.
> Also given that cloudflare use this option, and I expect
> cloudflare are literally the largest users of nginx in the world
> and also have to deal with extreme adversarial environments
> given they run a service to protect against DDoS, I would expect
> they would be aware of any potential DoS vector in this regard,
> or if not aware, extremely interested in hearing about it!
I believe Cloudflare has enough resources and/or enough mitigations
in place to don't care.
More information about the nginx-devel