Issue with flooded warning and request limiting

Mon Nov 20 13:01:03 UTC 2017

Hello!

On Mon, Nov 20, 2017 at 11:33:26AM +0100, Stephan Ryer wrote:

> We are using nginx as a proxy server in front of our IIS servers.
> 
> We have a client who needs to call us up to 200 times per second. Due to
> the roundtrip-time, 16 simultanious connections are opened from the client
> and each connection is used independently to send a https request, wait for
> x ms and then send again.
> 
> I have been doing some tests and looked into the throttle logic in the
> nginx-code. It seems that when setting request limit to 200/sec it is
> actually interpreted as “minimum 5ms per call” in the code. If we receive 2
> calls at the same time, the warning log will show an “excess”-message and
> the call will be delayed to ensure a minimum of 5ms between the calls..
> (and if no burst is set, it will be an error message in the log and an
> error will be returned to the client)
>
> We have set burst to 20 meaning, that when our client only sends 1 request
> at a time per connection, he will never get an error reply from nginx,
> instead nginx just delays the call. I conclude that this is by design.

Yes, the code counts average request rate, and if it sees two 
requests with just 1ms between them the averate rate will be 1000 
requests per second.  This is more than what is allowed, and hence 
nginx will either delay the second request (unless configured with 
"nodelay"), or will reject it if the configured burst size is 
reached.

> The issue, however, is that a client using multiple connections naturally
> often wont be able to time the calls between each connection. And even
> though our burst has been set to 20, our log is spawned by warning-messages
> which I do not think should be a warning at all. There is a difference
> between sending 2 calls at the same time and sending a total of 201
> requests within a second, the latter being the only case I would expect to
> be logged as a warning.

If you are not happy with log levels used, you can easily tune 
them using the limit_req_log_level directive.  See 
http://nginx.org/r/limit_req_log_level for details.

Note well that given the use case description, you probably don't 
need requests to be delayed at all, so consider using "limit_req 
.. nodelay;".  It will avoid delaying logic altogether, thus 
allowing as many requests as burst permits.

> Instead of calculating the throttling by simply looking at the last call
> time and calculate a minimum timespan between last call and current call, I
> would like the logic to be that nginx keeps a counter of the number of
> requests withing the current second, and when the second expires and a new
> seconds exists, the counter Is reset.

This approach is not scalable.  For example, it won't allow to 
configure a limit of 1 request per minute.  Moreover, it can 
easily allow more requests in a single second than configured - 
for example, a client can do 200 requests at 0.999 and additional 
200 requests at 1.000.  According to your algorithm, this is 
allowed, yet it 400 requests in just 2 milliseconds.

The current implementation is much more robust, and it can be 
configured for various use cases.  In particular, if you want to 
maintain limit of 200 requests per second and want to tolerate 
cases when a client does all requests allowed within a second at 
the same time, consider:

    limit_req_zone $binary_remote_addr zone=one:10m rate=200r/s;
    limit_req zone=one burst=200 nodelay;

This will switch off delays as already suggested above, and will 
allow burst of up to 200 requests - that is, a client is allowed 
to do all 200 requests when a second starts.  (If you really want 
to allow the case with 400 requests in 2 milliseconds as described 
above, consider using burst=400.)

-- 
Maxim Dounin
http://mdounin.ru/