[PATCH] Dynamic rate limiting for limit_req module

J Carter jordanc.carter at outlook.com
Tue Jan 3 06:25:07 UTC 2023


Hello,

Thanks again for your reply and thoughts.

On 02/01/2023 23:47, Maxim Dounin wrote:
> Hello!
>
> On Mon, Jan 02, 2023 at 06:21:03AM +0000, J Carter wrote:
>
>>> "The fundamental problem with dynamic rates in limit_req, which is
>>> a leaky bucket limiter, is that the rate is a property which
>>> applies to multiple requests as accumulated in the bucket (and
>>> this is basically why it is configured in the limit_req_zone, and
>>> not in limit_req), while the dynamically calculated rate is a
>>> property of the last request.
>>>
>>> This can easily result in counter-intuitive behaviour.
>>> For example, consider 5 requests with 1 second interval, assuming
>>> burst 2, and rate being 1r/m or 100r/s depending on some request
>>> properties:
>>>
>>> - 1st request, rate 1r/m, request is accepted
>>> - 2st request, rate 1r/m, request is accepted
>>> - 3rd request, rate 1r/m, request is rejected, since there is no
>>>    room in the bucket
>>> - 4th request, rate 100r/s, request is accepted
>>> - 5th request, rate 1r/m, request is accepted (unexpectedly)
>>>
>>> Note that the 5th request is accepted, while it is mostly
>>> equivalent to the 3rd request, and one could expect it to be
>>> rejected.  But it happens to to be accepted because the 4th
>>> request "cleared" the bucket."
>> This is true - however I do wonder if the behavior (clearing the bucket)
>> is truly counter intuitive in that scenario.
> Well, certainly it is.  And it is even more so, if you'll assume
> that 4th and 5th requests are processed essentially at the same
> time: it will be a race condition with different outcomes
> depending on the actual order of the requests during limit_req
> zone updates.  For additional fun, this order is not the same as
> order of the requests in logs.
You're right, it's a good point. Given a short duration between 
requests, the behavior during transitions would not be deterministic and 
could 'flip flop' with regards to the excess value. This behavior could 
be harsh in the extreme rate jump scenario presented.
>
>> Should it not be expected that this user/key/state that has been
>> assigned 100r/s (even if just for one request) would have it's
>> outstanding excess requests recalculated (and in this case absolved) by
>> that rate increase?
>> You have after all assigned it more capacity.
>>
>> I'm not sure how to elegantly avoid this if it is an issue, since there
>> is no 'standard' rate to reference (it could interpolate over
>> time/requests, but that might be even more confusing).
> I don't think there is a good solution, and that's one of the
> reasons why there is no such functionality in limit_req.  Instead,
> it provides a fixed rate, and an ability to provide different
> burst limits for different requests (which is checked on a
> per-request basis), as well as multiple different limits to cover
> cases when multiple rates are needed.
I have a couple of ideas to solve this issue, that I plan to investigate 
further:

Interpolation - Instead of transitioning 'rate' in a single request 
straight to a new value, do it smoothly over time/requests.

In practical terms this would be achieved by interpolating the state's 
new rate with the previous rate, and using time between requests for 
that state (ms) as a scale factor.
A new command variable would be added to control the smoothness(the 
slope angle) of the interpolation. Since interpolation is an additive 
process, and considers time, this would smooth 'excess''s change and 
also mask the order of concurrent operations/requests (so no race).

Another option - Create a new state per unique 'key and rate' 
combination. This is effectively the same logical process(and memory 
required) as with the manual approach from your example (just 
dynamically determined at runtime in a single zone).

This means when a user (a given key) changes rate, a new state is 
created. The old state remains - to be cleaned up later via expire, and 
conversely if the user switches back to their original rate before 
expiry the old state will become active once again. This would mean 
concatenating key and rate's values for the hash used for the state (as 
it must be unique).

This would increase memory usage briefly during the transition periods 
due to the additional states and would not consider the other key + rate 
combo's state's excess value when moving between rates (again much the 
same as the manual approach).

The first option is most suitable to a 'fine grained' key (per user) 
rate limiting, whereas the second is more suitable for a 'broad based' 
key (one with lots of different rates).

>>> "Note well that the same limits now can be configured separately,
>>> in distinct shared memory zones.  This makes it possible to limit
>>> them requests consistently, without any unexpected deviations from
>>> the configured behaviour if requests with different rates
>>> interfere.
>>>
>>> Just in case, the very same behaviour can be implemented with
>>> multiple limit_req_zones, with something like this:
>>>
>>>      map $traffic_tier $limit_free {
>>>          free          $binary_remote_addr;
>>>      }
>>>
>>>      map $traffic_tier $limit_basic {
>>>          basic         $binary_remote_addr;
>>>      }
>>>
>>>      map $traffic_tier $limit_premium {
>>>          premium       $binary_remote_addr;
>>>      }
>>>
>>>      limit_req_zone $limit_free zone=free:10m rate=1r/m;
>>>      limit_req_zone $limit_basic zone=basic:10m rate=2r/m;
>>>      limit_req_zone $limit_premium zone=premium:10m rate=1r/s;
>>>
>>>      limit_req zone=free;
>>>      limit_req zone=basic;
>>>      limit_req zone=premium;
>>>
>>>  From the example it is not very clear how the suggested change is
>>> beneficial."
>> The issue with this approach (which is a similar set up to the one that
>> prompted this patch) is the scalability of the solution -  one zone is
>> required per unique rate, and the inability to assign arbitrary rates at
>> runtime.
> Yet the solution is exactly equivalent to the example provided.
>
> Further, given that the use case clearly do no match the intended
> usage of limit_req, which is designed to as a simple
> DoS-mitigation and bruteforce-mitigation solution, it looks pretty
> match trivial to implement.
It is trivial for a handful of states, however a single map can neatly 
contain 100s of such rate mappings. It seems preferable to have one or 
two maps with a single limit_req_zone & limit_req directive as opposed 
to 100x of each (as well as having to ensure each zone is allocated 
sufficient memory, as opposed to simply monitoring a single zone).
>> Perhaps a better example than the first one I gave to illustrate the
>> full utility would be with this scenario - involving a user sending a
>> JWT to determine the rate limit that is applied:
>>
>> The limit_req_zone is assigned with $jwt_claim_sub as the key and
>> $jwt_claim_rate as the rate.
>>
>> A user presents the following JWT with every request, his requests now
>> have a rate limit of 100r/s.
>> {
>>       ...
>>       "sub": "user01",
>>       "rate": "100r/s"
>> }
>> Mid-session, that user then acquires a new JWT to get increased speed -
>> the limit associated with this state/node is now 1000r/s with the same
>> key(sub). Note that these rate values do not need to be preset in the
>> nginx configuration, and it can all be done with a single limit_req_zone.
>> {
>>       ...
>>       "sub": "user01",
>>       "rate": "1000r/s"
>> }
>> Similar could also be done with key_val zone (ie. adding and modifying
>> rate strings at runtime within a key_val zone, and doing a lookup in the
>> key_val zone for that value based upon a user identifier or group as the
>> key).
> This looks exactly equivalent to the configuration previously
> discussed.
This short example is, but see the previous issue. In theory you could 
have a thousands of individual rates(tailored per user) with this method 
with zero additional nginx configuration.
>
> Note well that actually having proper nginx configuration for all
> the allowed rate values ensure that no invalid and/or arbitrary
> rates can be supplied by users, providing an additional
> consistency check.

I avoided mentioning 'untrusted request headers' (cookies) and went with 
JWT claims for this reason. Since you can cryptographically verify a 
JWTs authenticity, you know if it's been tampered with - you can 
determine if it still contains the original rate assigned by the 
administrator controlled application that generated the token).

It is the duty of the administrator to set an appropriate rate value in 
that token.

It's possible to handle JWT in pure NJS 
(https://github.com/lombax85/nginx-jwt) -  I'll put something together 
to illustrate this better with a full example.

>
> Also note that this is not how limit_req is expected to be used.
> As already mentioned above, limit_req is a DoS and bruteforce
> mitigation solution rather than a solution for billing
> enforcement.

I do agree - it an extension to it's intended purpose, however dynamic 
request rates is consistent with other capabilities in nginx.
https://nginx.org/en/docs/http/ngx_http_core_module.html#limit_rate

Perhaps the proposed changes is/will be to far removed from the original 
purpose of the module - in which case I will make the required 
alterations and resubmit it as a separate module for consideration.



More information about the nginx-devel mailing list