How does Nginx look-up cached resource?

Tue Sep 8 23:09:28 UTC 2015

On 08.09.2015 3:29, Sergey Brester wrote:

>> There is no obscurity here. Value of proxy_cache_key is known,
>> hash function is known, nginx sources is open and available.

> If value of proxy_cache_key is known and attackers can generate it,
> what do you want to protect with some hash value?

I want protect backend from DDoS attack caused by nginx cache
poisoning caused by easily discoverable collisions of MurMurHash.

So, using MurMurHash in nginx is bad idea, because this allow
any attacker virtually turn off nginx cache for any cache entries.

> If attacker can use any key - it's no matter which hash algorithm you
> have used (attacker can get entry).

Attacker can get entry from nginx cache, not from backend - it is Ok.
Cache is fast and backend will be not overloaded in this use case.

For example, if site www.examle.com has popular page
http://www.examle.com/very-popular-page/ frequently requested
by users from cache - site work fine.

But if attacker generate request to other page, for example,
http://www.examle.com/other-page/?some-text-wkjhwhgfwjefwje
and hash of proxy_cache_key value of this page
will be the same as hash of proxy_cache_key value
of page http://www.examle.com/very-popular-page/
- as you say previously, nginx cache entry
will be replaced with new page content.

And if any other user request http://www.examle.com/very-popular-page/
- nginx can't process this request from cache and must send it to
backend, because full key comparison say, what cache entry contains
different page, not requested one. In same way nginx cache can be
turned off for any set of most popular pages on site, and backend
will be under DDoS and nginx cache can't help,
because it poisoned by collisions.

Root cause of this DDoS attack is usage of insecure MurMurHash.

>>> Hash value should be used only for fast searching of hash key. Not to
>>> identify the cached resources!
>> You remember proposed solution from your message?
>> http://mailman.nginx.org/pipermail/nginx-devel/2015-September/007286.html
>> [1]
>> Attacker easily can provide DDoS attack against nginx in this case:
>> http://www.securityweek.com/hash-table-collision-attacks-could-trigger-ddos-massive-scale
>> [2]
>> Hash Table Vulnerability Enables Wide-Scale DDoS Attacks
>
> And what's stopping him to do the same with much safe hash function?

"Collision resistance is a property of cryptographic hash functions".
- https://en.wikipedia.org/wiki/Collision_resistance

> On the contrary, don't forget the generating of such hash values is cpu
> greedy also.

You can check it by benchmark: openssl speed md5 sha1

Slow backend can generate *one* page several seconds.

>>> If your entry should be secure, the key (not it hash) should contain
>>> part of security token, authentication, salt etc.
>> This is "security through obscurity",
>> and you say, what this is bad thing.
> Wrong! Because if this secure parts in key are an internal nginx
> values/variables (authenticated user name, salt etc.), that the attacker
> never can use!

Default value of proxy_cache_key is $scheme$proxy_host$request_uri;
Frequently used proxy_cache_key value is $scheme$host$request_uri;

> He can theoretical use a variable part of key, to generate or brute some
> expected hash, equal with searched one, but the keys comparison makes
> all his attempts void.

Also this makes nginx cache void too.

If backend can't work without nginx cache - site will return denial
of service to customers, for example, by 504 Gateway Timeout Errors.

> 4) what do you want to do against it? If I understood it correct,
> instead of intensive load by page generation, you want shift the load
> into hash generation? Well, very clever. :)

Yes.

Doing sha1 for 3s on 1024 size blocks: 1955911 sha1's in 3.01s
Doing sha1 for 3s on 8192 size blocks:  289605 sha1's in 3.00s

Backend can generate only two pages in 3.00s

Compare 2 with 289605 and you will see ~ 100_000 performance boost.

>> Hash table implementations vulnerable to algorithmic complexity attacks
> That means the pure hashing, without keys comparison

With key comparison too, because thanks to insecure hash collisions,
hash table can be easily "converted" into linked list by attacker.

-- 
Best regards,
  Gena