How does Nginx look-up cached resource?

Sergey Brester serg.brester at sebres.de
Fri Sep 4 21:00:58 UTC 2015


On 04.09.2015 21:43, Maxim Dounin wrote:

> No one yet happened. And likely won't ever happen, as md5 is a
> good hash function 128 bits wide, and it took many years to find
> even a single collision of md5.

You confuse good for "collision-search algorithms" with a good in the 
sense of the "probability the collision can occur". A estimation of 
collision in sence of "collision-search algorithm" and co. implies the 
hashed string is unknown and for example it estimates attacks to find 
that (like brute, chosen-prefix etc).

I'm talking about the probability of incidence the same hash for two 
different cache keys.
In addition, because of so-called birthday problem 
(https://en.wikipedia.org/wiki/Birthday_problem) we can increase this 
probability with at least comparable 64 bit for real random data 
(different length).
Don't forget our keys, that will be hashed, are not really any "random" 
data - most of the time it contains only specified characters and/or has 
specified length.

So the probability that the collision will occur is still significant 
larger (a billion billion times larger).

> And even if it'll happen, we have
> crc32 check in place to protect us.

Very funny... You make such conclusions based on what?
So last but not least, if you still haven't seen the collision in sence 
of md5 "protected" crc32, how can you be sure, that this is still not 
occurred?

For example, how large you will estimate the probability that the 
collision will occur, if my keys will contain only exact 32 characters 
in range [0-9A-Za-z]? And it frequency? Just approximately dimension...

Regards,
sebres.



More information about the nginx-devel mailing list