My problem is this. I have a data block. Sometimes this data block is updated and a new modified version appears. I need to determine if the data I'm looking for matches the version I expect to receive.
I decided to use a fingerprint so that I could completely save the “expected” version of the data. It seems that the default choice for this kind of thing is an MD5 hash.
However, MD5 was designed for cryptographic protection. There are much faster hash functions. I am looking at modern non-cryptographic features like CityHash and SpookyHash.
Since I control all the data in my system, I only care about random collisions when the modified block of data hashes has the same value. Therefore, I do not think that I need to worry about the “critical-critical hashes” that threaten the attacker, and may leave with a simpler hash function.
Is there a problem using a hash function like CityHash or SpookyHash for this purpose, or should I just stick with MD5? Or should I use something specifically designed for fingerprints like the Rabin fingerprint?
source
share