I asked a similar question for a method string.GetHashCode()in .NET. From this point on, I realized that we cannot rely on an implicit hash code implementation for buit-in types if we use it on different machines. Therefore, I assume that the Java implementation is String.hashCode()also unstable in different hardware configurations and can behave differently in virtual machines (do not forget about different VM implementations)
We are currently discussing a way to safely convert strings to numbers in Java by hashing, but the hash algorithm must be stable on different nodes of the cluster and be fast for evaluation, since there will be a high frequency. My teammates insist on a native method hashCode, and I will need some reasonable arguments to get them to reconsider a different approach. Currently, I can only think about the differences between the machine configurations (x86 and x64), possibly different JVM providers on some machines (hardly applicable in our case) and byte order differences, depending on the machine that is used to start. Of course, character encoding is likely to be considered as well.
While all this comes to my mind, I am not 100% sure of any of them to be a strong enough reason, and I would appreciate your expertise and experience in this field. This will help me build stronger arguments for writing a custom hashing algorithm. In addition, I will be grateful for advice on what not to do with its implementation.
source
share