Hash function for bits from 64 bits to 10 bits

Question

Hash function for bits from 64 bits to 10 bits

I want the hash function to take a long number (64 bits) and produce a result of 10 bits. What is the best hash function for this purpose. Inputs are basically addresses of variables (addresses have 64 bits or 8 bytes in Linux), so my hash function should be optimized for this purpose.

+5

c gcc linux x86-64 hash

Metallicpriest Jul 6 '12 at 9:25

source share

3 answers

rodrigo · Answer 1 · 2012-07-06T09:34:30+0000

I would say something like this:

uint32_t hash(uint64_t x)
{
    x >>= 3;
    return (x ^ (x>>10) ^ (x>>20)) & 0x3FF;
}

Minor 3 bits are not very useful, since most variables are 4-byte or 8-byte aligned, so we delete them. Then we take the next 30 bits and mix them (XOR) in blocks of 10 bits each.

Naturally, you can also take (x>>30)^(x>>40)^(x>>50), but I'm not sure that they will have any value in practice.

jthill · Answer 2 · 2012-07-06T11:45:43+0000

- mod by prime, 1021 - 10- . .

static inline int hashaddress(void *v)
{
        return (uintptr_t)v % 1021;
}

, , . ; , .

Arjunshankar · Answer 3 · 2012-07-06T14:54:53+0000

, , . 4 , 4 2 mallocs. . :

 20125e8
 20125e6
 20125e7
 20125e4
3fef2131
3fef2130
3fef212f
3fef212c
 25e4802
 25e4806

:

LSB ( ) 'on' 'off'. . 2 LSB .
, 8-10 . .
, 64- 48 .

:

/* Drop two LSBs.  */
a >>= 2;

/* Get rid of the MSBs. Keep 46 bits. */
a &= 0x3fffffffffff;

/* Get the 14 MSBs and fold them in to get a 32 bit integer.
The MSBs are mostly 0s anyway, so we don't lose much entropy.  */
msbs = (a >> 32) << 18;
a ^= msbs;

" " -, , . " " , :

uint32_t half_avalanche( uint32_t a)
{
    a = (a+0x479ab41d) + (a<<8);
    a = (a^0xe4aa10ce) ^ (a>>5);
    a = (a+0x9942f0a6) - (a<<14);
    a = (a^0x5aedd67d) ^ (a>>3);
    a = (a+0x17bea992) + (a<<7);
    return a;
}

10- 10 MSB uint32_t. - , N MSB - N, .

I was a little bored, so I wrote a toy test for this. Nothing out of the ordinary, it allocates a bunch of memory on the heap and tries to use the hash described above. The source may be here . Result:

1024 buckets, 256 values, 29 collisions
1024 buckets, 512 values, 103 collisions
1024 codes, 1024 values, 370 collisions

Next: I tried the other two hashes that are given here. Both of them have the same performance. Looks like: just choose the fastest;)

Hash function for bits from 64 bits to 10 bits

More articles: