Fast bitwise comparison

I have two bytes that consist of two 4-bit numbers packed together. I need to know if one of the two numbers from the first byte matches any of the numbers from the second byte. Zero is considered zero and must not match.

Obviously, I can do this by unpacking the numbers and comparing them one at a time:

a = 0b10100101;
b = 0b01011111; // should have a match because 0101 is in both

a1 = a>>4; a2 = a&15;
b1 = b>>4; b2 = b&15;

return (a1 != 0 && (a1 == b1 || a1 == b2)) || (a2 != 0 && (a2 == b1 || a2 == b2));

//     ( true   && (  false  ||   false )) || ( true   && (  true   ||   false ))
//     ( true   &&         false         ) || ( true   &&         true          )
//            false                        ||         true
//                                        TRUE

However, I'm just wondering if anyone knows of a cleaner way to do this?

+5
source share
3 answers

A cleaner way is to get rid of this hard expression and make the code more readable.

def sameNybble (a, b):
    # Get high and low nybbles.

    ahi = (a >> 4) & 15 ; alo = a & 15;
    bhi = (b >> 4) & 15 ; blo = b & 15;

    # Only check ahi if non-zero, then check against bhi/blo

    if ahi != 0:
        if ahi == bhi or ahi == blo:
            return true

    # Only check alo if non-zero, then check against bhi/blo

    if alo != 0:
        if alo == bhi or alo == blo:
            return true

    # No match

    return false

Any decent optimizing compiler will basically give you the same basic code, so sometimes it’s better to optimize it for readability.

0

. - 16 ((a<<8)+b). 1 ( 8K), 8 ( 64K).

+2

++, 1,6 . , . true/false / ( ).

4 , , 16 , ( 16 ) , , nibble . AND , . AND nibble 0.

inline unsigned set( unsigned char X ) {
    return (1 << (X & 15)) | (1 << (X >> 4));
}

// Return true if a nibble in 'A' matches a non-null nibble in 'B'
inline bool match( unsigned char A, unsigned char B ) {
    return set( A ) & set( B ) & ~set( 0 );
}

I dated it to Intel Xeon X5570 @ 2.93 GHz, and it is 1.6 times faster than the original in question. Here is the code I used for its time:

#include <time.h>
#include <iostream>

bool original( unsigned char A, unsigned char B ) {
    unsigned char a1 = A >> 4;
    unsigned char a2 = A & 15;
    unsigned char b1 = B >> 4;
    unsigned char b2 = B & 15;

    return (a1 != 0 && (a1 == b1 || a1 == b2)) || (a2 != 0 && (a2 == b1 || a2 == b2));
}

static inline unsigned set( unsigned char X ) {
    return (1 << (X & 15)) | (1 << ((X >> 4)&15));
}

bool faster( unsigned char A, unsigned char B ) {
    return set( A ) & set( B ) & ~set( 0 );
}

class Timer {
    size_t _time;
    size_t & _elapsed;
    size_t nsec() {
        timespec ts;
        clock_gettime( CLOCK_REALTIME, &ts );
        return ts.tv_sec * 1000 * 1000 * 1000 + ts.tv_nsec;
    }
public:
    Timer(size_t & elapsed) : _time(nsec()), _elapsed(elapsed) {}
    ~Timer() { _elapsed = nsec() - _time; }
};

int main()
{
    size_t original_nsec, faster_nsec;
    const size_t iterations = 200000000ULL;
    size_t count = 0;

    { Timer t(original_nsec);
        for(size_t i=0; i < iterations; ++i) {
            count += original( 0xA5 , i & 0xFF );
        }
    }

    { Timer t(faster_nsec);
        for(size_t i=0; i < iterations; ++i) {
            count += faster( 0xA5 , i & 0xFF );
        }
    }

    std::cout << double(original_nsec) / double(faster_nsec)
              << "x faster" << std::endl;

    return count > 0 ? 0 : 1;
}

Here's the conclusion:

$ g++ -o match -O3 match.cpp -lrt && ./match
1.61564x faster
$ 
0
source

All Articles