How to speed up this BIT_COUNT request for distance from hamming?

I have a php script that checks the hamming distance between two photos taken from a security camera.

MySQL table with numbers 2.4M and consists of a key and 4 INT (10) s. INT (10) s were indexed separately together with the key, but I have no substantial evidence that any combination was faster than the others. I can try again if you offer to do this.

The interference weights are calculated by converting the image to 8x16 pixels, and every quarter bit is stored in a column, pHash0, pHash1 ... etc.

There are two ways I wrote this. The first way is to use nested views. Theoretically, each output should have less data to verify than its predecessor. A request is a prepared statement, huh? fields is the pHash [0-3] of the file that I am checking.

Select
    `Key`,
    Bit_Count(T3.pHash3 ^ ?) + T3.BC2 As BC3
  From
    (Select
      *,
      Bit_Count(T2.pHash2 ^ ?) + T2.BC1 As BC2
    From
      (Select
        *,
        Bit_Count(T1.pHash1 ^ ?) + T1.BC0 As BC1
      From
        (Select
          `Key`,
          pHash0,
          pHash1,
          pHash2,
          pHash3,
          Bit_Count(pHash0 ^ ?) As BC0
        From
          files
        Where
          Not pHash0 Is Null And
          Bit_Count(pHash0 ^ ?) < 4) As T1
      Where
        Bit_Count(T1.pHash1 ^ ?) + T1.BC0 < 4) As T2
    Where
      Bit_Count(T2.pHash2 ^ ?) + T2.BC1 < 4) As T3
  Where
    Bit_Count(T3.pHash3 ^ ?) + T3.BC2 < 4

The second approach was a bit more direct. He just did all the work right away.

Select
    `Key`,
  From
    files
  Where
    Not pHash0 is null AND
    Bit_Count(pHash0 ^ ?) + Bit_Count(pHash1 ^ ?) + Bit_Count(pHash2 ^
    ?) + Bit_Count(pHash3 ^ ?) < 4

The first query is faster on large sets of records, and the second on smaller sets of records, but none of them will exceed 1-1 / 3 seconds for comparison in 2.4M records.

Do you see a way to customize this process to make it faster? Any sentences can be checked quickly, for example, to change data types or indexes.

- Win7x64, MySQL/5.6.6 InnoDB, nginx/1.99, php-cgi/7.0.0 zend. script - .

EDIT:

, 4 32- 1 (16), 4 , 4 128- , php . , .

~ 500%. : pHash "A" pHash "B" +/- .

@duskwuff . @duskwuff!

:

Select
  files.`Key`, 
  Bit_Count(? ^ pHash0) + Bit_Count(? ^ pHash1) +
  Bit_Count(? ^ pHash2) + Bit_Count(? ^ pHash3) as BC
  From
    files FORCE INDEX (bitcount)
  Where
    bitCount Between ? And ? 
  AND Bit_Count(? ^ pHash0) + Bit_Count(? ^ pHash1) +
  Bit_Count(? ^ pHash2) + Bit_Count(? ^ pHash3) <= ?
  ORDER BY Bit_Count(? ^ pHash0) + Bit_Count(? ^ pHash1) +
  Bit_Count(? ^ pHash2) + Bit_Count(? ^ pHash3)

4 "?" 4 32- , 2 "?" +/- "?" . ORDER BY , LIMIT 1 . bitcount B-TREE.

2,4 , 3 4 , 70 000 . 64 ( ), 3 20% (490 000 ), 0 2,8% (70 000, ).

+1
1

, BIT_COUNT(a ^ b) BIT_COUNT(a) BIT_COUNT(b). ( .) , , , , . , , .

, , - :

ALTER TABLE files ADD COLUMN totalbits INTEGER;
CREATE INDEX totalbits_index ON files (totalbits);

UPDATE files SET totalbits = BIT_COUNT(pHash1) + BIT_COUNT(pHash2)
                           + BIT_COUNT(pHash3) + BIT_COUNT(pHash4);

SELECT `Key` FROM files WHERE (totalbits BETWEENAND …) AND

, . .

+5

All Articles