Find overlapping rowset of equal length?

There are 1 million lines of equal length (short line). for instance

ABCDEFGHI

fghixyzyz

ghiabcabc

zyzdddxfg

. ,.

I want to find a pair of matching two lines. Overlapping A "abcdefghi" and B "fghixyzyz" "fghi", which is the maximum suffix of A, the maximum prefix of B, satisfies the suffix, and the prefix is ​​equal.

Is there an efficient algorithm that can find the overlap of any two lines in a set?

+3
source share
1 answer

One effective way is to create a common suffix tree for a set of strings. To find the overlap between lines x and y:

y . node , x, , - x- > y.

. . 137 ( " - " ) " , ".

: (/ ) .

0

All Articles