Finding plausibility across multiple entries using PHP

I am working on a web application that tracks help desk entries. We want to find a way to stop people from copying and pasting their notes on common issues - we want the records of the original help records to be recorded for every challenge.

In any case, we have thousands of records, and some of them are similar, I'm trying to find a way to compare them all with eachother and point out any records that are very similar to the others, that is, 80% will probably be a direct copy, etc.

I looked at similar_text () and several other PHP built-in functions, but I am interested to hear if someone else has done something before. I do not believe that I can use compare_text () effectively, since I need to compare multiple records against each other, and not two lines.

Any input is appreciated.

+3
source share
3 answers

I think similar_text () will do what you want. As long as your machine has enough memory to handle comparisons, it should work fine. Also look at levenshtein () and soundex ().

0
source

, Solr. , , , "" . Solr ( - ) , , , "" "" ..

, Solr, , .

0
0

All Articles