Why does the LogLikelihoodSimilarity function return values greater than 1.0 for a dataset of 0s and 1s?

Question

Why does the LogLikelihoodSimilarity function return values greater than 1.0 for a dataset of 0s and 1s?

I have a large set of preference data that is expressed as 1.0, and I use the Tanimoto affinity functions and the common logical commands for user and item preferences. Recommendations typically represent values from 0 to 1.0.

Many sources, such as the Mahout in Action book and this previous SO thread , recommend the LogLikelihoodSimilarity metric for Tanimoto for Boolean datasets. When I switched to the LogLikelihood affinity metric, it generated a few points in a much higher range, for example 11. I had to go back to Tanimoto to get more sensitive grades. Can you suggest any potential fixes, or am I misunderstanding the return values of recommended results?

+3

mahout collaborative-filtering similarity

infomofo Apr 16 '12 at 17:47

source share

1 answer

Sean Owen · Accepted Answer · 2012-04-16T18:01:59+0000

, , . , 1.0 . , . [0,1] - .

Why does the LogLikelihoodSimilarity function return values ​​greater than 1.0 for a dataset of 0s and 1s?

More articles:

Why does the LogLikelihoodSimilarity function return values greater than 1.0 for a dataset of 0s and 1s?