Various LOF Implementation Results in ELKI and RapidMiner

I wrote my own LOF implementation, and I'm trying to compare the results with the ELKI and RapidMiner implementations, but all 3 give different results! I am trying to understand why.

My reference dataset is a one-dimensional, 102 real value with many duplicates. I will try to publish it below.

The first is the implementation of RapidMiner. LOF scores are very different from ELKI and my results; many come back with lof infinity. Has this implementation been confirmed as correct?

My results are similar to ELKI, but I do not get exactly the same LOF values. From a quick review of the comments in the ELKI source code, I think this may be due to differences in the way k-neighborhoods are computed.

In the LOF document, the MinPts parameter (elsewhere referred to as k) indicates the minimum number. points to be included in the k-neighborhood. In the ELKI implementation, I believe that they define a k-neighborhood as exactly k points, and not all points in a k-distance or k-different distance. Can someone confirm exactly how ELKI creates a k-neighborhood? There is also a private variable that allows the point itself to include it in its own area, but it seems that by default it should not be included.

Does anyone know of a public reference dataset that contains LOF estimates for validation?

--- for more details see ---

Link: ELKI source code is here:

http://elki.dbs.ifi.lmu.de/browser/elki/trunk/src/de/lmu/ifi/dbs/elki/algorithm/outlier/lof/LOF.java

The source code for RapidMiner is here:

http://code.google.com/p/rapidminer-anomalydetection/source/browse/trunk/src/de/dfki/madm/anomalydetection/evaluator/nearest_neighbor_based/LOFEvaluator.java

:

4,32323 5.12595 5.12595 5.12595 5.12595 5.7457 5.7457 5.7457 5.7457 5.7457 5.7457 5.97766 5.97766 6.07352 6.07352 6.12015 6.12015 6.12015 6.44797 6.44797 6.48131 6.48131 6.48131 6.48131 6.48131 6.48131 6.6333 6.6333 6.6333 6.70872 6.70872 6.70872 6.70872 6.70872 6.77579 6.77579 6.77579 6.77579 6.77579 6.77579 6.77579 6.77579 6.77579 6.77579 6.77579 6.77579 6.77579 6.77579 6,77579 7.03654 7.03654 7.03654 7.03654 7.03654 7.03654 7.03654 7.03654 7.03654 7.03654 7.03654 7.03654 7.03654 7.03654 7.03654 7.10361 7.10361 7.10361 7.10361 7.10361 7.10361 7.10361 7.10361 7.15651 7.15651 7.15651 7.15651 7.15651 7.15651 7.15651 7.15651 8.22598 8.22598 8.22598 8.22598 8.5538 8.5538 8.5538 8.5538 8.5538 8.5538 8.5538 8.5538 8.5538 8.5538 8.5538 8.5538 8.5538 8.5538 8.5538 8.5538 8.5538 8.5538

, (4.32323) LOF:

  • RapidMiner: ( / MinPts, 10 100)
  • ELKI: 2.6774 ( k = 10, distfunction/reachdist ​​ )
  • : 1.9531

, :

  • MinPts - 10, 10 . , 4.32323 48 , 5.12595 6.77579.
  • k- 2.45256
  • 1.58277
  • LRD 1/(99.9103/48)
  • lrd (o)/lrd (p) 48 93.748939
  • 48, LOF 1.9531
+5
2

, . Weka LOF, , , .

: , quickminer , . , , !

. .

, - , , .

ELKI , .

, 100% , "" . :

  • : A) , B) drop, C)

    C , A ( ) - , . B - , . " ", ?

  • k k-.

    k- - , , k . , 5.7457: 5.7457 - 4.32323 10 .

    k , k. , kth! , quickminer k, LOF (. 4 LOF!)

    k ( , , k ), , k-ths . ""?

    3 4 LOF LON- kNN.

    48 , , .

  • , , minPts ( , LOF 1.0)

    , Rapidminer.

: , . .

- k- , ( ) reach-dist(x[0], x[1]) = max(5.97766 - 5.12595, 5.12595 - 4.32323) = 0.80272

outlier , LOF. , LOF. , LOF .

+6

RapidMiner [1] ( LOF), . LOF . , ELKI. , ELKI, . ( , k + 1), ELKI . ( k-)

,  Hans

[1] http://marketplace.rapid-i.com/UpdateServer/faces/product_details.xhtml?productId=rmx_anomalydetection

0

All Articles