Mood swipe

Using mahout, I can classify data sentiment. But I'm stuck with a matrix of confusion.

I use mahout 0.7 naive bike algorithms to classify tweet moods. I use classifiers smallest bikes trainnband testnbfor training the classifier and classify the mood of tweets as "positive", "negative" or "neutral."

Example positive training set

      'positive','i love my i phone'
      'positive' , it pleasure to have i phone'  

In the same way, I prepared the training samples negative and neutral, this is a huge data set.

The test tweets I provide do not include feelings.

  'it is nice model'
  'simply fantastic ' 

I can run the mahout classification algorithm, and it gives the result of the classified instances as a confusion matrix.

, , . : .

       'negative','very bad btr life time'
      'positive' , 'i phone has excellent design features' 

mahout, , . .

"", , apache mahout, twitter.

+5
2

, , Naive Bayes ( ), , .

. . 2:

Parameters p = new Parameters();
p.set("basePath", modelDir.getCanonicalPath());9
Datastore ds = new InMemoryBayesDatastore(p);
Algorithm a = new BayesAlgorithm();
ClassifierContext ctx = new ClassifierContext(a,ds);
ctx.initialize();

....

ClassifierResult result = ctx.classifyDocument(tokens, defaultCategory);

, .

+3

, , , . , - Mahout , . , , Mahout . , .

, Mahout, . 0,7, 0,7.

public void classify(String modelLocation, RawEntry unclassifiedInstanceRaw) throws IOException {

    Configuration conf = new Configuration();

    NaiveBayesModel model = NaiveBayesModel.materialize(new Path(modelLocation), conf);
    AbstractNaiveBayesClassifier classifier = new StandardNaiveBayesClassifier(model);

    String unclassifiedInstanceFeatures = RawEntry.toNaiveBayesTrainingFormat(unclassifiedInstanceRaw);

    FeatureVectorEncoder vectorEncoder = new AdaptiveWordValueEncoder("features");
    vectorEncoder.setProbes(1); // my features vectors are tiny

    Vector unclassifiedInstanceVector = new RandomAccessSparseVector(unclassifiedInstanceFeatures.split(" ").length());

    for (String feature: unclassifiedInstanceFeatures) {
        vectorEncoder.addToVector(feature, unclassifiedInstanceVector);
    }

    Vector classificationResult = classifier.classifyFull(unclassifiedInstanceVector);

    System.out.println(classificationResult.asFormatString());

}

:

1) -, , , trainnb. , -o trainnb. .bin .

2) NaiveBayesClassifier

3) RawEntry - , . toNaiveBayesTrainingFormar , , "word1 word2 word3 word4". , .

4) Mahout Vector, Vector

5) - .

. Vector, ( ) . . ( , ) :

1) , ,

2) ( , StandardNaiveBayesClassifier )

3) , , ,

4) jC.set( "mapreduce.textoutputformat.separator", "); jC - JobConf. mapreduce. ",".

, Mahout 0.7. , , . , .

, Mahout , Mahout Java - .

+1

All Articles