I am trying to integrate OpenNLP into the work of reducing the map on Hadoop, starting with some basic suggestion. Inside the map function, the following code is executed:
public AnalysisFile analyze(String content) {
InputStream modelIn = null;
String[] sentences = null;
logger.info("sentenceModelPath: " + sentenceModelPath);
try {
modelIn = getClass().getResourceAsStream(sentenceModelPath);
SentenceModel model = new SentenceModel(modelIn);
SentenceDetectorME sentenceBreaker = new SentenceDetectorME(model);
sentences = sentenceBreaker.sentDetect(content);
} catch (FileNotFoundException e) {
logger.error("Unable to locate sentence model.");
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (modelIn != null) {
try {
modelIn.close();
} catch (IOException e) {
}
}
}
logger.info("number of sentences: " + sentences.length);
<snip>
}
When I start my work, I get an error message in the log in which "in should not be null!". (source of class throwing error) , which means that somehow I cannot open the InputStream for the model. Other tidbits:
- I checked that the model file exists in the location
sentenceModelPath. - I added Maven dependencies for opennlp-maxent: 3.0.2-incubating, opennlp-tools: 1.5.2-incubating and opennlp-uima: 1.5.2-incubating.
- Hadoop only works on my local machine.
OpenNLP. -, , Hadoop, OpenNLP, ?