I want to use the Hadoop platform Web Tool parallel argument (WebPIE) argument . I already implemented a Hadoop framework with two Ubuntu virtual machines and functioned well. When I want to use WebPie to parse RDF files, the process crashes due to the need for the Sequence File format.
The WebPIE tutorial didn't mention anything about the sequence file format as a prerequisite for reasoning in Hadoop. To create a Sequence file format, I wrote the following code:
public static void main(String[] args) {
FileInputStream fis = null;
SequenceFile.Writer swriter = null;
try {
Configuration conf = new Configuration();
File outputDirectory = new File("output");
File inputDirectory = new File("input");
File[] files = inputDirectory.listFiles();
for (File inputFile : files) {
fis = new FileInputStream(inputFile);
byte[] content = new byte[(int) inputFile.length()];
fis.read(content);
Text key = new Text(inputFile.getName());
BytesWritable value = new BytesWritable(content);
Path outputPath = new Path(outputDirectory.getAbsolutePath()+"/"+inputFile.getName());
FileSystem hdfs = outputPath.getFileSystem(conf);
FSDataOutputStream dos = hdfs.create(outputPath);
swriter = SequenceFile.createWriter(conf, dos, Text.class,
BytesWritable.class, SequenceFile.CompressionType.BLOCK, new DefaultCodec());
swriter.append(key, value);
}
fis.close();
swriter.close();
} catch (IOException e) {
System.out.println(e.getMessage());
}
}
RDF, 100% . - , , , RDF ?