Reasoning the semantic network in distributed systems

I want to use the Hadoop platform Web Tool parallel argument (WebPIE) argument . I already implemented a Hadoop framework with two Ubuntu virtual machines and functioned well. When I want to use WebPie to parse RDF files, the process crashes due to the need for the Sequence File format. The WebPIE tutorial didn't mention anything about the sequence file format as a prerequisite for reasoning in Hadoop. To create a Sequence file format, I wrote the following code:

public static void main(String[] args) {

    FileInputStream fis = null;
    SequenceFile.Writer swriter = null;
    try {

        Configuration conf = new Configuration();

        File outputDirectory = new File("output");
        File inputDirectory = new File("input");
        File[] files = inputDirectory.listFiles();

        for (File inputFile : files) {

            //Input
            fis = new FileInputStream(inputFile);

            byte[] content = new byte[(int) inputFile.length()];
            fis.read(content);

            Text key = new Text(inputFile.getName());
            BytesWritable value = new BytesWritable(content);

            //Output
            Path outputPath = new Path(outputDirectory.getAbsolutePath()+"/"+inputFile.getName());

            FileSystem hdfs = outputPath.getFileSystem(conf);

            FSDataOutputStream dos = hdfs.create(outputPath);

            swriter = SequenceFile.createWriter(conf, dos, Text.class,
                    BytesWritable.class, SequenceFile.CompressionType.BLOCK, new DefaultCodec());

            swriter.append(key, value);

        }

        fis.close();
        swriter.close();

    } catch (IOException e) {

        System.out.println(e.getMessage());
    }

}

RDF, 100% . - , , , RDF ?

+5
2

WebPIE Amazon EC2, - . , RDF, " N-Triples" ( ):

, , HDFS . N-Triples. , , .

" : " , , , Amazon EC2 Hadoop. , , , :

$ ./cmd-hadoop-cluster login webpie
$ hadoop fs -ls /
$ hadoop fs -mkdir /input
$ ./cmd-hadoop-cluster push webpie input_triples.tar.gz

HDFS. "3- : ,

. :

hadoop jar webpie.jar jobs.FilesImportTriples /input /tmp /pool --maptasks 4 --reducetasks 2 --samplingPercentage 10 --samplingThreshold 1000

... : 4 2 , 10% , , 1000 .

, , / , .

, .., , , .

0

N-Triples (triplePart1.gz, triplePart2.gz....), : input_triples.tar. gz, N- (triplePart1.gz, triplePart2.gz....).

  • tar HDFS

    ---/hadoop $tar zxvf/tmp/input_triples.tar.gz/tmp/input_triples.

    ---/hadoop $bin/hadoop fs -copyFromLocal/tmp/input-files/input.

  • ---/hadoop $bin/hasoop jar webpie.jar jobs.FilesImportTriples/input/tmp/pool --maptasks 4 --reducetasks 2 --samplingPercentage 10 --samplingThreshold 1000

  • ---/hadoop $bin/hasoop jar webpie.jar jobs.Reasoner/pool --fragment owl --rulesStrategy fixed --reducetasks 2 --samplingPercentage 10 --samplingThreshold 1000

: -)

0

All Articles