Well, that seems obvious once you find out, but as a Java beginner, it can take some time.
Set up your project first: just add the created sqoop.java file to the source folder. I use eclipse to import to the source folder of my class.
Then just make sure that you have correctly configured the build path of the java project:
Add the following jar files in the project properties / build path / java library / add an external jar: (for hadoop cdh4 +):
/usr/lib/hadoop/hadoop-common.jar
/usr/lib/hadoop-[version]-mapreduce/hadoop-core.jar
/usr/lib/sqoop/sqoop-[sqoop-version]-cdh[cdh-version].jar
Then adapt the mapreduce source code: First configure it:
public int run(String [] args) throws exception
{
Job job = new Job(getConf());
job.setJarByClass(YourClass.class);
job.setMapperClass(SqoopImportMap.class);
job.setReducerClass(SqoopImprtReduce.class);
FileInputFormat.addInputPath((job,"hdfs_path_to_your_sqoop_imported_file"));
FileOutputFormat.setOutputPath((job,"hdfs_output_path"));
job.setMapOutputKeyClass(Text.Class);
job.setMapOutputValueClass(Text.Class);
job.setOutputKeyClass(Text.Class);
job.setOutputValueClass(Text.Class);
...
.
, Java sqoop Sqimp.java:
, , : id, name, age
mapper :
public static class SqoopImportMap
extends Mapper<LongWritable, Text, Text, Text>
{
public void map(LongWritable k, Text v, Context context)
{
Sqimp s = new Sqimp();
try
{
s.parse(v);
}
catch(ParseError pe) {
try
{
if (s.age>30)
{
context.write(new Text(s.age),new Text(s.id));
}
}
catch(Exception ex)
{
}
}