How to see the exit in Amazon EMR / S3?

I am new to Amazon Services and have been trying to run the application on Amazon EMR.

To do this, I have completed the following steps:

1) Created merge scripts that contain → create table, load the data instruction into Hive with some file and select * from the command.

2) Created an S3 bucket. And I load the object into it like: Hive Script, a file to load into a table.

3) Then create a workflow (using the Sample Hive program). Based on the input, output, and script path (e.g. s3n: //bucketname/script.q, s3n: //bucketname/input.txt, s3n: // bucketname / out /). Did not create a directory. I think it will be created automatically.

4) Then the Job Flow begins, and after a while I saw states like STARTING, BOOTSTRAPING, RUNNING and SHUT DOWN.

5) When the SHUT DOWN state is triggered, it automatically stops showing the FAILES state for SHUT DOWN.

Then on S3 I did not see the out directory. How to see the result? I saw the directory as daemons, nodes, etc.

Also, how do I view data from HDFS in Amazon EMR?

+3
source share
1 answer

The output path specified in step 3 should contain your results (from your description, this is s3n: // bucketname / out /)

If this is not the case, something went wrong with your Hive script. If your Hive job failed, you will find error / exception information in jobtracker log. Jobtracker exists in<s3 log location>/daemons/<master instance name>/hadoop-hadoop-jobtracker-<some Amazon internal IP>.log

S3 . , . , .

+2

All Articles