I am new to Amazon Services and have been trying to run the application on Amazon EMR.
To do this, I have completed the following steps:
1) Created merge scripts that contain → create table, load the data instruction into Hive with some file and select * from the command.
2) Created an S3 bucket. And I load the object into it like: Hive Script, a file to load into a table.
3) Then create a workflow (using the Sample Hive program). Based on the input, output, and script path (e.g. s3n: //bucketname/script.q, s3n: //bucketname/input.txt, s3n: // bucketname / out /). Did not create a directory. I think it will be created automatically.
4) Then the Job Flow begins, and after a while I saw states like STARTING, BOOTSTRAPING, RUNNING and SHUT DOWN.
5) When the SHUT DOWN state is triggered, it automatically stops showing the FAILES state for SHUT DOWN.
Then on S3 I did not see the out directory. How to see the result? I saw the directory as daemons, nodes, etc.
Also, how do I view data from HDFS in Amazon EMR?
source
share