How to match dynamic speakers dynamoDB in EMR Hive

Question

How to match dynamic speakers dynamoDB in EMR Hive

I have a table in Amazon dynamoDB with a record structure like

{"username" : "joe bloggs" , "products" : ["1","2"] , "expires1" : "01/01/2013" , "expires2" : "01/02/2013"}

where the products property is a list of products owned by the user and the expires n properties refer to products in the list, the list of products is dynamic, and there are many. I need to transfer this data to S3 in a format like

joe bloggs|1|01/01/2013
joe bloggs|2|01/02/2013

Using external hive tables I can map the username and products columns in dynamoDB, however I cannot map the dynamic columns. Is there a way I could extend or adapt org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler to interpret and structure the data received from dynamo before the hive absorbs it? or is there an alternative solution for converting dynamo data to first normal form?

One of my key requirements is that I support the throttling provided by the dynamodb.throughput.read.percent parameter, so I did not compromise the operational use of the table.

+3

amazon-dynamodb hive amazon-emr

stjohnroe Apr 11 '12 at 11:36

source share

1 answer

Rodrigo Ribeiro · Answer 1 · 2012-04-11T20:49:04+0000

UDTF ( ). , Hive (, ) .

- Explode () .

How to match dynamic speakers dynamoDB in EMR Hive

More articles: