Amazon MapReduce with cronjob + API

I have a site configured for an instance of EC2, which allows users to view information from 4 of their social networks.

As soon as the user joins, the site must update its information every night to display the latest and current information the next day.

Initially, we had a cron job that passed through each user and made the necessary API calls, and then stored the data in a DB (an amazon rds instance).

This operation should take from 2 to 30 seconds per person, which means that to complete this 1 to 1, it will take several days to update.

I watched MapReduce and would like to know if this would be a suitable option for what I'm trying to do, but at the moment I can’t say for sure.

Can I provide a .sql file for MapReduce, with all the records I want to update + a script that tells MapReduce what to do with each record and process them all at the same time?

If not, what would it be best?

Thanks for your help in advance.

+3
source share
2 answers

I assume that the data of each user does not depend on the data of other users, which seems logical to me. If this is not the case, please ignore this answer.

( , ), MapReduce. MR - , , ( , , ).

, , - ~ 10000 ( ). 1000 , , , .

MR (, Hadoop), ( ). ( , ,...), .

, MR , , - YMMV.

+4

. MapReduce, Map , , . , , EC2 sql. , , . , Elastic MapReduce MapReduce.

+1

All Articles