I have a site configured for an instance of EC2, which allows users to view information from 4 of their social networks.
As soon as the user joins, the site must update its information every night to display the latest and current information the next day.
Initially, we had a cron job that passed through each user and made the necessary API calls, and then stored the data in a DB (an amazon rds instance).
This operation should take from 2 to 30 seconds per person, which means that to complete this 1 to 1, it will take several days to update.
I watched MapReduce and would like to know if this would be a suitable option for what I'm trying to do, but at the moment I can’t say for sure.
Can I provide a .sql file for MapReduce, with all the records I want to update + a script that tells MapReduce what to do with each record and process them all at the same time?
If not, what would it be best?
Thanks for your help in advance.
source
share