Cron job at a specific time of day - what is the limit?

I am after a little advice on working with Cron with PHP. My scenario is this:

I have a site with a lot of membership. Users have one or more URLS associated with their account. At midnight (or a specific time), I would like to name a script that will request websites for each user and update the database with the information that he finds. Think of it as a screen scraper service.

My question is about server stress. I will test this new feature on a shared server, but eventually I will switch to a dedicated server.

So, if the c.5000 membership has 2 URLs each, it's 10,000 websites that it will request. What do people think is the best way to do this? You have a cron job that runs the first 500 members, then in 10 minutes, run the next 500, etc. Etc.

or is there some kind of magic that I have not heard about that could help !?

Thanks for any tips!

+3
source share
3 answers

cron is a great tool to use in these basic concepts. However, it does not scale well, you guessed it! Explore job processing tools such as the open-source (and multilingual) Gearman:

http://gearman.org/

It should be a more reliable system for this task.

+2
source

script, script 10 000 - . script, - . - , imho.

0

, URL script . .

, cron script, / . script , , , evens , - , .

Regarding the implementation of this, I would think about having the script accept two integer values ​​that allow you to define the module and the remainder. For instance. for odd ones, even you define “2 0” and “2 1”, which will lead to the execution of something like SELECT * FROM myTable WHERE id % 2 == 0and SELECT * FROM myTable WHERE id % 2 == 1executed in the SQL database. Using this approach, it would be very easy to configure any number of tasks for parallel work.

The gearmand is very powerful, and I used it in a number of projects, but with it there was a wider learning curve. I think that my simple solution should help you.

0
source

All Articles