I'm trying to get about 5,000 companies using python on Google News.
I planned to do the work every 12 hours.
What I'm actually doing is using the Google News link ( https://news.google.com/news/feeds?q=MyQuery&output=rss ). I create a link for the company and then parse the returned XML to get the required data.
The problem is that it returns the result for 500 companies in 20 minutes and gives me the channels, but after that it starts returning the empty result to me. If I open the link, it has entries, but during the execution of the code, it stops returning the result after providing news for about 500 companies.
Now I'm wondering if there is a speed limit for Google News or a time limit?
Below is my code
companies = Company.objects.all()
for company in companies:
try:
SearchQuery = company.query
SearchQuery = SearchQuery.replace(' ', '%20')
rss = "https://news.google.com/news/feeds?q="+SearchQuery+"&output=rss"
feeds = feedparser.parse(rss)
for post in feeds['entries']:
try:
url = post.link
print("RSS Entry, Link: " + url)
title = post.title
print("Inserting Article (Title): "+title)
except Exception:
exc_type, exc_value, exc_traceback = sys.exc_info()
print(repr(traceback.format_exception(exc_type, exc_value,exc_traceback)))
except Exception:
exc_type, exc_value, exc_traceback = sys.exc_info()
print(repr(traceback.format_exception(exc_type, exc_value,exc_traceback)))
Many thanks for your help.
thank
source
share