I am executing the following code with JSoup
Document parse = Jsoup.connect("http://www.google.com/movies?near=<MyCity>&sort=1&start=0")
.followRedirects(true)
.ignoreContentType(true)
.timeout(12000)
.userAgent("Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:25.0) Gecko/20100101 Firefox/25.0")
.referrer("http://www.google.com")
.execute()
.parse();
Elements elements = parse.select(".movie_results .movie");
but when I check elementsit clearly skips a lot of content. I am trying to get the title and description of the movie from the page above.
What am I missing? Could this be due to the lack of header options, cookies? Is there any other library that could solve the problem?
I am trying to reproduce the same problem by doing:
curl http://www.google.com/movies?near=<MyCity>&sort=1&start=0 > page.html
Protyp
Just by highlighting one of the comments: try.jsoup.org is a good place to start using Jsoup. This will help you parse html in a very simple way.
Please +1 if you liked the tip and saved your day: D
source
share