How can I parse dynamic content from a web page?

I am trying to get a proxy list from this URL:

Free proxy list

That would be cool, but the port number is dynamic JavaScript content. How can I get JavaScript content from this page? I have jsoup and djNativeSwing, but I want to do this in the background thread.

JWebBrowser webBrowser = new JWebBrowser();
webBrowser.navigate("http://spys.ru/en/free-proxy-list/");
System.out.println(webBrowser.getHTMLContent());

this code returns a null result. Help me please.

+5
source share
1 answer

The web browser does not complete the download when you call the getHtmlContent () method. Instead, use something like this:

JWebBrowser webBrowser = new JWebBrowser();
webBrowser.navigate("http://spys.ru/en/free-proxy-list/");
webBrowser.addWebBrowserListener(new WebBrowserListener(){
   public void loadingProgressChanged(WebBrowserEvent e){
       if(e.getWebBrowser().getLoadingProgress()==100)
            System.out.println(webBrowser.getHTMLContent());
   }
}
/* Note: I wrote this in the comment field without any testing,
   you probably have to make the webBrowser final. */

JavaDocs is your friend!

+2
source

All Articles