I am reading text from a URL using Jsoup. The following link provides some tips for saving newlines when converting body to text.
How do I save line breaks when using jsoup to convert html to plain text?
I use the following lines to convert tags
String prettyPrintedBodyFragment = Jsoup.clean(body, "", Whitelist
.none().addTags("br", "p", "h1"), new OutputSettings()
.prettyPrint(true));
System.out.println(prettyPrintedBodyFragment);
I still get the body / content on one line. Any tips pl?
EDIT : Here is the complete source code and I see the output in only one line
public static void main(String[] args) throws Exception {
Connection conn = Jsoup.connect("http://finance.yahoo.com/");
Document doc = conn.get();
String body = doc.body().text();
String prettyPrintedBodyFragment = Jsoup.clean(body, "", Whitelist
.none().addTags("br", "p", "h1"), new OutputSettings()
.prettyPrint(true));
System.out.println(prettyPrintedBodyFragment);
}
source
share