I am trying to get the source code from a URI. It is reported that UTF-8. I also tried ISO-8859-1, ISO-8859-1 Windows-1250 and ISO-8859-2.
Here is my last attempt code (example ISO-8859-2):
public static String getPage(String page,String charset) throws IOException{
URL url=new URL(page);
return org.apache.commons.io.IOUtils.toString(url.openConnection().getInputStream(),charset);
}
public static void main(String args[])throws Exception{
String page=getPage("http://buscon.rae.es/drae/srv/search?val=aba","ISO-8859-2");
System.out.println(page);
}
But the result:
apÄ? ge 'quita, aparta', y este del gr. á¼? I AM? I ± γÎμ)
instead:
(Del lat. Apăge 'quita, aparta', y este del gr. Ἄπαγε).
Similarly, UTF-8 (which works with other code, as well as in browsers) and other encoding names also does not work in a similar way.
source
share