I am trying to read html code from a url connection. In one case, the html file I'm trying to read includes 5 line breaks before actually declaring the document type. In this case, the input reader throws an exception for EOF.
URL pageUrl =
new URL(
"http://www.nytimes.com/2011/03/15/sports/basketball/15nbaround.html"
);
URLConnection getConn = pageUrl.openConnection();
getConn.connect();
DataInputStream dis = new DataInputStream(getConn.getInputStream());
Has anyone encountered such a problem?
URL pageUrl = new URL("http://www.nytimes.com/2011/03/15/sports/basketball/15nbaround.html");
URLConnection getConn = pageUrl.openConnection();
getConn.connect();
DataInputStream dis = new DataInputStream(getConn.getInputStream());
String urlData = "";
while ((urlData = dis.readUTF()) != null)
System.out.println(urlData);
// exception thrown
java.io.EOFException in java.io.DataInputStream.readUnsignedShort (DataInputStream.java:323) in java.io.DataInputStream.readUTF (DataInputStream.java UP72) in java.io.DataInputStream.readUTF (DataInputStream.java<47)
in case of bufferedreader, it just answers null and does not continue
pageUrl = new URL("http://www.nytimes.com/2011/03/15/sports/basketball/15nbaround.html");
URLConnection getConn = pageUrl.openConnection();
getConn.connect();
BufferedReader br = new BufferedReader(new InputStreamReader(getConn.getInputStream()));
String urlData = "";
while(true)
urlData = br.readLine();
System.out.println(urlData);
prints zero
Penny source
share