XPath How to get table cell value from html document

I have an html document, and somewhere inside the document is below the table, I can get table rows and DOM java objects. It’s not clear to me how to extract the value of a table cell when this value is a row, and also when it is a binary resource?

I use code like:

  XPath xpath;
   XPathExpression expr;
   NodeList nodes=null;
   // Use XPath to obtain whatever you want from the (X)HTML
   try{

      xpath = XPathFactory.newInstance().newXPath();
      //<table class="data">

      NodeList list = doc.getElementsByTagName("table");
     // Node node = list.item(0); 
     //System.out.println(node.getTextContent());
    //String textContent=node.getTextContent();

    expr = xpath.compile("//table/tr/td");
    nodes = (NodeList)expr.evaluate(doc, XPathConstants.NODESET);

and loopiong like:

     for (int i = 0; i < nodes.getLength(); i++) {

       Node ln = list.item(i);
       String lnText=ln.toString();
       NodeList rowElements=ln.getChildNodes();
       Node one=rowElements.item(0);

       String oneText=one.toString();
       String nodeName=one.getNodeName();
       String valOne = one.getNodeValue();

But I do not see the value in the table.

 <table class="data">
 <tr><td>ImageName1</td><td width="50"></td><td><img src="/images/036000291452" alt="036000291452" /></td></tr>
 <tr><td>ImageName2</td><td width="50"></td><td><img src="/images/36000291452" alt="36000291452" /></td></tr>
 <tr><td>Description</td><td></td><td>Time Magazine</td></tr>
 <tr><td>Size/Weight</td><td></td><td>14 Issues</td></tr>
 <tr><td>Issuing Country</td><td></td><td>United States</td></tr>
  </table>
+3
source share
2 answers

This is an XPath expression :

/*/tr[1]/td[1]

selects an element td(without a namespace) that is the first descendant of the first trchild of the top element ( table) of the provided XML document.

XPath expression :

/*/tr[1]/td[2]

td ( ), tr (table) XML-.

:

/*/tr[$m]/td[$n]

td ( ), $n - $m -th tr (table) XML-. $m $n .

XPath string(), :

string(/*/tr[$m]/td[$n])

td ( ), $n - $m -th tr (table) XML-.

+1

, "string (//td)", . - "//td/img/@src", URL-, URL- URL- .

-1

All Articles