How to get Innertexts from multiple <a> tags?

This is my sample page. I want to get all inner tag texts to one line. I wrote code for this, but it doesn’t work correctly

<body>
    <div id="infor">
        <div id="genres">
            <a href="#" >Animation</a>
            <a href="#" >Short</a>
            <a href="#" >Action</a>
        </div>
    </div>
</body>

I want to get the internal text of the All tag in one line, I used this code for this, but it does not work correctly.

class Values
{
    private HtmlAgilityPack.HtmlDocument _markup;

    HtmlWeb web = new HtmlWeb(); //creating object of HtmlWeb
    form1 frm = new form1;

    _markup = web.Load("mypage.html"); // load page

    public string Genres
    {
        get
        {
            HtmlNodeCollection headers = _markup.DocumentNode.SelectNodes("//div[contains(@id, 'infor')]/a"); // I filter all of <a> tags in <div id="infor">
            if (headers != null)
            {
                string genres = "";
                foreach (HtmlNode header in headers) // I'm not sure what happens here. 
                {
                    HtmlNode genre = header.ParentNode.SelectSingleNode(".//a[contains(@href, '#')]"); //I think an error occurred in here... 
                    if (genre != null)
                    {
                        genres += genre.InnerText + ", ";
                    }
                }
                return genres;
            }
            return String.Empty;
        }
    }

    frm.text1.text=Genres;
}

text1 (return value):

Animation, Animation, Animation,

But I need the output as follows:

Animation, Short, Action,
+3
source share
2 answers

Little Linq and the use of Descendants, I think, will become easier.

var genreNode = _markup.DocumentNode.Descendants("div").Where(n => n.Id.Equals("genre")).FirstOrDefault();
if (genreNode != null)
{
    // this pulls all <a> nodes under the genre div and pops their inner text into an array
    // then joins that array using the ", " as separator.
    return string.Join(", ", genreNode.Descendants("a")
        .Where(n => n.GetAttributeValue("href", string.Empty).Equals("#"))
        .Select(n => n.InnerText).ToArray());
}
+1
source

, - header.ParentNode.SelectSingleNode(".//a[contains(@href, '#')]"). div, a, ( ). a node, , . , , , , :

HtmlNodeCollection headers = _markup.DocumentNode.SelectNodes("//div[contains(@id, 'infor')]/a[contains(@href, '#')]");
if (headers != null)
    {
    string genres = "";
    foreach (HtmlNode header in headers) // i not sure what happens here. 
        {
        genres += header.InnerText + ", ";
        }
    return genres;
    }
+1

All Articles