How to use ScrapySharp to analyze elements in an html document?

Question

How to use ScrapySharp to analyze elements in an html document?

Here is the official representative of the "Documentation" project:

https://bitbucket.org/rflechner/scrapysharp/wiki/Home

No matter what I try, I cannot find a method CssSelect()that the library should add to simplify the request. Here is what I tried:

using ScrapySharp.Core;
using ScrapySharp.Html.Parsing;
using HtmlAgilityPack;

HtmlWeb web = new HtmlWeb();
HtmlDocument doc = web.Load("http://www.stackoverflow.com");

var page = doc.DocumentNode.SelectSingleNode("//body");
page.CssSel???

How can I use this library? The documentation is not clear which type html.

+5

html c # web scraping html-agility-pack scrapysharp

sergserg Mar 31 '13 at 1:11

source share

1 answer

Ben allred · Accepted Answer · 2013-03-31T07:08:35+0000

Add

using ScrapySharp.Extensions;

It seems like you are missing it. This should make CssSelectavailable.

Just in case, the example helps, here is also the method that I use in the project:

private string GetPdfUrl(HtmlDocument document, string baseUrl)
{
    return new Uri(new Uri(baseUrl), document.DocumentNode.CssSelect(".table-of-content .head-row td.download a.text-pdf").Single().Attributes["href"].Value).ToString();
}

How to use ScrapySharp to analyze elements in an html document?

More articles: