Agility Pack Settings

I am using Agility Pack to parse HTML, following this question What is the best way to parse html in C #?  and I get excellent results :) The problem arises when I am on some web pages because the results are based on my location, therefore, for example, as in Spain, I get results in the region of Spain, and I would like to change, as if I would be in England, how can this be done? I mean, what do I need to change in the user agent? (I use as the user agent "Mozilla / 5.0 (Windows; U; Windows NT 5.1; en-US; rv: xxx) Gecko / 20041107 Firefox / xx)"

+5
source share
2 answers

You can use a method WebClient.DownloadStringthat allows you to set HTTP request headers to load the contents of a web page, and then use it with the HTML flexibility package.

UserAgent does not control the language. This is the headline Accept-Language. For example:

using (var client = new WebClient())
{
    client.Headers[HttpRequestHeader.AcceptLanguage] = "es-ES";
    client.Headers[HttpRequestHeader.UserAgent] = "some user agent if you wish";
    string html = client.DownloadString("http://example.com");
    // feed the HTML to HTML Agility Pack
    var doc = new HtmlDocument();
    doc.LoadHtml(html);

    // now do the parsing
}

But if the site uses IP-based recognition to send you content in different languages, you cannot do this on the client side to change this.

+9
source

Search by location or pages are usually done using ip, or when you register, you indicate the site you are on. you may want to look into the anon proxy inside the country in which you would like to look like you are.

+1
source

All Articles