Reading and publishing on web pages using C #

I have a project at work that requires me to enter information on a web page, read the next page that I'm redirected to, and then take further action. A simplified example of the real world will be similar to google.com, introducing "Coding Tricks" as search criteria and reading the resulting page.

Small coding examples, such as those related to http://www.csharp-station.com/HowTo/HttpWebFetch.aspx , tell you how to read a web page, but not how to interact with it by sending information to the form and going to the next page.

For the record, I do not create a malicious and / or spam-related product.

So, how do I go to read web pages that require a few simple browsing steps?

+2
source share
6 answers

You can programmatically create an Http request and get the answer:

 string uri = "http://www.google.com/search";
        HttpWebRequest request = (HttpWebRequest)WebRequest.Create(uri);
        request.Method = "POST";
        request.ContentType = "application/x-www-form-urlencoded";

        // encode the data to POST:
        string postData = "q=searchterm&hl=en";
        byte[] encodedData = new ASCIIEncoding().GetBytes(postData);
        request.ContentLength = encodedData.Length;

        Stream requestStream = request.GetRequestStream();
        requestStream.Write(encodedData, 0, encodedData.Length);

        // send the request and get the response
        using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
        {

            // Do something with the response stream. As an example, we'll
            // stream the response to the console via a 256 character buffer
            using (StreamReader reader = new StreamReader(response.GetResponseStream()))
            {
                Char[] buffer = new Char[256];
                int count = reader.Read(buffer, 0, 256);
                while (count > 0)
                {
                    Console.WriteLine(new String(buffer, 0, count));
                    count = reader.Read(buffer, 0, 256);
                }
            } // reader is disposed here
        } // response is disposed here

Of course, this code will return an error since Google uses GET, not POST, for search queries.

This method will work if you are dealing with specific web pages, since the URLs and POST data are mostly hardcoded. If you need something more dynamic, you will have to:

  • Page Capture
  • Remove the form
  • Create a POST string based on form fields

FWIW, I think something like Perl or Python might be better suited for this kind of task.

edit: x-www-form-urlencoded

+4
source

Selenium. Firefox Selenium IDE, script #, Selenium RC #. , System.Net.HttpWebRequest System.Net.WebClient. , . System.Windows.Forms.WebBrowser.

: Selenium IDE Selenium RC, Java, WatiN Test Recorder WatiN .NET.

+3

, html . , , , , .

, , System.Net.HttpWebRequest/HttpWebResponse, , System.Net.WebClient. , cookie , ..

+2

, -, URL-, , . , "" google.com?q=beetles, .

, - querystring (url) , -, -. Google WebRequest webresponse.

0

:

IMacros

http://www.iopus.com/

, , , , .

The top-level product has a graphical interface that you can use to record and edit macros, as well as C # libraries, which you can call from .Net code.

IMHO, this is one of those areas of programming that seems simple to launch ("I just GET the HTML for the page, process the line, then GET the next page ..."), but in practice it becomes to be a real PITA.

0
source

All Articles