TEAR: "Tearing" HTML Pages Off the Internet

Click to open or copy the TEAR project files.

The TEAR sample shows how to write an MFC console application that uses WININET.DLL to communicate with the Internet. The sample shows how to form an HTTP request using CHttpFile against CHttpConnection and CInternetSession objects.

Running the Sample

You must pass TEAR a URL referencing an HTTP server with a Web page that you would like to retrieve. To run the sample, open a command prompt window, type TEAR and the URL at the command line, and press ENTER. The program will log onto the Internet and retrieve the page and copy the raw HTML for the page to stdout. For example:

TEAR http://www.microsoft.com/ie/support/default.htm

The sample TEAR accepts several command-line options. To display the TEAR command-line options, open a command prompt window, type TEAR at the command line, and press ENTER. The program displays a list of command-line options.

You can use the /D option to force the program to use the preconfigured Internet access parameters that you have set up on your computer via Control Panel or the Internet Access Setup Wizard. If your local area network is directly connected to the Internet, use the /L option. If you need to reach the Internet through a gateway, use the /G option. The program responds to these options by changing the session flags passed to the CInternetSession constructor.

The /S option strips HTML formatting tags from the retrieved text, while the /P option causes the application to use CInternetSession::EnableStatusCallback to get information on the progress of the connection.

Redirection to a Different Server

Redirection in the TEAR sample deserves special attention. TEAR uses the INTERNET_FLAG_NO_AUTO_REDIRECT when it sends its request. This means that WININET doesn't automatically handle redirection errors by redirecting to another server. Instead, WININET reports redirection errors back to TEAR as errors.

There is a substanital amount of extra code in the sample to detect and react to these errors, and most applications won't need that code. Normally, applications won't specify the INTERNET_FLAG_NO_AUTO_REDIRECT flag when making HTTP requests. This lets WININET handle redirection transparently. Since TEAR is a sample, however, it provides a good vehicle for demonstrating the use of the CHttpFile::QueryInfo and CHttpFile::QueryInfoStatusCode member functions.

When TEAR detects a redirection error, it reacts by using QueryInfo to return all of the headers passed in the response header. It parses the response header to find the new location--that is, the target of the server's redirection, and handles the error appropriately. It is possible to ask TEAR to connect to a site that issues a redirection to another URL that in turn issues yet another redirection. In this situation, TEAR fails the request and print an error message.

You can learn more about the redirection status codes and the HTTP protocol in general by reading the specifications available at http://www.w3.org.