Hyper Text Markup Language - or HTML

I know most of you will probably be at least a little familiar with Hyper Text Markup Language, or HTML. When you want to go to a web site, you typically enter a Universal Resource Locator, or URL. This might be something like http://www.wrox.com. The HyperText Transfer Protocol (HTTP) tells the browser that you will be retrieving an HTML page from that web site. If no page is specified, as in our example, the default.html page is downloaded from the domain - www.wrox.com.

So when we retrieve a page from a web site by specifying a URL, we are really providing our browser three pieces of information:

the protocol – http
the host – the unique location of the resource we want - www.wrox.com
the page on that host that we want to download - in this case default.html.

So the URL is really http://www.wrox.com/default.html. We provide our browser with the protocol, the location of the .html file, and the name of the file to retrieve. The URL specifies a unique .html file somewhere in the World Wide Web - there is a fully qualified URL that is unique to each and every web page on the planet.

An HTML file is really just an ASCII file that uses HTML tags to tell the browser how to render the text. For example, the HTML page might have something like:

 <B>This text will be displayed in bold</B>

The text we will see on the page when we view it with our browser is sandwiched between the two tags: there's a 'Bold' tag - <B>, and an 'End Bold' tag - </B>. The tags are enclosed in <> brackets. If you know the meanings of the various tags, it is actually pretty easy to figure out what the resulting page will look like.

In this case the <B> tag tells the browser to make everything that follows it in bold until the delimiting tag, </B> is encountered. As you can see, most tags come in pairs and act –conceptually - as containers. Another tag will do something like:

<C>This text will be centred in the browser display</C>

That's all there is to it in principle. Now a single ASCII, human-readable HTML file can be rendered to look almost the same in any browser, running on any type of computer, anywhere in the world. It is the individual browser's responsibility to render the tags in the way that browser does best. So the beauty of HTML is that it is browser-independent - we can display an HTML page in the Netscape Browser, the Microsoft Internet Explorer, or any software browser. Any browser running on an Apple, IBM, Sun, or whatever is responsible for rendering the HTML to the screen. As long as we serve up the HTML page correctly, it is the browser's responsibility to make it look good.

When you get a second, fire up your browser and access any web page on the Internet. Then from the main menu of the browser, select View-Source. The browser will start up a notepad window and actually show you the HTML code for the web page that it is currently displaying. This exercise is very informative. You can see just how simple most web pages really are.

The problem is that HTML pages are like the old 8-millimeter movies of the 1950s. They are quaint, but can't do anything beyond displaying static text. Of course, there are tags to include graphics, such as .GIF files. Other tags permit the downloading of sound in .WAV files or even movies in .AVI files. But once an HTML page is defined by writing it in the tag-based HTML language, that's it. It is essentially static.

Well, what happens if we need to present specific information depending on who is asking for it? Well, up until recently only gurus could accomplish this magic. They would write what are called CGI scripts typically in a very arcane scripting language called PERL. This approach usually worked, but was very esoteric and error prone.