A First Web Page

Web pages are written using the HyperText Markup Language (HTML). The language has been reformulated as XHTML (eXtensible HTML) as of W3C's 2000 recommendation. Essentially, the main difference between HTML 4.01 (the last official version of HTML) and XHTML is that the latter is an application of XML, and as such may be processed using standard XML tools. In most other respects, XHTML is identical to HTML 4.01, although documents must strictly conform to the rules of XML. All tags must be closed, for example, and tag names and attribute names are case sensitive (the approach used to ensure conformance in XHTML is to use only lower case identifiers). Having said that, we will start by using some generic HTML code to create a simple web page that should display correctly in any browser, and worry about standards later. Before you start, create a directory called "Web" on your hard drive (or on an external disk drive or flash drive). Then, using a text editing program such as MS Notepad, open a new text file and type in the following code, exactly as shown (note that the indentation has no effect on how the web page will be displayed. It simply highlights the structure of the document, and makes the code a bit easier to read):


<html>
  <head>
    <title>My First Web Page</title>
  </head>

  <body>
    Hello World!
  </body>
</html>


Save the file in the "Web" directory with the filename "index.html" (if using Notepad, be sure to set the "Save as type:" box to "All files", or the file will be saved with the default ".txt" extension used by Notepad). Note that it is standard practice to use lower case characters only for HTML or XHTML file names. Note also that if a web server receives an incoming HTTP request that contains only the server's domain name (e.g. "www.technologyuk.net"), most web server applications look by default for a document called "index.html" in the root directory of the web server, and will return that document to the client browser if found (MS Windows-based servers are the exception to this rule, as they use default document names such as "default.htm", "default.asp" or "default.aspx"). Open a web browser, and use the browser's File menu to navigate to the "Web" directory and open the file "index.html". The result should be a web page that looks something like the illustration below.


The browser output for a basic 'Hello World!' HTML document

The browser output for a basic "Hello World!" HTML document


This admittedly not particularly inspiring web page contains the main structural elements found in all web pages. The elements that define the structure and content of a web page are called tags. A tag consists of the name of the tag enclosed within angle brackets (e.g. <tagname>). Each tag may optionally contain one or more attributes. An attribute is identified by its attribute name, which is followed an equals sign (=), and a value enclosed within quotes (""). Note that paired tags consist of an opening tag and a closing tag. The closing tag is identified by a solidus (forward slash) immediately following the opening angle bracket. A typical tag set might appear as follows:


<tag_name attr_name1="value1", attr_name2="value2", . . . >
  ... something goes here ...
</tag_name>


The <html> . . . </html> tag set encloses the entire contents of the document, and its main purpose is to inform the browser that this is HTML code. The document is subdivided into a head section (enclosed within the <head> . . . </head> tag set) and a body section (enclosed within the <body> . . . </body> tag set). The head of the document usually contains the <title> . . . </title> tag set as a minimum, and may contain other data, such as links to external style sheets or script files, and metadata (information that describes the contents of the document). None of the information in the head section of the document is actually displayed in the main browser window, although it may indirectly affect the way in which the information appearing in the main browser window is displayed in some way. The <title> . . . </title> tag set encloses a text string that is displayed in the browser's title bar when the document is loaded. The <body> . . . </body> tag set encloses the visible content of the document, i.e. the information that will be displayed within the main browser window. The text "Hello World!" is simply displayed in the top left-hand corner of the main browser window, in the default font for the browser (we have not yet applied any markup to this text).