Chapter 1. HTML

Web browsers deal with three major languages:

Developing a modern Web application requires some familiarity with all three of these. We'll start with HTML: It was originally the only language for building Web sites, and it still forms the foundation of the Web.

1.1. Tags and elements

Here is a simple example of an HTML snippet:

<p><em>HTML</em> was defined in 1991 by
<a href="http://www.w3.org/People/Berners-Lee/">Tim Berners-Lee</a>
alone.</p>

Whitespace is ignored except as separating words. So while our original HTML is written on three different lines, it would actually be displayed on a page as:

HTML was defined in 1991 by Tim Berners-Lee alone.

You can see that HTML contains pieces enclosed in angle brackets (‘<’ and ‘>’), which are also known as the less-than and greater-than symbols. Each of these angle-bracket-enclosed pieces, called a tag, does not appear to the user, but rather gives information about the page's structure.

In our example, notice the pair of similar tags “<em>” and “</em>”. The first “<em>” indicates the beginning of some text that should be displayed as emphasized (which most browsers render using italics); the second “</em>” indicates the end of the emphasized text. Together, this pair of an opening tag and a closing tag, along with everything in between, is called an element.

You can also see an <a> element, which indicates an anchor that a user might want to click. In this case, the element's opening tag includes some additional information called an attribute. Each element has a set of possible attributes that you can use. In this case, we've defined the href attribute, which says which Web page the browser should load when the user clicks the anchor. Attributes are always listed within the angle brackets following the tag name, separated by whitespace. Almost always, an attribute is written with the attribute name (href here) followed by an equals sign followed by the value of the attribute enclosed in quotation marks.

Overall, our example snippet includes three elements: A larger <p> element (which represents a paragraph), in which can be found an <em> element and an <a> element. HTML requires that elements be nested: “<p><em>Short paragraph</p></em>” would be illegal, since the <em> element starts inside the <p> element, but its closing tag is outside the <p> element.

There are a few elements where no closing tag is required or expected, since the element can never have any content nested within it. An example is <img> tag for incorporating an image into a Web page. In this case, you do not need a closing tag (though you may provide one if desired).

<p>Our logo (<img src="logo.png">) is orange.</p>

1.2. HTML page structure

Modern versions of HTML require a file containing HTML code to have a specific overall structure: It includes a header describing the file as HTML, followed by one huge <html> element. The <html> element contains two elements, <head> and <body>. Here is a simple example of a full HTML file. (The indentation is optional; it's just to help you identify the different elements.)

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Simple example</title>
  </head>

  <body>
    <p>Here is a simple example of a full HTML page.</p>
  </body>
</html>

The <head> element contains information about the file but which shouldn't be displayed in the browser window. In this case, you can see a <meta> element indicated how the file is encoded (an important issue for non-ASCII characters) and a <title> element giving a name to the page (which most browsers will display at the top of the browser window).

The <body> element contains the elements that actually should be displayed within the Web page.

1.3. Element classes and identifiers

Soon, we'll need a way to identify particular elements within CSS and JavaScript. HTML provides two such ways: identifiers and classes. These simply provide names for identifying elements within CSS and JavaScript; they do not themselves affect the HTML structure in any other way.

An element's identifier is simply a string that is supposed to give a unique developed-provided name to the element. It is defined in HTML using an id attribute.

<p id="ident-defn">The developer is responsible for
ensuring that identifiers are truly unique within a page.</p>

While each element can have only one identifier, and each identifier can be used only once, multiple elements may belong to a class, and any one element may belong to several classes. The class(es) of an element are defined using the class attribute.

<p>An element may have an <em class="dfn">identifier</em>
and one or more <em class="dfn">classes</em>.</p>

For an element that belongs to multiple classes, you would list the class names separated by spaces within the attribute value.