Building Blocks
Home     Building Blocks     HTML Documents     Tags     Hyperlinks     CSS     Forms

What is a Web Page?

Web pages are made up of three principle components:

  • text content - the actual headers and paragraphs that appear on the page.
  • references - more complex content like links, images, and multimedia.
  • markup - instructions that describe how the content and references should be displayed.

Each of these components is comprised of text.  This means that Web pages can be saved in text-only format but this also means that Web pages can be viewed in practically any browser and guarantees the universality of the Web.

It should be noted that Web pages also contain information about the language or script in which the text was written (encoding) and the kind of markup that describes it (doctype).

How HTML tagging works

HTML tells a browser how to display a web page. The documents containing HTML are plain text files (ASCII) with special "tags" that a web browser knows how to interpret and display on your screen. Proper syntax for HTML markup is <tag>the content the tag effects</tag>

Elements, Attributes, and Values

HTML is made up of three principle types of markup: elements, attributes and values

Elements are labels (tags) that identify and structure the different parts of a Web page.  Some elements have one or more attributes which further describe the purpose and content of the element. Elements can contain content (container tags) or they can be empty (stand-alone tags):

Container Tags

Stand-Alone Tags

Typical HTML element with content. The opening and closing tags surround the text. In this case the words will be italicized when viewed in the browser window. Empty elements do not surround any content. They have a single opening and closing tag. The final space and slash is optional in HTML but required in XHTML

 

Attribute and Value

Attribute and Predefined Value

Attributes are always located inside an element's opening tag. Their values should always be enclosed in quotation marks. The value might be a number, a word, a string of text, a URL, or a measurement. Separate each attribute pair from the next with a space. Some attributes only accept specific values. For example, the align attribute of the paragraph element can only be set to left, right, or center. FYI, this attribute has been "deprecated" from future HTML specifications in favor of style sheet alignment control.

Parents and Children

If one element contains another, it is considered the parent of the enclosed element (child). This structure becomes important when adding styles to elements or applying Javascript effects to them. When elements contain other elements, each element must be properly nested.

Nesting

It's ok to use multiple elements but they must be properly nested. If you open p and then em, you must close em before you close p

HTML vs. XHTML

There is no browser that supports every tag or every property of every tag in the current HTML/XHTML standard.  The current effort (XHTML and CSS) by the W3C (World Wide Web Consortium) is an attempt to encourage standardization by all browser manufacturers.

  • HTML and XHTML use the same elements, attributes, and values.  The difference is in the syntax. Where HTML is forgiving, XHTML is very strict.
  • HTML doesn't care if you use html, head and body elements.  XHTML requires them.
  • HTML lets you omit some closing tags. XHTML requires them for every element, even empty ones.
  • HTML lets you omit quotes around attribute values that contain just letters, numbers and four simple symbols (-, ., _, and :). XHTML has nightmares if you leave out the quotes.
  • HTML is flexible about case.  XHTML demands that all elements, attributes, and predefined values be in lower case.

Browser Basics

Special Characters

Some characters have a special meaning in HTML, like the less than sign (<) that defines the start of an HTML tag. If we want the browser to actually display these characters we must insert character entities in the HTML source.

A character entity has three parts: an ampersand (&), an entity name or a # and an entity number, and finally a semicolon (;). So to display a less than sign in an HTML document we must write either &lt; or &#60;

The advantage of using a name instead of a number is that a name is easier to remember. The disadvantage is that not all browsers support the newest entity names, while the support for entity numbers is very good in almost all browsers.

Note that entities are case sensitive. 

The most common character entity in HTML is the non-breaking space: &nbsp;

Special Characters in HTML

Browser View

Browsers ignore multiple character spaces in an HTML file. Notice how the browser ignores the spaces in the first example. You can add hard spaces using the "nonbreaking space" (&nbsp;) character entity.

 
  Visit (X)HTML Elements and Attributes Reference Site
http://www.cookwood.com/html/extras/xhtml_ref.html

Entities for Characters with Special Meanings
http://www.cookwood.com/html/extras/entities.html#html

W3 Schools Special Character Entity Reference
http://www.w3schools.com/html/html_entitiesref.asp

 


Test Yourself