The Differences between XHTML and HTML
Created | Updated Nov 9, 2006
HTML stands for Hypertext Mark-up Language and XHTML stands for Extensible Hypertext Mark-up Language. Many people will be familiar with HTML, but they may not know about XHTML. They are both fairly similar; they both use tags and most, if not all, of the tags in HTML are available in XHTML. XHTML is based on XML1 and as such is much stricter with regards to coding.
The Differences
The main differences between HTML and XHTML are shown below.
While there are a few more differences than detailed below, the ones shown are the main ones that are likely to affect most people.
All the differences can be found in the W3C's XHTML specification in the Differences with HTML 4 section.
Closing and Nesting
All tags must be correctly nested - if a tag was started within another tag it must also be ended within that tag.
Correct:
<p>This text is <b>bold</b></p>
Incorrect:
<p>This text is <b>bold</p></b>
All tags must be closed. This includes tags that go around text2 and tags that define objects rather than format text - 'empty elements', such as <img> and <hr> which only have attributes and don't surround any text. Empty elements must be closed either with a closing tag or the start tag must end with /> rather than just >
Surrounding Tags:
Correct:
<p>This is some text</p>
Incorrect:
<p>This is some text
Empty Elements:
Correct:
<br></br> or <br/>
Incorrect:
<br>
So that XHTML documents are backwards compatible with HTML browsers, a space can be placed before the />, eg <br />.
Tags and Attributes
All tags and attribute names must be lower case. This is necessary for all HTML tags used in XHTML because XML is case sensitive - <b> and <B> are classed as different tags in XML.
Correct:
<a href="http://example.com"></a>
Incorrect:
<A HREF="http://example.com"></A>
All attribute values in tags must be quoted, even numbers
Correct:
<table border="200">
Incorrect
<table border=200>
Attribute minimization is not supported by XML. The attribute and value (ie attribute=value) must be written in full. So attribute names such as checked for form elements cannot occur on their own in tags.
Correct:
<input checked="checked">
Incorrect:
<input checked>3
In HTML, the name attribute was defined for certain tags4. The id attribute was also present. XML only uses the id attribute, therefore XHTML documents must use the id attribute to identify elements.
So that XHTML documents are backwards compatible with HTML browsers, both id and name attributes can be used.