Revision 36825 of Fixing common validation problems

  • Revision slug: Fixing_common_validation_problems
  • Revision title: Fixing common validation problems
  • Revision id: 36825
  • Created:
  • Creator: CitizenK
  • Is current revision? No
  • Comment /* Script Trouble */

Revision Content

Summary: Document validation is the single most important tool an author has available. Validating not only makes sure your documents are well-formed, but makes them more robust and ready for the future. Get the details on common validation errors and how to fix your documents so you can avoid them. Like a nation or a house, a page divided against itself cannot stand-- not in standards-compliant browsers, anyway. Every page has a structure, and it turns out that if you aren't careful with your construction methods, the structure will be weakened, flawed, and potentially dangerous. If you've ever loaded up a page in Opera or Netscape 6 or Internet Explorer and had it look totally mangled, odds are that you've inadvertently built a shaky structure.

Imagine building a house on a foundation of sand, or with rubber support beams. Most people wouldn't even bother, and anyone who did shouldn't be surprised by huge cracks in the walls, wildly uneven flooring, or even total collapse of the structure. Yet many authors are shocked to discover that their pages fall apart in recent browsers. The usual reaction is that "the page was fine before!" which is exactly like saying "my rubber-column house didn't collapse on the same day it was built!" Perhaps not, but it was always in danger of falling over.

So how does one ensure a good, solid Web house? Well-structured markup. A clean document structure is absolutely essential to ensuring that your pages will behave in browsers both present and future. Fortunately, fixing up a page's structure after it's been built is a lot easier and less expensive than trying to correct structural flaws in a house! In fact, there are HTML validators out there that can help you identify the problems and quickly correct them. We highly recommend the World Wide Web Consortium's HTML Validator-- not only because it's provided by the same people who are responsible for the HTML and XHTML specifications, but also because most of its error messages provide a link to an explanation of what the error means. Eventually, of course, you'll recognize what each error message means without having to look up the explanation, but when you're starting out these help files are invaluable.

Your goal is simple: to bring your page to a state where it doesn't generate any errors at all. For bonus points, you could try to eliminate any warnings as well, but the important thing is to avoid having errors. There are, practically speaking, two general kinds of errors:

  • Warnings about elements, which are the most serious and can really mangle a page if left uncorrected. For example, an error like "element 'TD' not allowed here", which implies that you either have a TD outside of a table element, or else the validator thinks you do. Either way it's a major problem, and finding out why should be a top priority. An element error is equivalent to a contractor telling you that he left some critical support beams out of your house.
  • Warnings about attributes, which are less serious since most browsers will ignore any attribute they don't understand. This is not to say that attribute errors can be ignored, but they are generally less of a concern than element errors.

As you fix your markup to remove one error, you may find that you generate more-- or that suddenly several other errors go away. For example, if you add a missing end-table tag (</table>) to a document, you might fix every "element not allowed here" error that followed. In any case, the goal of every author should be to have no errors at all of either kind.

DOCTYPE and Validation

When you validate your document, you have the option to pick which Document Type Definition (DTD) you want to use as the standard. There are many options available, from HTML 2.0 up to the most recent standard available (it was XHTML 1.1 when we wrote this article). If you want your pages to work in today's browsers, then the best choice is a recent DTD. Given the generally backwards-compatible nature of HTML and XHTML, validating against a recent DTD should mean you'll be all right if older browsers drop by.

Rather than picking a DTD from the provided list, you can also place a DOCTYPE element at the top of your document, thus marking it as using a specific DTD. Let's say you wanted to use the HTML 4.0 Strict DTD. In that case, the very first line of your document (even before the <html> tag) should be:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN"
"http://www.w3.org/TR/REC-html40/strict.dtd">

Once you've added this element to the top of your document, then you can use the Document Type option "(specified inline)" and the W3C validator will use the DTD you've declared in your document to validate the markup.

It is also the case that recent browsers (Netscape 6, Explorer 5 for Macintosh, Explorer 6 for Windows) will make use of the DOCTYPE element to determine the "rendering mode" you want browser to use when displaying your document. Generally speaking, any "transitional" or "loose" DTD, or even a lack of a DOCTYPE, will cause the browsers to use a rendering mode that emulates legacy browser behavior. "Strict" DTDs, on the other hand, will switch browsers into a standards-compliant rendering mode. This is an easy way for authors to decide how they want browsers to handle their markup. The Apple Developer Connection has an article called "DOCTYPE Explained" that covers this territory in more detail; note that Internet Explorer 6 for Windows also supports the "DOCTYPE switching" described in the article.

Common Problems

There are a few errors that authors will likely see many, many times as they validate pages. There are also a few things that a validator might not catch (software is generally as perfect as the humans who write it). Here are a few of the most common errors and pitfalls to avoid.

Forgetting Important Attributes

If you get an attribute-related error, it's very likely going to tell you that you forgot to include a required attribute. These include:

  • the type attribute for the elements script and style
  • the alt attribute for the elements img and area
  • the summary attribute for the element table

The latter two attribute are important for accessibility reasons, as their inclusion assists users who are using text-only or audio browsers. The first attribute we mention, type, is critical for forward compatibility. As an example, many browsers (including Netscape 6) will ignore any STYLE element that has no type attribute, which has the usually unwanted effect of disabling the entire stylesheet.

A related situation is that the strict DOCTYPE for HTML and XHTML does not permit the attribute language, so type is the only way to mark what kind of script is being used. Thus, if you have a script that starts like this:

<script language="Javascript">

...then the validator is quite likely to throw an error. You can fix this by modifying the element to read:

<script type="text/javascript">

Script Trouble

Besides the potential problems centered around the language attribute, there are a few other ways in which scripts can cause you trouble when validating your HTML.

If your script contains any HTML tags inside string values, then make sure to escape the forward-slash symbol. For example, you need to write var docEle = "<html><\/html>" (note the boldfaced character) in order to prevent validation problems. This is a good practice in any case.

You should also enclose the contents of your SCRIPT element in an HTML comment. This is often done for both scripts and STYLE elements, so you may not encounter this problem. The usual way this is done looks something like this:

<script type="text/javascript"><!--
   (...script goes here...)
//--></script>

Note the Javascript single-line comment (//) in the final line, which is needed to make sure that the browser's Javascript engine ignores the string -->.

Improper Nesting of Elements

Original Document Information

  • Author(s): Eric A. Meyer, Netscape Communications
  • Last Updated Date: Published 05 Mar 2001
  • Copyright Information: Copyright © 2001-2003 Netscape. All rights reserved.
  • Note: This reprinted article was originally part of the DevEdge site.

Revision Source

<p><span class="comment">Summary: Document validation is the single most important tool an author has available.  Validating not only makes sure your documents are well-formed, but makes them more robust and ready for the future.  Get the details on common validation errors and how to fix your documents so you can avoid them.</span>
Like a nation or a house, a page divided against itself cannot stand-- not in standards-compliant browsers, anyway. Every page has a structure, and it turns out that if you aren't careful with your construction methods, the structure will be weakened, flawed, and potentially dangerous. If you've ever loaded up a page in Opera or Netscape 6 or Internet Explorer and had it look totally mangled, odds are that you've inadvertently built a shaky structure.
</p><p>Imagine building a house on a foundation of sand, or with rubber support beams. Most people wouldn't even bother, and anyone who did shouldn't be surprised by huge cracks in the walls, wildly uneven flooring, or even total collapse of the structure. Yet many authors are shocked to discover that their pages fall apart in recent browsers. The usual reaction is that "the page was fine before!" which is exactly like saying "my rubber-column house didn't collapse on the same day it was built!" Perhaps not, but it was always in danger of falling over.
</p><p>So how does one ensure a good, solid Web house? Well-structured markup. A clean document structure is absolutely essential to ensuring that your pages will behave in browsers both present and future. Fortunately, fixing up a page's structure after it's been built is a lot easier and less expensive than trying to correct structural flaws in a house! In fact, there are HTML validators out there that can help you identify the problems and quickly correct them. We highly recommend the <a class="external" href="http://validator.w3.org/">World Wide Web Consortium's HTML Validator</a>-- not only because it's provided by the same people who are responsible for the HTML and XHTML specifications, but also because most of its error messages provide a link to an explanation of what the error means. Eventually, of course, you'll recognize what each error message means without having to look up the explanation, but when you're starting out these help files are invaluable.
</p><p>Your goal is simple: to bring your page to a state where it doesn't generate any errors at all. For bonus points, you could try to eliminate any warnings as well, but the important thing is to avoid having errors. There are, practically speaking, two general kinds of errors:
</p>
<ul><li> <b>Warnings about elements</b>, which are the most serious and can really mangle a page if left uncorrected. For example, an error like "element '<code>TD</code>' not allowed here", which implies that you either have a <code>TD</code> outside of a table element, or else the validator thinks you do. Either way it's a major problem, and finding out why should be a top priority. An element error is equivalent to a contractor telling you that he left some critical support beams out of your house.
</li></ul>
<ul><li> <b>Warnings about attributes</b>, which are less serious since most browsers will ignore any attribute they don't understand. This is not to say that attribute errors can be ignored, but they are generally less of a concern than element errors.
</li></ul>
<p>As you fix your markup to remove one error, you may find that you generate more-- or that suddenly several other errors go away. For example, if you add a missing end-table tag (<span class="plain">&lt;/table&gt;</span>) to a document, you might fix every "element not allowed here" error that followed. In any case, the goal of every author should be to have no errors at all of either kind. 
</p>
<h3 name="DOCTYPE_and_Validation"> DOCTYPE and Validation </h3>
<p>When you validate your document, you have the option to pick which Document Type Definition (DTD) you want to use as the standard. There are many options available, from HTML 2.0 up to the most recent standard available (it was XHTML 1.1 when we wrote this article). If you want your pages to work in today's browsers, then the best choice is a recent DTD. Given the generally backwards-compatible nature of HTML and XHTML, validating against a recent DTD should mean you'll be all right if older browsers drop by.
</p><p>Rather than picking a DTD from the provided list, you can also place a <code>DOCTYPE</code> element at the top of your document, thus marking it as using a specific DTD. Let's say you wanted to use the HTML 4.0 Strict DTD. In that case, the very first line of your document (even before the <code><span class="plain">&lt;html&gt;</span></code> tag) should be:
</p>
<pre>&lt;!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN"
"http://www.w3.org/TR/REC-html40/strict.dtd"&gt;</pre>
<p>Once you've added this element to the top of your document, then you can use the Document Type option "(specified inline)" and the W3C validator will use the DTD you've declared in your document to validate the markup.
</p><p>It is also the case that recent browsers (Netscape 6, Explorer 5 for Macintosh, Explorer 6 for Windows) will make use of the <code>DOCTYPE</code> element to determine the "rendering mode" you want browser to use when displaying your document. Generally speaking, any "transitional" or "loose" DTD, or even a lack of a <code>DOCTYPE</code>, will cause the browsers to use a rendering mode that emulates legacy browser behavior. "Strict" DTDs, on the other hand, will switch browsers into a standards-compliant rendering mode. This is an easy way for authors to decide how they want browsers to handle their markup. The Apple Developer Connection has an article called "<a class="external" href="http://www.oreillynet.com/pub/a/javascript/synd/2001/08/28/doctype.html">DOCTYPE Explained</a>" that covers this territory in more detail; note that Internet Explorer 6 for Windows also supports the "DOCTYPE switching" described in the article.
</p>
<h3 name="Common_Problems"> Common Problems </h3>
<p>There are a few errors that authors will likely see many, many times as they validate pages. There are also a few things that a validator might not catch (software is generally as perfect as the humans who write it). Here are a few of the most common errors and pitfalls to avoid.
</p>
<h4 name="Forgetting_Important_Attributes"> Forgetting Important Attributes </h4>
<p>If you get an attribute-related error, it's very likely going to tell you that you forgot to include a required attribute. These include:
</p>
<ul><li> the <code>type</code> attribute for the elements <code>script</code> and <code>style</code>
</li><li> the <code>alt</code> attribute for the elements <code>img</code> and <code>area</code>
</li><li> the <code>summary</code> attribute for the element <code>table</code>
</li></ul>
<p>The latter two attribute are important for accessibility reasons, as their inclusion assists users who are using text-only or audio browsers. The first attribute we mention, <code>type</code>, is critical for forward compatibility. As an example, many browsers (including Netscape 6) will ignore any <code>STYLE</code> element that has no <code>type</code> attribute, which has the usually unwanted effect of disabling the entire stylesheet.
</p><p>A related situation is that the strict DOCTYPE for HTML and XHTML does not permit the attribute <code>language</code>, so <code>type</code> is the only way to mark what kind of script is being used. Thus, if you have a script that starts like this:
</p>
<pre>&lt;script language="Javascript"&gt;</pre>
<p>...then the validator is quite likely to throw an error. You can fix this by modifying the element to read:
</p>
<pre>&lt;script type="text/javascript"&gt;</pre>
<h4 name="Script_Trouble"> Script Trouble </h4>
<p>Besides the potential problems centered around the <code>language</code> attribute, there are a few other ways in which scripts can cause you trouble when validating your HTML.
</p><p>If your script contains any HTML tags inside string values, then make sure to escape the forward-slash symbol. For example, you need to write <code>var docEle = "&lt;html&gt;&lt;<b>\</b>/html&gt;"</code> (note the boldfaced character) in order to prevent validation problems. This is a good practice in any case.
</p><p>You should also enclose the contents of your <code>SCRIPT</code> element in an HTML comment. This is often done for both scripts and <code>STYLE</code> elements, so you may not encounter this problem. The usual way this is done looks something like this:
</p>
<pre>&lt;script type="text/javascript"&gt;&lt;!--
   (...script goes here...)
//--&gt;&lt;/script&gt;</pre>
<p>Note the Javascript single-line comment (<code>//</code>) in the final line, which is needed to make sure that the browser's Javascript engine ignores the string <code>--&gt;</code>. 
</p>
<h4 name="Improper_Nesting_of_Elements"> Improper Nesting of Elements </h4>
<div class="originaldocinfo">
<h3 name="Original_Document_Information"> Original Document Information </h3>
<ul><li> Author(s): Eric A. Meyer, Netscape Communications
</li><li> Last Updated Date: Published 05 Mar 2001
</li><li> Copyright Information: Copyright © 2001-2003 Netscape. All rights reserved.
</li><li> Note: This reprinted article was originally part of the DevEdge site.
</li></ul>
</div>
Revert to this revision