Revision 36857 of Fixing common validation problems

  • Revision slug: Fixing_common_validation_problems
  • Revision title: Fixing common validation problems
  • Revision id: 36857
  • Created:
  • Creator: Dikrib
  • Is current revision? No
  • Comment DOCTYPE updated to HTML5; 291 words added, 543 words removed

Revision Content

If you are not careful, you risk building your web pages on assumptions of browser behavior, that were never intentional, never documented and likely to not hold in the future. These assumptions are called quirks, and should be avoided in your web pages. Quirks may cause web pages to behave differently in different browsers, and even different versions of the same browser.

One way to help you make web pages, that work across multiple browsers is to use an HTML validator, that can help you identify problems and quickly correct them. An often used validator is the World Wide Web Consortium's HTML Validator. It's provided by the same people who are responsible for the HTML specification, and more importantly most of its error messages provide a link to an explanation of what the error means. Eventually, of course, you'll recognize what each error message means without having to look up the explanation, but when you're starting out these help files are invaluable.

Your goal is simple: to bring your page to a state where it doesn't generate any errors at all. For bonus points, you could try to eliminate any warnings as well.

As you fix your markup to remove one error, you may find that you generate more-- or that suddenly several other errors go away. For example, if you add a missing end-table tag (</table>) to a document, you might fix every "element not allowed here" error that followed. In any case, the goal of every author should be to have no errors at all of either kind.

DOCTYPE and Validation

Validators typically use a DOCTYPE to choose which standard to validate against. In browsers, a DOCTYPE is only used to switch between quirks mode and standards mode. Before you validate your document, it is important to have a correct DOCTYPE for mainly two reasons:

  • If browsers use quirks mode, validating your document will not help you make it behave the same in different browsers. Browsers deliberately act differently in this mode due to backwards compatibility.
  • If you validate your code against a different standard than the one used by browsers, validation will just lead you in the wrong direction.

The DOCTYPE must be placed in the top of your HTML document and should look like this:

<!DOCTYPE html>

This is the HTML5 doctype. There are several good reasons to validate against HTML5 rather than the older HTML4 or XHTML standards, even if your HTML document does not use any of the new features introduced in HTML5.

  • HTML5 is the standard modern browsers are trying to implement. Validating against HTML4 may put you in the wrong direction, as it may give you advice that is no longer correct in HTML5 and has never been correct in browser implementations.
  • Validating against XHTML will most likely lead you in the wrong direction, since your document is most likely not interpreted as XHTML.
  • Validation is not the best method to ensure backwards compatibility with older HTML4 browsers, as those browsers are typically less standards compliant.
  • Great care has been taken to make HTML5 backwards compatible with older browser implementations.

Once you've added this element to the top of your document, then you can use the Document Type option "(specified inline)" and the W3C validator will use the DTD you've declared in your document to validate the markup.

Common Problems

There are a few errors that authors will likely see many, many times as they validate pages. There are also a few things that a validator might not catch (software is generally as perfect as the humans who write it). Here are a few of the most common errors and pitfalls to avoid.

Forgetting Important Attributes

If you get an attribute-related error, it's very likely going to tell you that you forgot to include a required attribute. These include:

  • the type attribute for the elements script and style
  • the alt attribute for the elements img and area
  • the summary attribute for the element table

The latter two attribute are important for accessibility reasons, as their inclusion assists users who are using text-only or audio browsers. The first attribute we mention, type, is critical for forward compatibility. As an example, many browsers (including Netscape 6) will ignore any STYLE element that has no type attribute, which has the usually unwanted effect of disabling the entire stylesheet.

A related situation is that the strict DOCTYPE for HTML and XHTML does not permit the attribute language, so type is the only way to mark what kind of script is being used. Thus, if you have a script that starts like this:

<script language="Javascript">

...then the validator is quite likely to throw an error. You can fix this by modifying the element to read:

<script type="text/javascript">

Script Trouble

Besides the potential problems centered around the language attribute, there are a few other ways in which scripts can cause you trouble when validating your HTML.

If your script contains any HTML tags inside string values, then make sure to escape the forward-slash symbol. For example, you need to write var docEle = "<html><\/html>" (note the boldfaced character) in order to prevent validation problems. This is a good practice in any case.

You should also enclose the contents of your SCRIPT element in an HTML comment. This is often done for both scripts and STYLE elements, so you may not encounter this problem. The usual way this is done looks something like this:

<script type="text/javascript"><!--
   (...script goes here...)
//--></script>

Note the Javascript single-line comment (//) in the final line, which is needed to make sure that the browser's Javascript engine ignores the string -->.

Improper Nesting of Elements

Over the years, authors have developed a number of tricks that get the effects they want with a minimum of typing, and which avoid certain display effects. Unfortunately, most of these are based on wholly invalid markup and will cause a validator to choke. They'll also lead to display and functionality problems in standards-compliant browsers like Firefox 2 and Internet Explorer 6+ (in "strict" mode), so they need to be fixed anyway.

One very common example is wrapping a FONT element around one or more paragraphs, tables, or other block-level elements. As it happens, FONT is an inline element, and therefore cannot contain block-level elements. So the following markup is structurally incorrect:

<font color="red">
<p>Hey, paragraphs can't be inside font elements!</p>
</font>

It's exactly the same if you wrap a FONT element around a table. If you must color all of the text in your table, and you feel you must use FONT to do it, then you'll have to put the font elements inside each cell of the table. Of course, CSS makes this a lot easier:

<table style="color: red;">

On a related note, some authors like to avoid the "white space" that the FORM element introduces inside table cells by doing something like this:

<table>
<form action="script.cgi" method="get">
<tr><td>(...form widgets here...)</td></tr>
</form>
</table>

That will trigger an error because you can't put FORM inside a table but outside a table cell. You could wrap the form element around the entire table, or put the form into the table cell and use CSS to set its margins to zero-- but in that case the entire form would have to be placed within that single table cell. If you're using a table to lay out your form, then you need to wrap it around the whole table, or around an entire section of the document if that's feasible.

Inconsistent Case in Class and ID Values

Despite the fact that HTML has been historically case-insensitive, values in modern HTML and XHTML (as well as XML) are quite case-sensitive. This includes the names of class and ID identifiers. Thus, ExternalLink is not the same as externalLINK or even externallink. Standards-compliant browsers such as Netscape 6 enforce the case sensitivity of class and ID names. However, the HTML validator does not check case in values against other instances of the same values, either in the document or in any associated scripts or stylesheets, and so will not catch any inconsistencies that might lead to trouble in page display. For more information on this point, please see the Tech Note "Case Sensitivity in class and id Names."

Improper Comments

Although it may seem picky, it is important to be sure that you format your HTML comments correctly. The correct form of an HTML comment is:

<!-- comment -->

That's two dashes at either end, not three as some authors like to include. In general, you should avoid any sequence of dashes within a comment, and stick to the allowed pair of dashes to help mark the beginning and end of the comment. (See HTML 4.01, section 3.2.4 for more information.)

Ampersands

Because the ampersand character (&) is reserved for marking character entities, authors should never use raw ampersands in their HTML source-- and that includes ampersands inside URLs! Thus, any URL that needs an ampersand should be written like this:

http://www.site.web/path/doc.html?var1=val1&amp;var2=val2&amp;var3=val3

Each instance of &amp; will be translated by a Web browser into an ampersand, without triggering validation warnings.

Attribute Value Presence and Quotation

If you're validating against an XHTML DOCTYPE, then all of your attributes must have values, and all of these values must be enclosed in quotation marks. You must also close every element you open, so in those cases where there is no close tag, the end of the element should include a forward-slash. These are requirements of XHTML (and with XML-based languages in general), and so the validator will flag any instance where you do not follow these rules. One example of valid XHTML markup that will differ noticeably from historical HTML:

<input type="checkbox" checked="checked" name="prefSys" value="MacOS" />

Note the addition of a (quoted) value to checked and the slash at the end of the tag. Without these additions, this markup fragment would not be valid XHTML.

Conclusion

Although it may seem like more work at first, validating your markup now will pay off handsomely in saved time and effort later. Not only will your documents stand a much better chance of being properly displayed in all current and future browsers, but it will be much easier to maintain your documents, or even to convert them from HTML to another markup language such as XML.

Although the ideal goal is to have pages that generate no validation errors and no warnings, your primary concern should be the elimination of actual errors. Similarly, you should be more concerned about element errors than about attribute errors, although you really can't afford to ignore either kind. Once you've cleaned things up so that you no longer get errors, then you can turn to the task of styling the document and feel confident that the page will display in just about any known browser, as well as any decent browser to come.

Also On MDC

Related Links

Original Document Information

  • Author(s): Eric A. Meyer, Netscape Communications
  • Last Updated Date: Published 05 Mar 2001
  • Copyright Information: Copyright © 2001-2003 Netscape. All rights reserved.
  • Note: This reprinted article was originally part of the DevEdge site.

{{ languages( { "it": "it/Libert\u00e0!_Uguaglianza!_Validit\u00e0!", "es": "es/\u00a1Libertad,_Igualdad,_Validez!", "fr": "fr/Libert\u00e9_!_\u00c9galit\u00e9_!_Validit\u00e9_!" } ) }}

Revision Source

<p>If you are not careful, you risk building your web pages on assumptions of browser behavior, that were never intentional, never documented and likely to not hold in the future. These assumptions are called quirks, and should be avoided in your web pages. Quirks may cause web pages to behave differently in different browsers, and even different versions of the same browser.</p>
<p>One way to help you make web pages, that work across multiple browsers is to use an HTML validator, that can help you identify problems and quickly correct them. An often used validator is the <a class="external" href="http://validator.w3.org/">World Wide Web Consortium's HTML Validator</a>. It's provided by the same people who are responsible for the HTML specification, and more importantly most of its error messages provide a link to an explanation of what the error means. Eventually, of course, you'll recognize what each error message means without having to look up the explanation, but when you're starting out these help files are invaluable.</p>
<p>Your goal is simple: to bring your page to a state where it doesn't generate any errors at all. For bonus points, you could try to eliminate any warnings as well.</p>
<p>As you fix your markup to remove one error, you may find that you generate more-- or that suddenly several other errors go away. For example, if you add a missing end-table tag (<span class="nowiki">&lt;/table&gt;</span>) to a document, you might fix every "element not allowed here" error that followed. In any case, the goal of every author should be to have no errors at all of either kind.</p>
<h3 name="DOCTYPE_and_Validation">DOCTYPE and Validation</h3>
<p>Validators typically use a <code>DOCTYPE</code> to choose which standard to validate against. In browsers, a <code>DOCTYPE</code> is only used to switch between <a href="/en/Quirks_Mode_and_Standards_Mode" title="Quirks Mode and Standards Mode">quirks mode and standards mode</a>. Before you validate your document, it is important to have a correct <code>DOCTYPE</code> for mainly two reasons:</p>
<ul> <li>If browsers use quirks mode, validating your document will not help you make it behave the same in different browsers. Browsers deliberately act differently in this mode due to backwards compatibility.</li> <li>If you validate your code against a different standard than the one used by browsers, validation will just lead you in the wrong direction.</li>
</ul>
<p>The DOCTYPE must be placed in the top of your HTML document and should look like this:</p>
<pre>&lt;!DOCTYPE html&gt;
</pre>
<p>This is the HTML5 doctype. There are several good reasons to validate against HTML5 rather than the older HTML4 or XHTML standards, even if your HTML document does not use any of the new features introduced in HTML5.</p>
<ul> <li>HTML5 is the standard modern browsers are trying to implement. Validating against HTML4 may put you in the wrong direction, as it may give you advice that is no longer correct in HTML5 and has never been correct in browser implementations.</li> <li>Validating against XHTML will most likely lead you in the wrong direction, since your document is <a href="/en/XHTML#MIME_type_versus_DOCTYPE" title="en/XHTML#MIME_type_versus_DOCTYPE">most likely not interpreted as XHTML</a>.</li> <li>Validation is not the best method to ensure backwards compatibility with older HTML4 browsers, as those browsers are typically less standards compliant.</li> <li>Great care has been taken to make HTML5 backwards compatible with older browser implementations.</li>
</ul>
<p>Once you've added this element to the top of your document, then you can use the Document Type option "(specified inline)" and the W3C validator will use the DTD you've declared in your document to validate the markup.</p>
<h3 name="Common_Problems">Common Problems</h3>
<p>There are a few errors that authors will likely see many, many times as they validate pages. There are also a few things that a validator might not catch (software is generally as perfect as the humans who write it). Here are a few of the most common errors and pitfalls to avoid.</p>
<h4 name="Forgetting_Important_Attributes">Forgetting Important Attributes</h4>
<p>If you get an attribute-related error, it's very likely going to tell you that you forgot to include a required attribute. These include:</p>
<ul> <li>the <code>type</code> attribute for the elements <code>script</code> and <code>style</code></li> <li>the <code>alt</code> attribute for the elements <code>img</code> and <code>area</code></li> <li>the <code>summary</code> attribute for the element <code>table</code></li>
</ul>
<p>The latter two attribute are important for accessibility reasons, as their inclusion assists users who are using text-only or audio browsers. The first attribute we mention, <code>type</code>, is critical for forward compatibility. As an example, many browsers (including Netscape 6) will ignore any <code>STYLE</code> element that has no <code>type</code> attribute, which has the usually unwanted effect of disabling the entire stylesheet.</p>
<p>A related situation is that the strict DOCTYPE for HTML and XHTML does not permit the attribute <code>language</code>, so <code>type</code> is the only way to mark what kind of script is being used. Thus, if you have a script that starts like this:</p>
<pre>&lt;script language="Javascript"&gt;</pre>
<p>...then the validator is quite likely to throw an error. You can fix this by modifying the element to read:</p>
<pre>&lt;script type="text/javascript"&gt;</pre>
<h4 name="Script_Trouble">Script Trouble</h4>
<p>Besides the potential problems centered around the <code>language</code> attribute, there are a few other ways in which scripts can cause you trouble when validating your HTML.</p>
<p>If your script contains any HTML tags inside string values, then make sure to escape the forward-slash symbol. For example, you need to write <code>var docEle = "&lt;html&gt;&lt;<strong>\</strong>/html&gt;"</code> (note the boldfaced character) in order to prevent validation problems. This is a good practice in any case.</p>
<p>You should also enclose the contents of your <code>SCRIPT</code> element in an HTML comment. This is often done for both scripts and <code>STYLE</code> elements, so you may not encounter this problem. The usual way this is done looks something like this:</p>
<pre>&lt;script type="text/javascript"&gt;&lt;!--
   (...script goes here...)
//--&gt;&lt;/script&gt;</pre>
<p>Note the Javascript single-line comment (<code>//</code>) in the final line, which is needed to make sure that the browser's Javascript engine ignores the string <code>--&gt;</code>.</p>
<h4 name="Improper_Nesting_of_Elements">Improper Nesting of Elements</h4>
<p>Over the years, authors have developed a number of tricks that get the effects they want with a minimum of typing, and which avoid certain display effects. Unfortunately, most of these are based on wholly invalid markup and will cause a validator to choke. They'll also lead to display and functionality problems in standards-compliant browsers like Firefox 2 and Internet Explorer 6+ (in "strict" mode), so they need to be fixed anyway.</p>
<p>One very common example is wrapping a <code>FONT</code> element around one or more paragraphs, tables, or other block-level elements. As it happens, <code>FONT</code> is an inline element, and therefore cannot contain block-level elements. So the following markup is structurally incorrect:</p>
<pre>&lt;font color="red"&gt;
&lt;p&gt;Hey, paragraphs can't be inside font elements!&lt;/p&gt;
&lt;/font&gt;</pre>
<p>It's exactly the same if you wrap a <code>FONT</code> element around a table. If you must color all of the text in your table, and you feel you must use <code>FONT</code> to do it, then you'll have to put the font elements inside each cell of the table. Of course, CSS makes this a lot easier:</p>
<pre>&lt;table style="color: red;"&gt;</pre>
<p>On a related note, some authors like to avoid the "white space" that the <code>FORM</code> element introduces inside table cells by doing something like this:</p>
<pre>&lt;table&gt;
&lt;form action="script.cgi" method="get"&gt;
&lt;tr&gt;&lt;td&gt;(...form widgets here...)&lt;/td&gt;&lt;/tr&gt;
&lt;/form&gt;
&lt;/table&gt;</pre>
<p>That will trigger an error because you can't put <code>FORM</code> inside a table but outside a table cell. You could wrap the <code>form</code> element around the entire table, or put the form into the table cell and use CSS to set its margins to zero-- but in that case the entire form would have to be placed within that single table cell. If you're using a table to lay out your form, then you need to wrap it around the whole table, or around an entire section of the document if that's feasible.</p>
<h4 name="Inconsistent_Case_in_Class_and_ID_Values">Inconsistent Case in Class and ID Values</h4>
<p>Despite the fact that HTML has been historically case-insensitive, values in modern HTML and XHTML (as well as XML) are quite case-sensitive. This includes the names of class and ID identifiers. Thus, <code>ExternalLink</code> is not the same as <code>externalLINK</code> or even <code>externallink</code>. Standards-compliant browsers such as Netscape 6 enforce the case sensitivity of class and ID names. However, the HTML validator does not check case in values against other instances of the same values, either in the document or in any associated scripts or stylesheets, and so will not catch any inconsistencies that might lead to trouble in page display. For more information on this point, please see the Tech Note "<a href="/en/Case_Sensitivity_in_class_and_id_Names" title="en/Case_Sensitivity_in_class_and_id_Names">Case Sensitivity in <code>class</code> and <code>id</code> Names</a>."</p>
<h4 name="Improper_Comments">Improper Comments</h4>
<p>Although it may seem picky, it is important to be sure that you format your HTML comments correctly. The correct form of an HTML comment is:</p>
<pre>&lt;!-- comment --&gt;</pre>
<p>That's <strong>two</strong> dashes at either end, not three as some authors like to include. In general, you should avoid any sequence of dashes within a comment, and stick to the allowed pair of dashes to help mark the beginning and end of the comment. (See <a class="external" href="http://www.w3.org/TR/html401/intro/sgmltut.html#h-3.2.4">HTML 4.01, section 3.2.4</a> for more information.)</p>
<h4 name="Ampersands">Ampersands</h4>
<p>Because the ampersand character (<code>&amp;</code>) is reserved for marking character entities, authors should never use raw ampersands in their HTML source-- and that includes ampersands inside URLs! Thus, any URL that needs an ampersand should be written like this:</p>
<pre>http://www.site.web/path/doc.html?var1=val1&amp;amp;var2=val2&amp;amp;var3=val3</pre>
<p>Each instance of <code>&amp;amp;</code> will be translated by a Web browser into an ampersand, without triggering validation warnings.</p>
<h4 name="Attribute_Value_Presence_and_Quotation">Attribute Value Presence and Quotation</h4>
<p>If you're validating against an XHTML DOCTYPE, then all of your attributes must have values, and all of these values must be enclosed in quotation marks. You must also close every element you open, so in those cases where there is no close tag, the end of the element should include a forward-slash. These are requirements of XHTML (and with XML-based languages in general), and so the validator will flag any instance where you do not follow these rules. One example of valid XHTML markup that will differ noticeably from historical HTML:</p>
<pre>&lt;input type="checkbox" checked="checked" name="prefSys" value="MacOS" /&gt;</pre>
<p>Note the addition of a (quoted) value to <code>checked</code> and the slash at the end of the tag. Without these additions, this markup fragment would not be valid XHTML.</p>
<h3 name="Conclusion">Conclusion</h3>
<p>Although it may seem like more work at first, validating your markup now will pay off handsomely in saved time and effort later. Not only will your documents stand a much better chance of being properly displayed in all current and future browsers, but it will be much easier to maintain your documents, or even to convert them from HTML to another markup language such as XML.</p>
<p>Although the ideal goal is to have pages that generate no validation errors and no warnings, your primary concern should be the elimination of actual errors. Similarly, you should be more concerned about element errors than about attribute errors, although you really can't afford to ignore either kind. Once you've cleaned things up so that you no longer get errors, then you can turn to the task of styling the document and feel confident that the page will display in just about any known browser, as well as any decent browser to come.</p>
<h3 name="Also_On_MDC">Also On MDC</h3>
<ul> <li><a href="/en/Case_Sensitivity_in_class_and_id_Names" title="en/Case_Sensitivity_in_class_and_id_Names">Case Sensitivity in <code>class</code> and <code>id</code> Names</a></li>
</ul>
<h3 name="Related_Links">Related Links</h3>
<ul> <li><a class="external" href="http://validator.w3.org/">The W3C's HTML Validator</a></li> <li><a class="external" href="http://www.w3.org/QA/2002/04/valid-dtd-list.html">W3C DTD List</a></li> <li><a class="external" href="http://www.w3.org/TR/html401/intro/sgmltut.html#h-3.2.4">HTML 4.01, section 3.2.4</a></li> <li><a href="/en/Quirks_Mode_and_Standards_Mode" title="en/Quirks_Mode_and_Standards_Mode">Mozilla's Quirks Mode</a></li> <li><a class="external" href="http://www.oreillynet.com/pub/a/javascript/synd/2001/08/28/doctype.html">DOCTYPE Explained</a></li>
</ul>
<div class="originaldocinfo"> <h3 name="Original_Document_Information">Original Document Information</h3> <ul> <li>Author(s): Eric A. Meyer, Netscape Communications</li> <li>Last Updated Date: Published 05 Mar 2001</li> <li>Copyright Information: Copyright © 2001-2003 Netscape. All rights reserved.</li> <li>Note: This reprinted article was originally part of the DevEdge site.</li> </ul>
</div>
<p>{{ languages( { "it": "it/Libert\u00e0!_Uguaglianza!_Validit\u00e0!", "es": "es/\u00a1Libertad,_Igualdad,_Validez!", "fr": "fr/Libert\u00e9_!_\u00c9galit\u00e9_!_Validit\u00e9_!" } ) }}</p>
Revert to this revision