Writing JavaScript for XHTML

  • Revision slug: Writing_JavaScript_for_XHTML
  • Revision title: Writing JavaScript for XHTML
  • Revision id: 69756
  • Created:
  • Creator: Neil
  • Is current revision? No
  • Comment HTML elements now have XHTML namespace in Gecko 2.; 25 words added, 41 words removed

Revision Content

Website authors have started to write XHTML files now instead of HTML 4.01 for about 9 years, and some are starting to write XHTML5. But alas, few XHTML files viewed over the web are served with the correct MIME type, that is, with application/xhtml+xml! This is for one reason due to a certain browser, that is not capable of XHTML as XML. But it is also founded in the experience that the JavaScript, authored carefully for HTML, suddenly breaks in an XML environment.

This article shows some of the reasons alongside with strategies to remedy the problems. It will encourage web authors to use more XML features and make their JavaScripts interoperable with real XHTML applications.

To test the following examples locally, use Firefox's extension switch. Just write an ordinary (X)HTML file and save it once as test.html and once as test.xhtml.

Problem: Nothing Works

After switching the MIME type suddenly no inline script works anymore. Even the plain old alert() method is gone. The code looks something like this:

<script type="text/javascript">
//<!--
window.alert("Hello World!");
//-->
</script>

Solution: The CDATA Trick

This problem usually arises, when inline scripts are included in comments. This was common practice in HTML, to hide the scripts from browsers not capable of JS. In the age of XML comments are what they were intended: comments. Before processing the file, all comments will be stripped from the document, so enclosing your script in them is like throwing your lunch in a Piranha pool. Moreover, there's really no point to commenting out your scripts -- no browser written in the last ten years will display your code on the page.

The easy solution is to do away with the commenting entirely:

<script type="text/javascript">
window.alert("Hello World!");
</script>

This will work so long as your code doesn't contain characters which are "special" in XML, which usually means < and &. If your code contains either of these, you can work around this with CDATA sections:

<script type="text/javascript">
<![CDATA[
// is the variable a non-negative integer less than 10?
if (variable < 10 && variable >= 0)
action();
]]>
</script>

Note that the CDATA section is only necessary because of the < in the code; otherwise you could have ignored it.

A third solution is to use only external scripts, neatly sidestepping the special-character problem.

Alternatively, the CDATA section can be couched within comments so as to be able to work in either application/xhtml+xml or text/html:

<script type="text/javascript">
//<![CDATA[
        ...
//]]>
</script>

<!-- (For styles, it is different) -->

<style type="text/css">
/*<![CDATA[*/
        ...
/*]]>*/
</style>

And if you really need compatibility with very old browsers that do not recognize the script or style tags resulting in their contents displayed on the page, you can use this:

<script type="text/javascript"><!--//--><![CDATA[//><!--
        ...
//--><!]]></script>

<!-- (For styles, it is different) -->

<style type="text/css"><!--/*--><![CDATA[/*><!--*/
        ...
/*]]>*/--></style>

See this document for more on the issues related to application/xhtml+xml and text/html (at least as far as XHTML 1.* and HTML 4; HTML5 addresses many of these problems).

Problem: The DOM Changed

The central object in the DOM, the document object, is of type HTMLDocument in HTML, whereas it is an XMLDocument in XML files. This has an especially huge impact on methods JavaScript authors are used to in daily work. Take the document.getElementsByTagName method, for example. This is a DOM 1 method, which means, there are no XML namespaces respected. Take a look at this common snippet:

var headings = document.getElementsByTagName("h1");
for (var i = 0; i < headings.length; i++) {
  doSomethingWith( headings[i] );
}

Enter the problem: in XHTML, served as XML, all elements are in the XHTML namespace (remember the xmlns attribute in the html tag?). This means, our plain old DOM 1 method suddenly finds no elements anymore. Bang! Immediately 80% of today's JavaScripts on the web crashed, including our snippet above.

Solution: Use DOM 2 Methods

The W3C introduced the DOM 2, addressing the needs of distinguishing namespaces. Perhaps you have seen sometimes before a method like document.getElementsByTagNameNS? The difference is the NS part, meaning, it looks for namespaces. How do we use this method? This is straight forward:

var headings = document.getElementsByTagNameNS("http://www.w3.org/1999/xhtml","h1");
for (var i = 0; i < headings.length; i++) {
  doSomethingWith( headings[i] );
}

The only difference is the mentioning of the namespace the element is in. Okay, more letters to type, but you can define shorthands. Then, let's take only DOM 2 methods from now on!

The following section only applies to previous versions of Gecko. In Gecko 2.0 all HTML elements have the XHTML namespace.

But wait! Now, taking a look in our HTML file, the script breaks again! Remember, in HTML the elements are in no namespace at all! So, what we have to do now is writing a wrapper, that determines, if we are dealing with an HTML or an XML file. Check out this piece of code:

function getHTMLByTagName(tagName) {
  if (document.documentElement.namespaceURI !== null) {
    return node.getElementsByTagName(tagName);
  } else {
    return node.getElementsByTagNameNS("http://www.w3.org/1999/xhtml", tagName);
  }
}

Rather than using the non-standard document.contentType (and/or document.mimeType for Explorer), we check for the namespaceURI in order to detect whether we can use the namespace-aware getElementsByTagNameNS. We cannot use it in HTML 4 or XHTML 1.0 served as text/html (XHTML 1.1 should not be served as text/html), and the namespaceURI check will correctly return null. On the other hand, true XHTML (served as application/xhtml+xml) and HTML5 (which defines HTML elements as supporting namespaces as far as the DOM for greater portability with XHTML) indicate a namespaceURI (the XHTML one) and can both support getElementsByTagNameNS.

NB: The DOM 1 method getElementsByTagName also exists in XML documents. It will find every element of a given name, that is in no namespace at all. For this reason, AJAX's responseXML is often processed with DOM 1 methods without finding any problems. This is because very little XML sent via HTTPRequest bothers with namespaces.

NB: Also note that detection of support for the method getElementsByTagNameNS is not sufficient (as some believe browser support automatically means XHTML will be used), since detection will also return true in a text/html environment.

For creating an element in a similarly agnostic fashion (e.g., within a library), a similar process can be used. If there is browser support, and if the document is in a namespace, we know we can use the XML/XHTML methods. If the root tag is HTML, we also know we can use them, since this cannot be HTML4 or less (if HTML5, there would be a namespaceURI).  Lastly, in order to catch all possible cases of non-HTML, non-XHTML XML, we can detect content type (at least in Mozilla and IE) using non-standard attributes. Why is this necessary? Because XML can theoretically allow root tags with any name, even "html" without it actually being XHTML (it becomes XHTML solely when it is served with the specific content type). If it is not text/html (e.g., application/xml), then we can assume that the document we are in is XML (but with the root node in the null namespace) and we can use the namespaced method (highly unlikely, and certainly poor practice for a document to use a name like "html" in plain, non-XHTML XML, but we cover the case to illustrate the possibility).

function createElementNSIfPossible (ns, name, doc) {
  var elem, d = doc || document;
  if (d.createElementNS &&  // Browser supports the method
      (d.documentElement.nodeName.toLowerCase() != 'html' || // We know it's not HTML4 or less, if the tag is not HTML
       (d.contentType && d.contentType != 'text/html') || // We know it's not regular HTML4 or less if this is Mozilla and the content type is something other than text/html
       (d.mimeType && d.mimeType != 'text/html') // May be possible for IE to be similarly comprehensive now that IE9 supports XHTML
       )
      ) { // Don't create namespaced elements if we're being served as HTML (currently only Mozilla supports this detection in true XHTML-supporting browsers, but Safari and Opera should work with the above DOMParser anyways, and IE doesn't support createElementNS anyways); last test is for the sake of being in a pure XML document
      elem = d.createElementNS(ns, name);
  }
  else {
    elem = d.createElement(name);
  }
  return elem;
}

Problem: Names in XHTML and HTML are represented in different cases

Scripts that used getElementsByTagName() with an upper case HTML name no longer work, and attributes like nodeName or tagName return upper case in HTML and lower case in XHTML.

Solution: Use or convert to lower case

For methods like getElementsByTagName(), passing the name in lower case will work in both HTML and XHTML. For name comparisons, first convert to lower case before doing the comparison (e.g., "el.nodeName.lowerCase() === 'html'"). This will ensure that documents in HTML will compare correctly and will do no harm in XHTML where the names are already lower case.

Problem: My Cookie Won't Be Saved!

We found out already, that the document object in XML files is different from the ones in HTML files. Now we take a look at one property, that is missing in XML files and that we will miss very bad. In XML documents there is no document.cookie. That is, you can write something like

document.cookie = "key=value";

in XML as well, but you will find out, that literally nothing is saved in the cookie storage.

Solution: Use the Storage Object

With Firefox 2 there was a new feature enabled, the HTML 5 Storage object. Although this feature is not free of critics, you can use it to bypass the non-existing cookie, if your document is of type XML. Again, you will have to write your own wrapper to respect any given combination of MIME type and browser.

Problem: I Can't Use document.write()

This problem has the same cause as the one above. This method does not exist in XMLDocuments anymore. There are reasons why this decision was made, one being that a string of invalid markup will instantly break the whole document.

Solution: Use DOM Methods

Many people avoided DOM methods because of the typing to create one simple element, when document.write() was completely satisfying. Now you can't do this as easily as before. Use DOM methods to create all of your elements, attributes and other nodes. This is XML proof, as long as you keep the namespace problem in focus (e.g., there is a document.createElementNS method).

Now, not to be inhonest, you can still use strings like in document.write(), but it takes a little more effort. This code shows you, how to do it:

var string = '<div xmlns="http://www.w3.org/999/xhtml"><h1>Hello World!</h1></div>';
var parser = new DOMParser();
var documentFragment = parser.parseFromString(string, "text/xml");
body.appendChild(documentFragment); // assuming 'body' is the body element

But be aware, that if your string is not well-formed XML (e.g., you have an & where it should not be), then this method will crash, leaving you with a parser error.

Problem: I want to remain forward compatible!

Given the direction away from formatting attributes and the possibility of XHTML becoming eventually more prominent (or at least the document author having the possibility of later wanting to make documents available in XHTML for browsers that support it), one may wish to avoid features which are not likely to stay compatible into the future.

Solution: Avoid HTML-specific DOM

The HTML DOM , even though it is compatible with XHTML 1.0, is not guaranteed to work with future versions of XHTML (perhaps especially the formatting properties which have been deprecated as element attributes). The regular XML DOM provides sufficient methods via the Element interface for getting/setting/removing attributes.

Problem: My Favourite JS Library still Breaks

If you use JavaScript libraries like the famous prototype.js or Yahoo's one, there is bad news for you: As long as the developers don't start to apply the points mentioned above, you won't be able to use them in your XML-XHTML applications.

Two possible ways still are there, but neither is very promissing: Take the library, recode it and publish it or e-mail the developers, e-mail your friends to e-mail the developers and e-mail your customers to e-mail the developers. If they get the hint and are not too annoyed, perhaps they start to implement XML features in their libraries.

I Read about E4X. Now, This Is Perfect, Isn't It?

As a matter of fact, it isn't. E4X is a new method of using and manipulating XML in JavaScript. But, standardized by ECMA, they neglected to implement an interface to let E4X objects interact with DOM objects our document consists of. So, with every advantage E4X has, without a DOM interface you can't use it productively to manipulate your document. However, it can be used for data, and be converted into a string which can then be converted into a DOM object. DOM objects can similarly be converted into strings which can then be converted into E4X.

Finally: Content Negotiation

Now, how do we decide, when to serve XHTML as XML? We can do this on server side by evaluating the HTTP request header. Every browser sends with its request a list of MIME types it understands. So if the browser tells our server, that it can handle XHTML as XML, that is, the Accept field in the HTTP header contains application/xhtml+xml somewhere, we are safe to send the content as XML.

In PHP, for example, you would write something like this:

if( strpos( $_SERVER['HTTP_ACCEPT'], "application/xhtml+xml" ) ) {
  header( "Content-type: application/xhtml+xml" );
  echo '<?xml version="1.0" ?>'."\n";
} else {
  header( "Content-type: text/html" );
}

This distinction also sends the XML declaration, which is strongly recommended, when the document is an XML file. If the content is sent as HTML, an XML declaration would break IE's Doctype switch, so we don't want it there.

For completeness here is the Accept field, that Firefox 2.0.0.9 sends with its requests:

Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5

Further Reading

You will find several useful articles in the developer wiki:

DOM 2 methods you will need are:

 See also

Revision Source

<p>Website authors have started to write XHTML files now instead of HTML 4.01 for about 9 years, and some are starting to write XHTML5. But alas, few XHTML files viewed over the web are served with the correct MIME type, that is, with <em>application/xhtml+xml</em>! This is for one reason due to a certain browser, that is not capable of XHTML as XML. But it is also founded in the experience that the JavaScript, authored carefully for HTML, suddenly breaks in an XML environment.</p>
<p>This article shows some of the reasons alongside with strategies to remedy the problems. It will encourage web authors to use more XML features and make their JavaScripts interoperable with real XHTML applications.</p>
<p>To test the following examples locally, use <a href="/en/XML_in_Mozilla#XHTML" title="en/XML_in_Mozilla#XHTML">Firefox's extension switch</a>. Just write an ordinary (X)HTML file and save it once as <em>test.html</em> and once as <em>test.xhtml</em>.</p>
<h3 name="Problem:_Nothing_Works">Problem: Nothing Works</h3>
<p>After switching the MIME type suddenly no inline script works anymore. Even the plain old <em>alert()</em> method is gone. The code looks something like this:</p>
<pre class="brush: html"><span class="nowiki">&lt;script type="text/javascript"&gt;<br>   //&lt;!--<br>   window.alert("Hello World!");<br>   //--&gt;<br> &lt;/script&gt;</span>
</pre>
<h4 name="Solution:_The_CDATA_Trick">Solution: The CDATA Trick</h4>
<p>This problem usually arises, when inline scripts are included in comments. This was common practice in HTML, to hide the scripts from browsers not capable of JS. In the age of XML comments are what they were intended: comments. Before processing the file, all comments will be stripped from the document, so enclosing your script in them is like throwing your lunch in a Piranha pool. Moreover, there's really no point to commenting out your scripts -- <em>no browser written in the last ten years will display your code on the page</em>.</p>
<p>The easy solution is to do away with the commenting entirely:</p>
<pre class="brush: html"><span class="nowiki">&lt;script type="text/javascript"&gt;<br>   window.alert("Hello World!");<br> &lt;/script&gt;</span>
</pre>
<p>This will work so long as your code doesn't contain characters which are "special" in XML, which usually means <code>&lt;</code> and <code>&amp;</code>. If your code contains either of these, you can work around this with CDATA sections:</p>
<pre class="brush: html"><span class="nowiki">&lt;script type="text/javascript"&gt;<br> &lt;![CDATA[<br>   // is the variable a non-negative integer less than 10?<br>   if (variable &lt; 10 &amp;&amp; variable &gt;= 0)<br>     action();<br> ]]&gt;<br> &lt;/script&gt;</span>
</pre>
<p>Note that the CDATA section is only necessary because of the <code>&lt;</code> in the code; otherwise you could have ignored it.</p>
<p>A third solution is to <a href="/en/Properly_Using_CSS_and_JavaScript_in_XHTML_Documents" title="en/Properly_Using_CSS_and_JavaScript_in_XHTML_Documents">use only external scripts</a>, neatly sidestepping the special-character problem.</p>
<p>Alternatively, the CDATA section can be couched within comments so as to be able to work in either application/xhtml+xml or text/html:</p>
<pre class="brush: html">&lt;script type="text/javascript"&gt;
//&lt;![CDATA[
        ...
//]]&gt;
&lt;/script&gt;

&lt;!-- (For styles, it is different) --&gt;

&lt;style type="text/css"&gt;
/*&lt;![CDATA[*/
        ...
/*]]&gt;*/
&lt;/style&gt;</pre>
<p>And if you really need compatibility with very old browsers that do not recognize the script or style tags resulting in their contents displayed on the page, you can use this:</p>
<pre class="brush: html">&lt;script type="text/javascript"&gt;&lt;!--//--&gt;&lt;![CDATA[//&gt;&lt;!--
        ...
//--&gt;&lt;!]]&gt;&lt;/script&gt;

&lt;!-- (For styles, it is different) --&gt;

&lt;style type="text/css"&gt;&lt;!--/*--&gt;&lt;![CDATA[/*&gt;&lt;!--*/
        ...
/*]]&gt;*/--&gt;&lt;/style&gt;
</pre>
<p>See <a class="external" href="http://hixie.ch/advocacy/xhtml" title="http://hixie.ch/advocacy/xhtml">this document</a> for more on the issues related to application/xhtml+xml and text/html (at least as far as XHTML 1.* and HTML 4; HTML5 addresses many of these problems).</p>
<h3 name="Problem:_The_DOM_Changed">Problem: The DOM Changed</h3>
<p>The central object in the DOM, the <em>document</em> object, is of type <em>HTMLDocument</em> in HTML, whereas it is an <em>XMLDocument</em> in XML files. This has an especially huge impact on methods JavaScript authors are used to in daily work. Take the <em>document.getElementsByTagName</em> method, for example. This is a DOM 1 method, which means, there are no XML namespaces respected. Take a look at this common snippet:</p>
<pre class="brush: js">var headings = document.getElementsByTagName("h1");
for (var i = 0; i &lt; headings.length; i++) {
  doSomethingWith( headings[i] );
}
</pre>
<p>Enter the problem: in XHTML, served as XML, <strong>all</strong> elements are in the XHTML namespace (remember the <em>xmlns</em> attribute in the <em>html</em> tag?). This means, our plain old DOM 1 method suddenly finds no elements anymore. <em>Bang!</em> Immediately 80% of today's JavaScripts on the web crashed, including our snippet above.</p>
<h4 name="Solution:_Use_DOM_2_Methods">Solution: Use DOM 2 Methods</h4>
<p>The W3C introduced the DOM 2, addressing the needs of distinguishing namespaces. Perhaps you have seen sometimes before a method like <em>document.getElementsByTagNameNS</em>? The difference is the <strong>NS</strong> part, meaning, it looks for namespaces. How do we use this method? This is straight forward:</p>
<pre class="brush: js">var headings = document.getElementsByTagNameNS(<strong>"<a class=" external" href="http://www.w3.org/1999/xhtml" rel="freelink">http://www.w3.org/1999/xhtml</a>"</strong>,"h1");
for (var i = 0; i &lt; headings.length; i++) {
  doSomethingWith( headings[i] );
}
</pre>
<p>The only difference is the mentioning of the namespace the element is in. Okay, more letters to type, but you can define shorthands. Then, let's take only DOM 2 methods from now on!</p>
<div class="warning">The following section only applies to previous versions of Gecko. In Gecko 2.0 all HTML elements have the XHTML namespace.</div>
<p>But wait! Now, taking a look in our HTML file, the script breaks again! Remember, in HTML the elements are in <strong>no namespace at all</strong>! So, what we have to do now is writing a wrapper, that determines, if we are dealing with an HTML or an XML file. Check out this piece of code:</p>
<pre class="brush: js">function getHTMLByTagName(tagName) {
  if (document.documentElement.namespaceURI !== null) {
    return node.getElementsByTagName(tagName);
  } else {
    return node.getElementsByTagNameNS("<a class=" external" href="http://www.w3.org/1999/xhtml" rel="freelink">http://www.w3.org/1999/xhtml</a>", tagName);
  }
}
</pre>
<p>Rather than using the non-standard document.contentType (and/or document.mimeType for Explorer), we check for the namespaceURI in order to detect whether we can use the namespace-aware getElementsByTagNameNS. We cannot use it in HTML 4 or XHTML 1.0 served as text/html (XHTML 1.1 should not be served as text/html), and the namespaceURI check will correctly return null. On the other hand, true XHTML (served as application/xhtml+xml) and HTML5 (which defines HTML elements as supporting namespaces as far as the DOM for greater portability with XHTML) indicate a namespaceURI (the XHTML one) and can both support getElementsByTagNameNS.</p>
<p><em>NB:</em> The DOM 1 method <em>getElementsByTagName</em> also exists in XML documents. It will find every element of a given name, that is in no namespace at all. For this reason, AJAX's responseXML is often processed with DOM 1 methods without finding any problems. This is because very little XML sent via HTTPRequest bothers with namespaces.</p>
<p><em>NB:</em> Also note that detection of support for the method <em>getElementsByTagNameNS </em>is not sufficient (as some believe browser support automatically means XHTML will be used), since detection will also return true in a text/html environment.</p>
<p>For creating an element in a similarly agnostic fashion (e.g., within a library), a similar process can be used. If there is browser support, and if the document is in a namespace, we know we can use the XML/XHTML methods. If the root tag is HTML, we also know we can use them, since this cannot be HTML4 or less (if HTML5, there would be a namespaceURI).  Lastly, in order to catch all possible cases of non-HTML, non-XHTML XML, we can detect content type (at least in Mozilla and IE) using non-standard attributes. Why is this necessary? Because XML can theoretically allow root tags with any name, even "html" without it actually being XHTML (it becomes XHTML solely when it is served with the specific content type). If it is not text/html (e.g., application/xml), then we can assume that the document we are in is XML (but with the root node in the null namespace) and we can use the namespaced method (highly unlikely, and certainly poor practice for a document to use a name like "html" in plain, non-XHTML XML, but we cover the case to illustrate the possibility).</p>
<pre class="brush: js">function createElementNSIfPossible (ns, name, doc) {
  var elem, d = doc || document;
  if (d.createElementNS &amp;&amp;  // Browser supports the method
      (d.documentElement.nodeName.toLowerCase() != 'html' || // We know it's not HTML4 or less, if the tag is not HTML
       (d.contentType &amp;&amp; d.contentType != 'text/html') || // We know it's not regular HTML4 or less if this is Mozilla and the content type is something other than text/html
       (d.mimeType &amp;&amp; d.mimeType != 'text/html') // May be possible for IE to be similarly comprehensive now that IE9 supports XHTML
       )
      ) { // Don't create namespaced elements if we're being served as HTML (currently only Mozilla supports this detection in true XHTML-supporting browsers, but Safari and Opera should work with the above DOMParser anyways, and IE doesn't support createElementNS anyways); last test is for the sake of being in a pure XML document
      elem = d.createElementNS(ns, name);
  }
  else {
    elem = d.createElement(name);
  }
  return elem;
}</pre><h3>Problem: Names in XHTML and HTML are represented in different cases</h3>
<p>Scripts that used getElementsByTagName() with an upper case HTML name no longer work, and attributes like nodeName or tagName return upper case in HTML and lower case in XHTML.</p>
<h3>Solution: Use or convert to lower case</h3>
<p>For methods like getElementsByTagName(), passing the name in lower case will work in both HTML and XHTML. For name comparisons, first convert to lower case before doing the comparison (e.g., "el.nodeName.lowerCase() === 'html'"). This will ensure that documents in HTML will compare correctly and will do no harm in XHTML where the names are already lower case.</p>
<h3 name="Problem:_My_Cookie_Won.27t_Be_Saved.21">Problem: My Cookie Won't Be Saved!</h3>
<p>We found out already, that the document object in XML files is different from the ones in HTML files. Now we take a look at one property, that is missing in XML files and that we will miss very bad. In XML documents there is no <em>document.cookie</em>. That is, you can write something like</p>
<pre class="brush: js">document.cookie = "key=value";
</pre>
<p>in XML as well, but you will find out, that literally nothing is saved in the cookie storage.</p>
<h4 name="Solution:_Use_the_Storage_Object">Solution: Use the Storage Object</h4>
<p>With Firefox 2 there was a new feature enabled, the <a href="/en/DOM/Storage" title="en/DOM/Storage">HTML 5 Storage object</a>. Although this feature is not free of critics, you can use it to bypass the non-existing cookie, if your document is of type XML. Again, you will have to write your own wrapper to respect any given combination of MIME type and browser.</p>
<h3 name="Problem:_I_Can.27t_Use_document.write.28.29">Problem: I Can't Use <em>document.write()</em></h3>
<p>This problem has the same cause as the one above. This method does not exist in <em>XMLDocument</em>s anymore. There are reasons why this decision was made, one being that a string of invalid markup will instantly break the whole document.</p>
<h4 name="Solution:_Use_DOM_Methods">Solution: Use DOM Methods</h4>
<p>Many people avoided DOM methods because of the typing to create one simple element, when <em>document.write()</em> was completely satisfying. Now you can't do this as easily as before. Use DOM methods to create all of your elements, attributes and other nodes. This is XML proof, as long as you keep the namespace problem in focus (e.g., there is a <em>document.createElementNS</em> method).</p>
<p>Now, not to be inhonest, you can still use strings like in document.write(), but it takes a little more effort. This code shows you, how to do it:</p>
<pre class="brush: js">var string = '<span class="nowiki">&lt;div xmlns="http://www.w3.org/999/xhtml"&gt;&lt;h1&gt;Hello World!&lt;/h1&gt;&lt;/div&gt;</span>';
var parser = new DOMParser();
var documentFragment = parser.parseFromString(string, "text/xml");
body.appendChild(documentFragment); // assuming 'body' is the body element
</pre>
<p>But be aware, that if your string is not well-formed XML (e.g., you have an &amp; where it should not be), then this method will crash, leaving you with a parser error.</p>
<h3>Problem: I want to remain forward compatible!</h3>
<p>Given the direction away from formatting attributes and the possibility of XHTML becoming eventually more prominent (or at least the document author having the possibility of later wanting to make documents available in XHTML for browsers that support it), one may wish to avoid features which are not likely to stay compatible into the future.</p>
<h3>Solution: Avoid HTML-specific DOM</h3>
<p>The <a class="external" href="http://www.w3.org/TR/DOM-Level-2-HTML/html.html" title="http://www.w3.org/TR/DOM-Level-2-HTML/html.html">HTML DOM</a> , even though it is compatible with XHTML 1.0, is not guaranteed to work with future versions of XHTML (perhaps especially the formatting properties which have been deprecated as element attributes). The regular XML <a href="/en/DOM" title="En/DOM">DOM</a> provides sufficient methods via the <a href="/en/DOM/element" title="En/DOM/Element">Element</a> interface for getting/setting/removing attributes.</p>
<h3 name="Problem:_My_Favourite_JS_Library_still_Breaks">Problem: My Favourite JS Library still Breaks</h3>
<p>If you use JavaScript libraries like the famous prototype.js or Yahoo's one, there is bad news for you: As long as the developers don't start to apply the points mentioned above, you won't be able to use them in your XML-XHTML applications.</p>
<p>Two possible ways still are there, but neither is very promissing: Take the library, recode it and publish it or e-mail the developers, e-mail your friends to e-mail the developers and e-mail your customers to e-mail the developers. If they get the hint and are not too annoyed, perhaps they start to implement XML features in their libraries.</p>
<h3 name="I_Read_about_E4X._Now.2C_This_Is_Perfect.2C_Isn.27t_It.3F">I Read about E4X. Now, This Is Perfect, Isn't It?</h3>
<p>As a matter of fact, it isn't. <a href="/en/E4X" title="en/E4X">E4X</a> is a new method of using and manipulating XML in JavaScript. But, standardized by ECMA, they neglected to implement an interface to let E4X objects interact with DOM objects our document consists of. So, with every advantage E4X has, without a DOM interface you can't use it productively to manipulate your document. However, it can be used for data, and be converted into a string which can then be converted into a DOM object. DOM objects can similarly be converted into strings which can then be converted into E4X.</p>
<h3 name="Finally:_Content_Negotiation">Finally: Content Negotiation</h3>
<p>Now, how do we decide, when to serve XHTML as XML? We can do this on server side by evaluating the HTTP request header. Every browser sends with its request a list of MIME types it understands. So if the browser tells our server, that it can handle XHTML as XML, that is, the <em>Accept</em> field in the HTTP header contains <em>application/xhtml+xml</em> somewhere, we are safe to send the content as XML.</p>
<p>In PHP, for example, you would write something like this:</p>
<pre class="brush: js">if( strpos( $_SERVER['HTTP_ACCEPT'], "application/xhtml+xml" ) ) {
  header( "Content-type: application/xhtml+xml" );
  echo '&lt;?xml version="1.0" ?&gt;'."\n";
} else {
  header( "Content-type: text/html" );
}
</pre>
<p>This distinction also sends the XML declaration, which is strongly recommended, when the document is an XML file. If the content is sent as HTML, an XML declaration would break IE's Doctype switch, so we don't want it there.</p>
<p>For completeness here is the <em>Accept</em> field, that Firefox 2.0.0.9 sends with its requests:</p>
<pre class="eval">Accept: text/xml,application/xml,<strong>application/xhtml+xml</strong>,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
</pre>
<h3 name="Further_Reading">Further Reading</h3>
<p>You will find several useful articles in the developer wiki:</p>
<ul> <li><a href="/en/XML_in_Mozilla" title="en/XML_in_Mozilla">XML in Mozilla</a></li> <li><a href="/en/DOM" title="en/DOM">DOM</a></li> <li><a href="/en/XML_Introduction" title="en/XML_Introduction">XML Introduction</a></li> <li><a href="/en/XML_Extras" title="en/XML_Extras">XML Extras</a></li>
</ul>
<p>DOM 2 methods you will need are:</p>
<ul> <li><a href="/en/DOM/document.createElementNS" title="en/DOM/document.createElementNS">DOM:document.createElementNS</a></li> <li><a href="/en/DOM/document.getElementsByTagNameNS" title="en/DOM/document.getElementsByTagNameNS">DOM:document.getElementsByTagNameNS</a></li>
</ul>
<h3> See also</h3>
<ul> <li><a href="/en/Properly_Using_CSS_and_JavaScript_in_XHTML_Documents" title="en/Properly_Using_CSS_and_JavaScript_in_XHTML_Documents">Properly Using CSS and JavaScript in XHTML Documents</a></li>
</ul>
Revert to this revision