Parsing and serializing XML

  • Revision slug: Parsing_and_serializing_XML
  • Revision title: Parsing and serializing XML
  • Revision id: 14973
  • Created:
  • Creator: Brettz9
  • Is current revision? No
  • Comment 3 words added, 3 words removed

Revision Content

 

Mozilla doesn't support the W3C's Document Object Model Load and Save at this moment ({{ Bug("155749") }}), so the easiest way to serialize and deserialize DOM trees is to use the following Mozilla-specific interfaces:

  • XMLSerializer to serialize DOM trees to strings or to files
  • DOMParser to parse XML from strings into DOM trees
  • XMLHttpRequest to parse XML from files into DOM trees. Although DOMParser does have a method named parseFromStream(), it's actually easier to XMLHttpRequest which works for remote (not limited to HTTP) and local files.

Serializing DOM trees to strings

First, create a DOM tree as described in How to Create a DOM tree. Alternatively, use a DOM tree obtained from XMLHttpRequest.

Now, let's serialize doc — the DOM tree — to a string:

var serializer = new XMLSerializer();
var xml = serializer.serializeToString(doc);

From within a JS XPCOM component, new XMLSerializer() is not available. Instead, write:

 var serializer = Components.classes["@mozilla.org/xmlextras/xmlserializer;1"].createInstance(Components.interfaces.nsIDOMSerializer);
 var xml = serializer.serializeToString(doc);

"Pretty" serialization of DOM trees to strings

You can pretty print a DOM tree using XMLSerializer and E4X. First, create a DOM tree as described in the How to Create a DOM tree article. Alternatively, use a DOM tree obtained from XMLHttpRequest. We assume it's in the doc variable.

var serializer = new XMLSerializer();
var prettyString = XML(serializer.serializeToString(doc)).toXMLString();

Indents are provided with two spaces. You can, of course, use DOM:treeWalker to write your own, more performant version which also has the advantage that you can customize the indent string to be whatever you like.

Note: When using the E4X toXMLString method your CDATA elements will be lost and only the containing text remains. So using the above method might not be useful if you have CDATA elements in your XML.

<content><![CDATA[This is the content]]></content>

Will become

<content>This is the content</content>

Serializing DOM trees to files

First, create a DOM tree as described in the How to Create a DOM tree article. If you have already have a DOM tree from using XMLHttpRequest, skip to the end of this section.

Now, let's serialize doc — the DOM tree — to a file (you can read more about using files in Mozilla):

var serializer = new XMLSerializer();
var foStream = Components.classes["@mozilla.org/network/file-output-stream;1"]
               .createInstance(Components.interfaces.nsIFileOutputStream);
var file = Components.classes["@mozilla.org/file/directory_service;1"]
           .getService(Components.interfaces.nsIProperties)
           .get("ProfD", Components.interfaces.nsIFile); // get profile folder
file.append("extensions");   // extensions sub-directory
file.append("{5872365E-67D1-4AFD-9480-FD293BEBD20D}");   // GUID of your extension
file.append("myXMLFile.xml");   // filename
foStream.init(file, 0x02 | 0x08 | 0x20, 0664, 0);   // write, create, truncate
serializer.serializeToStream(doc, foStream, "");   // rememeber, doc is the DOM tree
foStream.close();

Serializing XMLHttpRequest objects to files

If you already have a DOM tree from using XMLHttpRequest, use the same code as above but replace serializer.serializeToStream(doc, foStream, "") with serializer.serializeToStream(xmlHttpRequest.responseXML.documentElement, foStream, "") where xmlHttpRequest is an instance of XMLHttpRequest.

Note that this first parses the XML retrieved from the server, then re-serializes it into a stream. Depending on your needs, you could just save the xmlHttpRequest.responseText directly.

Parsing strings into DOM trees

var theString='<a id="a"><b id="b">hey!</b></a>';
var parser = new DOMParser();
var dom = parser.parseFromString(theString, "text/xml");
// print the name of the root element or error message
dump(dom.documentElement.nodeName == "parsererror" ? "error while parsing" : dom.documentElement.nodeName);

Tutorial to make this work cross browser

Parsing files into DOM trees

XMLHttpRequest

As was previously mentioned, even though DOMParser does have a method named parseFromStream(), it's easier to use XMLHttpRequest to parse XML files into DOM trees (XMLHttpRequest works for both local and remote files). Here is sample code which reads and parses a local XML file into a DOM tree:

var req = new XMLHttpRequest();
req.open("GET", "chrome://passwdmaker/content/people.xml", false); 
req.send(null);
// print the name of the root element or error message
var dom = req.responseXML;
dump(dom.documentElement.nodeName == "parsererror" ? "error while parsing" : dom.documentElement.nodeName);

req.responseXML is a Document instance.

io.js

If you prefer io.js, this code will also parse a file into a DOM tree. Unlike XMLHttpRequest, it will not work with remote files:

var file = DirIO.get("ProfD"); // %Profile% dir
file.append("extensions");
file.append("{5872365E-67D1-4AFD-9480-FD293BEBD20D}");
file.append("people.xml");
var fileContents = FileIO.read(file);
var domParser = new DOMParser();
var dom = domParser.parseFromString(fileContents, "text/xml");
// print the name of the root element or error message
dump(dom.documentElement.nodeName == "parsererror" ? "error while parsing" : dom.documentElement.nodeName);

Resources

  • Sarissa - Sarissa is a cross-browser ECMAScript library for client side XML manipulation, including loading XML from URLs or strings, performing XSLT transformations, XPath queries and more. Supported: Gecko (Mozilla, Firefox etc), IE, KHTML (Konqueror, Safari). If you're writing JavaScript that is used in both XUL applications and HTML pages, and the HTML pages may be viewed in non-Gecko-based applications (such as Internet Explorer, Opera, Konqueror, Safari), you should consider using Sarissa to parse and/or serialize XML. Note: Do not create a DOM object using document.implementation.createDocument() and then use Sarissa classes and methods to manipulate that object. It will not work. You must use Sarissa to create the initial DOM object.
  • Parsing and Serializing XML at XUL Planet

{{ languages( { "ja": "ja/Parsing_and_serializing_XML" } ) }}

Revision Source

<p> </p>
<p>Mozilla doesn't support the W3C's <a class="external" href="http://www.w3.org/TR/DOM-Level-3-LS/load-save.html">Document Object Model Load and Save</a> at this moment ({{ Bug("155749") }}), so the easiest way to serialize and deserialize DOM trees is to use the following Mozilla-specific interfaces:</p>
<ul> <li><a href="/en/XMLSerializer" title="en/XMLSerializer">XMLSerializer</a> to serialize <strong>DOM trees to strings or to files</strong></li> <li><a href="/en/DOMParser" title="en/DOMParser">DOMParser</a> to parse XML from <strong>strings into DOM trees</strong></li> <li><a href="/en/XMLHttpRequest" title="en/XMLHttpRequest">XMLHttpRequest</a> to parse XML from <strong>files into DOM trees</strong>. Although <code>DOMParser</code> does have a method named <code>parseFromStream()</code>, it's actually easier to <a href="/en/XMLHttpRequest" title="en/XMLHttpRequest">XMLHttpRequest</a> which works for remote (not limited to HTTP) <strong>and</strong> local files.</li>
</ul>
<h3 name="Serializing_DOM_trees_to_strings">Serializing DOM trees to strings</h3>
<p>First, create a DOM tree as described in <a href="/en/How_to_create_a_DOM_tree" title="en/How_to_create_a_DOM_tree">How to Create a DOM tree</a>. Alternatively, use a DOM tree obtained from <a href="/en/XMLHttpRequest" title="en/XMLHttpRequest">XMLHttpRequest</a>.</p>
<p>Now, let's serialize <code>doc</code> — the DOM tree — to a string:</p>
<pre>var serializer = new XMLSerializer();
var xml = serializer.serializeToString(doc);
</pre>
<p>From within a JS XPCOM component, <code>new XMLSerializer()</code> is not available. Instead, write:</p>
<pre class="eval"> var serializer = Components.classes["@mozilla.org/xmlextras/xmlserializer;1"].createInstance(Components.interfaces.nsIDOMSerializer);
 var xml = serializer.serializeToString(doc);
</pre>
<h4 name=".22Pretty.22_serialization_of_DOM_trees_to_strings">"Pretty" serialization of DOM trees to strings</h4>
<p>You can <a class="external" href="http://en.wikipedia.org/wiki/Pretty-print">pretty print</a> a DOM tree using <code>XMLSerializer</code> and <a href="/en/E4X" title="en/E4X">E4X</a>. First, create a DOM tree as described in the <a href="/en/How_to_create_a_DOM_tree" title="en/How_to_create_a_DOM_tree">How to Create a DOM tree</a> article. Alternatively, use a DOM tree obtained from <a href="/en/XMLHttpRequest" title="en/XMLHttpRequest">XMLHttpRequest</a>. We assume it's in the <code>doc</code> variable.</p>
<pre>var serializer = new XMLSerializer();
var prettyString = XML(serializer.serializeToString(doc)).toXMLString();
</pre>
<p>Indents are provided with two spaces. You can, of course, use <a href="/en/DOM/treeWalker" title="en/DOM/treeWalker">DOM:treeWalker</a> to write your own, more performant version which also has the advantage that you can customize the indent string to be whatever you like.</p>
<p><strong>Note:</strong> When using the E4X <code>toXMLString</code> method your <strong>CDATA elements will be lost</strong> and only the containing text remains. So using the above method might not be useful if you have CDATA elements in your XML.</p>
<pre>&lt;content&gt;&lt;![CDATA[This is the content]]&gt;&lt;/content&gt;
</pre>
<p>Will become</p>
<pre>&lt;content&gt;This is the content&lt;/content&gt;</pre>
<h3 name="Serializing_DOM_trees_to_files">Serializing DOM trees to files</h3>
<p>First, create a DOM tree as described in the <a href="/en/How_to_create_a_DOM_tree" title="en/How_to_create_a_DOM_tree">How to Create a DOM tree</a> article. If you have already have a DOM tree from using <a href="/en/XMLHttpRequest" title="en/XMLHttpRequest">XMLHttpRequest</a>, skip to the end of this section.</p>
<p>Now, let's serialize <code>doc</code> — the DOM tree — to a file (you can read more <a href="/en/Code_snippets/File_I//O" title="en/Code_snippets/File_I//O">about using files in Mozilla</a>):</p>
<pre>var serializer = new XMLSerializer();
var foStream = Components.classes["@mozilla.org/network/file-output-stream;1"]
               .createInstance(Components.interfaces.nsIFileOutputStream);
var file = Components.classes["@mozilla.org/file/directory_service;1"]
           .getService(Components.interfaces.nsIProperties)
           .get("ProfD", Components.interfaces.nsIFile); // get profile folder
file.append("extensions");   // extensions sub-directory
file.append("{5872365E-67D1-4AFD-9480-FD293BEBD20D}");   // GUID of your extension
file.append("myXMLFile.xml");   // filename
foStream.init(file, 0x02 | 0x08 | 0x20, 0664, 0);   // write, create, truncate
serializer.serializeToStream(doc, foStream, "");   // rememeber, doc is the DOM tree
foStream.close();
</pre>
<h3 name="Serializing_XMLHttpRequest_objects_to_files">Serializing XMLHttpRequest objects to files</h3>
<p>If you already have a DOM tree from using <a href="/en/XMLHttpRequest" title="en/XMLHttpRequest">XMLHttpRequest</a>, use the same code as above but replace <code>serializer.serializeToStream(doc, foStream, "")</code> with <code>serializer.serializeToStream(xmlHttpRequest.responseXML.documentElement, foStream, "")</code> where <code>xmlHttpRequest</code> is an instance of <code>XMLHttpRequest</code>.</p>
<p>Note that this first parses the XML retrieved from the server, then re-serializes it into a stream. Depending on your needs, you could just save the <code>xmlHttpRequest.responseText</code> directly.</p>
<h3 name="Parsing_strings_into_DOM_trees">Parsing strings into DOM trees</h3>
<pre>var theString='&lt;a id="a"&gt;&lt;b id="b"&gt;hey!&lt;/b&gt;&lt;/a&gt;';
var parser = new DOMParser();
var dom = parser.parseFromString(theString, "text/xml");
// print the name of the root element or error message
dump(dom.documentElement.nodeName == "parsererror" ? "error while parsing" : dom.documentElement.nodeName);
</pre>
<p><a class="external" href="http://www.van-steenbeek.net/?q=explorer_domparser_parsefromstring">Tutorial to make this work cross browser</a></p>
<h3 name="Parsing_files_into_DOM_trees">Parsing files into DOM trees</h3>
<h4 name="XMLHttpRequest">XMLHttpRequest</h4>
<p>As was previously mentioned, even though <code>DOMParser</code> does have a method named <code>parseFromStream()</code>, it's easier to use <a href="/en/XMLHttpRequest" title="en/XMLHttpRequest">XMLHttpRequest</a> to parse XML files into DOM trees (<code>XMLHttpRequest</code> works for both local and remote files). Here is sample code which reads and parses a local XML file into a DOM tree:</p>
<pre>var req = new XMLHttpRequest();
req.open("GET", "chrome://passwdmaker/content/people.xml", false); 
req.send(null);
// print the name of the root element or error message
var dom = req.responseXML;
dump(dom.documentElement.nodeName == "parsererror" ? "error while parsing" : dom.documentElement.nodeName);
</pre>
<p><code>req.responseXML</code> is a <code><a class="external" href="http://xulplanet.com/references/objref/Document.html">Document</a></code> instance.</p>
<h4 name="io.js">io.js</h4>
<p>If you prefer <a href="/en/io.js" title="en/io.js">io.js</a>, this code will also parse a file into a DOM tree. Unlike <code>XMLHttpRequest</code>, it will not work with remote files:</p>
<pre>var file = DirIO.get("ProfD"); // %Profile% dir
file.append("extensions");
file.append("{5872365E-67D1-4AFD-9480-FD293BEBD20D}");
file.append("people.xml");
var fileContents = FileIO.read(file);
var domParser = new DOMParser();
var dom = domParser.parseFromString(fileContents, "text/xml");
// print the name of the root element or error message
dump(dom.documentElement.nodeName == "parsererror" ? "error while parsing" : dom.documentElement.nodeName);
</pre>
<h3 name="Resources">Resources</h3>
<ul> <li><a class="link-https" href="https://sourceforge.net/projects/sarissa/">Sarissa</a> - Sarissa is a cross-browser ECMAScript library for client side XML manipulation, including loading XML from URLs or strings, performing XSLT transformations, XPath queries and more. Supported: Gecko (Mozilla, Firefox etc), IE, KHTML (Konqueror, Safari). If you're writing JavaScript that is used in both XUL applications and HTML pages, and the HTML pages may be viewed in non-Gecko-based applications (such as Internet Explorer, Opera, Konqueror, Safari), you should consider using Sarissa to parse and/or serialize XML. <em>Note:</em> Do not create a DOM object using <code>document.implementation.createDocument()</code> and then use Sarissa classes and methods to manipulate that object. It will not work. You must use Sarissa to create the initial DOM object.</li> <li><a class="external" href="http://xulplanet.com/tutorials/mozsdk/xmlparse.php">Parsing and Serializing XML at XUL Planet</a></li>
</ul>
<p>{{ languages( { "ja": "ja/Parsing_and_serializing_XML" } ) }}</p>
Revert to this revision