International characters in XUL JavaScript

  • Revision slug: International_characters_in_XUL_JavaScript
  • Revision title: International characters in XUL JavaScript
  • Revision id: 128361
  • Created:
  • Creator: Biesi
  • Is current revision? No
  • Comment consistently use character encoding

Revision Content

Introduction

Gecko 1.8, as used in Firefox 1.5 and other applications, added support for non-ASCII characters in JavaScript files loaded from XUL files.

This means that such script files can use any character from virtually any language of the world. For example, they can contain a line:

var text = "Ein schönes Beispiel eines mehrsprachigen Textes: 日本語";

This mixes German and Japanese characters.

Earlier versions always interpreted JS files loaded from XUL as ISO-8859-1 (Latin-1) characters. (Unicode escapes, as discussed below, have always worked.)

How the character encoding is determined

When the JavaScript file is loaded from a chrome:// URL, for example as part of an extension, a Byte Order Mark (BOM) is used to determine the character encoding of the script. Otherwise, the character encoding will be the same as the one used by the XUL file (which can be specified using an encoding attribute in the <?xml?> tag). By default this will use UTF-8, which can represent virtually all characters in the world.

If the script file is loaded via HTTP, the HTTP header can contain a character encoding declaration as part of the Content-Type header, for example:

Content-Type: application/x-javascript; charset=UTF-8

If no charset parameter is specified, the same rules as above apply.

Cross-version compatibility

If you want the same code to work in both Gecko 1.8 and earlier versions, you must limit yourself to ASCII. However, you can use unicode escapes – the earlier example rewritten using them would be:

var text = "Ein sch\u00F6nes Beispiel eines mehrsprachigen Textes: \u65E5\u672C\u8A9E";

An alternative might be to use property files via nsIStringBundle or the XUL <stringbundle> element; this would allow for localization of the XUL. This can not be done in XUL files loaded from the web, only in privileged code, e.g. in extensions.

Revision Source

<h3 name="Introduction"> Introduction </h3>
<p><a href="en/Gecko">Gecko</a> 1.8, as used in <a href="en/Firefox_1.5_Beta_for_Developers">Firefox 1.5</a> and other applications, added support for non-ASCII characters in <a href="en/JavaScript">JavaScript</a> files loaded from <a href="en/XUL">XUL</a> files.
</p><p>This means that such script files can use any character from virtually any language of the world. For example, they can contain a line:
</p>
<pre class="eval">var text = "Ein schönes Beispiel eines mehrsprachigen Textes: 日本語";
</pre>
<p>This mixes German and Japanese characters.
</p><p>Earlier versions always interpreted JS files loaded from XUL as <a class="external" href="http://en.wikipedia.org/wiki/ISO_8859-1#Code_table">ISO-8859-1</a> (Latin-1) characters. (Unicode escapes, <a href="#Cross-version_compatibility">as discussed below</a>, have always worked.)
</p>
<h3 name="How_the_character_encoding_is_determined"> How the character encoding is determined </h3>
<p>When the JavaScript file is loaded from a <code>chrome://</code> URL, for example as part of an extension, a <a href="en/Byte_Order_Mark">Byte Order Mark</a> (<a class="external" href="http://en.wikipedia.org/wiki/Byte_Order_Mark">BOM</a>) is used to determine the character encoding of the script. Otherwise, the character encoding will be the same as the one used by the XUL file (which can be specified using an <code>encoding</code> attribute in the <code>&lt;?xml?&gt;</code> tag). By default this will use UTF-8, which can represent virtually all characters in the world.
</p><p>If the script file is loaded via HTTP, the HTTP header can contain a character encoding declaration as part of the <code>Content-Type</code> header, for example:
</p>
<pre class="eval">Content-Type: application/x-javascript; charset=UTF-8
</pre>
<p>If no charset parameter is specified, the same rules as above apply.
</p>
<h3 name="Cross-version_compatibility"> Cross-version compatibility </h3>
<p>If you want the same code to work in both Gecko 1.8 and earlier versions, you must limit yourself to ASCII. However, you can use <a href="en/Core_JavaScript_1.5_Guide/Unicode#Unicode_Escape_Sequences">unicode escapes</a> – the earlier example rewritten using them would be:
</p>
<pre class="eval">var text = "Ein sch\u00F6nes Beispiel eines mehrsprachigen Textes: \u65E5\u672C\u8A9E";
</pre>
<p>An alternative might be to use property files via <a href="en/NsIStringBundle">nsIStringBundle</a> or the <a href="en/XUL_Tutorial/Property_Files">XUL &lt;stringbundle&gt; element</a>; this would allow for localization of the XUL. This can not be done in XUL files loaded from the web, only in privileged code, e.g. in extensions.
</p>
Revert to this revision