International characters in XUL JavaScript

  • Revision slug: International_characters_in_XUL_JavaScript
  • Revision title: International characters in XUL JavaScript
  • Revision id: 128358
  • Created:
  • Creator: Biesi
  • Is current revision? No
  • Comment /* Introduction */

Revision Content

Introduction

Gecko 1.8, as used in Firefox 1.5 and other browsers, added support for non-ASCII characters in JavaScript files loaded from XUL files.

This means that such script files can use any character from almost any language of the world. For example, they can contain a line:

 var text = "Ein schönes Beispiel eines mehrsprachigen Textes: 日本語";

This mixes German and Japanese characters.

Earlier versions always interpreted JS files loaded from XUL as ISO-8859-1 (Latin-1) characters. (Unicode escapes, as discussed below, have always worked.)

How the character encoding is determined

In probably most cases, the JavaScript file will be loaded from a chrome:// URL as part of an extension. In such a case, if a Byte Order Mark is used, it determines the character set of the script. Otherwise, the character encoding will be the same as the one used by the XUL file (which can be specified using an encoding attribute in the <?xml?> element). By default this will use UTF-8, which can represent almost all characters in the world.

If the script file is loaded from HTTP, the HTTP header can contain a character encoding declaration as part of the Content-Type header, for example:

Content-Type: application/x-javascript; charset=UTF-8

If no charset is specified, the same rules as above apply.

Cross-version compatibility

If you want the same code to work in both Gecko 1.8 and earlier versions, you must limit yourself to ASCII. However, you can use unicode escapes – the earlier example rewritten using them would be:

var text = "Ein sch\u00F6nes Beispiel eines mehrsprachigen Textes: \u65E5\u672C\u8A9E";

An alternative might be to use property files via nsIStringBundle or the XUL <stringbundle> element; this would allow for localization of the XUL. This can not be done in XUL files loaded from the web, only in privileged code, e.g. in extensions.

Revision Source

<h3 name="Introduction"> Introduction </h3>
<p>Gecko 1.8, as used in Firefox 1.5 and other browsers, added support for non-ASCII characters in JavaScript files loaded from XUL files.
</p><p>This means that such script files can use any character from almost any language of the world. For example, they can contain a line:
</p>
<pre class="eval"> var text = "Ein schönes Beispiel eines mehrsprachigen Textes: 日本語";
</pre>
<p>This mixes German and Japanese characters.
</p><p>Earlier versions always interpreted JS files loaded from XUL as <a class="external" href="http://en.wikipedia.org/wiki/ISO_8859-1#Code_table">ISO-8859-1</a> (Latin-1) characters. (Unicode escapes, as discussed below, have always worked.)
</p>
<h3 name="How_the_character_encoding_is_determined"> How the character encoding is determined </h3>
<p>In probably most cases, the JavaScript file will be loaded from a <code>chrome://</code> URL as part of an extension. In such a case, if a <a href="en/Byte_Order_Mark">Byte Order Mark</a> is used, it determines the character set of the script. Otherwise, the character encoding will be the same as the one used by the XUL file (which can be specified using an <code>encoding</code> attribute in the <code>&lt;?xml?&gt;</code> element). By default this will use UTF-8, which can represent almost all characters in the world.
</p><p>If the script file is loaded from HTTP, the HTTP header can contain a character encoding declaration as part of the <code>Content-Type</code> header, for example:
</p>
<pre class="eval">Content-Type: application/x-javascript; charset=UTF-8
</pre>
<p>If no charset is specified, the same rules as above apply.
</p>
<h3 name="Cross-version_compatibility"> Cross-version compatibility </h3>
<p>If you want the same code to work in both Gecko 1.8 and earlier versions, you must limit yourself to ASCII. However, you can use <a href="en/Core_JavaScript_1.5_Guide/Unicode#Unicode_Escape_Sequences">unicode escapes</a> – the earlier example rewritten using them would be:
</p>
<pre class="eval">var text = "Ein sch\u00F6nes Beispiel eines mehrsprachigen Textes: \u65E5\u672C\u8A9E";
</pre>
<p>An alternative might be to use property files via <a href="en/NsIStringBundle">nsIStringBundle</a> or the <a href="en/XUL_Tutorial/Property_Files">XUL &lt;stringbundle&gt; element</a>; this would allow for localization of the XUL. This can not be done in XUL files loaded from the web, only in privileged code, e.g. in extensions.
</p>
Revert to this revision