On 2015-05-26 at 09:00 PST, MDN search will be unavailable for 5-10 minutes for system maintenance.

mozilla

Revision 129357 of Encodings for localization files

  • Revision slug: Encodings_for_localization_files
  • Revision title: Encodings for localization files
  • Revision id: 129357
  • Created:
  • Creator: Hamaryns
  • Is current revision? No
  • Comment better wording, quote signs

Revision Content

When creating a localization for Mozilla products, it’s important to be aware of the encoding of the files that you generate.

In general, files in the Mozilla CVS repositories are {{mediawiki.interwiki('wikipedia', 'UTF-8', 'UTF-8')}} encoded. There are a few exceptions, though.

Installer

The windows installer can’t handle UTF-8, but only the codepages provided by windows. This is hooked up a bit tricky in the build process, so here it goes:

File Encoding Notes
toolkit/installer/windows/charset.mk ASCII The WIN_INSTALLER_CHARSET variable must be set to an encoding which matches toolkit/installer/windows/install.it CHARSET= parameter. See the table below for appropriate values.
toolkit/installer/windows/install.it A Windows codepage. This must match the CHARSET= parameter in this file, and the WIN_INSTALLER_CHARSET parameter in charset.mk The FONTNAME/FONTSIZE/CHARSET parameters in this file must be set to good values. For most Western scripts, ‘MS Sans Serif’ and ‘8’ are good defaults for the font settings. Eastern scripts will need to choose appropriate fonts that are shipped with Windows. See the table below for appropriate values for the CHARSET= parameter.
browser/installer/installer.inc UTF-8
toolkit/installer/unix/install.it UTF-8 {{template.Deprecated_inline()}}

Native Windows encodings

The following table lists native Windows encodings, and the WIN_INSTALLER_CHARSET and CHARSET= values for each:

Encoding Name WIN_INSTALLER_CHARSET (charset.mk) CHARSET= (windows/install.it)
ANSI_CHARSET CP1252 0
BALTIC_CHARSET CP1257 186
CHINESEBIG5_CHARSET CP950 136
EASTEUROPE_CHARSET CP1250 238
GB2312_CHARSET CP936 134
GREEK_CHARSET CP1253 161
HANGUL_CHARSET CP949 129
RUSSIAN_CHARSET CP1251 204
SHIFTJIS_CHARSET CP932 128
TURKISH_CHARSET CP1254 162
VIETNAMESE_CHARSET CP1258 163
Middle East language editions of Windows:
ARABIC_CHARSET CP1256 178
HEBREW_CHARSET CP1255 177
Thai language editions of Windows:
THAI_CHARSET CP874 222

Searchplugins up to 1.5

Sherlock searchplugins used in Firefox up to Version 1.5 are encoded in MAC-ROMAN by default. There are a few possible encodings for the values inside the searchplugins described at mycroft.mozdev.org. The new format for searchplugins introduced for Firefox 2.0 is UTF-8 encoded.

{{ wiki.languages( { "ja": "ja/Encodings_for_localization_files" } ) }}

Revision Source

<p>When creating a localization for Mozilla products, it’s important to be aware of the encoding of the files that you generate.
</p><p>In general, files in the Mozilla CVS repositories are {{mediawiki.interwiki('wikipedia', 'UTF-8', 'UTF-8')}} encoded. There are a few exceptions, though.
</p>
<h3 name="Installer"> Installer </h3>
<p>The windows installer can’t handle UTF-8, but only the codepages provided by windows. This is hooked up a bit tricky in the build process, so here it goes:
</p>
<table class="fullwidth-table">

<tbody><tr>
<td class="header">File
</td><td class="header">Encoding
</td><td class="header">Notes
</td></tr>

<tr>
<td>toolkit/installer/windows/charset.mk
</td><td>ASCII
</td><td>The WIN_INSTALLER_CHARSET variable must be set to an encoding which matches toolkit/installer/windows/install.it CHARSET= parameter. See the table below for appropriate values.
</td></tr>

<tr>
<td>toolkit/installer/windows/install.it
</td><td>A Windows codepage. This must match the CHARSET= parameter in this file, and the WIN_INSTALLER_CHARSET parameter in charset.mk
</td><td>The FONTNAME/FONTSIZE/CHARSET parameters in this file must be set to good values. For most Western scripts, ‘MS Sans Serif’ and ‘8’ are good defaults for the font settings. Eastern scripts will need to choose appropriate fonts that are shipped with Windows. See the table below for appropriate values for the CHARSET= parameter.
</td></tr>

<tr>
<td>browser/installer/installer.inc
</td><td>UTF-8
</td><td>
</td></tr>
<tr>
<td>toolkit/installer/unix/install.it
</td><td>UTF-8 </td><td> {{template.Deprecated_inline()}}
</td></tr>
</tbody></table>
<h4 name="Native_Windows_encodings"> Native Windows encodings </h4>
<p>The following table lists native Windows encodings, and the WIN_INSTALLER_CHARSET and CHARSET= values for each:
</p>
<table class="standard-table">

<tbody><tr>
<td class="header">Encoding Name
</td><td class="header">WIN_INSTALLER_CHARSET (charset.mk)
</td><td class="header">CHARSET= (windows/install.it)
</td></tr>

<tr>
<td>ANSI_CHARSET </td><td> CP1252 </td><td> 0
</td></tr>
<tr>
<td>BALTIC_CHARSET </td><td> CP1257 </td><td> 186
</td></tr>
<tr>
<td>CHINESEBIG5_CHARSET </td><td> CP950 </td><td> 136
</td></tr>
<tr>
<td>EASTEUROPE_CHARSET </td><td> CP1250 </td><td> 238
</td></tr>
<tr>
<td>GB2312_CHARSET </td><td> CP936 </td><td> 134
</td></tr>
<tr>
<td>GREEK_CHARSET </td><td> CP1253 </td><td> 161
</td></tr>
<tr>
<td>HANGUL_CHARSET </td><td> CP949 </td><td> 129
</td></tr>
<tr>
<td>RUSSIAN_CHARSET </td><td> CP1251 </td><td> 204
</td></tr>
<tr>
<td>SHIFTJIS_CHARSET </td><td> CP932 </td><td> 128
</td></tr>
<tr>
<td>TURKISH_CHARSET </td><td> CP1254 </td><td> 162
</td></tr>
<tr>
<td>VIETNAMESE_CHARSET </td><td> CP1258 </td><td> 163
</td></tr>
<tr>
<td colspan="3"><b>Middle East language editions of Windows</b>:
</td></tr>
<tr>
<td>ARABIC_CHARSET </td><td> CP1256 </td><td> 178
</td></tr>
<tr>
<td>HEBREW_CHARSET </td><td> CP1255 </td><td> 177
</td></tr>
<tr>
<td colspan="3"><b>Thai language editions of Windows</b>:
</td></tr>
<tr>
<td>THAI_CHARSET </td><td> CP874 </td><td> 222
</td></tr>
</tbody></table>
<h3 name="Searchplugins_up_to_1.5"> Searchplugins up to 1.5 </h3>
<p>Sherlock searchplugins used in Firefox up to Version 1.5 are encoded in <code>MAC-ROMAN</code> by default. There are a few possible encodings for the values inside the searchplugins described at <a class="external" href="http://mycroft.mozdev.org/deepdocs/deepdocs.html">mycroft.mozdev.org</a>. The new format for searchplugins introduced for Firefox 2.0 is UTF-8 encoded.
</p>{{ wiki.languages( { "ja": "ja/Encodings_for_localization_files" } ) }}
Revert to this revision