mozilla

Revision 31297 of Character sets supported by Gecko

  • Revision slug: Character_Sets_Supported_by_Gecko
  • Revision title: Character sets supported by Gecko
  • Revision id: 31297
  • Created:
  • Creator: Prodoc
  • Is current revision? No
  • Comment

Revision Content

This page is still far from finished, at this stage it's just ment to give a general impression to where I'm heading. Because of that no link from other pages to here are added yet.

Introduction

Character set MIME names are used in the header of HTML documents to identify by which character set the content of a specific page should be processed.

Supported Character Sets

The following character sets are supported by Mozilla based browsers:

<tr> <td>IBM-864</td> <td>Arabic</td> </tr> <tr> <td>ISO-8859-6</td> <td>Arabic</td> </tr> <tr> <td>MacArabic</td> <td>Arabic</td> </tr> <tr> <td>Windows-1256</td> <td>Arabic</td> </tr> <tr> <td>ARMSCII-8</td> <td>Armenian</td> </tr> <tr> <td>ISO-8859-13</td> <td>Baltic</td> </tr> <tr> <td>ISO-8859-4</td> <td>Baltic</td> </tr> <tr> <td>Windows-1257</td> <td>Baltic</td> </tr> <tr> <td>ISO-8859-14</td> <td>Celtic</td> </tr> <tr> <td>IBM-852</td> <td>Central European</td> </tr> <tr> <td>ISO-8859-2</td> <td>Central European</td> </tr> <tr> <td>MacCE</td> <td>Central European</td> </tr> <tr> <td>Windows-1250</td> <td>Central European</td> </tr> <tr> <td>GB18030</td> <td>Chinees Simplified</td> </tr> <tr> <td>GB2312</td> <td>Chinees Simplified</td> </tr> <tr> <td>GBK</td> <td>Chinees Simplified</td> </tr> <tr> <td>HZ</td> <td>Chinees Simplified</td> </tr> <tr> <td>ISO-2022-CN</td> <td>Chinees Simplified</td> </tr> <tr> <td>Big5</td> <td>Chinees Traditional</td> </tr> <tr> <td>Big5-HKSCS</td> <td>Chinees Traditional</td> </tr> <tr> <td>EUC-TW</td> <td>Chinees Traditional</td> </tr> <tr> <td>MacCroatian</td> <td>Croatian</td> </tr> <tr> <td>IBM-855</td> <td>Cyrillic</td> </tr> <tr> <td>ISO-8859-5</td> <td>Cyrillic</td> </tr> <tr> <td>ISO-IR-111</td> <td>Cyrillic</td> </tr> <tr> <td>KOI8-R</td> <td>Cyrillic</td> </tr> <tr> <td>MacCyrillic</td> <td>Cyrillic</td> </tr> <tr> <td>Windows-1251</td> <td>Cyrillic</td> </tr> <tr> <td>CP-866</td> <td>Cyrillic/Russian</td> </tr> <tr> <td>KOI8-U</td> <td>Cyrillic/Ukrainian</td> </tr> <tr> <td>MacUkrainian</td> <td>Cyrillic/Ukrainian</td> </tr> <tr> <td>MacFarsi</td> <td>Farsi</td> </tr> <tr> <td>GEOSTD8</td> <td>Georgian</td> </tr> <tr> <td>ISO-8859-7</td> <td>Greek</td> </tr> <tr> <td>MacGreek</td> <td>Greek</td> </tr> <tr> <td>Windows-1253</td> <td>Greek</td> </tr> <tr> <td>MacGujarati</td> <td>Gujarati</td> </tr> <tr> <td>MacGurmukhi</td> <td>Gurmukhi</td> </tr> <tr> <td>IBM-862</td> <td>Hebrew</td> </tr> <tr> <td>ISO-8859-8-I</td> <td>Hebrew</td> </tr> <tr> <td>MacHebrew</td> <td>Hebrew</td> </tr> <tr> <td>Windows-1255</td> <td>Hebrew</td> </tr> <tr> <td>ISO-8859-8</td> <td>Hebrew Visual</td> </tr> <tr> <td>MacDevanagari</td> <td>Hindi</td> </tr> <tr> <td>MacIcelandic</td> <td>Icelandic</td> </tr> <tr> <td>EUC-JP</td> <td>Japanese</td> </tr> <tr> <td>ISO-2022-JP</td> <td>Japanese</td> </tr> <tr> <td>Shift_JIS</td> <td>Japanese</td> </tr> <tr> <td>EUC-KR</td> <td>Korean</td> </tr> <tr> <td>ISO-2022-KR</td> <td>Korean</td> </tr> <tr> <td>JOHAB</td> <td>Korean</td> </tr> <tr> <td>UHC</td> <td>Korean</td> </tr> <tr> <td>ISO-8859-10</td> <td>Nordic</td> </tr> <tr> <td>ISO-8859-16</td> <td>Romanian</td> </tr> <tr> <td>MacRomanian</td> <td>Romanian</td> </tr> <tr> <td>ISO-8859-3</td> <td>South European</td> </tr> <tr> <td>ISO-8859-11</td> <td>Thai</td> </tr> <tr> <td>TIS-620</td> <td>Thai</td> </tr> <tr> <td>Windows-874</td> <td>Thai</td> </tr> <tr> <td>IBM-857</td> <td>Turkish</td> </tr> <tr> <td>ISO-8859-9</td> <td>Turkish</td> </tr> <tr> <td>MacTurkish</td> <td>Turkish</td> </tr> <tr> <td>Windows-1254</td> <td>Turkish</td> </tr> <tr> <td>UTF-16 Big Endian (probably UTF-16BE though)</td> <td>Unicode</td> </tr> <tr> <td>UTF-16 Little Endian (probably UTF-16LE though)</td> <td>Unicode</td> </tr> <tr> <td>UTF-16</td> <td>Unicode</td> </tr> <tr> <td>UTF-32 Big Endian (probably UTF-32BE though)</td> <td>Unicode</td> </tr> <tr> <td>UTF-32 Little Endian (probably UTF-32LE though)</td> <td>Unicode</td> </tr> <tr> <td>UTF-7</td> <td>Unicode</td> </tr> <tr> <td>UTF-8</td> <td>Unicode</td> </tr> <tr> <td>TCVN</td> <td>Vietnamese</td> </tr> <tr> <td>VISCII</td> <td>Vietnamese</td> </tr> <tr> <td>VPS</td> <td>Vietnamese</td> </tr> <tr> <td>Windows-1258</td> <td>Vietnamese</td> </tr> <tr> <td>IBM-850</td> <td>Western</td> </tr> <tr> <td>ISO-8859-1</td> <td>Western</td> </tr> <tr> <td>ISO-8859-15</td> <td>Western</td> </tr> <tr> <td>MacRoman</td> <td>Western</td> </tr> <tr> <td>Windows-1252</td> <td>Western</td> </tr> </table>
MIME name</th>
   <td class="header">Language</th>

Revision Source

<p><span class="comment">This page is still far from finished, at this stage it's just ment to give a general impression to where I'm heading. Because of that no link from other pages to here are added yet.</span>
</p>
<h3 name="Introduction">Introduction</h3>
<p>Character set MIME names are used in the header of HTML documents to identify by which character set the content of a specific page should be processed.
</p>
<h3 name="Supported_Character_Sets">Supported Character Sets</h3>
<p>The following character sets are supported by Mozilla based browsers:
</p>

  &lt;tr&gt;
    &lt;td&gt;IBM-864&lt;/td&gt;
    &lt;td&gt;Arabic&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;ISO-8859-6&lt;/td&gt;
    &lt;td&gt;Arabic&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;MacArabic&lt;/td&gt;
    &lt;td&gt;Arabic&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;Windows-1256&lt;/td&gt;
    &lt;td&gt;Arabic&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;ARMSCII-8&lt;/td&gt;
    &lt;td&gt;Armenian&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;ISO-8859-13&lt;/td&gt;
    &lt;td&gt;Baltic&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;ISO-8859-4&lt;/td&gt;
    &lt;td&gt;Baltic&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;Windows-1257&lt;/td&gt;
    &lt;td&gt;Baltic&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;ISO-8859-14&lt;/td&gt;
    &lt;td&gt;Celtic&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;IBM-852&lt;/td&gt;
    &lt;td&gt;Central European&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;ISO-8859-2&lt;/td&gt;
    &lt;td&gt;Central European&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;MacCE&lt;/td&gt;
    &lt;td&gt;Central European&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;Windows-1250&lt;/td&gt;
    &lt;td&gt;Central European&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;GB18030&lt;/td&gt;
    &lt;td&gt;Chinees Simplified&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;GB2312&lt;/td&gt;
    &lt;td&gt;Chinees Simplified&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;GBK&lt;/td&gt;
    &lt;td&gt;Chinees Simplified&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;HZ&lt;/td&gt;
    &lt;td&gt;Chinees Simplified&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;ISO-2022-CN&lt;/td&gt;
    &lt;td&gt;Chinees Simplified&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;Big5&lt;/td&gt;
    &lt;td&gt;Chinees Traditional&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;Big5-HKSCS&lt;/td&gt;
    &lt;td&gt;Chinees Traditional&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;EUC-TW&lt;/td&gt;
    &lt;td&gt;Chinees Traditional&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;MacCroatian&lt;/td&gt;
    &lt;td&gt;Croatian&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;IBM-855&lt;/td&gt;
    &lt;td&gt;Cyrillic&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;ISO-8859-5&lt;/td&gt;
    &lt;td&gt;Cyrillic&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;ISO-IR-111&lt;/td&gt;
    &lt;td&gt;Cyrillic&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;KOI8-R&lt;/td&gt;
    &lt;td&gt;Cyrillic&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;MacCyrillic&lt;/td&gt;
    &lt;td&gt;Cyrillic&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;Windows-1251&lt;/td&gt;
    &lt;td&gt;Cyrillic&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;CP-866&lt;/td&gt;
    &lt;td&gt;Cyrillic/Russian&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;KOI8-U&lt;/td&gt;
    &lt;td&gt;Cyrillic/Ukrainian&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;MacUkrainian&lt;/td&gt;
    &lt;td&gt;Cyrillic/Ukrainian&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;MacFarsi&lt;/td&gt;
    &lt;td&gt;Farsi&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;GEOSTD8&lt;/td&gt;
    &lt;td&gt;Georgian&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;ISO-8859-7&lt;/td&gt;
    &lt;td&gt;Greek&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;MacGreek&lt;/td&gt;
    &lt;td&gt;Greek&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;Windows-1253&lt;/td&gt;
    &lt;td&gt;Greek&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;MacGujarati&lt;/td&gt;
    &lt;td&gt;Gujarati&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;MacGurmukhi&lt;/td&gt;
    &lt;td&gt;Gurmukhi&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;IBM-862&lt;/td&gt;
    &lt;td&gt;Hebrew&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;ISO-8859-8-I&lt;/td&gt;
    &lt;td&gt;Hebrew&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;MacHebrew&lt;/td&gt;
    &lt;td&gt;Hebrew&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;Windows-1255&lt;/td&gt;
    &lt;td&gt;Hebrew&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;ISO-8859-8&lt;/td&gt;
    &lt;td&gt;Hebrew Visual&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;MacDevanagari&lt;/td&gt;
    &lt;td&gt;Hindi&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;MacIcelandic&lt;/td&gt;
    &lt;td&gt;Icelandic&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;EUC-JP&lt;/td&gt;
    &lt;td&gt;Japanese&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;ISO-2022-JP&lt;/td&gt;
    &lt;td&gt;Japanese&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;Shift_JIS&lt;/td&gt;
    &lt;td&gt;Japanese&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;EUC-KR&lt;/td&gt;
    &lt;td&gt;Korean&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;ISO-2022-KR&lt;/td&gt;
    &lt;td&gt;Korean&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;JOHAB&lt;/td&gt;
    &lt;td&gt;Korean&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;UHC&lt;/td&gt;
    &lt;td&gt;Korean&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;ISO-8859-10&lt;/td&gt;
    &lt;td&gt;Nordic&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;ISO-8859-16&lt;/td&gt;
    &lt;td&gt;Romanian&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;MacRomanian&lt;/td&gt;
    &lt;td&gt;Romanian&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;ISO-8859-3&lt;/td&gt;
    &lt;td&gt;South European&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;ISO-8859-11&lt;/td&gt;
    &lt;td&gt;Thai&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;TIS-620&lt;/td&gt;
    &lt;td&gt;Thai&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;Windows-874&lt;/td&gt;
    &lt;td&gt;Thai&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;IBM-857&lt;/td&gt;
    &lt;td&gt;Turkish&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;ISO-8859-9&lt;/td&gt;
    &lt;td&gt;Turkish&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;MacTurkish&lt;/td&gt;
    &lt;td&gt;Turkish&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;Windows-1254&lt;/td&gt;
    &lt;td&gt;Turkish&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;UTF-16 Big Endian (probably UTF-16BE though)&lt;/td&gt;
    &lt;td&gt;Unicode&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;UTF-16 Little Endian (probably UTF-16LE though)&lt;/td&gt;
    &lt;td&gt;Unicode&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;UTF-16&lt;/td&gt;
    &lt;td&gt;Unicode&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;UTF-32 Big Endian (probably UTF-32BE though)&lt;/td&gt;
    &lt;td&gt;Unicode&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;UTF-32 Little Endian (probably UTF-32LE though)&lt;/td&gt;
    &lt;td&gt;Unicode&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;UTF-7&lt;/td&gt;
    &lt;td&gt;Unicode&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;UTF-8&lt;/td&gt;
    &lt;td&gt;Unicode&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;TCVN&lt;/td&gt;
    &lt;td&gt;Vietnamese&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;VISCII&lt;/td&gt;
    &lt;td&gt;Vietnamese&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;VPS&lt;/td&gt;
    &lt;td&gt;Vietnamese&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;Windows-1258&lt;/td&gt;
    &lt;td&gt;Vietnamese&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;IBM-850&lt;/td&gt;
    &lt;td&gt;Western&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;ISO-8859-1&lt;/td&gt;
    &lt;td&gt;Western&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;ISO-8859-15&lt;/td&gt;
    &lt;td&gt;Western&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;MacRoman&lt;/td&gt;
    &lt;td&gt;Western&lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
    &lt;td&gt;Windows-1252&lt;/td&gt;
    &lt;td&gt;Western&lt;/td&gt;
  &lt;/tr&gt;
&lt;/table&gt;
<table class="standard-table">
  <tbody><tr>
    <td class="header">MIME name&lt;/th&gt;
<pre class="eval">   &lt;td class="header"&gt;Language&lt;/th&gt;
</pre>
  </td></tr></tbody></table>
Revert to this revision