Revision 488889 of Character sets supported by Gecko

  • Revision slug: Gecko/Character_sets_supported_by_Gecko
  • Revision title: Character sets supported by Gecko
  • Revision id: 488889
  • Created:
  • Creator: kscarfone
  • Is current revision? No
  • Comment Updated tags

Revision Content

{{ draft() }}

Character set names are used in the header of HTML documents to identify by which character set the content of a specific page should be processed. The following character sets are supported by Gecko and software based on it:

Charset name Language Notes
IBM-864 Arabic  
ISO-8859-6 Arabic  
MacArabic Arabic  
Windows-1256 Arabic  
ARMSCII-8 Armenian  
ISO-8859-13 Baltic  
ISO-8859-4 Baltic  
Windows-1257 Baltic  
ISO-8859-14 Celtic  
IBM-852 Central European  
ISO-8859-2 Central European  
MacCE Central European  
Windows-1250 Central European  
GB18030 Chinese Simplified  
GB2312 Chinese Simplified  
GBK Chinese Simplified  
HZ Chinese Simplified  
ISO-2022-CN Chinese Simplified  
Big5 Chinese Traditional  
Big5-HKSCS Chinese Traditional  
EUC-TW Chinese Traditional  
MacCroatian Croatian  
IBM-855 Cyrillic  
ISO-8859-5 Cyrillic  
ISO-IR-111 Cyrillic  
KOI8-R Cyrillic  
MacCyrillic Cyrillic  
Windows-1251 Cyrillic  
CP-866 Cyrillic/Russian  
KOI8-U Cyrillic/Ukrainian  
MacUkrainian Cyrillic/Ukrainian  
MacFarsi Farsi  
GEOSTD8 Georgian
Note: This was never actually fully supported, and support was removed in Gecko 12.0 {{ geckoRelease("12.0") }}.
ISO-8859-7 Greek  
MacGreek Greek  
Windows-1253 Greek  
MacGujarati Gujarati  
MacGurmukhi Gurmukhi  
IBM-862 Hebrew  
ISO-8859-8-I Hebrew  
MacHebrew Hebrew  
Windows-1255 Hebrew  
ISO-8859-8 Hebrew Visual  
MacDevanagari Hindi  
MacIcelandic Icelandic  
EUC-JP Japanese  
ISO-2022-JP Japanese  
Shift_JIS Japanese  
EUC-KR Korean  
ISO-2022-KR Korean  
JOHAB Korean  
UHC Korean  
ISO-8859-10 Nordic  
ISO-8859-16 Romanian  
MacRomanian Romanian  
ISO-8859-3 South European  
ISO-8859-11 Thai  
TIS-620 Thai  
Windows-874 Thai  
IBM-857 Turkish  
ISO-8859-9 Turkish  
MacTurkish Turkish  
Windows-1254 Turkish  
UTF-16BE Unicode  
UTF-16LE Unicode  
UTF-16 Unicode  
UTF-32BE {{ obsolete_inline("5.0") }} Unicode Support removed for HTML5 compatibility.
UTF-32LE {{ obsolete_inline("5.0") }} Unicode Support removed for HTML5 compatibility.
UTF-7 {{ obsolete_inline("5.0") }} Unicode Support removed for HTML5 compatibility.
UTF-8 Unicode  
TCVN Vietnamese  
VISCII Vietnamese  
VPS Vietnamese  
Windows-1258 Vietnamese  
IBM-850 Western  
ISO-8859-1 Western  
ISO-8859-15 Western  
MacRoman Western  
Windows-1252 Western  

 

Revision Source

<p>{{ draft() }}</p>
<p>Character set names are used in the header of HTML documents to identify by which character set the content of a specific page should be processed. The following character sets are supported by Gecko and software based on it:</p>
<table class="standard-table">
 <tbody>
  <tr>
   <td class="header">Charset name</td>
   <td class="header">Language</td>
   <td class="header">Notes</td>
  </tr>
  <tr>
   <td>IBM-864</td>
   <td>Arabic</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>ISO-8859-6</td>
   <td>Arabic</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>MacArabic</td>
   <td>Arabic</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>Windows-1256</td>
   <td>Arabic</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>ARMSCII-8</td>
   <td>Armenian</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>ISO-8859-13</td>
   <td>Baltic</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>ISO-8859-4</td>
   <td>Baltic</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>Windows-1257</td>
   <td>Baltic</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>ISO-8859-14</td>
   <td>Celtic</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>IBM-852</td>
   <td>Central European</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>ISO-8859-2</td>
   <td>Central European</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>MacCE</td>
   <td>Central European</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>Windows-1250</td>
   <td>Central European</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>GB18030</td>
   <td>Chinese Simplified</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>GB2312</td>
   <td>Chinese Simplified</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>GBK</td>
   <td>Chinese Simplified</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>HZ</td>
   <td>Chinese Simplified</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>ISO-2022-CN</td>
   <td>Chinese Simplified</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>Big5</td>
   <td>Chinese Traditional</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>Big5-HKSCS</td>
   <td>Chinese Traditional</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>EUC-TW</td>
   <td>Chinese Traditional</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>MacCroatian</td>
   <td>Croatian</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>IBM-855</td>
   <td>Cyrillic</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>ISO-8859-5</td>
   <td>Cyrillic</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>ISO-IR-111</td>
   <td>Cyrillic</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>KOI8-R</td>
   <td>Cyrillic</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>MacCyrillic</td>
   <td>Cyrillic</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>Windows-1251</td>
   <td>Cyrillic</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>CP-866</td>
   <td>Cyrillic/Russian</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>KOI8-U</td>
   <td>Cyrillic/Ukrainian</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>MacUkrainian</td>
   <td>Cyrillic/Ukrainian</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>MacFarsi</td>
   <td>Farsi</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>GEOSTD8</td>
   <td>Georgian</td>
   <td>
    <div class="note">
     <strong>Note:</strong> This was never actually fully supported, and support was removed in Gecko 12.0 {{ geckoRelease("12.0") }}.</div>
   </td>
  </tr>
  <tr>
   <td>ISO-8859-7</td>
   <td>Greek</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>MacGreek</td>
   <td>Greek</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>Windows-1253</td>
   <td>Greek</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>MacGujarati</td>
   <td>Gujarati</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>MacGurmukhi</td>
   <td>Gurmukhi</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>IBM-862</td>
   <td>Hebrew</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>ISO-8859-8-I</td>
   <td>Hebrew</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>MacHebrew</td>
   <td>Hebrew</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>Windows-1255</td>
   <td>Hebrew</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>ISO-8859-8</td>
   <td>Hebrew Visual</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>MacDevanagari</td>
   <td>Hindi</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>MacIcelandic</td>
   <td>Icelandic</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>EUC-JP</td>
   <td>Japanese</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>ISO-2022-JP</td>
   <td>Japanese</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>Shift_JIS</td>
   <td>Japanese</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>EUC-KR</td>
   <td>Korean</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>ISO-2022-KR</td>
   <td>Korean</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>JOHAB</td>
   <td>Korean</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>UHC</td>
   <td>Korean</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>ISO-8859-10</td>
   <td>Nordic</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>ISO-8859-16</td>
   <td>Romanian</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>MacRomanian</td>
   <td>Romanian</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>ISO-8859-3</td>
   <td>South European</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>ISO-8859-11</td>
   <td>Thai</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>TIS-620</td>
   <td>Thai</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>Windows-874</td>
   <td>Thai</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>IBM-857</td>
   <td>Turkish</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>ISO-8859-9</td>
   <td>Turkish</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>MacTurkish</td>
   <td>Turkish</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>Windows-1254</td>
   <td>Turkish</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>UTF-16BE</td>
   <td>Unicode</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>UTF-16LE</td>
   <td>Unicode</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>UTF-16</td>
   <td>Unicode</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>UTF-32BE {{ obsolete_inline("5.0") }}</td>
   <td>Unicode</td>
   <td>Support removed for HTML5 compatibility.</td>
  </tr>
  <tr>
   <td>UTF-32LE {{ obsolete_inline("5.0") }}</td>
   <td>Unicode</td>
   <td>Support removed for HTML5 compatibility.</td>
  </tr>
  <tr>
   <td>UTF-7 {{ obsolete_inline("5.0") }}</td>
   <td>Unicode</td>
   <td>Support removed for HTML5 compatibility.</td>
  </tr>
  <tr>
   <td>UTF-8</td>
   <td>Unicode</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>TCVN</td>
   <td>Vietnamese</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>VISCII</td>
   <td>Vietnamese</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>VPS</td>
   <td>Vietnamese</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>Windows-1258</td>
   <td>Vietnamese</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>IBM-850</td>
   <td>Western</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>ISO-8859-1</td>
   <td>Western</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>ISO-8859-15</td>
   <td>Western</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>MacRoman</td>
   <td>Western</td>
   <td>&nbsp;</td>
  </tr>
  <tr>
   <td>Windows-1252</td>
   <td>Western</td>
   <td>&nbsp;</td>
  </tr>
 </tbody>
</table>
<p>&nbsp;</p>
Revert to this revision