Character set

A character set is an encoding system to let computers know how to recognize Character, including letters, numbers, punctuation marks, and whitespace.

In earlier times, countries developed their own character sets due to their different languages used, such as Kanji JIS codes (e.g., Shift-JIS, EUC-JP, etc.) for Japanese, Big5 for traditional Chinese, and KOI8-R for Russian. However, Unicode gradually became most acceptable character set for its universal language support.

If a character set is used incorrectly (For example, Unicode for an article encoded in Big5), you may see nothing but broken characters, which are called Mojibake.

Help improve MDN

Learn how to contribute

This page was last modified on Jul 11, 2025 by MDN contributors.

View this page on GitHub • Report a problem with this content

Character set

See also

Help improve MDN