Character set

A character set is an encoding system to let computer know how to regonize Character, including letters, numbers, punctuation marks, and whitespace.

In earlier times, countries developed their own character sets due to their different languages used, such as Kanji JIS codes (e.g. Shift-JIS, EUC-JP, etc.) for Japanese, Big5 for traditional Chinese, and KOI8-R for Russian. However, Unicode gradually become most acceptable character set for its universal languages support.

If character set used incorrectly (For example, Unicode for acticle encoded in Big5), you may seen nothing but broken characters, which called Mojibake.