Jump to:

Our volunteers haven't translated this article into ไทย yet. Join us and help get the job done!
You can also read the article in English (US).

UTF-8 (UCS Transformation Format 8) is the World Wide Web's most common character encoding. Each character is represented by one to four bytes. UTF-8 is backward-compatible with ASCII and can represent any standard Unicode character.

The first 128 UTF-8 characters precisely match the first 128 ASCII characters (numbered 0-127), meaning that existing ASCII text is already valid UTF-8. All other characters use two to four bytes. Each byte has some bits reserved for encoding purposes. Since non-ASCII characters require more than one byte for storage, they run the risk of being corrupted if the bytes are separated and not recombined.

Learn more

General knowledge

Document Tags and Contributors

ผู้มีส่วนร่วมกับหน้านี้: mdnwebdocs-bot, sideshowbarker, haingh, sebastien-bartoli, r-o-b, hbloomer, Andrew_Pfeiffer, Sheppy, klez, sandeepmishraxp
อัปเดตล่าสุดโดย: mdnwebdocs-bot,