Base64 encoding and decoding

This translation is incomplete. Please help translate this article from English

Base64 is a group of similar binary-to-text encoding schemes that represent binary data in an ASCII string format by translating it into a radix-64 representation. The term Base64 originates from a specific MIME content transfer encoding.

Base64 인코딩은 텍스트 데이터를 처리하도록 설계된 미디어를 통해 전송되거나 저장되야 하는 바이너리 데이터를 인코딩 해야할 때 일반적으로 사용된다.
Base64는 데이터가 전송중에 수정되지 않고 그대로 전송되기를 보장한다.
Base64는 일반적으로 MIME을 통한 전자 메일 또는 복잡한 데이터를 XML로 저장하는 등 여러 가지 응용 프로그램에서 사용된다.

JS에서는 두 개의 함수가 디코딩과 인코딩을 해준다.

base64 strings:

The atob() function decodes a string of data which has been encoded using base-64 encoding. Conversely, the btoa() function creates a base-64 encoded ASCII string from a "string" of binary data.

Both atob() and btoa() work on strings. If you want to work on ArrayBuffers, please, read this paragraph.


data URIs
data URIs, defined by RFC 2397, allow content creators to embed small files inline in documents.
Wikipedia article about Base64 encoding.
Decodes a string of data which has been encoded using base-64 encoding.
Creates a base-64 encoded ASCII string from a "string" of binary data.
The "Unicode Problem"
In most browsers, calling btoa() on a Unicode string will cause a Character Out Of Range exception. This paragraph shows some solutions.
List of Mozilla supported URI schemes
In this article is published a library of ours whose aims are:
  • creating a C-like interface for strings (i.e. array of characters codes — ArrayBufferView in JavaScript) based upon the JavaScript ArrayBuffer interface,
  • creating a collection of methods for such string-like objects (since now: stringViews) which work strictly on array of numbers rather than on immutable JavaScript strings,
  • working with other Unicode encodings, different from default JavaScript's UTF-16 DOMStrings,

View All...


View All...

The "Unicode Problem"

Since DOMStrings are 16-bit-encoded strings, in most browsers calling window.btoa on a Unicode string will cause a Character Out Of Range exception if a character exceeds the range of a 8-bit byte (0x00~0xFF). There are two possible methods to solve this problem:

  • the first one is to escape the whole string (with UTF-8, see encodeURIComponent) and then encode it;
  • the second one is to convert the UTF-16 DOMString to an UTF-8 array of characters and then encode it.

Here are the two possible methods.

Solution #1 – escaping the string before encoding it

function b64EncodeUnicode(str) {
    // first we use encodeURIComponent to get percent-encoded UTF-8,
    // then we convert the percent encodings into raw bytes which
    // can be fed into btoa.
    return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g,
        function toSolidBytes(match, p1) {
            return String.fromCharCode('0x' + p1);

b64EncodeUnicode('✓ à la mode'); // "4pyTIMOgIGxhIG1vZGU="
b64EncodeUnicode('\n'); // "Cg=="

To decode the Base64-encoded value back into a String:

function b64DecodeUnicode(str) {
    // Going backwards: from bytestream, to percent-encoding, to original string.
    return decodeURIComponent(atob(str).split('').map(function(c) {
        return '%' + ('00' + c.charCodeAt(0).toString(16)).slice(-2);

b64DecodeUnicode('4pyTIMOgIGxhIG1vZGU='); // "✓ à la mode"
b64DecodeUnicode('Cg=='); // "\n"

Unibabel is a library which includes common conversions using this strategy.

Solution #2 – rewrite the DOMs atob() and btoa() using JavaScript's TypedArrays and UTF-8

Use a TextEncoder polyfill such as TextEncoding (also includes legacy windows, mac, and ISO encodings), TextEncoderLite, or Buffer and a Base64 polyfill such as base64-js.

The simplest, most light-weight solution would be to use TextEncoderLite and base64-js.

This function assumes using base64-js imported as minified <script type="text/javascript" src="base64js.min.js"/>

function base64EncodingUTF8(str) {
    var encoded = new TextEncoderLite('utf-8').encode(str);        
    var b64Encoded = base64js.fromByteArray(encoded);
    return b64Encoded;