Base64 Encoding Explained: How It Works and When to Use It

· 13 min read

Base64 is one of those concepts that every developer encounters but few take the time to understand properly. You see it in data URIs, JWT tokens, API authentication headers and email attachments, yet the mechanism behind it often remains a black box. This guide explains Base64 from first principles — how the encoding algorithm works, when to use it, and critically, when not to.

What Is Base64?

Base64 is a binary-to-text encoding scheme that converts arbitrary binary data into a string of 64 printable ASCII characters: A-Z (26), a-z (26), 0-9 (10), plus + and / (2), totalling 64. The = character is used for padding.

It is not encryption. Anyone can decode Base64 back to the original data without a key. Its purpose is purely practical: to transmit binary data safely through systems that only handle text. Email protocols, JSON payloads, URLs, HTTP headers and XML documents all have restrictions on what bytes they can carry. Base64 guarantees the output uses only safe ASCII characters that survive transit through any text-based system.

Every 3 bytes of input produce 4 characters of Base64 output, making encoded data approximately 33% larger than the original. This overhead is the price of compatibility.

How the Algorithm Works

The encoding process operates on groups of three bytes (24 bits) at a time:

Step 1: Take 3 input bytes and concatenate their binary representations into 24 bits.

Step 2: Split the 24 bits into four groups of 6 bits each.

Step 3: Map each 6-bit value (0-63) to a character using the Base64 alphabet.

Worked example with the string "Hi":

Input:    H         i
ASCII:    72        105
Binary:   01001000  01101001

Concatenated: 010010000110100100000000
(Padded to 24 bits with zeros)

Split into 6-bit groups:
010010  000110  100100  000000

Decimal:  18    6    36    0
Base64:   S     G    k     =

Result: "SGk="

The = padding appears because the input (2 bytes) was not evenly divisible by 3. With 1 leftover byte, you get == padding. With 2 leftover bytes, you get =. With an input length divisible by 3, there is no padding.

The Base64 Alphabet

RangeCharactersValues
0-25A-ZUppercase letters
26-51a-zLowercase letters
52-610-9Digits
62+Plus
63/Forward slash
Padding=Pad character

Common Use Cases

Data URIs in HTML and CSS

Embed small images directly in your code without a separate HTTP request. This eliminates a round trip to the server, which is particularly beneficial for small icons and decorative images under 2KB.

<img src="data:image/png;base64,iVBORw0KGgo..." />

background-image: url(data:image/svg+xml;base64,PHN2Zy...);

The trade-off: Base64-encoded images are 33% larger than the binary original and cannot be cached independently by the browser. For images larger than a few kilobytes, a separate file served with proper cache headers is more efficient.

HTTP Basic Authentication

The Authorization header sends credentials as Base64-encoded username:password:

Authorization: Basic dXNlcm5hbWU6cGFzc3dvcmQ=
// Decodes to: username:password

This is why Basic Auth must always use HTTPS. Base64 is trivially decoded — it provides zero security. The encoding exists purely because HTTP headers are text-based and need a safe way to carry the colon-separated credentials.

JWT Tokens

JSON Web Tokens use Base64URL encoding (a URL-safe variant) for the header and payload sections. A JWT has three parts separated by dots, and the first two are Base64URL-encoded JSON:

eyJhbGciOiJIUzI1NiJ9.eyJ1c2VyIjoiYWxpY2UifQ.signature

You can decode the first two parts to inspect the token contents without any key — only the signature requires the secret. This is by design: JWTs are self-describing tokens where the payload is readable but tamper-proof.

Email Attachments (MIME)

Binary attachments in emails are Base64-encoded because SMTP was designed for 7-bit ASCII text only. When you attach a PDF or image to an email, your client encodes it as Base64, wraps it in MIME headers, and the recipient's client decodes it back. This is why attachments inflate email size by roughly a third.

API Request Bodies

When sending binary data — images, PDFs, audio files — in JSON payloads, Base64 encoding lets you include them as string values. This is common in APIs where multipart form data is not supported or where you want to keep everything in a single JSON document.

Encode & Decode Base64 Instantly

Convert text to Base64 and back, with URL-safe mode and UTF-8 support. Everything stays in your browser.

Open Base64 Encoder →

Base64 vs Base64URL

FeatureStandard Base64Base64URL
Characters 62-63+ and /- and _
PaddingRequired (=)Often omitted
URL-safeNo (+ and / need URL escaping)Yes
Used inEmail, data URIs, Basic AuthJWTs, URL parameters, filenames

The standard Base64 characters + and / have special meaning in URLs and file paths. Base64URL replaces them with - and _, making the output safe to use directly in URLs without percent-encoding. The padding character = is also problematic in URLs, so Base64URL often omits it — the decoder can infer the padding from the string length.

Base64 in Different Languages

LanguageEncodeDecode
JavaScriptbtoa(string)atob(string)
Node.jsBuffer.from(str).toString('base64')Buffer.from(b64, 'base64').toString()
Pythonbase64.b64encode(bytes)base64.b64decode(string)
C#Convert.ToBase64String(bytes)Convert.FromBase64String(string)
JavaBase64.getEncoder().encodeToString(bytes)Base64.getDecoder().decode(string)
PHPbase64_encode($data)base64_decode($data)
Gobase64.StdEncoding.EncodeToString(bytes)base64.StdEncoding.DecodeString(str)
Command lineecho -n "text" | base64echo "dGV4dA==" | base64 -d

A common gotcha in JavaScript: btoa() and atob() only handle Latin-1 characters. For UTF-8 text containing characters outside the Latin-1 range — emojis, Chinese characters, accented letters — you need to encode to UTF-8 first. The modern approach uses TextEncoder and TextDecoder:

// Encode UTF-8 string to Base64
const base64 = btoa(String.fromCodePoint(
  ...new TextEncoder().encode("Hello ")
));

// Or with Buffer in Node.js
const base64 = Buffer.from("Hello ").toString('base64');

When NOT to Use Base64

Base64 is often used where better alternatives exist. Avoid it in these situations:

Large file transfers: Base64 adds 33% overhead. For files over a few kilobytes, use multipart form uploads, presigned URLs, or streaming uploads. Encoding a 10MB image as Base64 means transferring 13.3MB of text.

Security or obfuscation: Base64 is not encryption. Do not use it to hide passwords, API keys or sensitive data. It is trivially reversible and provides zero confidentiality.

Large inline images: Data URIs with Base64-encoded images larger than 2-3KB are counterproductive. They inflate HTML/CSS file size, cannot be cached independently, and block rendering until the entire document is parsed. Use separate image files with proper cache headers instead.

Database storage: Storing binary data as Base64 text in a database wastes 33% more storage and is slower to query than native BLOB or BYTEA columns. Most modern databases handle binary data natively.

Performance Considerations

Base64 encoding and decoding are fast operations — modern CPUs can process gigabytes per second. The performance concern is not the encoding itself but the consequences of the 33% size increase. In network-constrained environments, that extra third matters. In memory-constrained environments, holding both the Base64 string and the decoded binary simultaneously can double memory usage.

For web applications, the key decision is inline vs external. Small assets (icons under 1-2KB, simple SVGs) benefit from Base64 inlining because eliminating an HTTP request outweighs the size penalty. Larger assets should always be served as separate files where the browser can cache them, compress them with gzip/brotli (which is ineffective on Base64), and load them in parallel.

Base32 and Base16 Alternatives

Base64 is not the only binary-to-text encoding. Base32 uses 32 characters (A-Z, 2-7), producing output that is 60% larger than the input but is case-insensitive and avoids ambiguous characters — useful for human-readable codes like TOTP secrets in two-factor authentication apps. When you set up an authenticator app and see a code like JBSWY3DPEHPK3PXP, that is Base32.

Base16 (hexadecimal) uses 16 characters (0-9, A-F), doubling the size but being the most straightforward encoding — every byte becomes exactly two hex digits. You encounter hex encoding in CSS colour codes (#FF5733), MAC addresses (00:1A:2B:3C:4D:5E), cryptographic hashes (SHA-256 outputs), and memory addresses in debugging.

Base85 (also called Ascii85) is less common but more space-efficient than Base64, producing only 25% overhead compared to 33%. It is used in PDF file encoding and some Git internal operations. The trade-off is a larger character set that includes characters problematic in some contexts.

Debugging Base64

When working with Base64-encoded data, a few common issues come up regularly. If decoding produces garbage text, the data is probably binary (an image or PDF) rather than text — try decoding to a file instead of a string. If you get padding errors, check that the string length is a multiple of 4 and that padding characters have not been stripped. If characters look wrong after decoding, the original was likely UTF-8 text that was decoded as Latin-1 — ensure your decoder uses the correct character encoding.

For quick debugging, most browsers include Base64 functions in the developer console. Open DevTools, switch to the Console tab, and use atob('dGVzdA==') to decode or btoa('test') to encode. This is faster than reaching for an external tool when you just need to inspect a token or verify a value.

Try the Base64 Encoder

Encode and decode Base64 with URL-safe mode, UTF-8 support, and file encoding. Everything runs in your browser.

Open Base64 Encoder →
Need a developer? Hire Anthony D Johnson — Senior .NET & Azure Developer →