Understanding Base64

Shaik Hamzah Last Updated : 10 Jun, 2025

5 min read

Base64 is a binary-to-text encoding methodology that helps represent binary data in ASCII string format. It’s often used to encode data for transmission over media that are mostly text, like emails, JSON-based APIs, etc., so that binary data like images and files don’t get corrupted. The term Base64 comes from the fact that it uses 64 characters – A-Z, a-z, 0-9, +, and / to represent data. In recent years, it has been widely used in multimodal AI applications, embedded systems, cloud-based services, and web development. In this article, we’ll learn more about Base64 and how to use it.

Why Base64?
How Does Base64 Work?
Python Implementation of Base64
- Encoding and Decoding Text
- Encoding and Decoding Images
Things to Keep in Mind While Using Base64
Conclusion

Why Base64?

Base64 is mostly used in cases where binary data (e.g., images, videos, model weights, etc.) needs to be passed through text-based infrastructures without being altered or corrupted. But why is it a popular choice amongst so many other types of encodings? Let’s try to understand.

Base64 is:

Text-safe: Can embed binary data in text-based formats like HTML, XML, JSON, etc.
Easy to transport: No issues with character encoding or data corruption.
Common for images: Often used in web development to embed images directly in HTML/CSS or JSON payloads.

And here’s how other famous encodings are compared to Base64.

Encoding	Purpose	Use Case	Size Impact
Base64	Binary to text	Embedding images/files in HTML, JSON, etc.	~33% increase
Hex	Binary to Hexadecimal	Debugging, network traces	~100% increase
Gzip	Compression	Actual size reduction for text/binary	Compression ratio-dependent

Also Read: What are Categorical Data Encoding Methods | Binary Encoding

How Does Base64 Work?

Now let’s try to understand how Base64 works. Here’s a walkthrough of the step-by-step conversion of the string “Hello” into its Base64 format.

Step 1: Convert the Text to ASCII Bytes

Character	ASCII Decimal Value	Binary Value (8 bits)
H	72	01001000
e	101	01100101
l	108	01101100
l	108	01101100
o	111	01101111

So now, our string “Hello” would look like 01001000 01100101 01101100 01101100 01101111.

That’s 5 characters × 8 bits = 40 bits.

Step 2: Break the Binary into 6-bit Groups

Base64 operates on 6-bit blocks, so we group the 40 bits into chunks of 6 which was previously in chunks of 8:

01001000 01100101 01101100 01101100 01101111

When these chunks of 8 are grouped in groups of 6 they look like this:

010010 000110 010101 101100 011011 000110 1111

Since 40 isn’t directly divisible by 6, we have to pad some 0s at the end. We now have 6 full 6-bit blocks and 1 leftover 4-bit block. We pad the last block with 2 zero bits to make it a full 6-bit chunk:

010010 000110 010101 101100 011011 000110 111100

Step 3: Convert 6-bit Groups to Decimal

We know 2^6 is 64. So, our range will be in between 0 to 63.

6-bit binary	Decimal
010010	18
000110	6
010101	21
101100	44
011011	27
000110	6
111100	60

Step 4: Map to Base64 Characters

Following the standard Base64 character table, we will map our decimal values to the corresponding characters.

Decimal	Base64 Character
18	S
6	G
21	V
44	s
27	b
6	G
60	8

We get “SGVsbG8” as our Base64 encoding for our string “Hello”.

Step 5: Add Padding

Since our original string had 5 bytes (not a multiple of 3), Base64 requires padding with “=” to make the output length a multiple of 4 characters.

5 bytes = 40 bits -> 6 full base64 chars + 2 more characters (from padded bits) -> Total 8 characters

Final Base64 encoded string: “Hello” -> SGVsbG8=

Also Read: Complete Guide on Encoding Numerical Features in Machine Learning

Python Implementation of Base64

Now that you understand how Base64 works, let me show you how to implement it in Python. We’ll first try to encode and decode some text, and then do the same with an image.

Encoding and Decoding Text

Let’s encode this simple text using Base64 and then decode the encoded string back to its original form.

import base64

# Text encoding
message = "Hello World"
encoded = base64.b64encode(message.encode())
print("Encoded:", encoded)
 
# Decoding it back
decoded = base64.b64decode(encoded).decode()
print("Decoded:", decoded)

Output

Encoding and Decoding Images

In vision-related applications, especially with Vision Language Models (VLMs), images are often encoded in Base64 when:

Transmitting images via JSON payloads to or from APIs.
Embedding images for training and serving multimodal models.
Using CLIP, BLIP, LLaVA or other Vision-Language Transformers that accept images as serialized Base64 strings.

Here’s a simple Python code to encode and decode Images.

from PIL import Image
import base64
import io

# Load and encode image

img = Image.open("example.jpeg")
buffered = io.BytesIO()

img.save(buffered, format="JPEG")
img_bytes = buffered.getvalue()
img_base64 = base64.b64encode(img_bytes).decode('utf-8')

print("Base64 String:", img_base64[:100], "...")  # Truncated

Output

Base64 for compression and transmission of data

We can also decode our base 64 encoded data back to the image using the below code.

from PIL import Image
import base64
import io
from IPython.display import display, Image as IPythonImage

# Assume `img_base64` is the base64 string

img_data = base64.b64decode(img_base64)

img = Image.open(io.BytesIO(img_data))
display(IPythonImage(data=img_data))

Output

Encoding and decoding images with Base64

To learn more about Base64 and find many more encoders and decoders, you can refer this site.

Things to Keep in Mind While Using Base64

Although Base64 is of great use in various use cases across domains, here are a few things to note while working with it.

Size Overhead (~33%): For every 3 bytes of binary, you output 4 bytes of text. On large batches (e.g., thousands of high‑res frames), this can consume network and storage bandwidth quickly. Consider compressing images (JPEG/PNG) before Base64 and using streaming if possible.
Memory & CPU Load: Converting and buffering an entire image at once can spike overall memory usage during encoding. Similarly, decoding into raw bytes and then parsing via an image library also adds CPU overhead.
Not a Compression Algorithm: Base64 doesn’t reduce size, it inflates it. Always apply true compression (e.g., JPEG, WebP) on the binary data before encoding to Base64.
Security Considerations: If we blindly concatenate Base64 strings into HTML or JSON without cleaning, you could open XSS or JSON‑injection vectors. Also, extremely large Base64 data can exhaust the parsers and enforce maximum payload sizes at the gateway.

Conclusion

In an era where models can “see” as well as “read”, Base64 has quietly become a cornerstone of multimodal systems. It plays a very important role in data encoding by bridging the gap between binary data and text‑only systems. In vision‑language workflows, it standardizes how images travel from mobile clients to cloud GPUs, while preserving reproducibility and easing integration.

Making images compatible with text-based infrastructure has always been a complex problem to solve. Base64 encoding provides a practical solution to this, enabling image transmission over APIs and packaging datasets for training.

Shaik Hamzah

Data Scientist @ Analytics Vidhya | CSE AI and ML @ VIT Chennai
Passionate about AI and machine learning, I'm eager to dive into roles as an AI/ML Engineer or Data Scientist where I can make a real impact. With a knack for quick learning and a love for teamwork, I'm excited to bring innovative solutions and cutting-edge advancements to the table. My curiosity drives me to explore AI across various fields and take the initiative to delve into data engineering, ensuring I stay ahead and deliver impactful projects.

Beginner Generative AI LLMs

Free Courses

4.8

AWS Data Querying with S3 & Athena

Master AWS data storage & querying with S3, Athena, Glue, RDS, and Redshift.

4.6

Foundations of LangGraph

Build reliable AI workflows using LangGraph state, memory, & agent

4.6

Claude 4.5: Smarter, Faster & More Human AI

Build real-world AI workflow with Claude 4.5 Opus using smart, human-like AI

4.7

NotebookLM Essentials to Pro: The Complete Practical Guide

Your complete NotebookLM guide to faster learning, smarter research, and pow

4.7

Gemini 3: The AI That Thinks, Sees and Creates

Learn Gemini 3 through hands on demos, real apps, and multimodal AI projects

Reading list

Understanding Base64

Table of Contents

Why Base64?

How Does Base64 Work?

Step 1: Convert the Text to ASCII Bytes

Step 2: Break the Binary into 6-bit Groups

Step 3: Convert 6-bit Groups to Decimal

Step 4: Map to Base64 Characters

Step 5: Add Padding

Python Implementation of Base64

Encoding and Decoding Text

Encoding and Decoding Images

Things to Keep in Mind While Using Base64

Conclusion

Login to continue reading and enjoy expert-curated content.

Free Courses

AWS Data Querying with S3 & Athena

Foundations of LangGraph

Claude 4.5: Smarter, Faster & More Human AI

NotebookLM Essentials to Pro: The Complete Practical Guide

Gemini 3: The AI That Thinks, Sees and Creates

Recommended Articles

Responses From Readers

Become an Author

Flagship Programs

Free Courses

Popular Categories

Generative AI Tools and Techniques

Popular GenAI Models

AI Development Frameworks

Data Science Tools and Techniques

Reading list

Introduction to NLP

Text Pre-processing

NLP Libraries

Regular Expressions

String Similarity

Spelling Correction

Topic Modeling

Text Representation

Information Retrieval System

Word Vectors

Word Senses

Dependency Parsing

Language Modeling

Getting Started with RNN

Different Variants of RNN

Machine Translation and Attention

Self Attention and Transformers

Transfomers and Pretraining

Question Answering

Text Summarization

Named Entity Recognition

Coreference Resolution

Audio Data

ASR

Audio Separation

Chatbot

Auto NLP

Understanding Base64

Table of Contents

Why Base64?

How Does Base64 Work?

Step 1: Convert the Text to ASCII Bytes

Step 2: Break the Binary into 6-bit Groups

Step 3: Convert 6-bit Groups to Decimal

Step 4: Map to Base64 Characters

Step 5: Add Padding

Python Implementation of Base64

Encoding and Decoding Text

Encoding and Decoding Images

Things to Keep in Mind While Using Base64

Conclusion

Login to continue reading and enjoy expert-curated content.

Free Courses

AWS Data Querying with S3 & Athena

Foundations of LangGraph

Claude 4.5: Smarter, Faster & More Human AI

NotebookLM Essentials to Pro: The Complete Practical Guide

Gemini 3: The AI That Thinks, Sees and Creates

Recommended Articles

Responses From Readers

Become an Author

Flagship Programs

Free Courses

Popular Categories

Generative AI Tools and Techniques

Popular GenAI Models

AI Development Frameworks

Data Science Tools and Techniques