# Quick peek into Base64

Hello Fellow readers and tech geeks. You all must have heard of this famous encoding scheme whether you are on the programming side or on RED Team Security side. Many of us have also seen base64 encoder-decoder on hackbars. So, lets continue the hunt down for BASE64.

## What is BASE64 ?

Basically it is an encoding scheme which converts BINARY values ( 010110xx ) to human readable form ( ASCII -> abcdexx ) .

## Why was BASE64 made?

So, back in the days when mailing system was built, it was basically just for text based conversation but as the time passed on, the need for attachments like the image and media files(audio,video) came into existence. When these types of attachments are sent over internet they are sent in the form of binary data. Binary data has a good probability of getting corrupt when sent over internet in its raw form. So, to overcome this problem BASE64 was introduced.

## Problem with Raw Binary Data

The main problem with binary data is it contains null characters which in some languages like C,Python represent end of character string. So sending binary data in raw form containing NULL Bytes will result in stopping a file from being fully read , leading in a corrupt data. Now let’s dive into how BASE64 encoding works.

## BASE64 encoding and decoding

#### Note : Length of the string should be in multiple of 3.

### Example 1 :

String to encode : “ace”, Length=3

1) Convert each character to decimal equivalent.

[a= 97]

[c= 99] [e= 101]

2) Change each decimal equivalent with 8-bit binary representation.

[97 =01100001] [99=01100011] [101=01100101] so we got [ 01100001 01100011 01100101 ]

3) Seperate them in a bunch of 6-bit each like

[011000 010110 001101 100101]

4) Calculate binary to decimal like this

[011000 = 24] [010110 = 22] [001101 = 13] [100101 =37]

5) Covert these decimal characters to base64 scheme using base64 chart

[24=Y] [22=W] [13=N] [37=l]

So “ace” results in “YWNl”

### Example 2 :

String to encode : “abcd” Length=4 which is not a multiple of 3. Here come padding in play. So to make string length multiple of 3 , we must add 2 bit padding to make length= 6. Padding bit is represented by “=” sign.

#### Note : One padding bit equals two zeroes 00 so two padding bit equals four zeroes 0000

So lets start the process :–

1) Convert each character to decimal equivalent.[a= 97] [b= 98] [c= 99] [d=100]

2) Change each decimal equivalent with 8-bit binary representation.

[97=01100001] [98=01100010] [99=01100011] [100=01100100]

3) Separate them in a bunch of 6-bit each like

[011000] [010110] [001001] [100011] [011001] [00]

,

so the last 6-bit is not fulfilled so we insert two padding bit which equals four zeroes “0000” now we get

[011000] [010110] [001001] [100011] [011001] [000000 ==]

,

now its equal. We also added two equals sign at the end to show that 4 zeroes were added which will help in time of decoding.

4) Calculate binary to decimal

Reversing the steps will result in BASE64 decoding.

[011000 =24] [010110 =22] [001001 = 9] [100011=35] [011001=25] [000000=0] ==

5) Covert these decimal characters to base64 scheme using base64 chart

[24=Y] [22=W] [9=j] [35=j] [25=Z] [0=A] ==

So “abcd” results in “YWJjZA==”

## Does BASE64 help compress ?

So, the base64 representation of string of size n is : ceil(n/3)*4 where n is size of string

Lets takes string size =16 KB. By putting in formulae ceil (16**1024/3)*4 will result in 21.8 kb. We clearly see that it is increasing the file size , so base64 doesn’t help in compression.

exeellent