I took a part in a RedHat internal CTF challenge run by Keith Swagler and the Red Hat Information Security Team last year. Many awesome and interesting challenges, but I’d like to write about the “bse64” challenge. Along with that I’ll take a closer look at the base64 encoding itself and explain some basics.

## Challenge description

Developer Paul said removing one specific letter from a Base64 encoded string was good enough to stop attackers for a few years. Can you prove him wrong?
ZxhZ3tOHRf24wdWdoXzBiZnUkPEB0TBufQo=

## First try: decoding the line

The first step for me was to try to decode the line and see if it works somehow. So, I ran this command in my Linux terminal and took a look at the decoded output:

\$ base64 -d <<< ZxhZ3tOHRf24wdWdoXzBiZnUkPEB0TBufQo=
gYE՝|Ԑ0n}

As expected, not much help here, no patterns, no clues, nothing. It means that to solve this challenge we need to get closer to base64 encoding itself and learn some foundations.

## Base64 101

Base64 is a binary-to-text encoding schema used to convert binary files and send them over the Internet. Base64 string can only contain A-Z, a-z, 0-9, +, / characters (64 characters). These characters can be represented by 6 bits. 1 byte contains 8 bits to fit this, so the left side filled by 0s. That means there are 8 bits, but only 6 right bits have a value for us because the left 2 bits are always 0. The encoding rule for base64 encodes three 8-bit bytes to four 6-bit bytes and then adds 0s to each 6-bit byte to fit the 8-bit byte.

Probably it sounds a bit complicated, but it really isn’t. Let’s take a look at the examples and everything will become more clear.

## Base64 encoding example

Let’s take the string OIL as an example.

The first step is to convert each letter to it’s ASCII code representation:

O I L -> 79 73 76

Then convert all numbers to binary format (8 digits == 8 bits):

79 73 76 -> 01001111 01001001 01001100

Now we need to create four 6-bit groups. For this let’s merge all together first:

01001111 01001001 01001100 -> 010011110100100101001100

At this moment we can brake down our line with four 6-bit groups (6 digits == 6 bits):

010011110100100101001100 -> 010011 110100 100101 001100

Since each group should be 8-bits long, we have to add 0s to the left side of each group to fit 8-bit format:

010011 110100 100101 001100 -> 00010011 00110100 00100101 00001100

Now the final part to convert all the binary digits to letters. For this let’s convert binary digits back to decimal first:

00010011 00110100 00100101 00001100 -> 19 52 37 12

The last step is to convert decimals to letters using Base64 encoding table (see below):

19 52 37 12 -> T 0 l M

Encoding is completed. We’ve just encoded the string OIL to the T0lM base64 string.

## Base64 encoding table

The base64 encoding table is:

## “Missing” bytes

To fit the three 8-bit groups we need to encode 3 letters like in the example above. But what if we want to encode 2 letters or even 1 letter? In this case, we can’t get four 6-bits groups and we have to paste the “=” character in the place of missing groups. It means, whenever the length of our string we can have only one or two “=” characters at the end of our string (or zero if our string matches the base64 encoding pattern).

Different implementations, however, may use other values for the latest two characters and the one used for padding (“=“).

## Back to the challenge

Based on the information above we need to find the missing letter (or letters) and recover the string from the description. We don’t need to recover the whole string, instead, we can break down the string by 4 characters and try to recover small groups.

## Finding the missing letter

The description says that there is a missing letter in the string. And finding the missing letter is the simplest part of this challenge. We know the format of the answer which is “flag{}”. Let’s take the “flag{” part and try to encode it with base64 in the Linux terminal:

\$ base64 <<< flag{
ZmxhZ3sK

And then compare with the string from the description:

ZmxhZ3sK | ZxhZ3tOHRf24wdWdoXzBiZnUkPEB0TBufQo=

Here we can see that our line contains additional character m in the first 4 letters of the string. It means, we’ve found the missed letter and now we can try to recover the whole string.

## Recovering the string: the algorithm

I used small python automation to brute force 4-letters groups in the line and tried to figure out if a specific group of 4 letters needs the additional m letter or not.

Note: The description tells us about a missing letter. It means it can be lowercase or uppercase ( m or M)

To do that I took 3 random letters and encode them to base64 string. Then I compared the result base64 string with the group of 4 letters from the original base64 string. Possible results here are:

• The full match with the 4 letters in the original string
• Partial match with the first 3 letters from the original base64 string with m or M somewhere between these letters.
• In this case, we need to move the last letter from the 4-letter group of the original base64 string to the next round.

## Recovering the string: the process

I ran the script and got the results:

## Challenge results

In the end I got this full base64 string with all missing letters in place:

ZmxhZ3tOMHRfM24wdWdoXzBiZnUkPEB0MTBufQo=

And also the flag:

flag{N0t_3n0ugh_0bfu\$<@t10n}
Categories: Security