# Chapter 2 - Stack Overflows

## C and C++ Lack Bounds-Checking

```c
#include <stdio.h>
int main()
{
    int array[5] = {1, 2, 3, 4, 5};
    printf("%d\n", array[5]);
}
```

In this program, the programmer created an five elements long array in C. The programmer made a mistake by forgetting that an array of size five begins with element zero, `array[0]`, and ends with element four, `array[4]`. So, he tries to print what he thought to be the fifth element of the array, but in reality the program will read beyond it, into the "sixth" element.

We can compile this program using gcc. Since it will be compiled on a 64-bit processor and the program in the book was compiled on a 32-bit one, we need to use the flag -m32 to compile it as 32-bits. Also, to enable source code debugging using `gdb` we will set the `-g` flag:

![](https://3889206050-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M6DIEHtstePxj4NCmCC%2F-M6bZjuQAWSTsDhJdNXv%2F-M6c6dRxEtAebVL0_Nal%2Fimage.png?alt=media\&token=9b80f5db-c068-44fb-9633-57b01bbce60a)

We can notice that the `gcc` compiler doesn't  throw any errors, but once we execute the program the output is not what it was expected.

![](https://3889206050-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M6DIEHtstePxj4NCmCC%2F-M6bZjuQAWSTsDhJdNXv%2F-M6c708sJBobVHMcGxxV%2Fimage.png?alt=media\&token=db211d2e-65c6-4f17-b5d7-708021c129aa)

This shows us how easy it is to read past the end of a buffer. In a real world scenario, this would be considered as an *information disclosure vulnerability.*

{% hint style="info" %}
"A buffer is a limited, contiguously allocated set of memory, usually represented as an array in C."
{% endhint %}

We can now use `gdb` (GNU Debugger) to debug the program.

gdb commands:

* **list**                  show source code
* **run**                               execute program
* **break**                  insert breakpoint
* **x**                           examine memory
* **disassemble**              show assembly code
* **continue**                     resume execution
* **info registers**             see registers
* **info proc mapping**    see memory map

![](https://3889206050-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M6DIEHtstePxj4NCmCC%2F-M6bZjuQAWSTsDhJdNXv%2F-M6bgG6BVwgUiTTaPavc%2Fimage.png?alt=media\&token=0d6c1de0-08f1-4729-962d-c57f4d56e39d)

In the image above we can see that the numbers 1 through 5 were pushed to the stack plus a random value that represents the sixth element of the array.

### Writing Past End of Array

```c
#include <stdio.h>
int main()
{
    int array[5];
    array[10000] = 1;
}
```

In this example, the programmer created a five elements long array, and then sets 1 as the 10000th element, which writes way past the end of the array. By executing the program we can see a "Segmentation fault" error. This means the program crashed and its normally the desired outcome of a *Denial of Service* attack.

```
g666isildur@linux:~/cnit127/ch2$ ./ch2b 
Segmentation fault
```

## The Stack

* LIFO (Last-In, First-Out)
* ESP (Extended Stack Pointer) register points to the top of the stack
* EBP is typically used for calculated addresses on the stack (`mov eax, [ebp+10]` copies the data 16 bytes down the stack into the EAX register)
* PUSH puts items on the stack
* POP takes items off the stack
* The stack's purpose is to make the use of functions more efficient
* When a function is called, these things occur:
  * Calling routine stops processing its instructions
  * Saves its current state
  * Transfers control to the function
  * Function processes its instructions
  * Function exits
  * State of the calling function is restored
  * Calling routine's execution resumes
* In other words...
  * Push function's **arguments** onto the stack
  * Call function, which pushes the return address **RET** onto the stack, which is the **EIP** at the time the function is called
  * Before function starts, a **prolog** executes, pushing **EBP** onto the stack
  * It then copies **ESP** into **EBP**
  * Calculates size of local variables
  * Reserves that space on the stack, by subtracting the size from **ESP**
  * Pushes local variables onto stack

### Function Example

```c
void function(int a, int b)
{
    int array[5] = {1, 2, 3, 4, 5};
}
int main()
{
    function(1, 2);
    printf("Returned from function\n");
}
```

This function does nothing else besides printing a message to the screen. The purpose of it is to understand the stack frame.

We debug the program with `gdb` and set a breakpoint and the call to the funtion and at the function ret like shown in the screenshot, then run the program.

![](https://3889206050-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M6DIEHtstePxj4NCmCC%2F-M6bZjuQAWSTsDhJdNXv%2F-M6bwj10TbRV9qBRSrUh%2Fimage.png?alt=media\&token=7da90762-9aaf-44ee-af19-4ecc30baba2a)

The we can see that the program stops at the first breakpoint, before jumping to the function. Now we can type `info registers`  to see where the stack frame starts and ends in the `main` function.

![](https://3889206050-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M6DIEHtstePxj4NCmCC%2F-M6bZjuQAWSTsDhJdNXv%2F-M6bxFoCKJV2sgaNnm-X%2Fimage.png?alt=media\&token=19cf250c-ebd8-4865-bd6b-f489f46450b4)

Now we know that the stack frame starts at `0xffffd618` and ends at `0xffffd610` (the stack frame goes from EBP to ESP  and grows towards lower addresses).

We can now use `x` to examine the first 20 addresses of the memory in the stack frame of `main()`.

![](https://3889206050-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M6DIEHtstePxj4NCmCC%2F-M6bZjuQAWSTsDhJdNXv%2F-M6byx20B4DoeI2x2eWh%2Fimage.png?alt=media\&token=803bf02e-5dbf-4a78-94a0-82949f697dd0)

The stack fram of main() can be seen by the highlighted region.

Then we can analyze `function()` by typing `continue` and making the program run until the next break at the return address.

![](https://3889206050-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M6DIEHtstePxj4NCmCC%2F-M6bZjuQAWSTsDhJdNXv%2F-M6c-BgSvXoY6z4KfGxr%2Fimage.png?alt=media\&token=daf1dd32-bee8-4568-af0d-30f30c866ce2)

We can see that the stack frame goes from `0xffffd600` to `0xffffd5e0`. By examining `ESP` we can have a better look at `function()`'s stack frame, highlighted in the screenshot below.

![](https://3889206050-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M6DIEHtstePxj4NCmCC%2F-M6bZjuQAWSTsDhJdNXv%2F-M6c-vLBpSGNZzqenx8y%2Fimage.png?alt=media\&token=a59370a9-6144-48f7-b1c6-42b18fcd4858)

{% hint style="info" %}
To know which part of the addresses of ESP belong to the stack frame, we should realize that it is built on little endianess and that each address has its values reversed. So, we know that the function()'s stack frame ends at the address `0xffffd618` by counting from two in two bits from right to the left (xff = 600, xff = 5ff, xd6 = 5fe, x18 = 5fc)
{% endhint %}

By disassembling `main()` we can see how a function is called:

1. **push** arguments onto the stack
2. **call** the function

![](https://3889206050-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M6DIEHtstePxj4NCmCC%2F-M6bZjuQAWSTsDhJdNXv%2F-M6c3e7ngsqKDKDS2GJe%2Fimage.png?alt=media\&token=f20640ae-4f70-4cc9-ae3a-136a0a0aafc2)

We can also disassemble `function()` to see how the `prolog` is executed:

1. **push** EBP onto the stack
2. **mov** ESP into EBP, starting a new stack frame
3. **sub** from ESP, reserving room for local variables

![](https://3889206050-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M6DIEHtstePxj4NCmCC%2F-M6bZjuQAWSTsDhJdNXv%2F-M6c49A0xkR07bJNRLlG%2Fimage.png?alt=media\&token=7c4485b9-a852-42df-9d75-87fb20cd561d)

If we take a close look at the next word after `function()`'s stack frame, we can notice that it has the address of the next instruction to be executed in `main()`.

![](https://3889206050-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M6DIEHtstePxj4NCmCC%2F-M6bZjuQAWSTsDhJdNXv%2F-M6c5fx-VmkRLDNESuZO%2Fimage.png?alt=media\&token=83b5edd9-f3af-4f6d-9f74-1ded6e2a95df)

### Stack Buffer Overflow Vulnerability

```c
#include <stdio.h>
void user_input(void)
{
    char buf[30];
    gets(buf);
    printf("%s\n", buf);
}
int main()
{
    user_input();
    return 0;
}
```

This function allows the user to put as many elements into `buf` as the user wants. By compiling this program we can see that it throws a warning.

![](https://3889206050-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M6DIEHtstePxj4NCmCC%2F-M6bZjuQAWSTsDhJdNXv%2F-M6fFnR_Iz3y_nme71pw%2Fimage.png?alt=media\&token=bb9e5476-52a6-4882-9723-8ed3b3b08bae)

The compiler says that the 'gets' function is dangerous and should not be used. We can see the man page of gets to learn more about this function.

![](https://3889206050-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M6DIEHtstePxj4NCmCC%2F-M6bZjuQAWSTsDhJdNXv%2F-M6fGN0yh33I2GlN-RCt%2Fimage.png?alt=media\&token=baccb8b9-c4ce-4809-9ccc-6d7372cac147)

Now we know why we should never use gets. It will continue to store characters past the end of the buffer making it impossible to tell how many characters it will read.

By executing the program we can see that it takes user input and returns that same input, but by inputting a bunch of A's we get the segmentation fault error, indicated an illegal operation.

&#x20;

![](https://3889206050-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M6DIEHtstePxj4NCmCC%2F-M6bZjuQAWSTsDhJdNXv%2F-M6fHc1lPKRt_RS0IpKw%2Fimage.png?alt=media\&token=58152dfe-5b0b-499a-97dd-8da2ffc9a7e5)

Running the program on gdb and setting a breakpoint after gets will let us examine what is happening in the stack.

![](https://3889206050-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M6DIEHtstePxj4NCmCC%2F-M6bZjuQAWSTsDhJdNXv%2F-M6fJapIZPKOQ3jT7Xoh%2Fimage.png?alt=media\&token=56d2d326-1641-4d9f-b51f-4ae374809a8e)

In the screenshot above we can see outlined in yellow the current stack frame that goes from the address `0xffffd648` at EBP to the address `0xffffd620` at ESP. Then, we can see the current stack highlighted, and in it outlined in red the ASCII values for `HELLO` (stored in reverse, remember that this is little-endian) as well as the return value outside of the stack outlined in green. Now since the ASCII values appear in hexadecimal, when using a bunch of A's to overwrite the stack we can expect to see the ASCII value of A that is 0x41. Below is the ASCII table that confirms that.

![](https://3889206050-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M6DIEHtstePxj4NCmCC%2F-M6bZjuQAWSTsDhJdNXv%2F-M6fL0Oiw61TNi_zMNqP%2Fimage.png?alt=media\&token=e03f8fd8-1652-4a24-87af-694bb9c07556)

Now running the program again, this time with a bunch of A's, we can see that we filled the stack frame with these many A characters and that the return value is overwritten with `0x41414141`.

![](https://3889206050-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M6DIEHtstePxj4NCmCC%2F-M6bZjuQAWSTsDhJdNXv%2F-M6fLgBPzQM57hJigIBT%2Fimage.png?alt=media\&token=47826786-4885-4591-ae18-92a97652401e)

By typping `continue` we can examine the crash and check the EIP value.

![](https://3889206050-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M6DIEHtstePxj4NCmCC%2F-M6bZjuQAWSTsDhJdNXv%2F-M6fM-7nxmMmGwtZR3cX%2Fimage.png?alt=media\&token=bee28153-7c38-475d-a566-1cf0992e1772)

The EIP value is now `0x41414141` , which means that it was controlled by user input.

{% content-ref url="chapter-2-stack-overflows/linux-buffer-overflow-with-command-injection" %}
[linux-buffer-overflow-with-command-injection](https://666isildur.gitbook.io/ethical-hacking/binary-exploitation-exploit-development/shellcoders-handbook/chapter-2-stack-overflows/linux-buffer-overflow-with-command-injection)
{% endcontent-ref %}

{% content-ref url="chapter-2-stack-overflows/linux-buffer-overflow-without-shellcode" %}
[linux-buffer-overflow-without-shellcode](https://666isildur.gitbook.io/ethical-hacking/binary-exploitation-exploit-development/shellcoders-handbook/chapter-2-stack-overflows/linux-buffer-overflow-without-shellcode)
{% endcontent-ref %}
