Intro to x86-64

Look at x86-64 assembly using radare2

Radare2 is a framework for reverse engineering and analyzing binaries. It can be used to disassemble binaries(translate machine code to assembly, which is actually readable) and debug said binaries(by allowing a user to step through the execution and view the state of the program).

We first execute the program intro running /.intro.

From the execution we can see that the program creates two variables and switch their values.

Examining the program with radare2.

To run the program on radare we type r2 -d intro.

This will open the binary in debugging mode. Once the binary is open, one of the first things to do is ask r2 to analyze the program with the command aa which is the most common analysis command. It analyses all symbols and entry points in the executable.

Then run e asm.syntax=intel to set the disassembly syntax to Intel.

In this case, the analysis involves extracting function names, flow control information and much more. r2 instructions are usually based on a single character, so it is easy to get more information about the commands. For general help, run ?. For more specific information, for example, about analysis, run a?.

Once the analysis is complete, we want to know where to start analyzing from - most programs have an entry point defined as main. To find a list of the functions run afl.

Above we can see that there is a main function. To examine the assembly code at main we run the command pdf @main, where pdf means print disassembly function. Doing so will give us the following.

In the figure above, the values on the complete left column are memory addresses of the instructions, and these are usually stored in a structure called the stack. The middle column contains the instructions encoded in bytes (what is usually the machine code), and the last column contains the human readable instructions.

The core of assembly language involves using registers to do the following:

  • Transfer data between memory and register, and vice versa

  • Perform arithmetic operations on registers and data

  • Transfer control to other parts of the program.

Since the architecture is x86-64, the registers are 64 bit and intel has a list of 16 registers:

64 bit

32 bit

rax

eax

rbx

ebx

rcx

ecx

rdx

edx

rsi

esi

rdi

edi

rsp

esp

rbp

ebp

r8

r8d

r9

r9d

r10

r10d

r11

r11d

r12

r12d

r13

r13d

r14

r14d

r15

r15d

Even though the registers are 64 bit, meaning the can hold up to 64 bits of data, other parts of the registers can also be referenced. In this case as 32 bit values, but they can also be refereced as 16 bit and 8 bit (higher 4 bit and lower 4 bit)

64-bit ->rax ; 32- bit -> eax; 16-bit -> ax; 8-bit(higher 4 bit) -> ah; 8-bit(lower 4 bit) -> al

The first 6 registers are general purpose registers. The rsp is the stack pointer and it points to the top of the stack which contains the most recent memory address. The stack is a data structure that manages memory for proagrams. rbp is a frame pointer and points to the frame of the function currently being executed - every function is executed in a new frame. To move data using registers, the following instruction is used:

mov destination, source

This involves:

  • Transferring constants (mov rax, 3 would move the constant 3 to the rax register)

  • Transferring values from a register (mov rbx, rax would move the value in rax to rbx)

  • Transferring values from memory which is shown by putting registers inside breackets (mov [rbx], rax would move the value stored in rax into the memory location represented by rbx.

Some other important instructions are:

  • lea destination, source: This instruction sets the destination to the address denoted by the expression in source.

  • add destination, source: destination = destination + source

  • sub destination, source: destination = destination - source

  • imul destination, source: destination = destination * source

  • sal destination, source: shift destination bits to the left

  • sar destination, source: shift destination bits to the right

  • xor destination, source: destination = destination XOR source

  • and destination, source: destination = destination AND source

  • or destination, source: destination = destination OR source

If Statements

The general format of an if statement is:

if(condition){

  do-stuff-here

}else if(condition) //this is an optional condition {

  do-stuff-here

}else {

  do-stuff-here

}

If statements use 3 important instructions in assembly:

  • cmp source1, source2: it is like computing a-b without setting destination (if both sources are equal it evaluates to 0 and sets the ZF to 1)

  • test source1, source2: It is like computing AND without seeing destination (if both sources are equal it evaluates to 0 and sets the ZF to 1)

Jump instructions are used to transfer control to different instructions, and there are different types of jumps:

Jump Type

Description

jmp

Unconditional

je

Equal/Zero

jne

Not Equal/Not Zero

js

Negative

jns

Nonnegative

jg

Greater

jge

Greater or Equal

jl

Less

jle

Less or Equal

ja

Above(unsigned)

jb

Below(unsigned)

The last 2 values of the table refer to unsigned integers. Unsigned integers cannot be negative while signed integers represent both positive and negative values. Since the computer needs to differentiate between them, it uses different methods to interpret these values. For signed integers, it uses something called the two’s complement representation and for unsigned integers it uses normal binary calculations.

Lets analyze a program with if statements:

In the figure above we can see the main function. To analyse it we first set a break point on the jge and the jmp instruction using the command:

db 0x55ae52836612(which is the hex address of the jge instruction)

db 0x55ae52836618(which is the hex address of the jmp instruction)

We have added breakpoints to stop the execution of the program at those points so that we can see the state of the program.

We now run dc to start the execution of the program and stop at the first break point. Before the first breakpoint this is what happens:

  • The first 2 lines push the base pointer onto the stack and save it, then give the value of the base pointer to the stack pointer.

  • The next 3 lines are about assigning values 3 and 4 to the local arguments/variables var_8h and var_4h. It then stores the value of var_8h in the eax register.

  • The cmp instruction compares the value of eax with var_4h.

To view the value of the registers we type dr. Below we have the value of the registers at the beginning of the program and before hitting the breakpoint.

We can that the value in rax is 3 when we hit the breakpoint. We see that after the compare, the instruction will jump if eax is greater than or equal to the value in var_4h. To see what's in var_4h, we can see in the the main function that it has the value 4 assigned to it.

So eax contains 3, and 3 is not greater than 4 which mean the jump will not occur and we will move to the next instruction. We can check this moving to the next instruction using ds.

Solving the questions for the if2 binary

To answer the first question lets first analyze the main function:

There are three variables:

  • var_ch that is assigned with value 0

  • var_8h that is assigned with value 0x63 which is the hex value for the decimal 99

  • var_4h that is assigned with the value 0x3e8 which is the hex value for the decimal 1000

var_ch is stored in eax, which is then compared with var_8h. If eax is greater than or equal to 99 (var_8h) then it jumps to some address ahead, but we know that eax is 0 so that it wont jump. Then, the value of var_8h is stored in eax, and it is then compared with 1000 (var_4h). Once again if it eax is greater than or equal to 1000 it jumps, so it does not jump because eax is now 99. Now we have the and instruction, comparing the value in eax with 0x64. To do this and operation we can see how it works at the binary level.

0x63 = 1100011

0x64 = 1100100

    1100011
and 1100100
       =
    1100000

1100000 = 0x60 = 96

Since this is the only operation done to var_8h before the pop and ret instructions...

We just found the answer to the first question.

Continuing with the program's flow examination, after the bitwise and operation, the program jumps to the address at 0x563ff0c00630 which subtracts 0x4b0 to var_4h. This is the last instruction related to the variables before the pop and ret instruction, meaning that the value of var_ch remained 0.

As we already seen, the instruction at 0x563ff0c00630 subtracts 0x4b0 to var_4h.

var_4h = 0x3e8 = 1000

0x4b0 = 99

1000 - 99 = 1

So the value of var_4h before the pop and ret instructions is 1.

The symbol that represents the and instruction is &.

Loops

Usually two types of loops are used: for loops and while loops. The general format of while loops is:

while(condition){

  Do-stuff-here

  Change value used in condition

}

The general format of a for loop is:

for(initialise value: condition; change value used in condition){

  do-stuff-here

}

Lets analyse the following binary.

We start by setting a breakpoint at the jmp instruction.

Doing this allows use to skip the first few lines of instructions, which as we saw using if statements, it just passing in values to local arguments

Once execution reaches the breakpoint at the jmp instruction, run ds to move to the next instruction. Since this is an unconditional jump, it will move to the cmp instruction.

Here the cmp instruction is trying to compare what’s in the local argument var_ch with the value 8. To see what’s in var_ch, we check the start of the disassembled function and check the memory. In this case, it is rbp-0xc

And shows that it contains 4. The next instruction is a jle which is going to check is the value is var-ch is less than or equal to 8. Since 4 is less than 8, it will jump to the add instruction.

The add instruction will add 2 to the value of var-ch and continue to go to the cmp instruction. Since 2 was added to var_ch, var_ch will now contain 6 which is still less than 8, and it will jump back to the add instruction. This can be seeing by continuing execution using the ds statement. We know this is a loop because the add instruction is being executed more than once, and this is in combination with comparing the value of var_ch to 8. So we can infer the structure of the loop to be:

while(var_ch < 8){

 var_ch = var_ch + 2

}

This questions are about the binary loop2. Let's get an overview of what the program does analyzing its main function.

The program has three variables: var_ch, var_8h and var_4h. It first starts by assigning the valye 0x14 (decimal 20) to the var_ch, the value 0x16 (decimal 22) to var_8 and zeroes var_4h and assigns it a value of 4. Then, the program junps to a compare, comparing the value of var_4h with 0x63 (decimal 99). If the var_4h is less than or equal to 99, it jumps to the address at ...61c, where it has an and operation with var_ch and the value of 2. Then it moves to a sar instruction that it will shift to the right the values of var_8h. After that, it will be assigning the value of var_4h (4) to the edx register, and then moving it to the eax register, adding eax and after ebx and storing the value (now 12) once again into var_4h. It then proceeds to the compare and it loops again.

In the first iteration of the loop, the variable var_8h has the value of 0x16 or 0 0 0 1 0 1 1 0 bits. Since the sar instruction shifts the bits to the right, the value of var_8h in the first iteration of the loop is 0 0 0 0 1 0 1 1 (decimal 11). On the second iteration, the bits are shifted again and become 0 0 0 0 0 1 0 1 , which is is the binary of 5, the answer to the first question.

The value of var_ch at the beginning of the program is 0x14, the hexadecimal equivalent to 20. It is then ANDed with 2, so lets have a look:

0 0 0 1   0 1 0 0 
0 0 0 0   0 0 1 0 

0 0 0 0   0 0 0 0

If we do a bitwise AND instruction the 20 and 2, the result will be 0. Once 0, in the rest of the iterations where the value of var_ch is ANDed again with 2, it will be always be 0. This the answer to the second question.

To answer this question, we can type ds until we get to the end of the iteration and check the value of var_8h with the command px @rbp-0x8.

We have already seen that the value of var_ch will always be 0 after the first iteration. We can confirm that by typing px @rbp-0xc.

Crackme1

This crackme's password was the normal ip number of localhost.

Crackme2

This crackme has a lot of code, but most of it we can ignore. What is important is that we can see that it opens a secret file in the directory, and then proceeds to reverse the other of the string in the file. If the password is the string of that file in reverse order we it prints the "Correct Password" message.

Last updated