Follow us on Twitter!
Become the change you seek in the world. - Gandhi
Wednesday, April 23, 2014
Navigation
Home
HellBoundHackers Main:
HellBoundHackers Find:
HellBoundHackers Information:
Learn
Communicate
Submit
Shop
Challenges
HellBoundHackers Exploit:
HellBoundHackers Programming:
HellBoundHackers Think:
HellBoundHackers Track:
HellBoundHackers Patch:
HellBoundHackers Other:
HellBoundHackers Need Help?
Other
Members Online
Total Online: 17
Guests Online: 17
Members Online: 0

Registered Members: 82876
Newest Member: bhl1986
Latest Articles

How a program works

Arrow Image A very early step for understanding Buffer Overflow



How a program works
A very early step for understanding Buffer Overflow


This is not a buffer overflow exploit, but a required background that will help to understand how CPU & memory \"collaborate\" to execute a program.
I read many articles about \'buffer overflow\'. Most of them starting from a specific point by \'stowing\' the basic knowledge one must have to deeply understand what is going on behind the scenes. I write this article to cover (I hope) this gap.

If at the end of this article you feel more comfortable with concepts like CALL, RETN and how a function is executed using the memory (buffer, stack, etc) then i will feel that I succeed... so, help me feel a successful and nice person :))

First, I would like to point out that everything we say is about the processor xx86 family. In addition, most memory addresses are expressed in a decimal notation (for the shake of clarity for beginners) instead of hexadecimal that actually represented in the real world systems.

Requirements in order to read this article:
1. A basic understanding of assembly language.
2. A basic understanding of C language.
3. A basic understanding of a Personal Computer.
4. A basic understanding of English (i hope...).
5. None of the above,... just open mind, imagination and... frame.
Well,... ok,.. 4 and 5 i believe is the most crucial - even they contradict each other!

I hear you say...: \"Come on lamer, you said too much!! Let’s start...\"

Ok then....
Every process starts in memory in three basic segments:
-Code Segment
-Data Segment (the well known BSS)
-Stack Segment

CODE SEGMENT
------------
In this memory segment, \"live\" all instructions of our program. Nobody... (nobody? well ok, almost nobody) can write to this memory segment i.e. is a read only segment.

For example
All assembly instructions (in C code here) are located in code segment:

/* Set the 1st diagonal items to 1 otherwise 0 */
for (i = 0; i < 100; i++)
for (j = 0; j < 100; j++)
if (i<>j)
a[i][j] = 0
else
a[i][j] = 1;


PS: The remarks /*...*/ are not included... in the data segment. The compiler does not produce code for the remarks.

DATA SEGMENT
------------
All initialized or un-initialized global variable are stored in this non-read only segment.
For example:

int i;
int j = 0;
int a[100][100];



STACK SEGMENT
-------------
All function variables, return addresses and function addresses are stored in this non-readonly memory.
This segment is actually a stack data structure (for those that have attended a basic information technology course). This, actually means, that we put variables in a stack in memory. The last putted (or pushed) variable is in the top on stack i.e. the first available. The well known LIFO (Last In First Out) data structure.

The processor register ESP (Extended Stack Pointer) is used to keep the address of the first current available element of the stack.

In the stack: we can put (PUSH) and get (POP) values.
There are two important “secrets” here:
[1] PUSH and POP instructions are done in 4-byte-units because of the 32bit architecture of xx86 processors family.
[2] Stack grows downward, that is, if SP=256, just after a “PUSH 34” instruction, SP will become 252 and the value of EAX will be placed on address 252.

For example:

STACK
adrs memory
---- ------------------
256 | xy |
252 | |
248 | |
244 | |
... .................
(ESP=256)

Instruction > PUSH EAX ; remark: suppose EAX = 34

STACK
256 | xy |
252 | 34 |
248 | |
244 | |
... .................
(ESP=252)

Instruction > POP EAX ; remark: Get the value from the stack into EAX register

STACK
256 | xy |
252 | 34 |
248 | |
244 | |
... .................
(ESP=256)


Instruction > PUSH 15 ; remark: suppose EAX = 15
Instruction > PUSH 16 ; remark: suppose EBX = 16

STACK
256 | xy |
252 | 15 |
248 | 16 |
244 | |
... .................
(ESP=248)



What is behind a function-call
-------------------------------
Before we explain what is behind, we must say a few words about the EIP (Extended Instruction Pointer or simple \'Instruction pointer\'). This register keeps the code segment address of the instruction that will be executed by the CPU.

Every time CPU executes an instruction stores into EIP the address of the instruction that follows the currently executed.
But, how does CPU find the address of the next instruction?
Well... we have two cases here...
1. The address is immediately after the instruction currently executed.
2. There is a \'JMP\' (jump, i.e. a function call) so the instruction that needs to be executed next is in an address which is not next to the current.

In case 1 the address is calculated by simply add the Length of the currently executed instruction to the current EIP value.
Example:
Suppose we have the following 2 instruction to the addresses 100, 101

100 push EDX
101 mov ESP 0

Suppose that at the starting point of our little program we have: EIP = 100
CPU executes the instruction at address 100.
CPU checks the instruction:
Is it a JUMP? No, so calculate its size. CPU knows that the push instruction is 1 byte long.
So,... the new value of
EIP = EIP + size(push EDX) =>
EIP = 100 + 1 =>
EIP = 101
So,.... CPU executes the instruction at address 101, and so forth...

In case 2, we have a jump... things are a bit more different.
Actually, just before we JMP to another address (i.e. call a function), we save the address of the next instruction in a temporary register, say in EDX; and before returning from the function we write the address in EDX to EIP back again.

CALL and RETN assembly instructions are used ... by the CPU to calculate the above addresses:
The CALL is used to do 2 things:
1. To \"remember\" the next instruction that will be executed after function returns (by pushing its address to the stack) and
2. To write into the EIP the address of the calling function i.e. to perform the function call.

The RETN instruction is called at the end of the function:
It pops (gets) the \"return address\" that CALL pushes into the stack to continue the execution after the end of the function.

The Base pointer (EBP)
----------------------
Each function in any program (even the main() function in C) has its own stack frame. A stack frame is a logical group of consecutive variables in the stack that keeps variables and addresses for every function that is currently executed.
Every address in the stack’s frame is a relative address. That means, we address the locations of data in our stack in relative to some criterion. And this criterion is EBP, which is the acronym for Extended Base Pointer.
EBP has the stack pointer of the caller function. We PUSH the old ESP to the stack, and utilize another register,named EBP to relatively reference local variables in the callee function.
I hope the use of the base pointer will be more clear in the following example.

A REAL EXAMPLE C PROGRAM:

Consider the following C program:

void function1(int , int , int );
void main()
{
function1 (1, 2, 3);
}

void function1 (int a, int b, int c)
{
char z[4];
}


I compile/link the above program and I use the olly debugger to check the assembly code created.
Bypassing the operating systems instructions (which is the 90% of the assembly code) the rest is the code that corresponds to our little program:

0040123C /. 55 PUSH EBP
0040123D |. 8BEC MOV EBP,ESP
0040123F |. 6A 03 PUSH 3 ; /Arg3 = 00000003
00401241 |. 6A 02 PUSH 2 ; |Arg2 = 00000002
00401243 |. 6A 01 PUSH 1 ; |Arg1 = 00000001
00401245 |. E8 05000000 CALL bo1.0040124F ; \\bo1.0040124F
0040124A |. 83C4 0C ADD ESP,0C
0040124D |. 5D POP EBP
0040124E \\. C3 RETN

0040124F /$ 55 PUSH EBP
00401250 |. 8BEC MOV EBP,ESP
00401252 |. 51 PUSH ECX
00401253 |. 59 POP ECX
00401254 |. 5D POP EBP
00401255 \\. C3 RETN


ANALYSIS:
---------
The addresses from 0040123C to 0040124E is the main() function.
The addresses from 0040124F to 00401255 is the function1() function.

0040123C /. 55 PUSH EBP
Backs up the old stack pointer. It pushes it onto the stack.

0040123D |. 8BEC MOV EBP,ESP
Copy the old stack pointer to the ebp register
From then on, in the function, we\'ll reference function\'s local
variables with EBP. These two instructions are called the
\"Procedure Prologue\".

The stack has the EBP value:
[ebp]
STACK
256 | [ebp] |
... .................
(ESP=256)



0040123F |. 6A 03 PUSH 3 ; /Arg3 = 00000003
00401241 |. 6A 02 PUSH 2 ; |Arg2 = 00000002
00401243 |. 6A 01 PUSH 1 ; |Arg1 = 00000001
Here we put the arguments into the stack

The stack is:
STACK
256 | [ebp] |
252 | 3 |
248 | 2 |
244 | 1 |
... .................
(ESP=244)


00401245 |. E8 05000000 CALL bo1.0040124F ; \\bo1.0040124F
call the function at addresss 0040124F. bo1 is the name of my executable.
The stack becomes:
STACK
256 | [ebp] |
252 | 3 |
248 | 2 |
244 | 1 |
240 | 0040124A | <- the return address when the function1 ends.
... .................
(ESP=240)

Let’s follow the execution, so go to address 0040124F (the function1):

0040124F /$ 55 PUSH EBP
00401250 |. 8BEC MOV EBP,ESP
Hmm... this is the \"Procedure Prologue\" again (remember this must be executed in every function). It set ups its own stack frame. The EBP register is currently pointing at a location in main\'s stack frame. This value must be preserved. So, EBP is pushed onto the stack. Then the contents of ESP is transferred to EBP. This allows the arguments to be referenced as an offset from EBP and frees up the stack register ESP to do other things.

The stack now, is:
STACK
256 | [ebp] |
252 | 3 |
248 | 2 |
244 | 1 |
240 | 0040124A | <- the return address when the function1 ends.
236 | <main’s EBP> | <- Note that ESP=EBP indicates this address.
... .................
(ESP=236)


00401253 |. 59 POP ECX
00401254 |. 5D POP EBP
After two pops the actual stack becomes:
STACK
256 | [ebp] |
252 | 3 |
248 | 2 |
244 | 1 |
... .................
(ESP=244)

00401255 \\. C3 RETN
The function ends and returns to the 0040124A (remember our definition of the RET instruction).

0040124A |. 83C4 0C ADD ESP,0C
After the function RETurned, we add 12 or 0C in hex (since we pushed 3 args
onto the stack, each allocating 4 bytes (integers)) into Stack Pointer. Increasing the ESP we actually decreasing the stack (remember that we fill stack downwards from high to low memory addresses i.e. ESP = 244 + 12 = 256).
STACK
256 | [ebp] |
... .................
(ESP=256)

Thus, the ESP has the value that has at the first step of the programs execution before the function call.

I hope that you get a basic understanding of the use of Stack and Stack Pointer.
In another article I will describe how nasty things can happened here. Hint: How about overwriting the stack item (at address 240 in our example above) or how about overwriting the value of the Instruction Pointer (EIP)...

Please be impolite and as rude as possible because this is not my 1st article. In addition I don’t give a shit about it...

;-) To be serious. . . I suggest you to try my program or better create your own and test, check, review, test, check, review, test, check, review!!

Happy Programming Guys!!

References:
[1] BUFFER OVERFLOWS DEMYSTIFIED by murat@enderunix.org
[2] C Function Call Conventions and the Stack (UMBC CMSC 313, Computer Organization & Assembly Language, Spring 2002, Section 0101)
[3] The Assembly Language Book for IBM PC by Peter Norton (ISBN 960-209-028-6)
[4] Analysis of Buffer Overflow Attacks from http://www.windowsecurity.com/articles/Analysis_of_Buffer_Overflow_Attacks.html
[5] 8088 8086 Programming and Applications for IBM PC/XT & Compatibles by Nikos Nasoufis

Comments

willeHon September 03 2006 - 00:31:42
Sounds, erm, interesting. You clearly put a lot of effort into this. It seems well constructed, written and packed full of content. Really good, shame I didnt read it, *whhooosshhh* right over my head.
Mr_Cheeseon September 03 2006 - 01:18:16
great article has everything a great article needs and you included references which is a rare bonus. seems interesting as i was reading now and straight forward to understand. i'll give it a thorough read a bit later on.
wolfmankurdon September 03 2006 - 03:24:48
looks great, gave it a quick read only htoughm 3:23 am o.O makes me want to leanr ASM
BobbyBon September 06 2006 - 16:25:30
A near perfect article. Brilliant.
z3roon September 07 2006 - 22:05:29
Fantastic article had to read it a few times to understand it all but it was great! Please write more articles Smile
Thiseason September 08 2006 - 13:30:08
thnx! people... I really appreciate your... taste.. Smile Thnx!
Post Comment

Sorry.

You must have completed the challenge Basic 1 and have 100 points or more, to be able to post.