CPU, Memory and Buffer Overflow
- https://ret2rop.blogspot.com/2018/08/basic-working-computers.html
- https://ret2rop.blogspot.com/2018/08/interacting-exploiting-ports.html
Programs, Libraries, Binaries
In general and for the ease and simplicity we code program first in high level language. It has many constants, variables and instructions to be processed by the CPU for proper functioning of program. These instructions are grouped in so called modules and functions. Fortunately we don't need to code all these instructions in detail in high level programming languages as they are already included as functions in Libraries. A compiler/interpreter takes all these required instructions and convert them into machine code to be understandable by the machine.What happens when you execute a code?
The binary of the corresponding code is first run by the kernel. Kernel helps the OS to talk with hardware. It's provided a space in the memory(Random Access Memory) from the pool then the program is loaded on the allocated space. The CPU jumps to a specific memory address and starts processing. The computer consists of- CPU which is the main processor
- Memory also called RAM on which the program is loaded
- A Hard disk or something with non-volatile storage
16bit
|
32bit
|
64bit
|
Description
|
AX
|
EAX
|
RAX
|
The accumulator register
|
BX
|
EBX
|
RBX
|
The base register
|
CX
|
ECX
|
RCX
|
The counter
|
DX
|
EDX
|
RDX
|
The data register
|
SP
|
ESP
|
RSP
|
Stack pointer
|
BP
|
EBP
|
RBP
|
Points to the base of the stack frame
|
RnD |
Rn
|
(n = 8...15) General purpose registers
|
|
SI
|
ESI
|
RSI
|
Source index for string operations
|
DI
|
EDI
|
RDI
|
Destination index for string operations
|
IP
|
EIP
|
RIP
|
Instruction Pointer
|
FLAGS
|
Condition codes
|
The above table is a list of such registers on 16 bit, x86 (32 bit) and x86-64 (64 bit) CPU. The 8 bit registers were called as AL, BL, CL, etc. Prefix "E" in 32 bit registers stand for extended and "R" is used in 64 bit registers. We will talk more about registers further.
Memory
Now let's come up to memory. Memory is the place where your program is loaded and it's all the local variables, parameters, functions are stored here. The processor then reads instructions from memory and executes them.Memory Addresses
You need to give everything an address in memory so that you can point to a specific part where a particular data is stored. These addresses are given in hexadecimals. Hex is number system with base 16, like generally we use decimals which has base 10 and binary has base 2. So 0x00000... points the lowest address in memory and 0xffffff... points to the highest address in memory. These are stored in special registers.Memory Organization
- The topmost part of memory i.e. the highest memory addresses are reserved by kernel. It also contains our command line variables (argc,argv) and environment variables.
- The first here from bottom of memory is code segment. It has all the strange byte pattern for CPU. This part is read only. Any attempt to write in it will lead to memory violation.
- The next part up is DATA and BSS segment. They contain global variables. If you are familiar with programming you might know that they can be accessed by any function from any code. The DATA section has Initialised variables and BSS section contains Uninitialised variables.
- Then comes HEAP. Heap is used for dynamic memory allocation. You can allocate memory to heap during execution of binaries by means of family of functions namely malloc. Heap is one of the target site for buffer overflow. Heap grows towards higher memory addresses.
- Let's discuss STACK now. Stack contains local variables, parameters, and some more information for the functions. In most systems stack begins at higher memory address and moves toward lower memory address. You will find most buffer overflows in this region. The base of stack is tracked by Base Pointer register in CPU. The top of stack is tracked by Stack Pointer. Stacks work on Last In First Out (LIFO) principle. It means the last object to be pushed on stack will be the first to be popped off. After passing the arguments and variables the old base pointer is pushed on the stack. The value of bp is updated to stack pointer, it is stored in base register (bx), some stack space is allocated(sub) and the function is called. This is called function prologue. It's reverse is after the execution, called epilogue the stack space is deallocated(add), the stack pointer is reset and base pointer is restored, and base register is popped off the stack. Another pop operation copies the return address to instruction pointer.
- The Return address points to the memory address where the program should return the control to after code execution.
BUFFER
Most programs usually take an input and process an output on basis of that through some specified functions. Where are these stored in the memory? These strings and arrays are stored in buffer. So buffer holds up objects of same data type. This input can be taken from- Data typed in a prompt or gui.
- Data sent to program over a network.
- Data provided in a file.
- Data provided in variables.
What's buffer overflow?
A buffer is like a water tank. You are given a fixed amount of space to fill. But what if you give the input more than it can hold? It overflows and water is spilled into places you aren't supposed to access. The same happens in buffer overflow. If a user input allowed is say 500 characters. What happens if you input 550 characters and the programmer had been too lazy to perform any bound checks and just copies the data onto buffer? The input crosses the buffer boundary and overwrites the memory surrounding it. This should usually cause Segmentation Fault. A segmentation fault/SIGSEGV is raised when you try to access areas of memory which you aren't supposed to access.How buffer overflows can be used for Exploitation?
As we know that buffer overflows overwrite the surrounding memory. This means it can be used to crash the system, corrupt memory and even execute arbitrary codes on the system in the same privileges as the program. In UNIX/LINUX environment a vulnerable program with suid can be used in executing root level commands. Same with programs having admin rights in windows. An attacker can carefully craft the input to make the CPU perform some specific task or execute a program for her.There are many types of buffer overflows occurring in different areas of memory. For example, In stack based buffer overflow an attacker makes the buffer to overwrite the return address. Recall that return address points where the control should be returned to after execution. The attacker can replace that with address of a CPU instruction like a shellcode which can spawn a command line shell to the attacker. A shellcode is simply CPU instruction to execute a shell like bash, sh, zsh, cmd, powershell, etc. Recall from our previous article how we could interact with open ports and application listening on it. A buffer overflow exploit can lead to remote exploitation of that system. In fact buffer overflow has been one of the most widely used attack and had been responsible for many major worm outbreaks and hacking of computers over a network and whole internet. And most importantly they don't require user interaction.
Well that's all for now. It was just introduction. There's still a lot to learn in this topic. In next articles we will learn how to craft buffer overflow exploits starting with Smashing the Stack, heap overflows, some more information about CPU and memory and a lot more of interesting stuff. Keep researching, practicing and learning. Feel free to ask problems. Make sure you understand everything well. Share if you liked the article. Thanks for reading. :D
Next read: Stack based buffer overflow on 64 bit linux
Hey,
ReplyDeleteJust want to say thank you for these posts. They are way more informative and clean when compared to some of the other guides out there by well known researchers. I'm looking forward to learning more from your site.
Thanks buddy.
DeleteVery well written. Thanks for sharing the informative post
ReplyDeleteWelcome :D
Delete