Monthly Archives: January 2018

Protostar Heap3

Level Text

#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <stdio.h>

void winner()
{
  printf("that wasn't too bad now, was it? @ %d\n", time(NULL));
}

int main(int argc, char **argv)
{
  char *a, *b, *c;

  a = malloc(32);
  b = malloc(32);
  c = malloc(32);

  strcpy(a, argv[1]);
  strcpy(b, argv[2]);
  strcpy(c, argv[3]);

  free(c);
  free(b);
  free(a);

  printf("dynamite failed?\n");
}

Intro

This level introduces the Doug Lea Malloc (dlmalloc) and how heap meta data can be modified to change program execution. The goal for this level is to call the winner function.

First, let’s look at the difference in structure of free and allocated memory chunks (ones we allocate with malloc):

Screenshot from 2018-01-18 13-27-40.png

Image 1: Representation of chunks in memory [1]

 As a short reminder: dlmalloc keeps free chunks (chunks available for allocation) in a doubly linked list of chunks (right side of Image 1). Each free chunk has fields representing:

  • Prev_size: If previous chunk is allocated, the size field represents data from that chunk.
  • Size: Size of free chunk.
  • FD pointer: Pointer to next free chunk in doubly linked list.
  • BK pointer: Pointer to previous free chunk in doubly linked list.
  • Unused space: Space where user data would be stored if allocated.
  • Size: Size of chunk, this field is used for easier merging of chunks.

In contrast, the structure of allocated chunks is:

  • Previous size: If previous chunk is allocated this field represents last 4 bytes of that chunk. If previous chunk is free, this field represents the size of that chunk.
  • Size: Size of this chunk including chunk metadata (header). Last bit of this chunk (called P or “prev_inuse” bit) indicates if the previous chunk is in use or not.
  • User data: Space available for user data.

Important thing to note here are the relations between those two chunk types: Imagine you have a user allocated chunk of memory which looks as the left side of Image 1. If we convert this chunk into a free chunk type (right side of Image 1), the first 8 bytes of “User data” will be represented as FD and BK pointers of the free chunk. We will need this later because it is crucial for our exploit.

Let’s see how our level looks like in memory:

Screenshot from 2018-01-18 14-25-42.png

Image 2: Level representation in memory

As we can see, the three pointers (a, b and c) are located on the stack and point to their corresponding allocated memory chunks which are located on the heap.

The attack vector we use in this level is the vulnerable strcpy call which doesn’t check how many bytes get written into the buffer and thus allowing us to cause overflows.

Short reminder on how free works: It is important for memory allocator to avoid fragmentation, that is most of your memory is allocated in a large number of non-contiguous blocks. To avoid this, the allocator needs to coalesce/merge contiguous free chunks. I found a very good and short explanation of how free works on Sploitfun blog [2] and source code for dlmalloc at [3]. So merging forward (if next chunk is also free) or backward (if previous chunk is free) works as follows:

Merging backwards:

  1. Check if previous chunk is free: “Previous chunk is free, if current freed chunk’s PREV_INUSE (P) bit is unset.” [2]
  2. Merge if free: “unlink (remove) the previous chunk from its binlist, add previous chunk size to current size and change the chunk pointer to point to previous chunk.” [2]

Merging forward:

  1. Check if previous chunk is free: “Next chunk is free, if next-to-next chunk’s (from currently freed chunk) PREV_INUSE (P) bit is unset. To navigate to next-to-next chunk, add currently freed chunk’s size to its chunk pointer, then add next chunk’s size to next chunk pointer.” [2]
  2. Merge if free: “unlink (remove) the next chunk from its binlist and add next chunk size to current size.” [2]

We mentioned a “unlink” gets executed. Unlink is used to remove a chunk from doubly linked list of free chunks. This unlink call is very important and looks something like this:

Screenshot from 2018-01-18 14-58-58.png

Image 3: Unlink macro

is the chunk to unlink, BK and FD are variables used for redirection. Look at the right side of Image 1.

  • FD = P -> fd: pointer to the next free chunk is stored in FD
  • BK = P -> bk: pointer to the previous free chunk is stored in BK
  • FD -> bk = BK: we need to change the BK pointer of the free chunk following the chunk we are unlinking to point to the chunk before the one we are currently unlinking.
  • BK -> fd = FD: we need to change the FD pointer of the previous free chunk to point to the chunk following the chunk we want to unlink.

After this steps, the chunk pointed to by P will be unlinked from the doubly linked list of free chunks.

Now we can think about the idea of how to pass this level: Using unconstrained strcpy writes we could overflow data from one chunk to overwrite another chunk’s memory, thus allowing us to change its metadata (the allocated chunk’s prev_size and size fields from left part of Image 1). Arbitrary write to the size field allows us to change its prev_inuse bit, effectively changing the status of the previous chunk from allocated to free. By changing a neighboring chunk’s status to free, the free() function that gets called on the other neighboring chunk will trigger the unlink merging routine on the chunk we falsely set to be free. By manipulating the fields in the falsely set free chunk (which is actually allocated), the FD -> bk = BK line from unlink routine will give us an arbitrary memory write because we will control the addresses in FD and BK pointers. The details about this attack will be shown in the next chapter.

Solution

Solution steps:

  1. Find the address of winner.
  2. Find the redirection location, that is, a way which will redirect the program execution to the winner function.
  3. Put the address of winner at the redirection location.
  4. Test the exploit.

Note that I didn’t pass this level on my first try. My failed attempts are also documented and the errors I made are pointed out and explained.

Step 1: First, load our program in GDB using command: “gdb heap3”, next, set the disassembly output to Intel syntax using command: “set disassembly-flavor intel”. After this is done, we can begin. Get the winner function using command: “print winner” which will give you the address of the winner function. I explained in more detail (with pictures) how to do this in previous levels so I won’t do it again there to not repeat myself. The address of winner is: winner@08048864

Step 2: From discussions from previous levels, I picked the GOT way of redirecting program execution because it is the most reliable one. If we disassemble main function using “disass main”, by looking at line  we can see that the call to printf from 28th line in the given C source gets replaced with puts as an optimization. The puts call is located in a library so it needs to be resolved using Global Offset Table and Procedure Linkage Table, as explained in the previous levels. So we need to find the landing location from the PLT jump. Because PLT is just a bunch of code, we can disassemble it using comand: “disass 0x8048790” which gives us the below output:

Screenshot from 2018-01-18 15-59-09.png

Image 4: disassembling PLT

Output from Image 4 shows that the PLT trampoline jumps to the GOT entry located at redir_loc@0x0804b128 in the data section.

Step 3: Ok, now to the nasty part. We want to somehow write the address of winner from Step 1 to the redir_loc redirection location from Step 2. The way to do this is by modifying allocated chunk headers and forcing the unlink method to be called on a chunk we trick the program to think it’s free.

(FAILED ATTEMPT #1) My idea was to overflow the “b” allocated space from Image 2 to overwrite the chunk header metadata (prev_size and size fields) belonging to “c” allocated space and thus, when free gets called at line 24 from our C source, trick the program to think that the previous chunk (the “b” chunk) is free and unlink it, causing the “b” chunk’s first 8 bytes of “User data” field to be interpreted as FD and BK and thus giving us the arbitrary write we need to write the address of winner to redir_loc. Considering the code for backward merging from [2] I designed my exploit:

Screenshot from 2018-01-18 16-16-59.png

Image 5: backward merging code snippet

./heap3 AAAA `python -c ‘print “redir_loc_bytes” + “winner_addr_bytes” + (32 – 4 – 4) + chk_b_size_4B + even_size_val ‘` CCCC

  • AAAA: dummy bytes for “a”, not used.
  • `python -c ‘print “redir_loc_bytes” + “winner_addr_bytes” + (32 – 4 – 4) + chk_b_size_4B + even_size_val ‘`: First print 4 bytes representing the redir_loc, then write 4 bytes representing winner_addr, and then fill the rest of the “b” allocated space. After that write “b” chunk size (which has the value 0x29) and another 4 bytes which will represent size of the “c” allocated space with last bit set to 0 which indicates that the previous chunk is not in use (is freed).
  • CCCC: dummy bytes for “c”, not used

I will encode the memory structure of allocated and free chunks as follows (for easier understanding):

(A, chunk_name(a/b/c) | prev_sizesizeuser_data…)  => Allocated chunk

(F, chunk_name(a/b/c) | prev_sizesizeFD, BK, unused_space…) => Free chunk

After exploit code gets called, the heap would look something like this (before the first call to free):

(A, a | prev_size, size, AAAA“)

(F, b | prev_size, size, redir_loc, winner_addr, padding)

(A, c | 0x00000029, even_size_val, “CCCC”)

After first call to free(c), the program would get tricked that the “b” chunk is free because of the overwritten “c” chunk’s size field which will end with a 0 bit which indicates that previous chunk is not in use. After that, free(c) would call unlink on the “b” chunk and thus causing winner_addr to be written at redir_loc (Image 3, line 4). The value 0x00000029 is used for navigating to the start of the previous chunk by Image 5, line 4.

Reason this approach fails: The value 0x00000029 I needed to input via command line couldn’t be given because bash always strips the null bytes (because they represent string terminator or something like that). Second, even if I did input those bytes, line 4 from Image 3 (FD -> bk = BK) would actually write BK at (FD + 12), that will result in writing winner_addr at redir_loc+12.

(FAILED ATTEMPT #2) We failed miserably when trying the previous exploit, so what are the alternatives? It is shown, with a simple trick, that is possible to use negative numbers to make things work. I said previously that bash wouldn’t accept small numbers because they contain NULL bytes which bash would ignore. But what if we input really really large numbers, so large that they are actually interpreted as negative integers?

If we set the “c” chunk’s prev_size to -8 and its size to -4 (which is still even and indicates that the previous chunk is free), we can get the forwards merging unlink call to actually work. Consider the new heap structure (before free(c)) after the follwing shellcode:

./heap3 AAAA `python -c ‘print “B”*32 + 0xfffffff8 + 0xfffffffc ‘` `python -c ‘print “C”*8 + (redir_loc – 12) + winner_addr’`

(A, a | prev_size, size, AAAA“)

(A, b | prev_size, size, padding)

(A, c | 0xfffffff8, 0xfffffffc, “CCCCCCCC”, (redir_loc – 12), winner_addr)

What will happen here when free(c) gets called is the following: The program will see that the “c” chunk’s size field is set to 0xfffffffc which represents -4 and because it is an even value, it will think the previous chunk is free. Then, by Image 5, line 4, it will navigate to the previous free chunk by subtracting its prev_size field (0xfffffff8 = -8) from the start of “c” chunk pointer and thus actually adding +8 and moving into the “c” chunk by bytes, arriving at the beginning of the memory allocated by malloc(c). We can then forge an “virtual” chunk inside the chunk “c”‘s “User data” field which will be interpreted as a free chunk and unlinked. This looks like this:

Screenshot from 2018-01-18 18-44-28.png

Image 6: heap state after executing user exploit

The “CCCCCCCC” bytes represent fake prev_size and size bytes in the fake chunk made inside the “c” chunk “User data” space after which (redir_addr – 12) and winner_addr are located, which represend FD and BK because the fake chunk is considered free.

Reason this approach failed: This approach will actually overwrite redir_addr with winner_addr, but some code after that will try to write at some address inside .txt section which is write protected. Here is a snippet of that problem:

Screenshot from 2018-01-18 18-55-47.png

Image 7: writing to protected .txt section

Indeed, by executing GDB command “maintenance info sections” we can see that the address 0x08048864 + 0x8 belongs to .txt section.

(SUCCESSFUL ATTEMPT) Ok, now that I knew that the problem is writing to .txt protected section, and because we have plenty of space allocated on the heap, we could write a simple shellcode which will redirect our execution to the winner function. So instead of providing winner_addr, we will provide shellcode_addr which we will put inside the “a” chunk “User data” space.

There is a neat website that can compile shellcode for us: https://defuse.ca/online-x86-assembler.htm#disassembly

If you don’t want to use the site, you can compile it manually:

  1. make a file shellcode.asm containing the following instructions:
    • mov eax, 0x08048864
    • jmp eax
  2. compile the file using the command: nasm -f elf shellcode.asm
  3. link the file using the command: ld -m elf_i386 -o shellcode shellcode.o
  4. using objdump -D shellcode command, extract instruction bytes

Our working payload is now:

./heap3 `python -c ‘print “A”*16 + “\xb8\x64\x88\x04\x08\xff\xe0″‘` `python -c ‘print “A”*32 + “\xf8\xff\xff\xff” + “\xfc\xff\xff\xff”‘` `python -c ‘print “B”*8 + “\x1c\xb1\x04\x08” + “\x18\xc0\x04\x08″‘`

Parameters:

  • “A”*16: shift padding
  • “\xb8\x64\x88\x04\x08\xff\xe0”: shellcode
  • “A”*32: padding for “b” chunk
  • “\xf8\xff\xff\xff”: -8
  • “\xfc\xff\xff\xff”: -4
  • “B”*8: dummy bytes for fake prev_size and size for fake chunk
  • “\x1c\xb1\x04\x08”: fake chunk’s FD (redir_loc – 12)
  • “\x18\xc0\x04\x08”: shellcode address inside “a” chunk

One thing to note here is the first 16 padding “A” bytes which are there because when free(a) gets called, some other data gets written to the memory locations where “a” chunk “User data” begins and that would overwrite our shellcode, so we just need to shift it by a few bytes (in this case 16) for it to not get overwritten.

Proof of work:

Screenshot from 2018-01-18 19-19-02.png

Image 8: proof that exploit code works

That’s it! Thanks for reading!

References

[1] Heap Overflows and Double-Free Attacks, http://homes.soic.indiana.edu/yh33/Teaching/I433-2016/lec13-HeapAttacks.pdf

[2] Heap overflow using unlink, https://sploitfun.wordpress.com/2015/02/26/heap-overflow-using-unlink/

[3] dlmalloc source code, ftp://g.oswego.edu/pub/misc/malloc.c

Protostar Heap2

Level Text

#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <stdio.h>

struct auth {
  char name[32];
  int auth;
};

struct auth *auth;
char *service;

int main(int argc, char **argv)
{
  char line[128];

  while(1) {
      printf("[ auth = %p, service = %p ]\n", auth, service);

      if(fgets(line, sizeof(line), stdin) == NULL) break;
      
      if(strncmp(line, "auth ", 5) == 0) {
          auth = malloc(sizeof(auth));
          memset(auth, 0, sizeof(auth));
          if(strlen(line + 5) < 31) {
              strcpy(auth->name, line + 5);
          }
      }
      if(strncmp(line, "reset", 5) == 0) {
          free(auth);
      }
      if(strncmp(line, "service", 6) == 0) {
          service = strdup(line + 7);
      }
      if(strncmp(line, "login", 5) == 0) {
          if(auth->auth) {
              printf("you have logged in already!\n");
          } else {
              printf("please enter your password\n");
          }
      }
  }
}

Intro

This level was a bit messy. Messy in a way that it’s poorly written, either on purpose or in a hurry, which resulted in a lot of WTF-s in the process of understanding what was going on. The level itself is not hard at all, it introduces a new type of vulnerability: use-after-free. This vulnerability is created when referencing memory after it has been freed, resulting in a crash or undefined system behavior.

I want to mention a few things about this code that generated some problems along the way: The struct auth declared at line 7 and struct auth * auth at line 12 share the same name, which results in problems when using malloc at line 25, which (in my case at least) allocates 4 bytes of memory instead of 36 bytes because it gets the size of struct auth * auth from line 12 instead of struct auth at line 7. I’ll show this in more detail in the solution section of this post.

This program accepts some commands and acts accordingly. There are 4 commands: “auth”, “reset”, “service” and “login”. “auth” is used for user authentication that is in this case a definition of the struct auth (ideally). “reset is used to free the memory allocated for struct auth (ideally), “service” is used for setting up a fictional service that is in this case only a copy of a given string. “login” is used to perform authentication of the user by checking if the auth member of struct auth structure is set (another bad practice of writing code).

Solution

Solution steps:

  1. Allocate memory for struct auth using “auth” command.
  2. Free memory allocated in step 1 by using “reset” command.
  3. Provide sufficiently large input using “service” command such that it will overflow the auth member of struct auth allocated previously at the same space where service string will now reside.
  4. Perform authentication by issuing the “login” command which will use the pointer that points to memory which was previously unallocated (this is where use-after-free happens).

After loading the program in GDB and disassembling it using Intel syntax, the useful breakpoints we could set are: After “auth” command sets-up the user structure (struct auth ideally) at main + 205, after “reset” deallocates memory at main + 250, after “service” allocates memory for the new string and places it there at main + 297 and after “auth” which will log us in if we did everything right at main + 346.

Step 1: First we issue the command “auth USER” which will allocate memory for struct auth (ideally) and set it’s name to “USER”. Note that the auth integer attribute remains unchanged and it’s set to 0.

1_after_set_user.png

Image 1: Allocated “USER” sting in memory

Image 1 shows how the things look in memory after we issue the mentioned command. But something’s wrong. Allocated memory where we are allowed to write is located at 0x8189818 as saved in EAX register (return value of malloc), but after printing few bytes before that (starting at 0x8189810), the chunk header shows that there are only 16 bytes allocated (this is from 0x00000011 where the LS bit is a flag pointing out that the previous chunk is allocated) instead of required 36 bytes for the struct auth. The reason for this is because at line 25, the malloc call gets the size of the pointer named the same as the structure, and because every pointer is 4 bytes in size, malloc allocates 4 bytes plus the additional 8 bytes for the header and another 4 bytes required for alignment (I guess it’s for the alignment according to some minor research on Stackoverflow and such).

Step 2: In this step we will “reset” (deallocate) the allocated memory in the previous step.

Screenshot from 2018-01-10 16-22-22.png

Image 2: deallocating memory

Image 2 shows that when we deallocate the memory allocated in the previous step, it still contains the contents we had there before it was deallocated, that is the “USER” string. After deallocation, the freed memory is now available for allocating it again, and this is exactly what we will do in step 3.

Step 3: By issuing the “service” command 3 times, giving different arguments each time: “AAAABBBB”, “CCCCDDDD” and “EEEEFFFF” we can see that it got written in the deallocated space from the previous step (0x8189818). By 3 times calling the “service” command, we also get 3 chunk headers because those are 3 separate allocations (one for each “service” command).

Screenshot from 2018-01-10 16-33-19.png

Image 3: successful exploitation of use-after-free vulnerability

Image 3 shows how new strings are written in memory. Another programming error occurred at line 35 where the space character is also taken as the part of the string (that’s the reason why there are 0x20 bytes at the beginning of each word).

Step 4: Referencing to image 3, we can see that the login command was issued. By issuing the “login” command, we can see that the program tries to access address previously stored in struct auth * auth pointer, which still exists in that pointer, but it’s memory got deallocated (and later replaced with the strings we provided by issuing “service” command). The code in “login” command at line 38 tries to access address *(auth + 32) which will result at accessing content at address 0x8189834 and because this is the location of the heap chunk header size of the memory chunk that contains “EEEEFFFF”, it will be populated with some data (the size of allocated memory) and result in a value different than 0 which will result in the program granting us access to the account created earlier.

Thanks for reading!

Protostar Heap1

Level text

#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <stdio.h>
#include <sys/types.h>

struct internet {
  int priority;
  char *name;
};

void winner()
{
  printf("and we have a winner @ %d\n", time(NULL));
}

int main(int argc, char **argv)
{
  struct internet *i1, *i2, *i3;

  i1 = malloc(sizeof(struct internet));
  i1->priority = 1;
  i1->name = malloc(8);

  i2 = malloc(sizeof(struct internet));
  i2->priority = 2;
  i2->name = malloc(8);

  strcpy(i1->name, argv[1]);
  strcpy(i2->name, argv[2]);

  printf("and that's a wrap folks!\n");
}

Intro

This level covers another heap-based vulnerability exploitation and unlike the previous level, which was just an introductory one, this level requires us to exploit a vulnerability using a technique similar to stack-based buffer overflows used in the previous levels to redirect execution of the program to our desired location, function winner.

First time I encountered malloc was in an introductory C programming class and used it rarely ever again. I just knew that it allocated memory and returned an address to that allocated space. Although not necessary for solving it, this level made me explore the inner workings and some implementation details of this important function.

malloc and free are two functions used in dynamic memory allocation. malloc is used for allocating memory space and free for deallocating it. It is necessary to say that those two functions are not system calls but convenient wrappers around system calls. For example, malloc will call system calls brk or mmap, depending on the situation, to request more free space from the kernel. Imagine them as wrappers which could (and usually are) implemented differently across different systems, but the general functionality remains the same. You can read more about the inner workings of malloc and heaps in general at [1] which describes the glibc implementation of malloc.

Ok, this level focuses on the heap and not the stack, but what are the differences between the two and what are the implications for our approach to the solution?

Stack is used for static memory allocation and heap for dynamic. The stack is a LIFO (last in first out) structure used for storing local variables, tracking and calling function calls with the most recently reserved block always freed first. Managing the stack is trivial by incrementing or decrementing a pointer while the heap management is more complex. The complexity of heap management follows from its structure because unlike the stack, the heap is not well structured (adjacency of reserved memory is not guaranteed). Heap is a memory space which can be arbitrarily populated via allocations of different sizes and subsequent allocations may not be trivial if (for example) there is not enough space for reserving desired amount of memory on the heap, in which cases additional interactions with the kernel are required. There is one stack for each existing thread, while the number of heaps can vary from one to many, but unlike the stack, maximum number of threads can be much greater than the number of heaps, meaning some threads will share heaps. From the malloc glibc implementation, the number of so called “arenas” (I imagine the arenas as a heap container which can store multiple heaps, read more at [1]) is determined by the number of processor cores and architecture (32 bit vs. 64 bit). This means that there is a need for synchronization because of contemporary thread allocations on the same heap(s), which will slow things up. In contrast, the same stack addresses are usually accessed much more often so they can be mapped (because we work in the virtual address space) to the processor’s cache and their access can be sped up significantly. Unlike the stack which has a fixed size determined when the thread is created, the heap size is set on application startup and can grow as it is needed by the application (by asking the kernel for more memory using aforementioned system calls).

The things we know for now is that we need to somehow redirect program execution. Ok, great, the first thing that comes to mind is overwriting the return address of main, but how? We can see from lines 31 and 32 from the code above that the first argument is copied to the first structure’s name attribute (it’s allocated space on the heap) which size is 8 bytes and then the next argument is copied to the second structure’s name allocated space. If we recall how strcpy works, we know that it copies byte array given as the second argument to the address given by the first argument, without making security checks if there is enough allocated space to accommodate the whole byte array given (in this case, by the user). This is our attack vector.

If we give a sufficiently large input (in this case, larger that 8 bytes) as the first argument, it will cause an overflow on the heap. Now, let’s recall what that means: We had a bunch of calls to malloc which allocates some space on the heap and then we copy user input to that space. Note that the memory allocated spaces don’t have to be adjacent, because malloc tries to find a sufficiently large space to accommodate the desired amount of bytes and nothing ensures that the space between two subsequent allocations is not occupied by some other allocation that happened before that. I will go more into detail in the “solution” section of this post.

The main idea behind this attack is to overflow allocated memory given by one call of malloc and for that overflowed data to overwrite some specific address at some location in an another heap allocated space given by another malloc call. The idea is the same as the one we did before on the stack based overflow levels, the only difference is that nothing ensures adjacency of interesting memory addresses (because we work on the heap and not the stack).

Exploit consists of overflowing the second allocated memory space (specifically the space pointed to by i1->name) into the memory space pointed to by i2, overwriting the memory address of i2->name pointer so that it points to some desired address that we want. That gives us the ability to redirect program flow to address given by the second user given argument, which will be written to the overwritten address of i2->name pointer. Details of this exploitation will be covered in the next part of this post.

Solution

Steps for exploitation:

  1. Find where i1->name is pointed at after memory allocation (where data will be stored).
  2. Find address of i2->name character pointer (which we need to overwrite).
  3. Find the offset from *(i1->name) to i2->name so we can tailor out input.
  4. Find address of winner (which we need to call).
  5. Find address of location from where the execution will be redirected from (location from which we will bounce our execution to winner).
  6. Execute exploit.

Step 1: using GDB, we first disassemble the binary using disassemble main, note that the standard disassembly syntax is set to AT&T, to set it to Intel which is in my opinion more readable, enter GDB command set disassembly-flavor intel. Find the malloc call which corresponds to the one at the 25th line in the given code (located at (main + 46)) and set a breakpoint line after that using command break *(main + 47). The result of malloc function call, which is the address of allocated space, will be stored in EAX register, which we can (after hitting our breakpoint) see with the command info registers eax or i r eax for short. After running the program with run command, we find that the address of memory allocated to i1->name is:  *(i1->name)@0x095BA018 (address may vary).

Screenshot from 2018-01-04 18-03-28.png

Step1: Finding address of *(i1->name)

Step 2: almost the same as in previous step, except now we need the actual address of i2->name struct member and not the memory allocated to it. This can be achieved by setting a breakpoint immediately after memory allocation for i2 is done by the malloc call at line 27 in given C code. After finding that call in disassembly, the breakpoint is set at *(main + 68), this gives us the address of memory allocated for accommodating the internet structure and pointed to by i2 which is: i2@0x095ba028. If we look at the internet structure, we can conclude that the address of name structure member (which is of type character pointer) will reside 4 bytes after the beginning of the memory allocated for the structure itself (after i2@0x095ba028), so the address of name member is: i2->name@0x095ba02c.

Step 3: we need to find out the offset from the address found at step 1 to the address found at step 2 so we know how many bytes we need to supply via user input. This is done simply by subtracting: 0x095ba02c – 0x095ba018 = 0x14 = 20 bytes. This means we need to write in 20 bytes as the first argument to overflow to the i2->name structure member and the bytes we write next will overwrite it. Note that this program is not big and that there is plenty of space on the heap. Although nothing guarantees that subsequent calls to malloc will result in adjacent memory allocations, by the described way the heap works (as a doubly linked list of free memory chunks) I can assume that at least first few allocations could result in adjacent memory spaces. Let’s see with GDB. Setting breakpoint after user input is written to allocated memory addresses, we examine the memory:

Let’s analyze the i1 structure. When structure corresponding to i1 was allocated, malloc returned address 0x800Screenshot from 2018-01-04 23-33-35.png4a008, at which is the content of the priority field (set to 1). Little before that is a field with 0x00000011. This field represents how many bytes does the allocation take and the last bit (PREV_INUSE flag bit is set, look at [1]) means that the previous chunk is allocated and because of that, the 0x00000000 bytes starting at address 0x804a000 represent user data (the same goes with the other chunks). We can also see that consecutive malloc calls gave consecutive accessible memory regions starting at: 0x804a008, 0x804a018, 0x804a028, 0x804a038.

View of an allocated chunk (from [1])

Step 4to get the address of winner we can enter in GDB the command print winner and get the address: winner@0x08048494.

Step 5: this is the address from which the program flow could be redirected. For example, this can be the location of the saved return address (also called saved eip), which, when overwritten to the address of winner function, will redirect our program flow to the winner function when the program attempts to return from main program function. We did this a lot in the stack-based buffer overflow levels. The more reliable way (which I recall I did only a few times before) is to overwrite the PLT (procedure linkage table) entry for some dynamically linked function from our program (such as printf function).

Image result for plt and got

Role of PLT in dynamically linked function calls (from Eli Bendersky’s article on PIC)

A short reminder on how PLT and GOT work (the details can be found at [2]): When you compile a program that uses some dynamically linked library (for example you have a printf call), you don’t write that function’s code yourself because it already exists somewhere on the system in some library. Because the compiler doesn’t know the exact address of that function (because it doesn’t know where exactly the library will be loaded during execution) it leaves a “blank” space reserved for the address of that function in the Global Offset Table (GOT). If we say GOT is all we need, we might get into trouble, because that means we need to resolve (find) all the addresses of all the functions provided by that dynamically linked library and put them into GOT because we don’t know which functions will get called and which wont, this means that if we have a small program with only one call to printf, we would still need to find all the addresses to all the other functions provided by C standard library (where printf is defined) and that takes time. To aid this problem another level of indirection is implemented in the form of a Procedure Linkage Table (PLT) which is basically just a “trampoline” that instructs the resolution only of those functions we need to call in our program. So when we call printf we won’t actually call printf directly, but corresponding PLT entry will be called instead , which is just a piece of code instructing the resolution of the function’s address which will be added to GOT. This is called “lazy” approach.

So we are left with two options (two that I know about at least): The address of the saved instruction pointer register (saved RETURN address) which we can get with the GDB command info framesaved_EIP@0xfff9767c. Another option is to find the address of the PLT entry from a known function call, for example printf. In disassembly we can see that there is no printf call, only a call to puts. This is just an optimization which will still work fine.

Screenshot from 2018-01-04 19-13-29.png

puts is called instead of printf

As I said before, PTL entries are just a bunch of code, which means we could disassemble it:

Screenshot from 2018-01-04 19-16-59.png

Disassembly of puts PLT entry

So the address we could overwrite is the address the PLT entry for puts will jump to when trying to resolve its address in GOT: puts_GOT@0x08049774.

Step 6: the exploit is executed by providing tailored input as the two command line arguments the program expects. The goal of the first argument is to overflow i1->name heap buffer into the i2 structure and overwrite the address of i2->name with the address of either the saved_EIP or puts_GOT address. The goal of the second argument is to simply provide the address to which the execution will be redirected.

Generally, exploit could look like this:

./heap1 `python -c ‘print “A” * struct_offset + exec_redir_addr + ” ” + winner_addr‘`

Where:

  • struct_offset = 0x14 = 20 bytes
  • exec_redir_addr = puts_GOT or saved_EIP
  • winner_addr = 0x08048494

Here is proof of work:

Screenshot from 2018-01-04 19-51-48.png

Exploited by overwriting saved_eip (return) address on Protostar VM

Screenshot from 2018-01-04 20-05-34.png

Exploited by overwriting puts_GOT address on Protostar VM

NOTE #1: Addresses may vary.

NOTE #2: Although the level can be exploited on the given virtual machine, I wanted to copy the heap1 level to my local Kali machine because I’m more used to it and wanted to exploit it there. From here I stumbled upon some problems. I tried with the saved_EIP exploit method and noticed that each time I tried to start the program using the same exploit from the same directory (so the environment variables would not change) the address of saved_EIP would change. This is because ASLR is enabled by default. To prove this, I temporarily disabled ASLR by editing /proc/sys/kernel/randoimze_va_space and changing its value from to 0 (which disables it). Next time I ran the program the address of saved_EIP didn’t change through consecutive program execution and the exploit was possible. Exploitation by overwriting puts_GOT didn’t result in GOT jump address (first instruction inside PLT) being randomized, even if ASLR is used and the GOT is inside the data segment which should be randomized by ASLR. I did a minor research and found that if the executable is not compiled as PIE (that is position independent executable) you don’t get much benefit from ASLR because not every memory region gets randomized, and I can’t find out why exactly. My guess is because maybe PLT couldn’t work with everything randomized as it needs to know the position of GOT . [3] states that if executable isn’t PIE, it surely hasn’t it’s text segment randomized, also this post  states that any reference from a non-PIC/PIE code to a function from a dynamically linked library needs PLT for address resolution because the non-PIC/PIE executable expects function addresses to be static/known (which they are, because the real function call is replaced with PLT entry address which is known prior to execution).

References

[1] https://sploitfun.wordpress.com/2015/02/10/understanding-glibc-malloc/
[2] https://eli.thegreenplace.net/2011/11/03/position-independent-code-pic-in-shared-libraries/
[3] https://securityetalii.es/2013/02/03/how-effective-is-aslr-on-linux-systems/

Thank you for reading!