UNIX and Stack Smashing

Posted by Janani Kehelwala on November 14, 2018 · 7 mins read

Let us look into what buffer overflow vulnerabilities are, what aspects of UNIX system design allows their (popular and continued) exploitation and what precautions can be taken to patch them.

Design aspects of UNIX systems that allow Buffer Overflows to exist

  • UNIX file system permissions rely on the all-powerful ROOT user, thus certain essential programs need to be run under root privileges to function properly. This is accomplished through SUID bit in file permissions, which allows a non-root user to run the required programs with file owner’s privileges (who is mostly root for such crucial programs/shell scripts). If one of these processes could be tricked to spawn and return a shell (or execute any harmful code in privileged mode), system compromise becomes feasible even from a non-root user’s account.
  • C Programming Language delegates the correct handling of memory allocations to variables to the system developer in favor of maintaining the simplicity and efficiency required for interactions with hardware. Due to both this delegation and the language’s intimate interactions with hardware memory, if address manipulation could be induced to a running program, stack smashing becomes a possibility.
  • Storing the return address on the stack is a major design aspect that enables buffer overflow exploits. This will be demonstrated in the next section.

How stack smashing is done?

Stack smashing occurs when data is written to a variable which is of lesser capacity than the data. Then, due to the memory allocation and variable placement in the stack, the value writes over the return value of the function. This causes the return value to contain a random memory address, which is likely to be out of it’s process boundaries, and causes segmentation faults. However, if this address could be pointed to a data location that contains executable assembly code, then buffer overflow is exploited to obtain remote access.

Dynamic buffers are allocated on the stack at run time, and thus are used for this exploit. This is uninitialized data (BSS), which are allocated with zero filled memory. Upon running out of memory, the program is blocked and rescheduled to run with a larger memory module.

In a regular function, the stack pointer points to the last data element to be pushed to the stack. Base pointer holds the initial boundary of the function. (Function parameters have a positive offset and local variables have a negative offset.)

void function(int a, int b, int c) {
	char buffer1[5];
	char buffer2[10];
}

void main() {
	function(1,2,3);
}

The above code will be allocated in memory as follows.

Thus, if one of the buffers were overwritten with sufficient data, it overwrites the return address placed in the memory.

Therefore, if return address is manipulated with a static string containing shell code (assembly language hexadecimal string of specific binary instructions, obtained from a debugger/reverse engineering tool), and if the return address is blindly restored following exit from the function, arbitrary code can be executed.

Knowing the memory address where shellcode resides is crucial for a successful exploit. To achieve this, two mechanisms are used.

  1. Padding the shell code with no-op assembly operations to give the shellcode a wider memory space
  2. Following shell code by many instances of ‘guessed’ return addresses in memory

Prevention Mechanisms

  • Centralized Approaches
    • Modifying system libraries, operating system kernel
    • Cheap but unstable in the long term
  • Decentralized Approaches
    • Modifying privileged programs and C language compilers.
    • Expensive but durable

Modifying code

Use of standard C Byte copy or concatenation functions can be restricted. Certain safer functions have been introduced which could be used in their stead, or explicit bounds checking can be used prior to performing the copy function.

Input validation in general, of shell environment pointers, and excessive command line arguments would be better practice for many other attacks such as privileged shell escape using special characters. If invalid, the program can be terminated.

Programs such as CodeCenter, Purify, GCT, Electric Fence assists programmers locate buffer overflows/ illegal function operators that go amiss in standard C compilers. However, they are debugging tools, and does not consider underlying system design or permissions, thus all vulnerabilities won’t be detected.

However, while difficult, manual review and code modification may be the only fool-proof solution for buffer overflow attacks.

Compiler Modifications

Compilers already generate warning messages upon using dangerous functions. Different modifications have been introduced such as performing stack integrity check before referencing return value, bounds checking and representing pointers differently so bounds checking can be performed. However, they have their respective disadvantages, such as suffering performance hits, portability issues, not inspecting programmer defined autonomous functions, C’s philosophy violations and requiring a global scale upgrade of system binaries.

Stack execution privilege modifications.

This method proposes modifying OS kernel segment limit, so that it doesn’t cover actual stack space. Dynamic memory allocation stack of the OS is marked as non-executable, thus code placed on the stack will not be executable. Since this is necessary for exploits, this efficiently prevents buffer overflow attacks. However, it LISP and Objective C compilers which make use of trampoline functions, or nested function calls do not work properly with this kernel patch. Furthermore, since signal handlers require a executable stack, this has to be facilitated and buffer overflow vulnerabilities could occur in that section of memory.

In conclusion, the most effective solution is to eliminate the problem at the source, by properly auditing the privileged program code so dynamic buffer overflows would not be allowed within that program.

References