Comment: Buffers cause heaps of problems

Hackers have exploited buffer overflow weaknesses in stacks since the 1980s. Now a new variation involving memory heaps could catch many firms unawares, says Neil Barrett

The first hacking "magic wand" I saw was the buffer overflow exploit used by Robert Morris in the original Internet Worm back in the late 1980s.

Morris's father worked for the National Security Agency and noticed that many programs in C had two interesting characteristics. First, variables within functions were created and accessed within a structured portion of memory called the stack. Second, array variables used as program buffers were not checked before further characters were added to them.

The result was that characters could be added beyond the end of the buffer variable to overwrite the stack structure information, including the address to which the function was to return the locus of control. And the result of that was an ability to make functions perform instructions that had been loaded into the buffer. A program with a 512 byte array declared to hold user input could have that buffer filled with a simple set of instructions; and the return pointer at the end of the stack segment could be overwritten to jump to the start of buffered instructions rather than back into the main body of the program.

I marvelled at the cleverness of the trick. It was, of course, simple enough to stop it working. Programs could be patched to include checks on the buffer contents; and the architecture of the underlying computers could be altered to inhibit execution of data values on the stack. I thought the trick would have a shelf-life of two years, but 15 years later programs are still found that can be hacked in this way.

Many apps use library code that still contains unchecked buffers; and even new code exhibits the problem, introduced by careless programmers and undetected by inefficient testing. Now, a new version of this old problem is appearing: a version that does not rely on overflowing stack-held variables and so is not protected by the architectural approach of inhibiting stack execution. In this new version, an alternative memory location is used: the program heap.

Variables that are declared within program functions are created on the stack, but global variables that are declared in the main body of the program, or are inherited from the environment, are stored in an unstructured portion of memory. This is called the heap. The operating system works by allocating as much of this memory as is required, arranging the memory as a linked list of allocated blocks. At the end of each collection of blocks is a pointer value that tells the operating system where the next blocks are to be found when more space is needed.

Just as buffers declared within the stack can be overwritten, so too can the buffers that might have been declared within this heap. And when those buffers are manipulated, it becomes possible to alter the pointer value. In particular, it becomes possible to make that pointer show an address in the body of the program. When another attempt to allocate space on the heap is then made, that space is created within the program. This second buffer can then be filled with new instructions and the program is hacked.

This trick is only just beginning to be exploited. Soon increasing numbers of vulnerable programs will be found and a new generation of system hacks will become evident.

Have your say: reply to IT Week

More IT Week Comments