[C-prog-lang-l] heap corruption, oh my !

Vladimír Kotal vlada at kotalovi.cz
Sat Jul 1 16:52:46 CEST 2023


Hi all,

one of you encountered heap corruption when writing the task assignment. I talked a bit about how heap allocators work in one of the lectures. Basically, heap corruption happens when internal data structures used by the heap allocator got corrupted.

There are couple of things to realize:
  - heap allocators use various techniques to request memory from the operating system and set of techniques to manage it. As for the latter they can use simple linked list but also much more complicated structures (trees, bitmaps etc.).
  - the behavior of a heap allocator in different versions of one C library can differ (e.g. because it was tuned or maybe the heap allocator implementation changed significantly)
  - some heap allocators offer techniques for debugging various problems
  - lastly, different systems can have different C libraries and thus different heap allocators. Not only when crossing e.g. from BSD to macOS to Linux, but also between different Linux distributions (some use glibc, other musl). This also means that heap corruption of the same program might be reproducible on one system but not the other.

When something corrupts the heap, e.g. by scribbling over memory semi-randomly, it will often overwrite the header for a buffer, breaking the metadata and linkage of the buffer in the heap allocator structures. The heap allocator may interpret the data later on erroneously. This can crash the program (which might be a good thing for early detection) or happily continue, causing the problem to propagate even further, making it way more harder to root cause.

The first thing you can do is to wrap the heap allocator APIs (malloc/calloc/realloc/free) and add extensive logging (i.e. log the action, pointer address and size, maybe also contents of the buffer). By parsing the logs some problems can be detected, e.g. double free.

Sometimes, the problem might be detected via tools. The first tool is usually the heap allocator itself, assuming it has debugging capabilities. For example glibc has heap consistency checking (https://www.gnu.org/software/libc/manual/html_node/Heap-Consistency-Checking.html) that can be triggered by environment variable and also explicit function call from within the program.

Other time, the problem could be detected by reading the source code. Incidentally, I happened to encounter such thing in the last couple of weeks. Luckily, the environment I was using was 100% reproducible, meaning it was single threaded and the program requested the memory in the same order (and the system in question was handing out the memory deterministically), which means that I could remember the pointer addresses in my head when debugging and refine/drill down easily. In the end I found out that it was the realloc() implementation used that looked basically like this (sans error checking and some details):

void *realloc(void *ptr, size_t size)
{
    void *newbuf = malloc(size);
    size_t old_len = *((size_t *)ptr - 1);  // grab the length from the buffer header
    memcpy(newbuf, ptr, old_len > size ? old_len : size);
    free(ptr);
    return newbuf;
}

The error can be easily spotted once one arrives to this function definition.

The trouble with detecting heap corruption is that one would have to check all memory accesses. This is hard to do because of pointer based access. In such case Valgrind can help you, assuming the corruption can be reproduced readily.

Best regards,


V. Kotal
-------------- next part --------------
HTML attachment scrubbed and removed


More information about the c-prog-lang-l mailing list