Author: Zheng Luo
Last update: 2024-08-31
When programming in native languages with manual memory management, users in my company often get troubled by various memory issues, including use-after-free, double-free, read-from-uninitialized data, or memory leaks, despite being very experienced developers.
Fortunately, with the entire industry's decades of experience combating these memory issues, we already have a full toolkit tackling memory contract violations. These tools are extensively practiced everywhere I've worked and played an important role in code quality:
clang-tidy
for single-file checks. CodeChecker for cross-file checks. They only catch part of the memory issues and have a moderate level of false positives. Users typically need to set up enforced checks to make their codebase compliant.ASAN
, MSAN
, UBSAN
and valgrind
: From my personal experience, they typically slow down programs by 2x-10x depending on memory access patterns, and they can catch use-after-free, double-free, or memory leaks reliably. My current company runs all unit tests under sanitizers.std::vector
subscription is checked, stack overflow is partially detected, and bad malloc_chk
calls could be caught. My current company plans to enable it on all production binaries that aren't too performance-critical. These flags typically cause a 5%-20% slowdown in production.GLIBC_TUNABLES=glibc.malloc.perturb=125
when running the program, glibc
will initialize all malloc
-ed memory blocks before returning them to users. This often helps users catch memory issues earlier, as a non-zero out-of-range value is easier to notice than random (largely zero) bytes from the heap. Depending on the malloc frequency, this could cause a 2%-20% slowdown.Recently, I found another layer of defense that could supplement the above guards against memory issues - secure memory allocators. It's a set of memory allocator implementations aiming to replace the native (usually glibc)'s malloc
, free
, and other primitives, by providing better detection of memory issues and protecting against malicious attackers on heap layout. Moreover, it often comes with minimal (2% - 15%) slowdown, rendering it useful for production workloads.
Secure memory allocators are not a new concept. You may already be using them on your devices. Since Android 11, Scudo, a specially-designed memory allocator focusing on safety, has been enabled by default for all native code. Chromium also has its own allocator with safety in mind - PartitionedAlloc provides more security than the system-provided allocators.
What kind of protection does it provide? Let's dive into one of my favorite choices in this domain, GrapheneOS's hardened_malloc, to figure it out. The usage is simple:
Internally, this small script just preloads libhardened_malloc.so
to the given program and replaces malloc
, free
, and other allocation primitives (e.g., posix_memalign
, C++'s allocate
, or memcpy
for extra checks):
To test it, let's create a heap buffer overflow:
The program overflows if the first argument exceeds 16 bytes. valgrind
reveals the issue:
However, if you just run it directly, it will not trigger any alert and exit cleanly. This is a very serious security concern, as attackers could exploit this overflow to overwrite internal program states and then gain control of the program's control flow. There's plenty of research exploiting heap layouts (Heap FengShui if you haven't heard of its cool name).
Hardened malloc detects this memory issue pretty cleanly, preventing future escalation of this issue:
Judging from the error messages, we can sort of guess the underlying methodology: it places a canary value at the end of the allocation and checks its value at deallocation. When the value mismatches, it will ruthlessly abort()
the program, assuming that it has entered an unrecoverable state.
Let's try another use-after-free example:
Unfortunately, at this time, hardened_malloc didn't detect it, and the program exits normally. It's not ideal, as valgrind illustrates the problem:
It's not doing nothing in this case. First of all, it randomizes the number of slots to reuse for small memory allocations so that the overwritten memory chunk will not easily allow attackers to control the data flow deterministically. Secondly, hardened_alloc's UAF check only happens when a chunk is re-allocated, so it will complain correctly if we change the program to the below:
This detection isn't always triggered, as there's a chance that hardened_malloc hasn't inspected the UAF-ed chunk in allocation requests. This example demonstrated limitations of secure allocators' capability in guarding memory misuses. Therefore, in a test environment, you should still resolve to sanitizers if performance isn't too big of an issue.
The above failure doesn't indicate that none of the secure allocators could detect this memory corruption. Hardened malloc deliberately chose a design to minimize performance impact and guarantee heap security over detecting memory issues. Let's take a look into another secure allocator, Microsoft's hardened snmalloc. It correctly and reliably catches the UAF in the original program example:
Internally, our write-after-free corrupted the free list of snmalloc, and on program exit, snmalloc iterates the free list and verifies that the backward edge of the free list matches the internally encoded version. Since our example corrupted the free list, the edge mismatches and triggered this alert:
The above examples illustrated the difference in capabilities of secure allocators. On a high level, I found ISOAlloc's feature matrix a good resource to compare different secure allocators' feature sets. It's worth noting that even if both secure allocators support a feature (e.g., canary after heap allocation), the capability to detect memory corruption is still subject to the allocator's implementation (e.g., the size of the canary, the canary value, and when it checks the value). Overall, I found hardened_malloc and hardened snmalloc the top two competitors in this domain.
Most secure allocators are designed to be used in production with minimal performance overhead, and I highly recommend people enable it by default for their own deployment. There are some tricky pieces I found prohibiting enabling it by default at the OS level that might be worth noting:
Personally, I recommend the below setup, assuming that you can control both lab and production environments:
-fno-omit-frame-pointers
in the lab environment so that it's easier to reproduce the issue with acceptable slowdown.Secure allocators are still in their early stages and have a lot of potential. In the upcoming years, we can expect better support from OS and hardware to help them catch memory issues more precisely:
userfaultfd
after Linux 5.7 allows users to get notified when people attempt to write to a page. This can be used to detect unauthorized writes at userspace.There's even more progress in this domain on the horizon. Academia has been working on CHERI, a set of instructions to explicitly tag memory access patterns. Surely we will see more secure, reliable, and easy-to-use memory corruption detection in the foreseeable future.