Custom memory allocators in C++ and you

“but lil abi why can’t we just rely on new????”

4 min readOct 10, 2022

Introduction

On Reddit, some time ago, I was involved in heated discussion about the importance of writing your own memory allocators. Now keep in mind, you don’t have to beat tcmalloc, jmalloc, or the Doug Lea allocators. Just know how the work under the hood and the vague implementation is good enough. Anyways, we were discussing about a good “beginner c++ project”, and I had recommend that the original poster should design and write out a custom memory allocator so they can start thinking about programming in terms of being cache efficient and being performant. As we start hitting the physical limits of Moore’s law, our computers are becoming no much faster then previous generations. Now to get gains in computing speeds, we must look to other areas of optimization.

What is a memory allocator though?

Memory allocators allows us to partition a huge block of system memory into smaller chunks of memory. Essentially what this means is that we can break down even more smaller chunks of memory and define some sort of data structure and algorithm that manages those smaller pieces of memory. But what is the purpose for going through all this work just to allocate memory? With custom memory allocators, we improve memory efficiency by limiting the amount of memory fragmentation and we also cut down the overhead incurred when making new calls. We also have an increase in performance and a more efficient use of the cache (lower cache miss) through better spatial locality (fancy talk for related objects being close together in memory) of similar objects. Custom memory allocators have their uses in game dev and embedded system programming.

Designing a memory allocator

Now that we have looked at what a memory allocator is and the purpose of building one, lets look at how we can design a memory allocator. In a high level view, a memory allocator can be defined by two different parts: the interface, and the data structure that manages the memory chunks.

A simple implementation of a memory allocator’s interface could look like this:

Any custom allocator that we build for a particular problem, should inherit from this base class and implement these functions. In our constructor function, we call malloc and reserve a buffer of memory for use later. In the Allocate function, we pass in the size of memory we want to “allocate” and an alignment, then find an available chunk of memory and return an aligned pointer to it. In the Free function, we pass in the pointer to a constructed object and “free” it. To manage the collection of chunks of memory we “allocate” we apply some sort of a data structure and algorithm for basic operations, such as: inserting chunks, remove chunks, and finding a certain chunk. It is important to choose the right data structure and algorithm as they can have a drastic effect on the performance.

Lets look at some allocators.

Linear Allocator

The linear allocator is one of the simplest allocator you can make. The idea of this allocator is to allocate memory in a linear order. Each subsequent call to the Allocate function will move an offset pointer ahead the amount requested. The offset pointer begins starting at the front of the buffer. The advantages of this buffer is that objects that are allocated are kept closer together, so that they end up on the same cache line. For this type of allocator, the Free function does not free individual chunks, the entire buffer of memory is wiped.

Stack Allocator

The Stack allocator is similar to the Linear allocator but we are able to free memory chunks only if . Keeping an offset pointer to the beginning off the buffer, we move it a linear fashion each time we call the Allocate function and attach a tag at the start of the memory chunk. The tag is an object that gives some information about the memory chunk. In this tag we note the size of the chunk. The Free function can only free memory chunks if it is on the top of the stack.

Pool Allocator

The Pool allocator takes the entire buffer of memory and splits it into equal size parts. Each chunk contains a tag with a flag (🔥) if the block is free or not. The Allocate function, looks for a free block, then marks it used and returns a pointer to it. The Free function takes the incoming chunk and marks it free.

Free List Allocator

The Free List allocator manages a list of free blocks and allocated blocks. The Allocate function searches for a free block matching the size needed or for a bigger chunk size to split. Upon selecting the right block the function returns a pointer to an allocated block. The Free function takes a chunk of memory and adds it back to the free list, if possible it will merge with one of the other free blocks.

You can improve the performance of a Free List allocator by using a red black tree as the internal data structure; we achieve a worst case time complexity of O(log n).

Conclusion

Check out my Github repo for a full implementation on these allocators.