Let's examine how the heap is used for dynamically allocated memory.
You request memory buffers or blocks of a particular size from the runtime environment by using malloc(), realloc(), or calloc(), and you release them back to the runtime environment when you no longer need them by using free(). The C++ new and delete operators are built on top of malloc() and free(), so this discussion applies to them as well.
The memory allocator ensures that your requests are satisfied by managing a region of the program's memory area known as the heap. In this heap, the allocator tracks all of the information—such as the size of the original block—about the blocks and heap buffers that it's allocated to your program, in order that it can make the memory available to you during subsequent allocation requests. When a block is released, the allocator places it on a list of available blocks called a free list. It usually keeps the information about a block in the header that precedes the block itself in memory.
The runtime environment grows the size of the heap when it no longer has enough memory available to satisfy allocation requests, and it may return memory from the heap to the OS when the program releases memory.
The basic heap allocation mechanism is broken up into two separate pieces, a chunk-based small block allocator and a list-based large block allocator. By configuring specific parameters, you can select the sizes for the chunks in the small block allocator and also the boundary between the small and large allocators.
Both the small and large block allocators allocate and deallocate memory from the OS in the form of chunks known as arenas, by calling mmap() and munmap(). By default, the arena size is 32 KB. It must be a multiple of 4 KB and must currently be less than 256 KB. If your program requests a block that's larger than an arena, the allocator gets a block whose size is a multiple of the arena size from the process manager, gives your program a block of the requested size, and puts any remaining memory on a free list.
You can configure this parameter by doing one of the following:
The allocator also attempts to cache recently freed blocks. In QNX Neutrino 6.6 or later, this cache is used only for blocks that are the current arena size or smaller. You can configure the arena cache by setting the following environment variables:
Alternatively, you can call:
mallopt(MALLOC_ARENA_CACHE_MAXSZ, size); mallopt(MALLOC_ARENA_CACHE_MAXBLK, number);
To tell the allocator to never release memory back to the OS, you can set the MALLOC_MEMORY_HOLD environment variable to 1:
export MALLOC_MEMORY_HOLD=1
or call:
mallopt(MALLOC_MEMORY_HOLD, 1);
Once you've used mallopt() to change the values of MALLOC_ARENA_CACHE_MAXSZ and MALLOC_ARENA_CACHE_MAXBLK, you can call mallopt() again with a command of MALLOC_ARENA_CACHE_FREE_NOW to immediately adjust the arena cache. The behavior depends on the value argument:
If you don't use the MALLOC_ARENA_CACHE_FREE_NOW command, the changes made to the cache parameters take effect whenever memory is subsequently released to the cache.
You can preallocate and populate the arena cache by setting the MALLOC_MEMORY_PREALLOCATE environment variable to a value that specifies the size of the total arena cache. The cache is populated by multiple arena allocation calls in chunks whose size is specified by the value of MALLOC_ARENA_SIZE.
The preallocation option doesn't alter the MALLOC_ARENA_CACHE_MAXBLK and MALLOC_ARENA_CACHE_MAXSZ options. So if you preallocate 10 MB of memory in cache blocks, and you want to ensure that this memory stays in the application throughout the lifetime of the application, you should also set the values of MALLOC_ARENA_CACHE_MAXBLK and MALLOC_ARENA_CACHE_MAXSZ to something appropriate.
The large block allocator uses a free list to keep track of any available blocks. To minimize fragmentation, the allocator uses a first-fit algorithm to determine which block to use to service a request. If the allocator doesn't have a block that's large enough, it uses mmap() to get memory from the OS in multiples of the arena size, and then carves out the appropriate user pieces from this, putting the remaining memory onto the free list.
If all the memory that makes up an arena is eventually freed, the arena is returned to the OS. In QNX Neutrino 6.6 or later, when you free a block of memory that's larger than the arena size, the allocator uses munmap() to immediately return the block to the OS (unless you've set MALLOC_MEMORY_HOLD to 1 to prevent the allocator from releasing freed memory back to the OS). Freed blocks that are the arena size or smaller are cached according to the cache settings.
The small block allocator manages a pool of memory blocks of different sizes. These blocks are arranged into linked lists called bands; each band contains blocks that are the same size. When your program allocates a small amount of memory, the small block allocator returns a block from the band that best fits your request. Allocations larger than the largest band size are serviced by the large allocator. If there are no more blocks available in the band, the allocator uses mmap() to get an arena from the OS and then divides it into blocks of the required size.
The allocator initially adjusts all band sizes to be multiples of _MALLOC_ALIGN (which is 8 for 32-bit architectures, and 16 for 64-bit architectures). The allocator normalizes the size of each pool so that each band has as many blocks as can be carved from a 4 KB piece of memory, taking into account alignment restrictions and overhead needed by the allocator to manage the blocks. The default band sizes and pool sizes are as follows:
Band size | Number of blocks (32-bit) | Number of blocks (64-bit) |
---|---|---|
16 | 167 | 123 |
24a | 125 | — |
32 | 100 | 82 |
48 | 71 | 61 |
64 | 55 | 49 |
80 | 45 | 41 |
96 | 38 | 35 |
128 | 29 | 27 |
a This band is defined only when you compile for 32-bit architectures.
When used in conjunction with the MALLOC_MEMORY_PREALLOCATE option for the arena cache, the preallocation of blocks in bands is performed by initially populating the arena cache, and then allocating bands from this arena cache.
You can configure the bands by setting the MALLOC_BAND_CONFIG_STR environment variable to a string in this format:
N:s1,n1,p1:s2,n2,p2:s3,n3,p3: ... :sN,nN,pN
where the components are:
The parsing is simple and strict:
If the allocator doesn't like the string, it ignores it completely.
For example, setting MALLOC_BAND_CONFIG_STR to:
8:2,32,60:15,32,60:29,32,60:55,24,60:100,24,60:130,24,60:260,8,60:600,4,60
specifies these bands, with 60 blocks preallocated for each band:
Band size | Number of blocks |
---|---|
2 | 32 |
15 | 32 |
29 | 32 |
55 | 24 |
100 | 24 |
130 | 24 |
260 | 8 |
600 | 4 |
For a 32-bit architecture, the allocator might normalize this configuration as follows:
Original band size | Adjusted band size | Number of blocks |
---|---|---|
2 | 8 | 251 |
15 | 16 | 167 |
29 | 32 | 100 |
55 | 56 | 62 |
100 | 104 | 35 |
130 | 136 | 27 |
260 | 264 | 14 |
600 | 600 | 6 |
For the above configuration, allocations larger than 600 bytes are serviced by the large block allocator.