4.2 Big Block Area Implementation

For memory allocations requests above 4KB, LeapHeap applies the algorithm of 3.2 List Algorithm to the data structures of 2.2 Big-Block Area Data Structure.

The addr field of a block descriptor is implemented as the 32-bit address of the corresponding block in heap memory. The allocd field is implemented as one of the low-order bits of the addr field, not otherwise used because of 8-byte block alignment. The penult field is implemented as a 16-bit count. The nxt field is implemented as the 16-bit zero-based index within the descriptor pool of the succeeding descriptor. The latter two fields are jointly modifiable by the 32-bit Intel CMPXCHG instruction.

When an allocator finds a free block it performs a block-fit operation if the size of the block meets the size of the request but does not exceed it by more than 6%. If the free block is at least double the size of the request it performs block-split. Otherwise the block is passed over. This rule is used for all but the very largest blocks.

Block descriptor usage is tracked by means of a 3-tier cascading bitmap of the type discussed in section 2.1 Compartment Area Data Structure. The bits in the bitmap are set by allocators that perform block-split and block-append operations; they are cleared by allocators that perform block-coalesce. The cascading bitmap allows a thread to rapidly find an unused or fresh descriptor.

A thread allocating with the block-fit operation issues one locked instruction. A thread allocating with the block-split operation normally issues two locked instructions but may require up to four. A thread allocating via block-append issues between four and six locked instructions; in addition it makes a NT kernel call to commit the memory underlying the allocation. Allocators use additionally between three and five locked instructions for each block-coalesce operation they perform. The freeing routine does not use any locked instructions. These numbers hold as long as the thread does not become a victim of contention.

The size of each allocation request is enlarged to give room for a 32-bit pointer at the start of the allocated block; this is set by the allocator to point to the corresponding block descriptor. The back pointer allows the free routine to navigate to the descriptor directly, from the address passed as parameter.

Allocators that perform the block-append operation call the NT kernel to commit fresh memory within the big-block address range. The memory used for the descriptor pool is also committed as needed, a page at a time; a small bitmap records which pages are committed. Once committed, neither of these memory types is ever decommitted.

The tiering of the cascading bitmap imposes a ceiling of two to the power fifteen descriptors, which in turn limits the number of blocks in the big-block area. Supposing an average block size of 8 KB, this amounts to some 256 MB of allocations. Few applications are likely to be troubled by this limit.

The big-block area follows on from the 4 KB compartment and by default is 100 MB in size, giving a grand total of 945 MB address range requirement for the LeapHeap.