previousupnext

5.2 Memory Efficiency

The amount of memory tied up in a heap is almost always greater then the net total (aggregate allocations less aggregate frees) requested by the application. The difference is due to bookkeeping overhead and fragmentation. Some applications are heavy heap users without requiring large amounts of memory. Again, if the total active memory requirement of all running processes on a machine does not approach the amount of RAM fitted to the machine, the wasted memory does not impinge on performance. However in the general case it must be assumed that memory efficiency matters.

The NT native heap stores memory allocations of all sizes together in memory, interposing eight bytes of bookkeeping data between each pair of allocations. This field is conveniently called a header, though it contains information about the preceding as well as the suceeding allocation. Supposing an average (arithmetic mean) allocation size of 40 bytes, the bookkeeping overhead amounts to 20%. The LeapHeap compartment area overhead, described in section 2.1 Compartment Area Data Structure, of between one and two bits per allocation, is a fraction of 1%. Both heap managers have other, fixed overheads, which become less significant in percentage terms as the heap grows larger. The first and last memory pages of the occupied portion of each LeapHeap compartment are likely to be only partially filled; this wastage increases 32-fold under the preferred-bit scheme introduced in LeapHeap version 1.3, and could amount to 30 MB. The native heap has no overhead of this magnitude.

Another form of wasted memory is fragmentation, the occurrence of unused areas of heap memory both between (external fragmentation) and within (internal fragmentation) allocations. External fragmentation arises when a allocation has been freed and the released memory block has not yet been incorporated into fresh allocations. A certain amount is inevitable, but some combinations of allocation pattern and algorithm lead to pathological fragmentation, with a large number of small free blocks that are not in practice reusable. Fragmentation within allocations happens when the algorithm chooses to recycle, without splitting, a free block that is larger than the allocation size requested, often a better policy than spending time looking for a perfect fit.

The degree of memory fragmentation an application may suffer is difficult to predict. The fragmentation overhead of the NT heap manager might lie between 10% and 20%, a figure derived from the literature on similar algorithms rather than measurement. LeapHeap fragmentation is immediately quantifiable in one sense; within each compartment it is at most the historic maximum number of allocations, the highwater mark, less the current number of allocations.

The LeapHeap bitmap algorithm cannot split a free cell to satisfy a request for a smaller allocation. Such inflexibility has a reputation for being punished by strongly-phased programs. For example, if a server application demands one hundred thousand 72-byte memory blocks when it starts up, but subsequently frees them all and enters a running phase in which it never uses that size again, that is 7 MB of permanently wasted memory. This does apply to LeapHeap, but the vital distinction between different memory types comes to the rescue, for the wasted memory is cheap pagefile memory which will never be accessed hence never brought into RAM. The reaction of LeapHeap to a phased program depends on the precise allocation pattern of that program, whereas in harness with a program having similar behaviour to the model application, fragmentation will be predictably and reliably low. In the absence of overflow, LeapHeap does not have any internal fragmentation in the compartment area (save for alignment), but that is to some extent simply a trade off against external fragmentation.

Alignment losses are a type of internal fragmentation, but worth treating separately. The Pentium processor suffers serious performance loss if data are not aligned to its preferences. The minimum credible alignment a general-purpose allocator could offer is 32-bit, otherwise numerous Intel dwords (used for Visual Basic 'long' and generally for C 'int') would be misaligned. The native heap provides 64-bit alignment, which means that Intel quadwords (C and Visual Basic 'double') are aligned. However the Visual C++ compiler does not align quadword variables if they are on the stack, which indicates that 32-bit alignment would be tolerable. Anyway, LeapHeap follows the same policy as the native heap, rounding up the requested size of an allocation to the next 8-byte boundary. The memory loss due to alignment, assuming a smooth size distribution with 40-byte mean, approaches 10% for both heaps.

The algorithm used in the LeapHeap big-block area does not sensibly add to bookkeeping overhead, as the overhead per block is twelve bytes but the blocks in excess of 4 KB. On the other hand, fragmentation within the big-block area will be significant, of the same order as any list-based scheme.