previousupnext

6.3 Tuning

The registry interface may be used to tune LeapHeap per application to extract the best performance.

The most important thing to say in this respect is that LeapHeap has a range of sizing strategies built in, a feature obscured in earlier sections of the guide, which assume the default LeapHeap in the interests of simplicity.

The default LeapHeap uses a compartment at every 8-byte interval up to 4 KB. Towards the other extreme lies a power of two distribution, with compartments at 8, 16, 32, 64, 128 bytes and so on. The power of two configuration is obviously more profligate of memory, but greatly speeds applications that are heavy users of the HeapReAlloc function. Scripting languages are pre-eminent in this, partly because of the method by which the script is interpreted, and also because they offer the programmer variable length strings. Concatenating two strings involves calculating the size of the combined string and moving the string variable that is to receive it to an area of memory at least that size.

Scripting in languages such as Visual Basic for Applications (VBA) and Perl is widespread. It is well understood that these languages are less efficient than Java or C++, but because of the rapidity with which applications can be constructed and extended using them, they constitute the greatest processing load on many a server.

There exist also a host of proprietary scripting languages that work with the products of one vendor. They are often used in batch mode to migrate databases and repositories, tasks which can take many hours. Examples are the VBA-derived RoseScript from Rational Software Corporation, and the C-like DOORS Extension Language from Telelogic AB.

The following JavaScript code fragment constructs a string consisting of all the ASCII printable characters:
s = String.fromCharCode();
for (i=32; i<127; i++)
s = s + String.fromCharCode(i);
The browser might make 95 calls to HeapReAlloc when interpreting this code. Assuming 16-bit characters, the string will end up 190 bytes long. For both default LeapHeap and the NT native heap, a realloc involves an allocate(new_size) - copy data - free(old-cell) combination every time an 8-byte boundary is crossed, which is 23 times. For power-of-two LeapHeap, only 5 such operations are necessary, at 8, 16, 32, 64 and 128 bytes. In the other cases, LeapReAlloc is quick to realise that the cell does not have to be moved, the routine completing in 35 Pentium instructions.

There is nothing special about the power-of-two distribution. LeapHeap compartments can be arranged any way the administrator pleases. The only restriction is that there must be at least one compartment, and the big-block area must be at least 1 MB.

Here is how the CompartmentSizes rules look for power-of-two sizing, though the tiering can be altered to suit:
Rule01 sizeclass 1 tiers 4
Rule02 sizeclass 2 tiers 4
Rule03 sizeclass 4 tiers 4
Rule04 sizeclass 8 tiers 4
Rule05 sizeclass 16 tiers 4
Rule06 sizeclass 32 tiers 4
Rule07 sizeclass 64 tiers 3
Rule08 sizeclass 128 tiers 3
Rule09 sizeclass 256 tiers 3
Rule10 sizeclass 512 tiers 3
Rule11 bigblockarea 100

The overlay principle by which sizing rules are interpreted means that fewer rules are needed. For example:
Rule01 sizeclass 1 to 256 tiers 2
Rule02 sizeclass 8 to 9 tiers 4
Rule03 sizeclass 16 tiers 4
This results in all the first 256 sizeclasses having 2 tiers except sizeclasses 8, 9 and 16 which have been overridden. Note also that in this example no allocations above 2 KB are expected, if any occur they will go into the big-block area, which will have the default size of 10 MB (as distinct from the default LeapHeap rules size of 100 MB).

Tuning LeapHeap is vital for applications that have problems meeting the 2 GB address range constraint. For LeapHeap to be usable here, it is a prerequisite that compartment usage does not retreat too much from the highwater marks, otherwise address range is being wasted. The analysis of section 5.2 Memory Efficiency showed that switching to LeapHeap from the NT native heap might be expected to save 20% of memory. In that case the LeapHeap must be run 80-85% full to store the same amount of information as the native heap. Firstly, if the application fails to initialise with the default LeapHeap, the compartment tiering must be reduced until the application does run. Then the compartment sizes are adjusted to match the size distribution of the allocations the program makes. Since the only adjustment possible for compartment size is up or down by a factor of 32, compartment overflow can be reduced but not eliminated. Merging adjacent compartments is a valuable technique for the larger sizes, where fractional memory loss will be low. For example, if sizeclasses 100 to 109 run 10% full, replace the separate compartments with a single compartment covering those sizeclasses. Make sure that the largest compartment of all has some free space, to prevent overflow into the big-block area.

As for the remaining registry settings, use of the HeapDestroy values is straightforward. If monitoring the application has shown that it requires heap identifier tagging, ExtendedDestroy should be turned on. Tag size should remain at eight bits unless monitoring has revealed large numbers of heaps concurrently in use (from the viewpoint of the application). FollowDestroyWithCompact releases pagefile memory back to the system after a call of HeapDestroy, but the amount released is unlikely to approach the sum of the allocations that the heap contained, and pagefile memory is cheap anyway. The top-level Validation setting may be turned on for single-threaded programs, but without access to the program source code and knowing why HeapValidate calls are made there is no point.