Foundation · Optimization · Tips & Tricks · Uncategorized

Allocation-Free Collections

Say you have a method that creates a temporary list to gather some data. You calculate some statistics based on that data and then you destroy the list again. You call this method a lot, resulting in lots of memory allocations and deallocations and increased memory fragmentation. In addition, all this memory management takes time and may affect performance when used in time-critical sections of code.

For those situations, you may want to keep all your data on the stack instead and avoid memory allocations altogether. We show you a couple of ways you can achieve this.

Even if these use cases don’t apply to you, you may find this article useful since it uses some interesting concepts and Delphi language features.

Stack vs Heap, Value vs Reference Types

But first, lets get some terminology out of the way. You probably already know everything in this section, but let’s recap anyway.

There are two main types of internal memory: the stack and the heap. The stack is used to store local variables of methods and possibly other data (such as return addresses on many platforms). The heap stores dynamically allocated blocks of memory, including strings, dynamic arrays and objects.

When you learned about these types of memory, you probably read about value and reference types in the same paragraph or chapter, because the two are somewhat related. A value type stores some value directly at the memory location, while a reference type stores a pointer to a value that is located somewhere else (usually, but not necessarily on the heap). Examples of value types are integers, floating-point numbers, Booleans, enums, characters, records and static arrays. Examples of reference types are strings, objects, object interfaces, dynamic arrays and pointers.

There is some debate about the difference between classes and objects, and they are sometimes used interchangeably. I personally believe that the two are different: a class describes the contract (fields, methods, properties and events) and an object is a specific instance of a class. Feel free to disagree!

Only value types can be stored on the stack. When a reference type is declared on the stack, only its pointer value is stored on the stack; the actual value is allocated on the heap (if not nil). For example:

procedure StackVsHeap;
var
  A: Integer;
  B: TList<Integer>;
begin
  A := 42;
  B := TList<Integer>.Create;
  B.Add(42);
  B.Add(100);
end;

This results in the following memory layout (with arbitrary memory locations for reference):

stackvsheap

As you can see, creating a TList and adding some items results in at least two heap memory allocations: one to store the instance data of the list object (FItems, FCount and some other fields) and one to store a dynamic array of items. In this example, the dynamic array has room for 4 items, of which 2 are used. The dynamic array will grow as needed to accommodate new items.

Stack-Based Collections

You may run into situations where these heap memory allocations are not ideal. They can increase memory fragmentation which can result in increased memory usage. Also, allocating and freeing dynamic memory is not a cheap operation, and creating temporary lists many times per second may impact performance. On the other hand, “allocating” memory on the stack is usually a zero-cost operation. It just involves adjusting the stack pointer at the start of a method, which the compiler does most of the time anyway.

These issues could be addressed if you could create a collection entirely on the stack (that is, both the collection properties like FCount and the actual items should live on the stack). Actually, you probably already have used these kinds of collections before:

procedure StaticArray;
var
  Items: array [0..9] of Integer;
  Count: Integer;
begin
  ...
end;

A local static array is essentially a stack-based collection, albeit not a user-friendly one. To make this array more list-like, we could wrap it inside a record with methods.

A Simple Stack-Based List

A simple implementation of that could look like this:

type
  TStackList<T: record> = record
  private type
    P = ^T;
  private
    FData: array [0..255] of Byte;
    FCapacity: Integer;
    FCount: Integer;
    function GetItem(const AIndex: Integer): T;
  public
    procedure Initialize;
    procedure Clear;
    procedure Add(const AItem: T);

    property Count: Integer read FCount;
    property Items[const AIndex: Integer]: T read GetItem; default;
  end;

You can find a more documented version of this code (and all other code in this article) in our JustAddCode repository on GitHub in the AllocationFreeCollections directory.

You may find the name TStackList a bit confusing. Here, the word “stack” does not refer to a stack-like data structure, but to the fact that the list should live on the memory stack. If you were to create a stack-like collection that lives on the memory stack, then it could be called TStackStack.

A couple of things to note here:

  • I chose to create a generic list here. Nowadays there is little need for non-generic lists.
  • The type parameter T has a record constraint. This means that the list can only hold value types (for now). This simplifies the implementation of this list somewhat.
  • The nested type P = ^T; declaration may be new to you. It declares a typed pointer to the type of the items in the list. This is useful in the implementation to access list items.
  • The list holds its items in a static array of 256 bytes, meaning it always consumes 256 bytes of (stack) memory, regardless of type T. So if used to create a list of integers, then the list can hold at most 64 items (since an integer is 4 bytes in size).
  • You must call the Initialize method to initialize or create the list. This method acts like a constructor. Since records cannot have constructors without parameters (at least not in the current Delphi version), I chose to add an Initialize method. You could also opt for a static method that returns a list instead (as in class function Create: TStackList<T>; static;), but returning a large record from a function is not very efficient.
  • Although you can declare a TStackList as a field inside an object, this is not really the purpose of this list and just wastes memory. This type is designed to be declared as a local variable in a method only.

The implementation is pretty simple. The Initialize method just calculates the number of items the collection can hold and sets the FCount field to 0 (since fields in records are not initialized with 0 when declared on the stack).

procedure TStackList<T>.Initialize;
begin
  if IsManagedType(T) then
    raise EInvalidOperation.Create(
      'A stack based collection cannot contain managed types');

  FCapacity := SizeOf(FData) div SizeOf(T);
  FCount := 0;
end;

It also checks if the type parameter T is a managed type and raises an exception if so. This may deserve some explanation. Even though the type parameter T has a record constraint, this doesn’t mean that T cannot contain a managed type. For example, the record constraint prevents Delphi from compiling this:

var
  List: TStackList<String>;

But not from compiling this:

type
  TStringWrapper = record
    Value: String;
  end;
var
  List: TStackList<TStringWrapper>;

When the type parameter T is a reference type or a managed type, we need some additional code to prevent memory leaks. We’ll get to this later and keep things simple for now by not allowing this.

Adding an item works like this:

procedure TStackList<T>.Add(const AItem: T);
var
  Target: P;
begin
  if (FCount >= FCapacity) then
    raise EInvalidOperation.Create('List is full');

  Target := @FData[FCount * SizeOf(T)];
  Target^ := AItem;
  Inc(FCount);
end;

Since the capacity of this list is fixed, we need to raise an exception when the list is full. We’ll look at an alternative later.

Because the FData field is just an array of bytes, we need a trick to put items of type T into this array. This is where the type P = ^T; declaration comes in handy. We calculate the offset into the FData array and assign its address to the Target variable of type P. Then, we can just dereference this variable to assign the value. Retrieving an item works in a similar way:

function TStackList<T>.GetItem(const AIndex: Integer): T;
var
  Item: P;
begin
  if (AIndex < 0) or (AIndex >= FCount) then
    raise EArgumentOutOfRangeException.Create('List index out of range');

  Item := @FData[AIndex * SizeOf(T)];
  Result := Item^;
end;

You can use this stack-based list as follows (see the E01SimpleStackList example in the repo):

procedure SimpleStackListExample;
var
  List: TStackList<Integer>;
  Error: Boolean;
  I: Integer;
begin
  List.Initialize;

  { TStackList<Integer> can contain up to 256 bytes of data. An Integer
    is 4 bytes in size, meaning the list can contain up to 64 Integers. }
  for I := 0 to 63 do
    List.Add(I);

  { Adding 64'th item should raise an exception. }
  Error := False;
  try
    List.Add(0);
  except
    Error := True;
  end;
  Assert(Error);

  { Check contents }
  Assert(List.Count = 64);
  for I := 0 to List.Count - 1 do
    Assert(List[I] = I);
end;

A Stack List with a Configurable Size

You may find that 256 bytes of storage is too little or too much. It would be nice if this size was configurable. If Delphi was more like C++, we could use a template parameter like this:

type
  TStackList<T: record; N: Integer> = record
  private
    FData: array [0..N - 1] of Byte;
  end;

var
  List: TStackList<Double, 1024>;

But Delphi is not C++ and generics are not templates. So how can we accomplish something similar? Instead of a template parameter, we could use another type parameter whose sole purpose is to provide storage for the list items:

type
  TStackList<T: record; TSize: record> = record
  private
    FData: TSize;
    ...

We could then declare a stack list with a storage of 1024 bytes like this:

type
  T1024Bytes = record Data: array [0..1023] of Byte end;

var
  List: TStackList<Double, T1024Bytes>;

And its usage would be the same as the fixed size list presented earlier. You can find this version in the E02StackListWithSize example in the repository.

Note that the T1024Bytes type declares a static array wrapped inside a record. The record wrapper is needed because static arrays by themselves cannot be used as a TSize type because of the record constraint.

I’m old school and like my powers-of-two. There is no real reason you should use a T1024Bytes instead of a T1000Bytes.

The implementation of this record is also very similar to the previous one. But since FData is not a static array anymore, we need different code to calculate the address of an item:

procedure TStackList<T, TSize>.Add(const AItem: T);
var
  Target: P;
begin
  if (FCount >= FCapacity) then
    raise EInvalidOperation.Create('List is full');

  Target := @FData;
  Inc(Target, FCount);
  Target^ := AItem;

  Inc(FCount);
end;

Since P is a typed pointer, we can use simple pointer math. First we set Target to the address of the first item in the array. Then we just increment it with the current FCount. For reference, these are two alternative ways to accomplish the same calculation:

  {$POINTERMATH ON}
  Target := P(@FData) + FCount;

This adds the count directly to the address without a separate Inc statement. But this requires typecasting to P first and enabling the POINTERMATH directive.

Another way is to do the entire calculation manually and explicitly:

Item := P(IntPtr(@FData) + (FCount * SizeOf(T)));

Now, we first need to convert the address to an integral type, preferably of type IntPtr for compatibility on all platforms. After that, we need to increment the value with the count multiplied by the size of the list item type. Finally, we need to typecast the whole thing to type P. This is a much more elaborate way to achieve the same thing, but it is the way that the compiler would convert the first two versions behind the scenes. So it is good to know what is happening.

A List with Managed Types

But what if you want a list of strings or objects? The record constraint on the T type parameter currently does not allow this. We can remove this constraint, but then we need to make sure that we manage the managed types in the list. The E03StackListWithManagedTypes example shows a way to do this.

First, the FData field usually contains random data when the list is declared on the stack. If we interpret this data as items of a managed type, then this will most likely result in access violations because Delphi will try to update the reference counts of strings (or other managed types) with a random address.

So we need to clear FData in the Initialize method:

procedure TStackList<T, TSize>.Initialize;
begin
  ...
  if IsManagedType(T) then
    FillChar(FData, SizeOf(FData), 0);
end;

Note that this is only necessary when T is a managed type, hence the IsManagedType check.

IsManagedType is a “compiler-magic” function, which means that the if-condition is evaluated at compile-time instead of run-time. Thus when a TStackList<Integer> is compiled, the check and FillChar statement is completely removed from the code. On the other hand, when a TStackList<String> is compiled, the if-check is also removed from the code, but the FillChar statement remains. This compiler-magic function (and other compiler-magic functions like HasWeakRef and GetTypeKind) can be helpful in creating efficient code that avoids run-time checks.

Also, we need a Finalize method now that must be called (preferably in a finally section) to release the managed items in the list. This method is the equivalent of a destructor:

procedure TStackList<T, TSize>.Finalize;
begin
  Clear;
end;

procedure TStackList<T, TSize>.Clear;
begin
  if IsManagedType(T) then
  begin
    FinalizeArray(@FData, TypeInfo(T), FCount);
    FillChar(FData, FCount * SizeOf(T), 0);
  end;
  FCount := 0;
end;

It uses the FinalizeArray routine from the System unit to decrease the reference counts of the items in the array (depending on their type). Without this, the items in the array would never get released, which results in a memory leak.

The other parts of the code remain unchanged. You may wonder if this part of the code still works correctly:

var
  Target: P;
begin
  Target := @FData;
  Inc(Target, FCount);
  Target^ := AItem;
end;

Because P may be a pointer to a managed type, you may wonder if this code will circumvent any reference counting now. This is not the case. When Delphi compiles this code for a managed type, it will make sure that the assignment to Target^ will decrease the reference count of the original item (if any) and increment the reference count of the newly assigned item.

A Growable Stack List

The examples discussed above work fine if you know in advance the maximum number of items that the list will ever hold. Adding any more will result in an exception. But what if you want to use the advantage of the stack, but still be able to grow the collection if needed?

So for a final example, we look at how you could create a list that uses the stack up to a certain amount, but is able to grow into the heap if needed. This may provide the best of both worlds: if the number of items stays low, you avoid using dynamic memory altogether; but if the need arises, you can still grow the collection without any problems.

You can find this version in the E04GrowableStackList example in the repo. This version doesn’t allow managed types for its items, so that we can focus on just the “growable” part.

type
  TStackList<T: record; TSize: record> = record
  private
    FStackData: TSize;
    FStackCapacity: Integer;
    FHeapData: Pointer;
    FHeapCapacity: Integer;
    FCount: Integer;
    ...

Now, when we add an item, we check if it still fits in the stack, and if not, we will grow the collection on the heap:

procedure TStackList<T, TSize>.Add(const AItem: T);
var
  Target: P;
  Index: Integer;
begin
  if (FCount < FStackCapacity) then
    { We can still add this item to the memory stack. }
    Target := @FStackData;
    Inc(Target, FCount);
  else
  begin
    { We need to add this item to heap memory.
      First calculate the index into heap memory. }
    Index := FCount - FStackCapacity;

    { Grow heap memory if needed to accommodate Index }
    if (Index >= FHeapCapacity) then
      Grow;

    Target := FHeapData;
    Inc(Target, Index);
  end;

  Target^ := AItem;
  Inc(FCount);
end;

The FStackData field will hold the first FStackCapacity items. Any additional items are allocated on the heap. The FHeapData field is a pointer to an array that can hold at most FHeapCapacity items. When this capacity is reached, it will grow this memory block:

{$IF (RTLVersion < 33)}
procedure TStackList<T, TSize>.Grow;
begin
  { Pre-Rio growth strategy: double collection size }
  if (FHeapCapacity = 0) then
    FHeapCapacity := 4
  else
    FHeapCapacity := FHeapCapacity * 2;
  ReallocMem(FHeapData, FHeapCapacity * SizeOf(T));
end;
{$ELSE}
procedure TStackList<T, TSize>.Grow;
begin
  { Delphi Rio introduced a user-configurable growth strategy }
  FHeapCapacity := GrowCollection(FHeapCapacity, FHeapCapacity + 1);
  ReallocMem(FHeapData, FHeapCapacity * SizeOf(T));
end;
{$ENDIF}

Here we take advantage of the user-configurable growth strategy introduced with Delphi Rio. When an older Delphi version is used, we default to doubling the collection size.

When retrieving an item, we must also check now whether the item is on the stack or on the heap:

function TStackList<T, TSize>.GetItem(const AIndex: Integer): T;
var
  Item: P;
begin
  if (AIndex < 0) or (AIndex >= FCount) then
    raise EArgumentOutOfRangeException.Create('List index out of range');

  if (AIndex < FStackCapacity) then
  begin
    Item := @FStackData;
    Inc(Item, AIndex);
  end
  else
  begin
    Item := FHeapData;
    Inc(Item, AIndex - FStackCapacity);
  end;

  Result := Item^;
end;

Finally, we should make sure that we free any allocated heap data in the Finalize method.

When Not to Use Stack Collections

Stack-based collections certainly have their uses, but the benefits depend on the situation. If you only need a temporary list once in a while, then there is usually no point in using a stack list and you should take advantage of the flexibility of a “regular” list.

Also, a stack list consumes memory on the stack. This is usually not a problem since all platforms provide a pretty big stack nowadays. But if you use a stack list in a recursive routine, then you may run into a stack overflow at some point. So my advise is to keep the size of your stack lists to within a couple of KB. If you need more, then memory fragmentation and speed are probably not your bottleneck anyway and a regular list should suffice.

So as always, use if appropriate, not just because you can.

Let me know if you encountered any interesting use cases for stack-based collections.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s