Optimization · Tips & Tricks · Uncategorized

Experiments in Uniform Memory Management

We experiment with algorithms to make memory management in Delphi more consistent between desktop and mobile platforms. This eases debugging and finding memory leaks. We look into using object interfaces and semi-automatic memory management based on ownership. Along the way, we touch on using linked lists and Delphi’s messaging framework.

Warning: this post is more opinion than fact. Memory management is a controversial topic for many people and there are a lot of different opinions about how Delphi’s memory manager should work. Please feel free to disagree with me…

This post is accompanied by a sample application in our JustAddCode repository on GitHub. You will find it in the UniformMemoryManagement directory.

A Tale of Two Memory Management Models

With the advent of mobile compilers, Delphi introduced a new memory management model based on Automatic Reference Counting (ARC). This model promised an easier way to manage memory by not having keeping track of allocations and by (mostly) eliminating the danger of “dangling pointers”. No more need for create - try - use - finally - free blocks everywhere. This is being reduced to create - use and Delphi will take care of the rest.

Unfortunately, this promise was never fulfilled, for the following reasons:

  • When developing cross-platform apps that also need to run on Windows and/or macOS, you still need to use the “old” create - try - use - finally - free pattern of freeing your objects when you are done with them. So there is no gain in reduced and simplified code.
  • Even if you are developing for mobile platforms (or Linux) only, you will probably still to do the majority of testing and debugging on Windows, since it is so much faster and easier. So again, you need to use the “old” pattern.
  • Confusingly, Free behaves in opposite ways on ARC and non-ARC platforms. On ARC platforms, it just sets the reference to nil (which may or may not actually free the object), while on non-ARC platforms it actually frees the object (but does not set the reference to nil). If you need some form of deterministic finalization, then you need to use DisposeOf instead of Free.
  • Automatic Reference Counting still requires manual tweaking to avoid memory leaks. It is very easy to (inadvertently) create a reference cycle (aka circular reference) between two or more objects. These cycles will prevent these objects from being freed, resulting in a memory leak. To avoid this, you need to make some references [weak] (or [unsafe]). It is not always obvious if there is a reference cycle in you code, especially if the chain of references is long. Furthermore, it is very difficult to detect a reference cycle at run-time. You cannot use the ReportMemoryLeaksOnShutdown feature, since there is no real concept of “shutdown” with mobile apps. There is a CheckForCycles API, but this only works on a single instance and doesn’t report all live reference cycles in the app.

All in all, this doesn’t make memory management easier IMHO. In fact, it makes it harder.

The first item in the list above may be addressed by a future Delphi version. The current Delphi roadmap mentions the possibility of ARC being implemented on all platforms (under “Research Areas”). That would be great since it means we would finally have a consistent model. It would be even better if we could also optionally disable ARC on all platforms…

In addition to this, ARC adds quite a bit of CPU overhead. Every time you assign an object, Delphi automatically increases and/or decreases the reference count of 1 or 2 objects. This results in one or more atomic operations that lock the CPU for a fraction of time. Furthermore, to manage [weak] references, the RTL keeps track of all of these in multiple dictionaries. When an object is destroyed, these references are checked and set to nil.

For larger objects, this overhead is small. But if you use a lot of small objects, then the overhead becomes noticeable and it is not uncommon that 10-15% of application time is spend on just managing object lifetime. And having a lot of small objects is not uncommon. For example, loading an XML document into a DOM can easily result in the creation of many thousands of small objects. Also, with the “inheritance-vs-composition” scale tipping more and more in favor of composition, you end up with more (smaller) objects. Even creating a simple FireMonkey TLabel control on iOS results in the creation of 23 other objects (in Delphi 10.2). Adding it to a form using the default style creates another 108 objects (some of which only temporary). And in this process, the atomic TObject.__ObjAddRef method gets called over 1400 times, and __ObjRelease over 1300 times.

I know this is the price you pay for ARC, but unfortunately, you pay it on those platforms (with low-powered CPUs) that can least afford it…

Towards Uniform Memory Management

Ideally, we would like a memory model that works the same on all platforms. That way, you only have to target a single model and you only have to test your code on a single platform (as far debugging memory-related issues is concerned). If you have experience with debugging memory growth and reference cycles on mobile platforms, then you would probably appreciate that.

So you basically have 3 approaches:

  1. Disable ARC on all platforms.
  2. Enable ARC on all platforms.
  3. Find some middle ground.

The first option is very difficult or near impossible in practice. You can disable ARC on a per-class basis by overriding the __ObjAddRef and __ObjRelease methods on ARC platforms. If you just return -1 in those methods, then ARC for that class will effectively be disabled. You can probably even do this globally by hooking the VMTs of all classes you want to disable ARC for (as explained in my previous post on Cross-Platform Code Hooking). This method has big drawbacks though:

  • You cannot use Free to free an object, since Free just sets the reference to nil on ARC platforms. When ARC is disabled, this will not result in the release of this reference, and thus the instance will never be freed.
  • Also, you cannot use DisposeOf, since this only calls the destructor, but does not free up the memory for the object.
  • So you need to come up with a different way to destroy your objects. Adding yet another approach adds to the confusion and doesn’t make things easier.
  • But more importantly, you have to make sure that all variables containing instances of destroyed objects are set to nil. With ARC disabled, freeing an object leaves a dangling pointer. If you don’t set all references to this destroyed object to nil (including references in arrays or collections), then Delphi will try to call __ObjRelease on the dangling pointer at some point, most likely resulting in an Access Violation. That is a big burden and definitely doesn’t make things easier.

So the remainder of this post discusses the other two approaches.

Enable ARC on all platforms

Well, there is no way to enable ARC on all platforms. At least, not until support for it is added to the Delphi compilers (which, as I mentioned, is a research area on the roadmap).

But you can easily enable ARC on a per-class basis: just create an object interface with the public API of the class and implement this interface. Nick Hodges would say that you should do this anyway since it decouples your API from the actual implementation. And as an added benefit, you get (mostly) automatic memory management (as long as your class derives from TInterfacedObject or you implement the reference counting yourself). If you follow this model, then memory management on ARC and non-ARC platforms will be virtually the same, greatly simplifying the management of object lifetimes. Also, you can debug memory related issues (such as reference cycles) on Windows now, which is much easier than doing that on a mobile device.

Talking about reference cycles: when using object interfaces, you can also create a reference cycle between two interfaces, preventing both underlying objects from being destroyed. This used to be a challenge until Delphi 10.1 Berlin arrived. Since that version, you can also use [weak] and [unsafe] references for object interfaces on Windows and macOS. So you don’t need “hacks” anymore to manually break cycles, or to store interfaces in plain old pointers to avoid cycles.

But object interfaces are a bit heavy on the performance side. Since an interface is basically just a Virtual Method Table, this means that every method call is a virtual method call, using one additional level of indirection (and a VMT table lookup, which isn’t very cache friendly). In addition, if your class has simple properties that just access a field, then those would have to be implemented using getter and/or setter methods in the object interface. This means that accessing the property isn’t a simple matter of accessing a field anymore. Instead, it calls a method now (actually, a virtual method to make things worse), which adds quite a bit of overhead. This may be a bit nitpicky, and not an issue for your run-of-the-mill business logic classes. But it can become an issue in high-performance applications such as animated user interfaces or games (or in general when you don’t want to drain the battery).

Managed Types

For those scenarios, you may want to consider this model: use regular classes on ARC platforms and object interfaces on non-ARC platforms. This way, you won’t have the interface overhead on the devices that can least afford it, but you will have the interface overhead on Windows and macOS. But since those platforms are more powerful, the additional overhead is less of an issue.

Of course, having two different approaches is a maintenance nightmare. But we can try to keep the differences to a minimum with some clever naming. For example, you could use a different single-letter prefix to signify both classes on ARC platforms and interfaces on non-ARC platforms. Since these are both “managed” types, we can use the “M” prefix.

Lets see how this would work with an example that you will find in the GitHub repository for this post. We will call our example “class” MSample. On ARC platforms, MSample is just an alias for the TSample class, while on non-ARC platforms, MSample is actually an interface that is implemented in the TSample class. The declarations could look like this:

type
  {$IFDEF AUTOREFCOUNT}
  TSample = class;
  MSample = TSample;
  {$ELSE}
  MSample = interface
    { Property access methods }
    function GetLink: MSample;
    procedure SetLink(const AValue: MSample);

    { Properties }
    property Link: MSample read GetLink write SetLink;
  end;
  {$ENDIF}

  TSample = class{$IFNDEF AUTOREFCOUNT}(TInterfacedObject, MSample){$ENDIF}
  private
    [weak] FLink: MSample;
  protected
    { MSample implementation }
    function GetLink: MSample;
    procedure SetLink(const AValue: MSample);
  public
    property Link: MSample read FLink write FLink;
  end;

The actual code is still in one place: the TSample class. In this example, the MSample “class” just has a single property called Link. We can use this property to create an artificial reference cycle. Note that this property is implemented as a [weak] reference, so it will not result in a memory leak.

You could use this class like this:

var
  A, B: MSample;
begin
  A := TSample.Create;
  B := TSample.Create;
  A.Link := B;
  B.Link := A;
end;

This creates two instances and a reference cycle between them. Note that the A and B variables are of type MSample. You should always declare your variables of the “M” type and only use the “T” type (TSample) when creating the instance (or when accessing class methods or class properties). This will ensure that both instances will automatically be destroyed when they go out of scope.

Also note that on ARC platforms, accessing A.Link (or B.Link) is very efficient, since it just accesses the FLink field directly. On non-ARC platforms, this will go through the (virtual) GetLink and SetLink methods.

This approach puts a bit of a burden on the class designer. But using the class is very simple and straight-forward; you only have to remember to use “M” types in your declarations (like you would use “I” types when using regular object interfaces).

Still, this may not be for you. In that case, I would still recommend to use object interfaces instead of classes for all the reasons Nick advocates, as well as simplifying memory management.

Semi-automatic Memory Management

As an example of the third approach (“find some middle ground” between ARC and non-ARC), lets look at a way you can implement a form of semi-automatic memory management. Actually, you have been using this approach since Delphi 1, but we add a couple of twists.

This approach uses the concept of “object ownership”, and you use it every time you create an instance derived from TComponent. Every TComponent instance has an owner, and when the owner is destroyed, the instance is automatically destroyed as well (at least on non-ARC platforms). To achieve this, a TComponent maintains a list owned objects. When it is destroyed, it also destroys all objects in this list.

But TComponent is a bit on the heavy side. Let’s try to mimic this behavior in a more light-weight manner, so you can use it as a base classes for most other classes and take advantage of the semi-automatic memory management it provides. Let’s call this base class TNode, since you are basically creating a tree-shaped structure of objects and its owned children.

Blast from the past: Linked Lists

Instead of using a TList-type class to store children, let’s use a linked list. That saves us the construction of another object and allocating a dynamic array to store the items in the list. This makes linked lists very useful for these types of scenarios. Linked lists are pretty old-school though, and you may not have used them yourself. So we will go into a little detail on how to use them.

Linked lists have gotten a bit out of fashion since they don’t provide random access to items in the list. But since we don’t need that for this purpose, that’s OK.

Also, linked lists are criticized for not being cache friendly on modern CPUs. This is certainly the case if each node in the list contains a link to its data object. But in our case, the data object is the node, so this argument does not apply.

At a minimum, a TNode needs to have a link to its parent and a list of children. A linked list starts with a link to the head of the list, and every node has a link to the next sibling in the list. In a tree-like structure the head of the list is the link to its first child node. The following diagram clarifies this:

singly_linked_list

In the diagram, open circles represent links that have not been assigned (that have the value nil). Looking at the diagram, it is obvious that there is a reference cycle between the root node and its first child. This means that either the Parent or the FirstChild value should be a [weak] or [unsafe] reference to avoid memory leaks on ARC platforms. In parent/child relationships it is common (and often easiest) to apply this attribute to the parent value. So, the TNode class can look like this:

type
  TNode = class abstract
  strict private
    [unsafe] FParent: TNode;
    FFirstChild: TNode;
    FNextSibling: TNode;
  public
    constructor Create(const AParent: TNode);

    property Parent: TNode read FParent;
    property FirstChild: TNode read FFirstChild;
    property NextSibling: TNode read FNextSibling;
  end;

Note that I used the [unsafe] attribute here instead of [weak]. Both attributes serve the same purpose, but [unsafe] doesn’t set the reference to nil when the object is destroyed. This makes [unsafe] much more efficient since Delphi doesn’t have to perform the bookkeeping related to tracking and nullifying weak references. But, as the name implies, it can be unsafe since you will end up with a dangling pointer. But then again, every object in the non-ARC world is unsafe, since freeing it may leave dangling pointers everywhere. So as long as you know what you are doing, I think using [unsafe] is perfectly fine. Even inside the Delphi RTL and FMX libraries, more and more weak references are being replaced with unsafe references with every new release.

Adding a node

The constructor of TNode adds itself to the list of children of the parent. When you look at the diagram above, the code should make sense:

constructor TNode.Create(const AParent: TNode);
begin
  inherited Create;
  if Assigned(AParent) then
  begin
    FParent := AParent;
    FNextSibling := AParent.FFirstChild;
    AParent.FFirstChild := Self;
  end;
end;

This code adds itself in front of the list, meaning that the list will be in reverse addition order. But we usually don’t care about the order of the children in the list, so that’s OK.

Deleting a node

In the non-ARC world, when you free an object derived from TComponent, it will remove itself from its owner. In the ARC world though, it will do nothing since the owner keeps strong references to its owned objects. So freeing the object will only decrease its reference count, but not actually free it.

The same goes for our TNode based model. So we need a way to both remove a node from its parent and free it. Let’s create a Delete method that does just this:

procedure TNode.Delete;
var
  Cur: TNode;
begin
  if (FParent <> nil) then
  begin
    Cur := FParent.FFirstChild;
    if (Cur = Self) then
      { We are at the head of the linked list.
        We only need to update a link. }
      FParent.FFirstChild := FNextSibling
    else
    begin
      { We are somewhere in the middle of the linked list.
        Find ourself and update the link. }
      while (Cur <> nil) and (Cur.FNextSibling <> Self) do
        Cur := Cur.FNextSibling;

      if Assigned(Cur) then
        Cur.FNextSibling := FNextSibling;
    end;
  end;
  DisposeOf;
end;

Deleting an item from a linked list is a bit more complicated than adding an item. There are basically two possible situations. If we are at the head of the list of children of the parent (FParent.FirstChild = Self) then we can just update this head to point to the next item in the list, as the following diagram clarifies:

list_delete_head

Otherwise, we have search for the previous sibling and have it link to the sibling to the right of us:

list_delete_middle

Destroying all child nodes

Finally, when a node is destroyed, we should destroy its children as well. But we only have to do this on non-ARC platforms, since this is handled automatically on ARC platforms:

{$IFDEF AUTOREFCOUNT}
destructor TNode.Destroy;
begin
  inherited;
end;
{$ELSE !AUTOREFCOUNT}
destructor TNode.Destroy;
var
  Cur, Next: TNode;
begin
  Cur := FFirstChild;
  while Assigned(Cur) do
  begin
    Next := Cur.FNextSibling;
    Cur.Free;
    Cur := Next;
  end;
  inherited;
end;
{$ENDIF !AUTOREFCOUNT}

This code traverses the list of children and destroys each child (which may in turn destroy its children etc…).

Doubly Linked Lists

You may have noticed that the Delete method may need to perform a linear search to find the previous sibling in the linked list. In Big-O speak: this is a O(n) algorithm, meaning that it can take up to n (the number of items in the list) operations to find the node. This is a drawback of so-called singly linked lists. If performance is a concern, then you can use a doubly linked list instead. In a doubly linked list, each node also has a link to its previous sibling:

doubly_linked_list

This means that you don’t need to search for the previous sibling anymore since it is available in the PrevSibling field. So deleting a node becomes a O(1) algorithm, meaning that the number of operations is independent of the number of items in the list:

list_delete_doubly

Please refer to the sample project to see how this is done in code. To compile the code to use doubly linked lists instead of singly linked lists, set the DOUBLY_LINKED_LIST define or activate one of the *-DoublyLinkedList build configurations.

As you can see in the diagram, there are reference cycles everywhere since two siblings reference each other through the NextSibling and PrevSibling fields. We fixed this in the code by marking the FPrevSibling field as [unsafe].

Free Notifications

If you have ever created your own VCL components, then you know that you sometimes need to get notified when another component is destroyed, so you can set your reference to this component to nil. Basically, you are manually implementing a weak reference here.

You would do this by calling FreeNotification on the target object to let it know you want to be notified when the object is destroyed. Then you would override the Notification method. In that method you would check whether the object that is about to be destroyed is the one you have a reference to. If so, you would set that reference to nil. Finally, you should not forget to call RemoveFreeNotification in your destructor to make sure the target object removes you from its notification list.

To implement free notifications, every TComponent has a list of other components that get notified when the component is about to be destroyed.

In our TNode based model, you may also want to get notified when a node is about to be destroyed. But lets do this in a more light-weight way that doesn’t require lists and 3 methods at the TNode level. Instead, let’s take advantage of Delphi’s System.Messaging framework.

Messages

If you are not yet familiar with Delphi’s messaging framework, then you probably should, since it provides a nice way to decouple code and remove dependencies between classes. Basically, you can broadcast different types of messages and anyone can listen for messages of specific types. This means that sender and receiver(s) are totally independent of each other.

Usually, the sender creates a message object (derived from TMessage) and broadcasts it using TMessageManager.SendMessage. Listeners can subscribe to messages of a specific type by calling TMessageManager.SubscribeToMessage, passing the type of message to listen for and a regular or anonymous method that will handle the message.

For our purposes, let’s declare a message type called TFreeNotificationMessage:

type
  TFreeNotificationMessage = class(TMessage);

This type doesn’t declare any additional data or methods. It is just used so you can subscribe to messages of this particular type.

Receiving messages

Let’s start with the receiving side. In the sample application, we have a FireMonkey frame called TFrameFreeNotifications that listens for messages of the type we just declared. It does this by subscribing to the message type in the constructor, and unsubscribing in the destructor:

constructor TFrameFreeNotifications.Create(AOwner: TComponent);
begin
  inherited;
  TMessageManager.DefaultManager.SubscribeToMessage(
    TFreeNotificationMessage, FreeNotificationListener);
end;

destructor TFrameFreeNotifications.Destroy;
begin
  TMessageManager.DefaultManager.Unsubscribe(
    TFreeNotificationMessage, FreeNotificationListener);
  inherited;
end;

You can create your own message managers, but in this example, we just use the default message manager that is globally available. In the listener method you can then perform some custom actions based on the object that is about to be destroyed. That object is passed in the Sender parameter. In the sample app, we just log its name:

procedure TFrameFreeNotifications.FreeNotificationListener(
  const Sender: TObject; const M: TMessage);
begin
  Assert(M is TFreeNotificationMessage);
  Assert(Sender is TFreeNotificationObject);
  Log('TFreeNotificationMessage received for: '
    + TFreeNotificationObject(Sender).Name);
end;

The sample uses a TFreeNotificationObject class for demonstration purposes. This class is derived from TFreeNotificationBase, which provides the functionality to send notification messages on destruction.

Sending messages

Normally, when you want to send a message, you will create an instance of a class derived from TMessage. You would then send that message using the message manager, which would automatically free the message after it has been handled.

For our purposes, it is overkill to create a new message object every time we send the notification. Instead, we want to create a single TFreeNotificationMessage instance and reuse it for every notification. We can do this by making it a (static) class variable and create it in a class constructor (and destroy it in a class destructor):

type
  TFreeNotificationBase = class abstract(TNode)
  private class var
    FNotificationMessage: TFreeNotificationMessage;
  public
    class constructor Create;
    class destructor Destroy;
  end;

class constructor TFreeNotificationBase.Create;
begin
  FNotificationMessage:= TFreeNotificationMessage.Create;
end;

class destructor TFreeNotificationBase.Destroy;
begin
  FNotificationMessage.Free;
end;

Then, when the object is about to be destroyed, we broadcast this message. But we don’t want the message manager to destroy the message though, since we want to reuse it later. We can accomplish this by setting the third parameter to SendMessage to False (meaning not to destroy the message):

destructor TFreeNotificationBase.Destroy;
begin
  TMessageManager.DefaultManager.SendMessage(Self,
    FNotificationMessage, False);
  inherited;
end;

Note that sending messages does incur a bit of overhead. So in the actual sample code, we only send the message if a FNotifyOnFree flag is set. To enable this flag, you need to call the EnableFreeNotification method on the object you want to be notified about.

Conclusion

Actually, I don’t have a conclusion. I have been experimenting with many ways to simplify memory management and tried to come up with models that work the same way on all platforms. That isn’t easy. I have looked into disabling ARC, using records instead of classes and implementing some form of smart pointers. I have even experimented with “classic” objects (you know, those from the Turbo Pascal days declared using the object keyword, but still available today) and garbage collection on non-ARC platforms.

But in the end it would just be much easier if Delphi implemented a uniform model into the language. Until that day comes, I think the closest we can get to uniform memory management is to use object interfaces. But as I said in the beginning of this post: you may totally disagree with me. And that is OK!

3 thoughts on “Experiments in Uniform Memory Management

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s