Optimization · Tips & Tricks · Uncategorized

Inline Variables can increase performance

Delphi 10.3 Rio will add inline variables to the language. Besides the various benefits mentioned in Marco’s blog post that introduces the concept, inline variables can potentially improve performance of your code.

It’s all about scope

If you have been reading my articles, then you may know that I am a sucker for performance. That’s one of the reasons I am very excited about this addition to the Delphi language. Marco’s post does an excellent job at explaining the syntax at benefits, so I will not go into that here. Instead, lets jump right into an example. Consider the following (“classical”) code:

type
  TFoo = record
    I: Integer;
    S: String;
  end;

procedure TestLocalVars(const ACondition: Boolean);
var
  S: String;
  I: IInterface;
  F: TFoo;
begin
  if (ACondition) then
  begin
    S := 'Local String';
    I := TInterfacedObject.Create;
    F.S := 'Managed Record';
  end;
end;

This code uses 3 local variables. In addition all these variables are “managed” variables, meaning that Delphi adds some code behind the scenes to manage initialization, cleanup, and assignment of these variables. A managed variable is a variable that is (implicitly or explicitly) reference counted, such as a string, dynamic array, object interface or a plain object on ARC platforms (as long as we still have ARC). A record that contains one or more managed fields is also a managed type, such as the TFoo type in this example.

All Delphi compilers (including the upcoming 10.3 version) convert this example to the following (pseudo) code:

procedure TestLocalVars(const ACondition: Boolean);
var
  S: String;
  I: IInterface;
  F: TFoo;
begin
  S := nil;            // Added by compiler
  I := nil;            // Added by compiler
  InitializeRecord(F); // Added by compiler
  if (ACondition) then
  begin
    S := 'Local String';
    I := TInterfacedObject.Create;
    F.S := 'Managed Record';
  end;
  FinalizeRecord(F);   // Added by compiler
  IntfClear(I);        // Added by compiler
  UStrClr(S);          // Added by compiler
end;

As you can see, Delphi implicitly adds initialization to the beginning of the routine to clear out the string and interface, and to initialize the managed record. At the end of the routine, it cleans up the record, interface and string. These routines basically check these variables for nil, and if they are not nil, it decreases their reference counts which could result in releasing their memory. These routines are not very expensive, but they are not cheap either. In particular, FinalizeRecord is a bit expensive since it uses RTTI to traverse all managed fields in the record and recursively clean them up.

Note that InitializeRecord and FinalizeRecord are only called for “managed” records (that is, records with any managed fields). For POD records, this overhead does not exist.

As you can see, the initialization and finalization code is always executed, even if ACondition is False. It would be more efficient if this code is only executed when the if-statement evaluates to True. This is where inline variables come to the rescue.

Limiting Scope with Inline Variables

The same example using inline variables can be rewritten to this:

procedure TestInlineVars(const ACondition: Boolean);
begin
  if (ACondition) then
  begin
    var S := 'Inline String';
    var I: IInterface := TInterfacedObject.Create;
    var F: TFoo;
    F.S := 'Managed Record';
  end;
end;

Note that for the S variable, we can take advantage of type inference. This is not possible in this example for the I and F variables.

The new Delphi compiler converts this to the following (pseudo) code (at least ideally):

procedure TestInlineVars(const ACondition: Boolean);
begin
  if (ACondition) then
  begin
    var S: String := nil;     // Added by compiler
    S := 'Inline String';
    var I: IInterface := nil; // Added by compiler
    I := TInterfacedObject.Create;
    var F: TFoo;
    InitializeRecord(F);      // Added by compiler
    F.S := 'Managed Record';
    FinalizeRecord(F);        // Added by compiler
    IntfClear(I);             // Added by compiler
    UStrClr(S);               // Added by compiler
  end;
end;

Now, the initialization and finalization code is only executed if necessary. This improves performance, especially of ACondition is usually False.

The code that Delphi actually generates is different for the pseudo-code presented here. It is not as efficient as it should be, but hopefully this will improve in the future.

This may be a bit of a contrived example, but I have run into several situations in the past where inline variables would have (considerably) improved performance. For example, our JSON library contains code to parse JSON files. At some point, it uses a case statement to perform specific actions depending on the type of the data element currently read. Most cases use temporary strings or interface variables, which would go through initialization and finalization code. However, only one case in the case statement would actually be executed, unnecessarily running the initialization and finalization code for all other temporary variables that weren’t used. This had a measurable negative impact on parsing performance. I fixed that by having each case in the case statement call a different routine, and declare those variables inside those routines only. While this may be good practice in many cases, in this case it actually hurt readability of the code. Inline variables would have been a much better solution!

Implicit Local Variables

While it may not always be obvious, Delphi regularly creates temporary local variables behind the scenes. Consider this fragment:

procedure TestImplicitLocalVar(const AValue: Integer);
begin
  if (AValue <> 0) then
    ShowMessage(Format('Value: %d', [AValue]));
end;

The Delphi compiler will add a temporary string variable here to store the result of the Format call. The actual (pseudo) code that Delphi generates is:

procedure TestImplicitLocalVar(const AValue: Integer);
begin
  var Temp: String = nil; // Added by compiler
  if (AValue <> 0) then
  begin
    Temp := Format('Value: %d', [AValue]); // Added by compiler
    ShowMessage(Temp);
  end;
  UStrClr(Temp);          // Added by compiler
end;

Notice that UStrClr is always called, whether AValue is 0 or not. This is especially bad if AValue is almost always 0. For example, this routine could check an error code and show an error message in case the code is not 0. In the vast majority of cases, the error code will be 0 and the calling this routine should be very cheap. However, it calls some implicit initialization and finalization code every time the routine is invoked, resulting in non-trivial overhead.

You can improve performance by making the implicitly added variable explicit, and turning it into an inline variable:

procedure TestExplicitInlineVar(const AValue: Integer);
begin
  if (AValue <> 0) then
  begin
    var Temp := Format('Value: %d', [AValue]);
    ShowMessage(Temp);
  end;
end;

Now, the string is only initialized and cleaned up when AValue is not 0.

Ideally, Delphi would inline all implicitly generated temporary managed variables behind the scenes, so we would not have to take care of this ourselves. That would be a nice feature for a future update…

Use with Care

Now I am not saying that you should start modifying your existing code to use inline variables everywhere. It can definitely improve performance for routines that are called a lot or inside tight loops, but there are some drawbacks too.

Currently, inline variables can increase the size of the generated code, especially if they are of a managed type. This extra code also makes inline managed variables not as efficient as they could be. But then again, this is version 1.0 of this feature, so we could see improvements in the future.

Inline variables have a lot of potential when used wisely. Not only can they improve performance, they can also improve readability, maintainability and in some cases reliability of your code. As with everything, use them if it makes sense, but not just because they exist.

14 thoughts on “Inline Variables can increase performance

  1. Using RTTI to cleanup records which structure is known to compiler in compile time is a defect of the compiler. As well as putting UStrClr(Temp) out of condition where the temporary string is created.
    Introducing a new language feature to deal with the compiler defects is a bad decision in my opinion.

    Like

    1. Inline variables are not introduced to deal with compiler defects. They have a lot of purposes and benefits in their own right. They can be used to work around this particular compiler “defect” however. Maybe now that we have this language feature, it will be easier for to compiler team to apply it to implicitly generated local variables as well…

      Like

    2. I agree that using RTTI to cleanse records is not correct, it just adds extra work!
      Erik van Bilsen did you report this?

      Like

  2. I notice in my Rio editor that inline variables cause syntax checking to highlight them as errors but the code compiles correctly. Is there a setting to tell the editor to allow inline variables when checking syntax?

    Like

  3. It would be cool to summarize coding uses that improve the performance:

    • Pass parameters as const when possible
    • Don’t raise within the same method but on specific one
    F(param) If param = 0 then ErrorNotValidCommand()
    • Use inline vars carefully for managed types (as you mention)
    • Don’t inline functions that return managed types
    • Use [unsafe] for interfaces when possible so they behave as pure smart pointers (this one is tricky but yields lots of performance)

    others?

    Like

    1. Those are all good ways to improve performance. Some also serve as “self documenting”. For example, I almost always use const parameters, even for integers, since it makes it explicit that the parameter is an input parameter. [unsafe] also means that that variable is just a reference and not “owned”.

      There are probably other ways to improve performance as well, as long as you don’t lose sight of readability. May be worth a blog post…

      Thanks for pointing these out!

      Like

      1. This is an interesting one.

        Is about how the CPU reads the memoery data in advance, to me the “natural” way to loop over an array like D is J and then I but the opposite seems much faster.

        procedure TForm25.FormCreate(Sender: TObject);
        var
        t1, t2: cardinal;
        i, j, k: integer;
        d: array[0..255, 0..255] of integer;

        begin
        t1 := GetTickCount;
        for k := 1 to 1000 do
        for j := 0 to 255 do
        for i := 0 to 255 do
        d[i, j] := 1;
        t1:= GetTickCount – t1;

        t2 := GetTickCount;
        for k := 1 to 1000 do
        for i := 0 to 255 do
        for j := 0 to 255 do
        d[i, j] := 1;
        t2:= GetTickCount – t2;

        ShowMessage(integer.ToString(t1) + ‘/’ +integer.ToString(t2));
        end;

        displaying 219/15 ms.

        Like

  4. Yes, this is a known optimization technique. You should always perform your innermost loops at the inner (right-most) array dimension. This ensures that you are accessing consecutive data so you can take advantage of the CPU cache. By doing it the other way around, you are jumping all over the place and trashing the cache on each access.

    Like

Leave a comment