Invalid pointer operation caused by mix of script initilization section and destructor

Issue #25 closed

Anders Melander created an issue 2015-09-04

I'm unsure about the precise cause of the problem. It occurs in a script complex involving eight units and several ~~hundred~~thousand lines of script code, but I believe the following two units reproduces the problem.

Order of initilization

As far as I have been able to determine the initialization section of the second unit is executed too late. Since the first unit has a dependency on the second unit, the initialization section of the second unit should execute before the first unit. What I'm seeing is that the initialization section of the second unit is executed after the first unit has called a function in the second unit. However, I don't think this is the actual problem.

Clearing references to self during destruction

The TMyStuff destructor clears an external reference to itself during destruction. Apparently this modifies the reference count of the internal TScriptObjInstance causing TInterfacedObject.BeforeDestruction to throw an Invalid pointer operation exception.

Main script file:

unit Test;

interface

uses
  Factory;

implementation

initialization
  MyStuff;
finalization
end;

Aux script file:

unit Factory;

interface

type
  TMyStuff = class
  public
    destructor Destroy; override;
  end;

function MyStuff: TMyStuff;

implementation

var
  FMyStuff: TMyStuff = nil;

destructor TMyStuff.Destroy;
begin
  FMyStuff := nil;
  inherited;
end;

function MyStuff: TMyStuff;
begin
  if (FMyStuff = nil) then
    FMyStuff := TMyStuff.Create;
  Result := FMyStuff;
end;

initialization
  FMyStuff := TMyStuff.Create;
finalization
  FMyStuff.Free;
end;

Comments (13)

Eric Grange repo owner
I could reproduce it with the following minimal test case.
```
type
    TTest = class
        Field : String = 'hello';
        destructor Destroy; override;
    end;

var ref := TTest.Create;

destructor TTest.Destroy;
begin
    ref := nil;
    PrintLn(Field);
end;
```
I could also reproduce the problem in Delphi for an Interface in a Variant: VarClear calls VariantClear (Windows API) which calls _Release before nil'ing the reference, so upon the next assignment to nil, it is called again, resulting in an invalid refcount.

IntfClear does it the other way (nil before calling _Release), which avoids the issue. I guess this means the VarClear calls to have variant that does not rely on the Windows API.
- 2015-09-04T12:56:18+00:00
Eric Grange repo owner
- changed status to resolved
Safer VarClear - Fixed Issue ~~#25~~

→ <<cset a85ebaebb704>>
- 2015-09-04T13:18:45+00:00
Anders Melander reporter
- changed status to open
I can confirm that your test case no longer reproduces the problem but unfortunately my test case still causes an Invalid pointer operation exception. I suspect there might be more than one problem in play here.
- 2015-09-05T15:12:23+00:00
Eric Grange repo owner
Do you have code in the destructor that acquires Self or copies the instance being destroyed (rather than nil'ing it)?

That's a case that also happens in Delphi, I am not sure yet how to best handle it without patching the rtl.

Code would be like
```
type
    TTest = class
        destructor Destroy; override;
    end;

var ref1, ref2 : TTest;

destructor TTest.Destroy;
begin
    ref2 := Self;
end;

ref1 := TTest.Create;
ref1 := nil;
```
IntfCopy does not check for refcount, so the above code copies the dying reference and increments its reference counter...
- 2015-09-07T06:36:47+00:00
Anders Melander reporter
- edited description
- 2015-09-07T09:18:56+00:00
Anders Melander reporter
No, I do not save the self pointer during destruction.

My actual code is very similar to the test case i posted, except that I only create the instance in the function - not in the initialization section. That said, my actual code produces another error, but I will create another issue for that once I have created a small test case.

Isn't the solution to this issue just that the assignment needs to clear the destination before it assigns the new value?

I.e. this:
```
Foo := TBar.Create;
```
should translate to:
```
var temp := TBar.Create;
Foo := nil;
Foo := temp;
```
- 2015-09-07T09:31:41+00:00
Eric Grange repo owner
Which issue would nil'ing before assignment solve?

The first issue comes from an improper nil'ing in the VariantClear function, the second issue is because code that assigns (in IntfCopy) does not check for reference count of the reference before copying and incrementing, so it can increase a refcount from zero back to 1. For instance the following code will crash witha stack overflow in Delphi when TTest is destroyed.
```
type
   TTest = class (TInterfacedObject)
      destructor Destroy; override;
   end;

procedure Boom(ref : IUnknown);
begin
   Assert(ref<>nil);
end;

destructor TTest.Destroy;
begin
   Boom(Self);
end;
```
Script engine can survive the basic variation of the above (as the destroyed object is getting wrapped), but there may be call chains that cannot be survived.
- 2015-09-07T09:51:50+00:00
Anders Melander reporter
I was referring to the issue that my test case demonstrates.

I'm away from the code so excuse me if I get it wrong but isn't this the sequence that causes the problem (and I'm still referencing my test case):
1. MyStuff function performs FMyStuff := TMyStuff.Create
  FMyStuff is nil. Create an instance and assign it to FMyStuff.
  No problem there.
2. Initialization section performs FMyStuff := TMyStuff.Create
  FMyStuff is already assigned.
  The new assignment causes the destructor to be called before the reference is actually cleared.
3. destructor TMyStuff.Destroy
  FMyStuff is still assigned.
  Assigning nil to FMyStuff juggles the internal reference count leading to an exception when the internal object is destroyed.
As far as I can tell (and I'm just guessing here) if an assignment internally was performed as a clear followed by a copy, then step 2 above would not cause the problem.
- 2015-09-07T12:04:58+00:00
Eric Grange repo owner
Ah, sorry, I had not seen it under that angle, I will look at it!
- 2015-09-07T12:24:13+00:00
Eric Grange repo owner
Ok spotted the issue, it's a mix of the two: VarCopy is doing a VarClear before the assignment (like in your suggestion), but using VariantClear, which does not nil the reference before calling _Release...

Variant copies in the code are hidden behind simple ":=" assignments, so hard to track them all to have them replace by explicit calls...
- 2015-09-07T12:48:18+00:00
Eric Grange repo owner
Committed a fix for the above case and hopefully some variant, the issue still likely exists in other cases.
- 2015-09-07T14:00:19+00:00
Anders Melander reporter
Excellent!
I can confirm that the test case no longer produces the problem.
- 2015-09-07T20:52:54+00:00
Eric Grange repo owner
- changed status to closed
Closing it for now, but phasing out all VarCopy calls I can find (explicit & implicit)
- 2015-09-08T09:20:48+00:00
Log in to comment

Assignee: –

Type: bug

Priority: major

Status: closed

Votes: 0

Watchers: 2