Watch, Follow, &
Connect with Us
Public Report
Report From: Delphi-BCB/RTL/Delphi/Thread support    [ Add a report in this area ]  
Report #:  22267   Status: Open
Synchronize and WaitFor methods of TThread can lead to a DeadLock in logically correct applications
Project:  Delphi Build #:  8.1
Version:    10.0 Submitted By:   Serdar S. Kacar
Report Type:  Crash / Data loss / Total failure Date Reported:  12/13/2005 5:59:31 AM
Severity:    Extreme corner case Last Updated: 3/20/2012 2:24:39 AM
Platform:    All platforms Internal Tracking #:   238648
Resolution: Deferred to Next Rel (Resolution Comments) Resolved in Build: : None
Duplicate of:  None
Voting and Rating
Overall Rating: (3 Total Ratings)
5.00 out of 5
Total Votes: 15
Description
Effected Compilers:
Delphi 7 (upto Update 1),
Delphi 2006 (upto Update 1), thanks to Sebastian Modersohn.
Status of this problem on the other compilers unknown - please test your compiler with the attached project and report.

Effected Applications:
  - that has more than two threads, and
  - that has a thread that calls Synchronize in its Execute and its WaitFor called in the Main thread as sequence of another thread's Synchronize call.

Symptoms:
Stalled applications.

This problem is very hard to detect;
- occurs in specific kind of applications,
- might occur even your application is logically correct,
- most probably occurs infrequently, and
- isolation is quite time consuming job.

Therefor, I guess many developers are either ignored it or made kind of fixes in their code without knowing the problem source. Not using "TThread.Synchronize" and/or "TThread.WaitFor" at all is one alternative.

In case your Delphi version does not exhibit this problem, an
affected application build with standard classes.pas of Delphi 7
update 1 is also added for your reference.
Steps to Reproduce:
How to reproduce?

See the attached project. It's ready; Single button click will show
you the problem.

How the problem occurs?

CheckSynchronize call process synchronization requests in batches.

If two or more Syncronization requests are in the SyncList (a
private list in Classes.pas), they form a batch in CheckSynchronize;
CheckSynchronize tries to handle them all in one call.
When WaitFor method of a thread called as consequence of one of the
synchronization requests, if that thread also has a synchronization
request in the same batch, it never gets an opportunity to complete
the request, and, exit the thread.

Most of the time we are lucky; The thread we are waiting has no
synchronization request in any CheckSynchronize call in the
CallStack when we call its WaitFor:

     MainThread         Thread1             Thread2
T0   ...                Calls Synchronize   ...
T1   CheckSynchronize   Wait                ...
T2   ...                                    Calls Synchronize
T3   (Somewhere in the sequence of CheckSynchronize)
       Thread2.Terminate;                   Wait
       Thread2.WaitFor;
         CheckSynchronize;
         (Thread2.WaitFor handles Thread2's synchronization request gracefully here)
T4                                          Noticed Terminated
                                            Finished

Other times not:
      
     MainThread         Thread1             Thread2
T0   ...                Calls Synchronize   ...
T1   ...                Wait                Calls Synchronize
T2   CheckSynchronize                       Wait
     (Thread1 and Thread2's synchronization requests formed a batch here.)
T3   (Somewhere in the sequence of CheckSynchronize)
       Thread2.Terminate;
       Thread2.WaitFor;
         And the application seems unresponsive from that point on.
Workarounds
Until Borland approves that this is a problem and fixes it,
do not WaitFor threads that use Synchronize method in any sequence
of any other threads' Synchronize calls.

The fix shall be avoiding batch processing of Syncronization
requests in CheckSynchronize as the one in "Fixed_Classes.pas" in
the attached files.


If you are curious and using a compiler other than Delphi 7 update 1, here is how to make a fixed classes.pas for your complier  :
(one for Delphi 7 update 1 already included in the attachments)

{
  FOR EXPERIMENTAL PURPOSES ONLY.
  YOU HAD BETTER NOT USE IN THE OTHER APPLICATIONS - WAIT for the offical Borland fix.
  ONLY IF your compiler does exhibit symptoms described.
}

1. Copy your current classes.pas to the directory of this project.
2. Add copied classes.pas to your project.
WARNING: From that point on double check / ensure that you are working on the local copy of classes.pas . Do not edit the original one.
3. Locate CheckSynchronize function definition.
4. We are trying to replace batch processing of synchronization requests with one-at-a-time. Relevant changes are as follows:

function CheckSynchronize(Timeout: Integer = 0): Boolean;
var
  ..
  // LocalSyncList: TList;
begin
  ..
  //LocalSyncList := nil;
  ..
    //Integer(LocalSyncList) := InterlockedExchange(Integer(SyncList), Integer(LocalSyncList));
    ..
      Result := (SyncList <> nil) and (SyncList.Count > 0);
      //Result := (LocalSyncList <> nil) and (LocalSyncList.Count > 0);
        ..
        while SyncList.Count > 0 do
        //while LocalSyncList.Count > 0 do
          ..
          SyncProc := SyncList[0];
          SyncList.Delete(0);
          (*
          SyncProc := LocalSyncList[0];
          LocalSyncList.Delete(0);
          *)

          { sskacar: BEWARE: following code segment is an addition
            for TThread.WaitFor when called in the MainThread. }
          if SyncList.Count > 0 then
            SignalSyncEvent;
      ..

      //LocalSyncList.Free;
  ..
end;


5. If you see
    - .....  The Pawn says "Wish you hear me!" , and
    - Woav! "The Pawn"s thread ended succesfully.
    lines in the memo box, then congratulations you have achived a kind of fix - though it may not be a complete fix!

    Name it to "Fixed_classes for Delphi [your complier here].pas" and post to the attachments section.
    WARN yourself and anybody that this version of classes.pas is for experimental purposes only!
Attachment
Synchronize And WaitFor.zip
Comments

Lieven Keersmaekers at 12/13/2005 11:27:31 PM -
You might want to read "Multithreading - The Delphi way" from Martin Harvey, especially the section about
"Have you spotted the bug? WaitFor and Synchronize: An introduction to deadlock."

It can be found at http://www.pergolesi.demon.co.uk/prog/threads/ToC.html

Serdar S. Kacar at 12/14/2005 4:09:11 AM -
Yes, I had read that article before ; while I was trying to understand what's going on. The mentioned bug is no longer the case for (not reproducable by) Delphi 6, Delphi 7, and (I believe) the next versions.

The article was written in 2000. I guess at that time WaitFor method had written by ignoring "Current Thread is the Main Thread" condition - something like,

function TThread.WaitFor: LongWord;
var
  H: THandle;
begin
  H := FHandle;
  WaitForSingleObject(H, INFINITE);
  CheckThreadError(GetExitCodeThread(H, Result));
end;

Only a method written like this can cause the bug M. Harvey mentioned. Now its WINDOWS portion is like

function TThread.WaitFor: LongWord;
{$IFDEF MSWINDOWS}
var
  H: array[0..1] of THandle;
  WaitResult: Cardinal;
  Msg: TMsg;
begin
  H[0] := FHandle;
  if GetCurrentThreadID = MainThreadID then
  begin
    WaitResult := 0;
    H[1] := SyncEvent;
    repeat
      { This prevents a potential deadlock if the background thread
        does a SendMessage to the foreground thread }
      if WaitResult = WAIT_OBJECT_0 + 2 then
        PeekMessage(Msg, 0, 0, 0, PM_NOREMOVE);
      WaitResult := MsgWaitForMultipleObjects(2, H, False, 1000, QS_SENDMESSAGE);
      CheckThreadError(WaitResult <> WAIT_FAILED);
      if WaitResult = WAIT_OBJECT_0 + 1 then
        CheckSynchronize;
    until WaitResult = WAIT_OBJECT_0;
  end else WaitForSingleObject(H[0], INFINITE);
  CheckThreadError(GetExitCodeThread(H[0], Result));
end;
{$ENDIF}

This Delphi 6/7[/later] version, when called in the main thread, handles
- PENDING synchronization requests and
- messages that require main thread action.

Lieven Keersmaekers at 12/14/2005 11:23:29 PM -
Impressive, you've definitly done your homework. I stand corrected.

Sebastian Modersohn at 1/14/2006 7:58:11 AM -
I've downloaded the sample project and compiled with D2006 Upd.1 and the project also kind of hang (the memo on the right side still got filled by the "good" thread, but I couldn't close the app; had to kill as described).
That's the same behaviour the original exe supplied gives as well, so I guess I have reproduced the problem?

I also tried to make a diff of classes.pas from D7 and BDS 4, but there are too many changes!

Serdar S. Kacar at 1/25/2006 4:10:59 AM -
Yes, thank you for your effort. I have added D2006 (upto Update 1) to the effected compliers list.

I have also updated Workaround section to describe how to make fixed classes.pas for your complier.

Server Response from: ETNACODE01