Watch, Follow, &
Connect with Us
Public Report
Report From: Delphi-BCB/RTL/Delphi/Thread support    [ Add a report in this area ]  
Report #:  83092   Status: Open
Deadlock in TThread.Sycnhronize
Project:  Delphi Build #:  8.1
Version:    7.0 Submitted By:   Oleg Gusakov
Report Type:  Crash / Data loss / Total failure Date Reported:  3/17/2010 5:28:58 AM
Severity:    Infrequently encountered problem Last Updated: 3/20/2012 2:24:39 AM
Platform:    All platforms Internal Tracking #:   275583
Resolution: None (Resolution Comments) Resolved in Build: : None
Duplicate of:  None
Voting and Rating
Overall Rating: No Ratings Yet
0.00 out of 5
Total Votes: None
Description
Seems affected for all compilers.

The problem in method TThread.Synchronize. I leave the important code only:

class procedure TThread.Synchronize(ASyncRec: PSynchronizeRecord; QueueEvent: Boolean = False);
  . . .
      EnterCriticalSection(ThreadLock);
      try
        . . .
        if Assigned(WakeMainThread) then
          WakeMainThread(SyncProcPtr.SyncRec.FThread);
        if not QueueEvent then
{$IFDEF MSWINDOWS}
        begin
          LeaveCriticalSection(ThreadLock);
          try
            WaitForSingleObject(SyncProcPtr.Signal, INFINITE);
          finally
            EnterCriticalSection(ThreadLock);
          end;
        end;

Deadlock occurs when after (or in) WakeMainThread call OS switch to the main thread. In this case WM_NULL already posted but ThreadLock still locked. In the main thread TApplication.WndProc calls the CheckSynchronize procedure that enters ThreadLock and we got deadlock.
Steps to Reproduce:
In TApplication.WakeMainThread add Sleep(1) and call TThread.Synchronize in some thread.
Workarounds
Perhaps ThreadLock should be unlocked before calling WakeMainThread event.
Attachment
Test.zip
Comments

Tomohiro Takahashi at 3/17/2010 6:14:00 AM -
Could you please attach sample project to reproduce your issue?

Oleg Gusakov at 3/17/2010 7:11:51 AM -
Sorry, but I don't know how to reproduce it in a separate project. I catch it when my main form hangs and I start debugging by pressing "Pause" in IDE. And the call stack of the second thread shows TApplication.WakeMainThread (on PostMessage line) as last execution point. The main thread hangs in CheckSynchronize on "EnterCriticalSection(ThreadLock)".

So I just intend that adding "Sleep(1)" in the TApplication.WakeMainThread should always hangs application in TThread.Synchronize.

Oleg Gusakov at 3/17/2010 7:39:47 AM -
Seems the problem in something else. The deadlock can occurs only when WakeMainThread waits something from main thead but PostMessage don't. The sleep nothing changes actually and thread switching after PostMessage make just lock for very small time rather then deadlock.

Actually now I haven't ideas how it happens.

Oleg Gusakov at 3/17/2010 9:43:18 AM -
I attach test project. I understand that it is very farfetched but I can't now invent better work case. In project that hangs one component (TRxTimer from old RX library) also calls TThread.Suspend but not so aggressive.

Jason Sprenger at 3/18/2010 11:35:43 AM -
TThread.Suspend and Resume for thread synchronization have proven to be problematic.  In the current release, they both have been deprecated with the intention of removing them from TThread entirely.

Design to avoid the use of the Suspend method.  There are legitimate reasons for creating suspended threads and starting them with "resume", but starting a thread then suspending it particularly when the synchronize maybe applied will lead to this sort of deadlock situation.

Oleg Gusakov at 3/22/2010 2:39:02 AM -
Agree, Suspend is a bad code style. My investigation shows that problem was exactly in the TRxTimer that after setting Enabled to false suspend internal thread forever. If this thread already enters in ThreadLock the main thread also blocks forever.

So you can close this report with "As designed" or "Third Party" but I can suggest a changes to avoid this problem. This code:

function CheckSynchronize(Timeout: Integer = 0): Boolean;
begin
  . . .
  EnterCriticalSection(ThreadLock);
  try
    . . .

replace to something like this:

function CheckSynchronize(Timeout: Integer = 0): Boolean;
begin
  . . .
  if not TryEnterCriticalSection(ThreadLock) then
    PostThreadMessage(GetCurrentThreadID, WM_NULL, 0, 0)
  else
    try
      . . .

It prevents any possible deadlock. Perhaps the PostThreadMessage here is not necessary and can insert many messages in queue (unfortunately it can't be checked for presence with PeekMessage) but it guarantee that sync call will be processed as quickly as possible.

Dick Boogaers at 3/22/2010 5:03:22 AM -
There's an obvious flaw in your work-around: the Result should be set to false.

Oleg Gusakov at 3/22/2010 7:00:45 AM -
It wasn't final and correct workaround but only an idea (I can't imagine all possible side effects of this changes). Actually here also must be WakeMainThread() call instead of PostThreadMessage().

Oleg Gusakov at 3/18/2010 12:08:21 PM -
Right now I catch deadlock again but without any threads suspending.

- Stack of the main thread:
CheckSynchronize(0)
TApplication.Idle((1182830, 15, 0, 0, 33689593, (594, 384)))
TApplication.HandleMessage
TApplication.Run
Client

From ProcessXP:
ntkrnlpa.exe!KiSwapContext+0x26
ntkrnlpa.exe!KiSwapThread+0x2e5
ntkrnlpa.exe!KeWaitForSingleObject+0x346
ntkrnlpa.exe!KiSuspendThread+0x18
ntkrnlpa.exe!KiDeliverApc+0x117
ntkrnlpa.exe!KiSwapThread+0x300
ntkrnlpa.exe!KeWaitForSingleObject+0x346
ntkrnlpa.exe!NtWaitForSingleObject+0x9a
ntkrnlpa.exe!KiFastCallEntry+0xfc
ntdll.dll!KiFastSystemCallRet
ntdll.dll!ZwWaitForSingleObject+0xc
ntdll.dll!RtlpWaitOnCriticalSection+0x1a3
ntdll.dll!RtlEnterCriticalSection+0xa8
House.exe+0x313b4
House.exe+0xce1cb
House.exe+0xcd5af
House.exe+0xcd83b
House.exe+0x4cad50
kernel32.dll!BaseProcessStart+0x23

Execution point in Classes.pas:

function CheckSynchronize(Timeout: Integer = 0): Boolean;
var
  SyncProc: PSyncProc;
  LocalSyncList: TList;
begin
  if GetCurrentThreadID <> MainThreadID then
    raise EThread.CreateResFmt(@SCheckSynchronizeError, [GetCurrentThreadID]);
  if Timeout > 0 then
    WaitForSyncEvent(Timeout)
  else
    ResetSyncEvent;
  LocalSyncList := nil;
  EnterCriticalSection(ThreadLock);
  try      <<<<<<<<<< EIP
    Integer(LocalSyncList) := InterlockedExchange(Integer(SyncList), Integer(LocalSyncList));



- Stack of the TRxTimer thread:
TThread.Synchronize($1A04D20)
TThread.Synchronize($133B350)
TRxTimer.Synchronize($133B350)
TTimerThread.Execute
ThreadProc($1A04D00)
ThreadWrapper($13ABD60)

From ProcessXP:
ntkrnlpa.exe!KiSwapContext+0x26
ntkrnlpa.exe!KiSwapThread+0x2e5
ntkrnlpa.exe!KeWaitForSingleObject+0x346
ntkrnlpa.exe!KiSuspendThread+0x18
ntkrnlpa.exe!KiDeliverApc+0x117
hal.dll!HalpDispatchSoftwareInterrupt+0x49
hal.dll!HalpCheckForSoftwareInterrupt+0x81
hal.dll!HalEndSystemInterrupt+0x67
hal.dll!HalpIpiHandler+0xd2
ntdll.dll!RtlEnterCriticalSection+0x1d
House.exe+0x318ba
House.exe+0x319e7
House.exe+0xdc062
House.exe+0xdbbac
House.exe+0x31529
House.exe+0x53ea
kernel32.dll!BaseThreadStart+0x34

Execution point in Classes.pas:
class procedure TThread.Synchronize(ASyncRec: PSynchronizeRecord);
var
  SyncProc: TSyncProc;
begin
  if GetCurrentThreadID = MainThreadID then
    ASyncRec.FMethod
  else
  begin
{$IFDEF MSWINDOWS}
    SyncProc.Signal := CreateEvent(nil, True, False, nil);
    try
{$ENDIF}
{$IFDEF LINUX}
      FillChar(SyncProc, SizeOf(SyncProc), 0);  // This also initializes the cond_var
{$ENDIF}
      EnterCriticalSection(ThreadLock);
      try  <<<<<<<<<< EIP

Current ThreadLock value:
_RTL_CRITICAL_SECTION = record
    DebugInfo: $16A6E0
      Type_18: 0
      CreatorBackTraceIndex: 0
      CriticalSection: $8FEA98
      ProcessLocksList: ($169968, $169740)
      EntryCount: 0
      ContentionCount: 21
      Spare: array[0..1] of (0, 0)
    LockCount: -6
    RecursionCount: 0
    OwningThread: 0
    LockSemaphore: 1480
    Reserved: 0
  end;

The OwningThread is zero but both threads can't enter in it. I still haven't ideas how it can be. The TRxTimer creates by TRxGifAnimator - other RX Library component.

Dick Boogaers at 3/22/2010 4:56:23 AM -
Is this problem related to QC 22267?

Oleg Gusakov at 3/22/2010 7:09:45 AM -
No, it isn't. This problem (83092) occurs only when using TThread.Suspend calls.

Server Response from: ETNACODE01