Crash Dump Analysis Patterns (Part 9c)
This is another variant of Deadlock pattern when we have mixed synchronization objects, for example, events and critical sections. An event may be used to signal the availability of some work item for processing it, the fact that the queue is not empty and a critical section may be used to protect some shared data.
The typical deadlock scenario here is when one thread resets an event by calling WaitForSingleObject and tries to acquire a critical section. In the mean time the second thread has already acquired that critical section and now is waiting for the event to be set:
Thread A | Thread B
.. | ..
reset Event | ..
.. | acquire CS
wait for CS | ..
| wait for Event
The classical fix to this bug is to acquire the critical section and wait for the event in the same order in both threads.
In our example crash dump we can easily identify the second thread that acquied the critical section and is waiting for the event 0×480:
0:000> !locks
CritSec ntdll!LdrpLoaderLock+0 at 7c889d94
WaiterWoken No
LockCount 9
RecursionCount 1
OwningThread 2038
EntryCount 0
ContentionCount 164
*** Locked
13 Id: 590.2038 Suspend: 1 Teb: 7ffaa000 Unfrozen
ChildEBP RetAddr Args to Child
0483fd5c 7c822124 77e6bad8 00000480 00000000 ntdll!KiFastSystemCallRet
0483fd60 77e6bad8 00000480 00000000 00000000 ntdll!NtWaitForSingleObject+0xc
0483fdd0 77e6ba42 00000480 ffffffff 00000000 kernel32!WaitForSingleObjectEx+0xac
0483fde4 776cfb30 00000480 ffffffff 777904f8 kernel32!WaitForSingleObject+0×12
0483fe00 776adfaa 00000480 00000000 00000080 ole32!CDllHost::ClientCleanupFinish+0×2a
0483fe2c 776adf1a 00000000 0483fe7c 77790828 ole32!DllHostProcessUninitialize+0×80
0483fe4c 776b063f 00000000 00000000 0c9ecee0 ole32!ApartmentUninitialize+0xf8
0483fe64 776b06e3 0483fe7c 00000000 00000001 ole32!wCoUninitialize+0×48
0483fe80 776e43f5 00000001 77670000 776afef0 ole32!CoUninitialize+0×65
0483fe8c 776afef0 0483feb4 776b5cb8 77670000 ole32!DoThreadSpecificCleanup+0×63
0483fe94 776b5cb8 77670000 00000003 00000000 ole32!ThreadNotification+0×37
0483feb4 776b5c1b 77670000 00000003 00000000 ole32!DllMain+0×176
0483fed4 7c82257a 77670000 00000003 00000000 ole32!_DllMainCRTStartup+0×52
0483fef4 7c83c195 776b5bd3 77670000 00000003 ntdll!LdrpCallInitRoutine+0×14
0483ffa8 77e661d6 00000000 00000000 0483ffec ntdll!LdrShutdownThread+0xd2
0483ffb8 77e66090 00000000 00000000 00000000 kernel32!ExitThread+0×2f
0483ffec 00000000 77c5de6d 0ab24f68 00000000 kernel32!BaseThreadStart+0×39
0:000> !handle 480 ff
Handle 00000480
Type Event
Attributes 0
GrantedAccess 0×1f0003:
Delete,ReadControl,WriteDac,WriteOwner,Synch
QueryState,ModifyState
HandleCount 2
PointerCount 4
Name <none>
No object specific information available
It is difficult to find the first thread, the one which has reset the event and is waiting for the critical section. In our dump we have 9 such threads from !locks command output:
LockCount 9
Event as a synchronization primitive doesn’t have an owner. Despite this we can try to find 0×480 and WaitForSingleObject address near on some other thread raw stack if that information wasn’t overwritten. Let’s do a memory search:
0:000> s -d 0 L4000000 00000480
000726ec 00000480 00000022 000004a4 00000056
008512a0 00000480 00000480 00000000 00000000
008512a4 00000480 00000000 00000000 01014220
0085ab68 00000480 00000480 00000092 00000000
0085ab6c 00000480 00000092 00000000 01014234
00eb12a0 00000480 00000480 00000000 00000000
00eb12a4 00000480 00000000 00000000 0101e614
00ebeb68 00000480 00000480 00000323 00000000
00ebeb6c 00000480 00000323 00000000 0101e644
03ffb4fc 00000480 d772c13b ce753966 00fa840f
040212a0 00000480 00000480 00000000 00000000
040212a4 00000480 00000000 00000000 01063afc
0402ab68 00000480 00000480 00000fb6 00000000
0402ab6c 00000480 00000fb6 00000000 01063b5c
041312a0 00000480 00000480 00000000 00000000
041312a4 00000480 00000000 00000000 01065b28
0413eb68 00000480 00000480 00001007 00000000
0413eb6c 00000480 00001007 00000000 01065b7c
043412a0 00000480 00000480 00000000 00000000
043412a4 00000480 00000000 00000000 01066b44
0434ab68 00000480 00000480 00001033 00000000
0434ab6c 00000480 00001033 00000000 01066b9c
0483fd68 00000480 00000000 00000000 00000000
0483fdd8 00000480 ffffffff 00000000 0483fe00
0483fdec 00000480 ffffffff 777904f8 77790738
0483fe08 00000480 00000000 00000080 776b0070
0483fe20 00000480 00000000 00000000 0483fe4c
05296f58 00000480 ffffffff ffffffff ffffffff
05297eb0 00000480 00000494 000004a4 000004c0
0557cf9c 00000480 00000000 00000000 00000000
05580adc 00000480 00000000 00000000 00000000
0558715c 00000480 00000000 00000000 00000000
0558d3cc 00000480 00000000 00000000 00000000
0559363c 00000480 00000000 00000000 00000000
0559ee0c 00000480 00000000 00000000 00000000
055a507c 00000480 00000000 00000000 00000000
056768ec 00000480 00000000 00000000 00000000
0568ef14 00000480 00000000 00000000 00000000
0581ff88 00000480 07ca7ee0 0581ff98 776cf2a3
05ed1260 00000480 00000480 00000000 00000000
05ed1264 00000480 00000000 00000000 01276efc
05ed8b68 00000480 00000480 00005c18 00000000
05ed8b6c 00000480 00005c18 00000000 01276f74
08f112a0 00000480 00000480 00000000 00000000
08f112a4 00000480 00000000 00000000 00000000
08f1ab68 00000480 00000480 00007732 00000000
08f1ab6c 00000480 00007732 00000000 01352db0
In blue color I highlighted the thread #13 raw stack occurrences and in red color I highlighted memory locations that belong to another thread raw stack. In fact, these are the only memory locations from search results that make any sense from code perspective. The only meaningful stack traces can be found in memory locations highlighted in blue and red above.
This can be seen if we feed search results to WinDbg dds command (not all output is shown for clarity):
0:000> .foreach (place { s-[1]d 0 L4000000 00000480 }) { dds place -30; .printf "\n" }
000726bc 00000390
000726c0 00000022
000726c4 000003b4
000726c8 00000056
000726cc 00000004
000726d0 6dc3f6fd
000726d4 0000040c
000726d8 0000001e
000726dc 0000042c
000726e0 00000052
000726e4 00000004
000726e8 eacb0f6d
000726ec 00000480
000726f0 00000022
000726f4 000004a4
000726f8 00000056
000726fc 00000004
00072700 62b796d2
00072704 000004fc
00072708 0000001e
0007270c 0000051c
00072710 00000052
00072714 00000004
00072718 2a615cff
0007271c 00000570
00072720 00000024
00072724 00000598
00072728 00000058
0007272c 00000004
00072730 51913e59
00072734 000005f0
00072738 00000016
...
...
...
0568eee4 05680008 xpsp2res+0x1b0008
0568eee8 01200000
0568eeec 00001010
0568eef0 00200001
0568eef4 00000468
0568eef8 00000121
0568eefc 00000000
0568ef00 00000028
0568ef04 00000030
0568ef08 00000060
0568ef0c 00040001
0568ef10 00000000
0568ef14 00000480
0568ef18 00000000
0568ef1c 00000000
0568ef20 00000000
0568ef24 00000000
0568ef28 00000000
0568ef2c 00800000
0568ef30 00008000
0568ef34 00808000
0568ef38 00000080
0568ef3c 00800080
0568ef40 00008080
0568ef44 00808080
0568ef48 00c0c0c0
0568ef4c 00ff0000
0568ef50 0000ff00
0568ef54 00ffff00
0568ef58 000000ff
0568ef5c 00ff00ff
0568ef60 0000ffff
0581ff58 0581ff70
0581ff5c 776b063f ole32!wCoUninitialize+0x48
0581ff60 00000001
0581ff64 00007530
0581ff68 77790438 ole32!gATHost
0581ff6c 00000000
0581ff70 0581ff90
0581ff74 776cf370 ole32!CDllHost::WorkerThread+0xdd
0581ff78 0581ff8c
0581ff7c 00000001
0581ff80 77e6ba50 kernel32!WaitForSingleObjectEx
0581ff84 0657cfe8
0581ff88 00000480
0581ff8c 07ca7ee0
0581ff90 0581ff98
0581ff94 776cf2a3 ole32!DLLHostThreadEntry+0xd
0581ff98 0581ffb8
0581ff9c 776b2307 ole32!CRpcThread::WorkerLoop+0×1e
0581ffa0 77790438 ole32!gATHost
0581ffa4 00000000
0581ffa8 0657cfe8
0581ffac 77670000 ole32!_imp__InstallApplication <PERF> (ole32+0×0)
0581ffb0 776b2374 ole32!CRpcThreadCache::RpcWorkerThreadEntry+0×20
0581ffb4 00000000
0581ffb8 0581ffec
0581ffbc 77e6608b kernel32!BaseThreadStart+0×34
0581ffc0 0657cfe8
0581ffc4 00000000
0581ffc8 00000000
0581ffcc 0657cfe8
0581ffd0 3cfb5963
0581ffd4 0581ffc4
05ed1230 0101f070
05ed1234 05ed1274
05ed1238 05ed1174
05ed123c 05ed0000
05ed1240 05ed1280
05ed1244 00000000
05ed1248 00000000
05ed124c 00000000
05ed1250 05ed8b80
05ed1254 05ed8000
05ed1258 00002000
05ed125c 00001000
05ed1260 00000480
05ed1264 00000480
05ed1268 00000000
05ed126c 00000000
05ed1270 01276efc
05ed1274 05ed12b4
05ed1278 05ed1234
05ed127c 05ed0000
05ed1280 05ed2d00
05ed1284 05ed1240
05ed1288 05ed1400
05ed128c 00000000
05ed1290 05edade0
05ed1294 05eda000
05ed1298 00002000
05ed129c 00001000
05ed12a0 00000220
05ed12a4 00000220
05ed12a8 00000000
05ed12ac 00000000
...
...
...
08f1ab3c 00000000
08f1ab40 00000000
08f1ab44 00000000
08f1ab48 00000000
08f1ab4c 00000000
08f1ab50 00000000
08f1ab54 00000000
08f1ab58 00000000
08f1ab5c 00000000
08f1ab60 abcdbbbb
08f1ab64 08f11000
08f1ab68 00000480
08f1ab6c 00000480
08f1ab70 00007732
08f1ab74 00000000
08f1ab78 01352db0
08f1ab7c dcbabbbb
08f1ab80 ffffffff
08f1ab84 c0c00ac1
08f1ab88 00000000
08f1ab8c c0c0c0c0
08f1ab90 c0c0c0c0
08f1ab94 c0c0c0c0
08f1ab98 c0c0c0c0
08f1ab9c c0c0c0c0
08f1aba0 c0c0c0c0
08f1aba4 ffffffff
08f1aba8 c0c00ac1
08f1abac 00000000
08f1abb0 c0c0c0c0
08f1abb4 c0c0c0c0
08f1abb8 c0c0c0c0
We see that address 0581ff88 is the most meaningful and it also has WaitForSingleObjectEx nearby. This address belongs to the raw stack of the following thread #16:
16 Id: 590.1a00 Suspend: 1 Teb: 7ffa9000 Unfrozen
ChildEBP RetAddr
0581fc98 7c822124 ntdll!KiFastSystemCallRet
0581fc9c 7c83970f ntdll!NtWaitForSingleObject+0xc
0581fcd8 7c839620 ntdll!RtlpWaitOnCriticalSection+0x19c
0581fcf8 7c83a023 ntdll!RtlEnterCriticalSection+0xa8
0581fe00 77e67bcd ntdll!LdrUnloadDll+0x35
0581fe14 776b46fb kernel32!FreeLibrary+0x41
0581fe20 776b470f ole32!CClassCache::CDllPathEntry::CFinishObject::Finish+0x2f
0581fe34 776b44a0 ole32!CClassCache::CFinishComposite::Finish+0x1d
0581ff0c 776b0bfd ole32!CClassCache::CleanUpDllsForApartment+0x1d0
0581ff38 776b0b1f ole32!FinishShutdown+0xd7
0581ff58 776b063f ole32!ApartmentUninitialize+0x94
0581ff70 776cf370 ole32!wCoUninitialize+0x48
0581ff90 776cf2a3 ole32!CDllHost::WorkerThread+0xdd
0581ff98 776b2307 ole32!DLLHostThreadEntry+0xd
0581ffac 776b2374 ole32!CRpcThread::WorkerLoop+0×1e
0581ffb8 77e6608b ole32!CRpcThreadCache::RpcWorkerThreadEntry+0×20
0581ffec 00000000 kernel32!BaseThreadStart+0×34
And if we disassemble ole32!CRpcThread::WorkerLoop function which is found below WaitForSingleObjectEx on both stack trace and raw stack data from search results we would see that the former function calls the latter function indeed:
0:000> uf ole32!CRpcThread::WorkerLoop
ole32!CRpcThread::WorkerLoop:
776b22e9 mov edi,edi
776b22eb push esi
776b22ec mov esi,ecx
776b22ee cmp dword ptr [esi+4],0
776b22f2 jne ole32!CRpcThread::WorkerLoop+0x67 (776b234d)
ole32!CRpcThread::WorkerLoop+0xb:
776b22f4 push ebx
776b22f5 push edi
776b22f6 mov edi,dword ptr [ole32!_imp__WaitForSingleObjectEx (77671304)]
776b22fc mov ebx,7530h
ole32!CRpcThread::WorkerLoop+0x18:
776b2301 push dword ptr [esi+0Ch]
776b2304 call dword ptr [esi+8]
776b2307 call dword ptr [ole32!_imp__GetCurrentThread (7767130c)]
776b230d push eax
776b230e call dword ptr [ole32!_imp__RtlCheckForOrphanedCriticalSections (77671564)]
776b2314 xor eax,eax
776b2316 cmp dword ptr [esi],eax
776b2318 mov dword ptr [esi+8],eax
776b231b mov dword ptr [esi+0Ch],eax
776b231e je ole32!CRpcThread::WorkerLoop+0x65 (776b234b)
ole32!CRpcThread::WorkerLoop+0x37:
776b2320 push esi
776b2321 mov ecx,offset ole32!gRpcThreadCache (7778fc28)
776b2326 call ole32!CRpcThreadCache::AddToFreeList (776de78d)
ole32!CRpcThread::WorkerLoop+0x55:
776b232b push 0
776b232d push ebx
776b232e push dword ptr [esi]
776b2330 call edi
776b2332 test eax,eax
776b2334 je ole32!CRpcThread::WorkerLoop+0×60 (776cf3be)
ole32!CRpcThread::WorkerLoop+0x44:
776b233a push esi
776b233b mov ecx,offset ole32!gRpcThreadCache (7778fc28)
776b2340 call ole32!CRpcThreadCache::RemoveFromFreeList (776e42de)
776b2345 cmp dword ptr [esi+4],0
776b2349 je ole32!CRpcThread::WorkerLoop+0x55 (776b232b)
ole32!CRpcThread::WorkerLoop+0x65:
776b234b pop edi
776b234c pop ebx
ole32!CRpcThread::WorkerLoop+0x67:
776b234d pop esi
776b234e ret
ole32!CRpcThread::WorkerLoop+0x60:
776cf3be cmp dword ptr [esi+4],eax
776cf3c1 je ole32!CRpcThread::WorkerLoop+0x18 (776b2301)
ole32!CRpcThread::WorkerLoop+0x69:
776cf3c7 jmp ole32!CRpcThread::WorkerLoop+0x65 (776b234b)
Therefore we have possibly identified the thread #16 that resets the event by calling WaitForSingleObjectEx and tries to acquire the critical section. We also know the second thread #13 that has already acquired that critical section and now is waiting for the event to be signaled.
- Dmitry Vostokov @ DumpAnalysis.org -
June 20th, 2008 at 4:34 pm
Regarding thread#16 we can see the module it was trying to free if we dump parameters for FreeLibrary:
0581fe14 776b46fb kernel32!FreeLibrary+0x41
In another dump I had this:
16 Id: 13d0.2780 Suspend: 1 Teb: 7ffa8000 Unfrozen
ChildEBP RetAddr Args to Child
0210fc98 7c942124 7c95970f 00000100 00000000 ntdll!KiFastSystemCallRet
0210fc9c 7c95970f 00000100 00000000 00000000 ntdll!NtWaitForSingleObject+0xc
0210fcd8 7c959620 00000000 00000004 00000000 ntdll!RtlpWaitOnCriticalSection+0x19c
0210fcf8 7c95a023 7c9a9d94 0210fe78 0210fea8 ntdll!RtlEnterCriticalSection+0xa8
0210fe00 7c827bcd 02a10000 0210fea8 0210fee4 ntdll!LdrUnloadDll+0x35
0210fe14 775346eb 02a10000 0210ff0c 775346ff kernel32!FreeLibrary+0×41
[…]
0:000> lm
start end module name
[...]
02a10000 02a19000 DLL_A DLL_A.dll
September 18th, 2008 at 8:53 am
[…] Crash Dump Analysis Patterns (Part 9c) […]
March 30th, 2010 at 1:57 pm
[…] we introduce an icon for Deadlock (mixed objects, user space) […]
October 27th, 2010 at 4:03 pm
[…] Deadlock (mixed objects, user space) […]