Crash Dump Analysis Patterns (Part 82)
One pattern I recently discovered is the partitioning of anomalous debugger output from debugging commands into several disjoint or weakly linked classes. I call the pattern Semantic Split. It is better characterized as a partition of the memory analysis pattern, for example, Blocked Thread, into classes with different semantics, for example, blocked display threads and blocked remote share threads. Here is one short example of it found in the output of !locks and !process 0 ff WinDbg commands from a complete memory dump forced on a hanging server. The output shows several blocked threads and wait chains of executive resources (some shared locks have been removed for clarity):
0: kd> !locks
**** DUMP OF ALL RESOURCE OBJECTS ****
Resource @ 0×88eeeaf0 Exclusively owned
Contention Count = 809254
NumberOfExclusiveWaiters = 4
Threads: 88a26db0-01<*>
Threads Waiting On Exclusive Access:
88c6b6d0 8877b020 88a7e020 88938db0
Resource @ 0×88badb20 Exclusively owned
Contention Count = 9195
NumberOfExclusiveWaiters = 4
Threads: 88842020-02<*>
Threads Waiting On Exclusive Access:
88a8b170 89069450 88c4d020 88a26db0
Resource @ 0x88859cc0 Exclusively owned
Contention Count = 51021
NumberOfExclusiveWaiters = 8
Threads: 886f1c50-01<*>
Threads Waiting On Exclusive Access:
88e04db0 886785c0 8851edb0 896ee890
8869fb50 886d6498 889aa918 88c2da38
Resource @ 0x881cc138 Exclusively owned
Contention Count = 173698
NumberOfExclusiveWaiters = 4
Threads: 87e72598-01<*>
Threads Waiting On Exclusive Access:
88392020 8918c8d8 88423020 880eba50
Resource @ 0x884ffab0 Exclusively owned
Contention Count = 3363
NumberOfExclusiveWaiters = 2
Threads: 8807c5b8-02<*>
Threads Waiting On Exclusive Access:
87e72598 881c12a8
Resource @ 0x87cd6d48 Exclusively owned
Contention Count = 242361
NumberOfExclusiveWaiters = 5
Threads: 87540718-01<*>
Threads Waiting On Exclusive Access:
878ceaf0 8785ac50 8884a7b8 87c4ca28
89ab5db0
Resource @ 0x87c44d08 Exclusively owned
Contention Count = 2560
NumberOfExclusiveWaiters = 1
Threads: 87540718-01<*>
Threads Waiting On Exclusive Access:
87c4e468
Resource @ 0×87bf51d8 Exclusively owned
Contention Count = 3
NumberOfSharedWaiters = 3
Threads: 89e76db0-01<*> 8739ac50-01 86f5d1c8-01 870f4db0-01
Resource @ 0×888bfc38 Exclusively owned
Contention Count = 3
NumberOfSharedWaiters = 3
Threads: 88a10db0-01<*> 86c94198-01 86dac598-01 86d85c50-01
The first group of locks (one of them shown in blue) shows various problems with ComponentA module:
0: kd> !thread 88842020 1f
THREAD 88842020 Cid 1cf8.1b28 Teb: 7ffdb000 Win32Thread: bc25e8c0 WAIT: (Unknown) KernelMode Non-Alertable
88842098 NotificationTimer
Not impersonating
DeviceMap e3813fd0
Owning Process 888c5d88 Image: ApplicationA.exe
Attached Process N/A Image: N/A
Wait Start TickCount 1163714 Ticks: 0
Context Switch Count 35781 LargeStack
UserTime 00:00:00.453
KernelTime 00:00:01.109
Win32 Start Address 0x77ec3ea5
Start Address kernel32!BaseThreadStartThunk (0x7c8217ec)
Stack Init b5cc4bd0 Current b5cc4614 Base b5cc5000 Limit b5cbf000 Call b5cc4bd8
Priority 9 BasePriority 9 PriorityDecrement 0
ChildEBP RetAddr
b5cc462c 80833ec5 nt!KiSwapContext+0x26
b5cc4658 80829069 nt!KiSwapThread+0x2e5
b5cc46a0 bf8981b3 nt!KeDelayExecutionThread+0x2ab
b5cc46c4 bf898422 ComponentA!LockGUIHandle+0×6d
[…]
b5cc49e8 80a63456 nt!KiFastCallEntry+0xcd
[…]
The second group of locks (shown in red) shows the problem with ComponentB module:
0: kd> !thread 89e76db0 1f
THREAD 89e76db0 Cid 0004.0624 Teb: 00000000 Win32Thread: 00000000 WAIT: (Unknown) KernelMode Alertable
89e76e28 NotificationTimer
Not impersonating
DeviceMap e1006e10
Owning Process 8b581648 Image: System
Attached Process N/A Image: N/A
Wait Start TickCount 1163714 Ticks: 0
Context Switch Count 545
UserTime 00:00:00.000
KernelTime 00:00:00.015
Start Address 0xb9003c20
Stack Init b9148000 Current b9147abc Base b9148000 Limit b9145000 Call 0
Priority 16 BasePriority 8 PriorityDecrement 0
ChildEBP RetAddr
b9147ad4 80833ec5 nt!KiSwapContext+0x26
b9147b00 80829069 nt!KiSwapThread+0x2e5
b9147b48 b8fc9353 nt!KeDelayExecutionThread+0x2ab
b9147b74 b8ff9460 ComponentB!DeleteShareConnection+0×203
[…]
b9147ddc 8088f61e nt!PspSystemThreadStartup+0×2e
00000000 00000000 nt!KiThreadStartup+0×16
Looking at the list of all threads we see another classes of blocked threads, one is involving ComponentC module in user space:
0: kd> !thread 86c21db0 1f
THREAD 86c21db0 Cid 0fac.5260 Teb: 7ff6a000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable
869f2f68 SynchronizationEvent
IRP List:
87fe3148: (0006,0220) Flags: 00000830 Mdl: 00000000
Not impersonating
DeviceMap e1006e10
Owning Process 896ccc28 Image: ServiceA.exe
Attached Process N/A Image: N/A
Wait Start TickCount 1163714 Ticks: 0
Context Switch Count 22
UserTime 00:00:00.000
KernelTime 00:00:00.000
Win32 Start Address 0×005c1de0
LPC Server thread working on message Id 5c1de0
Start Address kernel32!BaseThreadStartThunk (0×7c8217ec)
Stack Init b9787000 Current b9786c60 Base b9787000 Limit b9784000 Call 0
Priority 9 BasePriority 8 PriorityDecrement 0
ChildEBP RetAddr
b9786c78 80833ec5 nt!KiSwapContext+0×26
b9786ca4 80829bc0 nt!KiSwapThread+0×2e5
b9786cec 8093b034 nt!KeWaitForSingleObject+0×346
b9786d50 8088ad3c nt!NtWaitForSingleObject+0×9a
b9786d50 7c9485ec nt!KiFastCallEntry+0xfc
03f8f984 7c821c8d ntdll!KiFastSystemCallRet
03f8f998 10097728 kernel32!WaitForSingleObject+0×12
03f8f9bc 10008164 ComponentC!ComponentB_Control+0×68
[…]
03f8ffec 00000000 kernel32!BaseThreadStart+0×34
This thread holds a mutant and blocks a dozen of other threads in ServiceA.exe, for example:
THREAD 8aa7cb40 Cid 0fac.0110 Teb: 7ffad000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable
87764550 Mutant - owning thread 86c21db0
From the function name we can infer that ComponentC controls ComponentB and this makes both blocked threads weakly connected.
Another thread in ServiceB involves DriverA module and blocks a thread ServiceA:
0: kd> !thread 8899e778 1f
THREAD 8899e778 Cid 01b0.13e0 Teb: 7ffdc000 Win32Thread: 00000000 WAIT: (Unknown) KernelMode Non-Alertable
8aadb6e0 SynchronizationEvent
8899e7f0 NotificationTimer
IRP List:
86f21de0: (0006,0220) Flags: 00000884 Mdl: 00000000
Not impersonating
DeviceMap e1006e10
Owning Process 8ab3d020 Image: ServiceB.exe
Attached Process N/A Image: N/A
Wait Start TickCount 1163714 Ticks: 0
Context Switch Count 2
UserTime 00:00:00.000
KernelTime 00:00:00.000
Win32 Start Address 0x005c1a3c
LPC Server thread working on message Id 5c1a3c
Start Address 0x48589bb3
Stack Init aecee000 Current aeced768 Base aecee000 Limit aeceb000 Call 0
Priority 12 BasePriority 11 PriorityDecrement 0
ChildEBP RetAddr
aeced780 80833ec5 nt!KiSwapContext+0x26
aeced7ac 80829bc0 nt!KiSwapThread+0x2e5
aeced7f4 badffece nt!KeWaitForSingleObject+0x346
WARNING: Stack unwind information not available. Following frames may be wrong.
aeced824 bae00208 DriverA+0×1ece
aeced868 bae0e45a DriverA+0×2208
aeced8a0 8081e095 DriverA+0×1045a
aeced8b4 b946673b nt!IofCallDriver+0×45
[…]
0: kd> !thread 8776c220 1f
THREAD 8776c220 Cid 0fac.5714 Teb: 7ff66000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable
8776c40c Semaphore Limit 0x1
Waiting for reply to LPC MessageId 005c1a3c:
Current LPC port e213b0c8
Not impersonating
DeviceMap e1006e10
Owning Process 896ccc28 Image: ServiceA.exe
Attached Process N/A Image: N/A
Wait Start TickCount 1163714 Ticks: 0
Context Switch Count 12
UserTime 00:00:00.000
KernelTime 00:00:00.000
Win32 Start Address 0x75fddd73
Start Address kernel32!BaseThreadStartThunk (0x7c8217ec)
Stack Init aecf2000 Current aecf1c08 Base aecf2000 Limit aecef000 Call 0
Priority 8 BasePriority 8 PriorityDecrement 0
ChildEBP RetAddr
aecf1c20 80833ec5 nt!KiSwapContext+0x26
aecf1c4c 80829bc0 nt!KiSwapThread+0x2e5
aecf1c94 80920f28 nt!KeWaitForSingleObject+0x346
aecf1d50 8088ad3c nt!NtRequestWaitReplyPort+0x776
aecf1d50 7c9485ec nt!KiFastCallEntry+0xfc
0408f594 75fde6b5 ntdll!KiFastSystemCallRet
0408f5fc 75fdd65e ComponentD!ServiceB_Request+0×1ae
[…]
0408ffec 00000000 kernel32!BaseThreadStart+0×34
In ServiceA we can also find several threads blocked by a RPC request to ServiceC:
0: kd> !thread 87397020 1f
THREAD 87397020 Cid 0fac.38cc Teb: 7ff80000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable
87397098 NotificationTimer
Not impersonating
DeviceMap e1006e10
Owning Process 896ccc28 Image: ServiceA.exe
Attached Process N/A Image: N/A
Wait Start TickCount 1163714 Ticks: 0
Context Switch Count 7807
UserTime 00:00:00.125
KernelTime 00:00:00.109
Win32 Start Address 0x005c21a8
LPC Server thread working on message Id 5c21a8
Start Address kernel32!BaseThreadStartThunk (0x7c8217ec)
Stack Init b4ecf000 Current b4ecec80 Base b4ecf000 Limit b4ecc000 Call 0
Priority 13 BasePriority 8 PriorityDecrement 0
ChildEBP RetAddr
b4ecec98 80833ec5 nt!KiSwapContext+0x26
b4ececc4 80829069 nt!KiSwapThread+0x2e5
b4eced0c 80996d8a nt!KeDelayExecutionThread+0x2ab
b4eced54 8088ad3c nt!NtDelayExecution+0x84
b4eced54 7c9485ec nt!KiFastCallEntry+0xfc
03a1f178 7c8024ed ntdll!KiFastSystemCallRet
03a1f188 77c5e51a kernel32!Sleep+0xf
03a1f198 77c36a44 RPCRT4!OSF_BINDING_HANDLE::Unbind+0x3a
03a1f1b0 77c36a08 RPCRT4!OSF_BINDING_HANDLE::~OSF_BINDING_HANDLE+0x32
03a1f1bc 77c369f1 RPCRT4!OSF_BINDING_HANDLE::`scalar deleting destructor'+0xd
03a1f1cc 77c5250a RPCRT4!OSF_BINDING_HANDLE::BindingFree+0x30
03a1f1dc 77f48c00 RPCRT4!RpcBindingFree+0x4e
03a1f1e8 77f48be2 ADVAPI32!RpcpUnbindRpc+0x15
03a1f1f4 77c3688e ADVAPI32!PLSAPR_SERVER_NAME_unbind+0xd
03a1f21c 77c369bb RPCRT4!GenericHandleMgr+0xca
03a1f23c 77c36983 RPCRT4!GenericHandleUnbind+0x31
03a1f260 77cb31b2 RPCRT4!NdrpClientFinally+0x5b
03a1f26c 77cb317a RPCRT4!NdrClientCall2+0x324
03a1f64c 77f4a0a1 RPCRT4!NdrClientCall2+0x2ea
03a1f664 77f4a022 ComponentD!ServiceC_Request+0×1c
[…]
03a1f8f8 77cb33e1 RPCRT4!Invoke+0×30
03a1fcf8 77cb35c4 RPCRT4!NdrStubCall2+0×299
03a1fd14 77c4ff7a RPCRT4!NdrServerCall2+0×19
03a1fd48 77c5042d RPCRT4!DispatchToStubInCNoAvrf+0×38
03a1fd9c 77c50353 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0×11f
03a1fdc0 77c511dc RPCRT4!RPC_INTERFACE::DispatchToStub+0xa3
03a1fdfc 77c512f0 RPCRT4!LRPC_SCALL::DealWithRequestMessage+0×42c
03a1fe20 77c58678 RPCRT4!LRPC_ADDRESS::DealWithLRPCRequest+0×127
03a1ff84 77c58792 RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0×430
03a1ff8c 77c5872d RPCRT4!RecvLotsaCallsWrapper+0xd
03a1ffac 77c4b110 RPCRT4!BaseCachedThreadRoutine+0×9d
03a1ffb8 7c824829 RPCRT4!ThreadStartRoutine+0×1b
03a1ffec 00000000 kernel32!BaseThreadStart+0×34
In ServiceC we see several RPC processing threads blocked by ComponentE:
0: kd> !thread 873acb40 1f
THREAD 873acb40 Cid 023c.3a00 Teb: 7ff93000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable
89f0aeb0 Semaphore Limit 0x1
873acbb8 NotificationTimer
IRP List:
89838a00: (0006,0094) Flags: 00000900 Mdl: 00000000
8705d4a0: (0006,0094) Flags: 00000800 Mdl: 00000000
88bc9440: (0006,0094) Flags: 00000900 Mdl: 00000000
87674af8: (0006,0094) Flags: 00000900 Mdl: 00000000
86f2aa48: (0006,0094) Flags: 00000900 Mdl: 00000000
87551290: (0006,0094) Flags: 00000900 Mdl: 00000000
Not impersonating
DeviceMap e1006e10
Owning Process 89dc0508 Image: ServiceC.exe
Attached Process N/A Image: N/A
Wait Start TickCount 1163714 Ticks: 0
Context Switch Count 16571
UserTime 00:00:00.250
KernelTime 00:00:00.703
Win32 Start Address RPCRT4!ThreadStartRoutine (0x77c4b0f5)
Start Address kernel32!BaseThreadStartThunk (0x7c8217ec)
Stack Init b2a9b000 Current b2a9ac60 Base b2a9b000 Limit b2a98000 Call 0
Priority 13 BasePriority 9 PriorityDecrement 0
ChildEBP RetAddr
b2a9ac78 80833ec5 nt!KiSwapContext+0x26
b2a9aca4 80829bc0 nt!KiSwapThread+0x2e5
b2a9acec 8093b034 nt!KeWaitForSingleObject+0x346
b2a9ad50 8088ad3c nt!NtWaitForSingleObject+0x9a
b2a9ad50 7c9485ec nt!KiFastCallEntry+0xfc
022cf8d0 7c821c8d ntdll!KiFastSystemCallRet
022cf8e4 741269e5 kernel32!WaitForSingleObject+0x12
022cf8f8 7412cdca ComponentE!Enumerate+0×37
[…]
022cf944 77cb33e1 RPCRT4!Invoke+0×30
022cfd44 77cb35c4 RPCRT4!NdrStubCall2+0×299
022cfd60 77c4ff7a RPCRT4!NdrServerCall2+0×19
022cfd94 77c5042d RPCRT4!DispatchToStubInCNoAvrf+0×38
022cfde8 77c50353 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0×11f
022cfe0c 77c38e0d RPCRT4!RPC_INTERFACE::DispatchToStub+0xa3
022cfe40 77c38cb3 RPCRT4!OSF_SCALL::DispatchHelper+0×149
022cfe54 77c38c2b RPCRT4!OSF_SCALL::DispatchRPCCall+0×10d
022cfe84 77c38b5e RPCRT4!OSF_SCALL::ProcessReceivedPDU+0×57f
022cfea4 77c3e8db RPCRT4!OSF_SCALL::BeginRpcCall+0×194
022cff04 77c3e7b4 RPCRT4!OSF_SCONNECTION::ProcessReceiveComplete+0×435
022cff18 77c4b799 RPCRT4!ProcessConnectionServerReceivedEvent+0×21
022cff84 77c4b9b5 RPCRT4!LOADABLE_TRANSPORT::ProcessIOEvents+0×1b8
022cff8c 77c5872d RPCRT4!ProcessIOEventsWrapper+0xd
022cffac 77c4b110 RPCRT4!BaseCachedThreadRoutine+0×9d
022cffb8 7c824829 RPCRT4!ThreadStartRoutine+0×1b
022cffec 00000000 kernel32!BaseThreadStart+0×34
Therefore we have 4 main groups of wait chain endpoints involving semantically disjoint ComponentA, ComponentB, ComponentE and DriverA modules. Although their module names do not infer disjointedness this example was abstracted from the real incident where respective modules were having different system functions and were even from different software vendors.
- Dmitry Vostokov @ DumpAnalysis.org -
June 16th, 2010 at 10:32 pm
[…] result from earlier problems? We search stack trace collection for any other anomalous activity (Semantic Split) and we find indeed a recurrent stack trace pattern across process […]
July 7th, 2010 at 4:34 pm
[…] share the same “keyboard” related functionality. So we don’t have an instance of semantic split pattern here and both module versions need to be checked and also removed for testing purposes if […]