Stack trace collection, missing threads, waiting time, critical section and LPC wait chains: pattern cooperation
Windows shutdown couldn’t progress and a complete memory dump was saved for later postmortem analysis. !stacks command showed reduced number of waiting threads in one important system service:
0: kd> !stacks
[...]
[89d6d8e8 ServiceA.exe]
1454.0014b0 89d36b60 0000163 Blocked DriverA!Check+0x177
[...]
Normally this service had at least a dozen waiting threads. If we switch to the process we see many threads missing and the process itself is in the process of exiting (three “process” nouns in one sentence):
0: kd> !process 89d6d8e8
PROCESS 89d6d8e8 SessionId: 0 Cid: 1454 Peb: 7ffd8000 ParentCid: 0234
DirBase: afa06000 ObjectTable: e5491278 HandleCount: 444.
Image: ServiceA.exe
VadRoot 89db18d8 Vads 213 Clone 0 Private 827. Modified 15. Locked 0.
DeviceMap e10028c8
Token e5556710
ElapsedTime 2 Days 02:59:39.285
UserTime 00:00:08.375
KernelTime 00:00:20.046
QuotaPoolUsage[PagedPool] 50660
QuotaPoolUsage[NonPagedPool] 9704
Working Set Sizes (now,min,max) (2523, 50, 345) (10092KB, 200KB, 1380KB)
PeakWorkingSetSize 2953
VirtualSize 76 Mb
PeakVirtualSize 78 Mb
PageFaultCount 19259
MemoryPriority BACKGROUND
BasePriority 8
CommitCharge 1522
THREAD 89d36b60 Cid 1454.14b0 Teb: 00000000 Win32Thread: 00000000 WAIT: (Unknown) KernelMode Alertable
8a8d7438 NotificationEvent
89d36bd8 NotificationTimer
Not impersonating
DeviceMap e10028c8
Owning Process 89d6d8e8 Image: ServiceA.exe
Attached Process N/A Image: N/A
Wait Start TickCount 11760358 Ticks: 355 (0:00:00:05.546)
Context Switch Count 4591
UserTime 00:00:00.000
KernelTime 00:00:00.000
Start Address DriverA!CheckProtocolStackThread (0xf762cfa0)
Stack Init f3e7b000 Current f3e7acc0 Base f3e7b000 Limit f3e78000 Call 0
Priority 8 BasePriority 8 PriorityDecrement 0
ChildEBP RetAddr
f3e7acd8 8083d5b1 nt!KiSwapContext+0x26
f3e7ad04 8083df9e nt!KiSwapThread+0x2e5
f3e7ad4c f762cf8d nt!KeWaitForSingleObject+0x346
[...]
f3e7adac 8092083b DriverA!CheckProtocolStackThread+0x5
f3e7addc 8083fe9f nt!PspSystemThreadStartup+0x2e
00000000 00000000 nt!KiThreadStartup+0x16
THREAD 89ce9580 Cid 1454.1630 Teb: 7ff9e000 Win32Thread: bc1e71f8 WAIT: (Unknown) UserMode Non-Alertable
893fae40 SynchronizationEvent
Not impersonating
DeviceMap e10028c8
Owning Process 89d6d8e8 Image: ServiceA.exe
Attached Process N/A Image: N/A
Wait Start TickCount 11048203 Ticks: 712510 (0:03:05:32.968)
Context Switch Count 1103 LargeStack
UserTime 00:00:00.281
KernelTime 00:00:01.484
Win32 Start Address DllA!OperationThread (0x1003b37e)
Start Address kernel32!BaseThreadStartThunk (0x77e617ec)
Stack Init f65a3000 Current f65a2c60 Base f65a3000 Limit f65a0000 Call 0
Priority 8 BasePriority 8 PriorityDecrement 0
Kernel stack not resident.
ChildEBP RetAddr
f65a2c78 8083d5b1 nt!KiSwapContext+0x26
f65a2ca4 8083df9e nt!KiSwapThread+0x2e5
f65a2cec 8092ae67 nt!KeWaitForSingleObject+0x346
f65a2d50 80833bef nt!NtWaitForSingleObject+0x9a
f65a2d50 7c82860c nt!KiFastCallEntry+0xfc (TrapFrame @ f65a2d64)
0293fd18 7c827d29 ntdll!KiFastSystemCallRet
0293fd1c 77e61d1e ntdll!ZwWaitForSingleObject+0xc
0293fd8c 77e61c8d kernel32!WaitForSingleObjectEx+0xac
0293fda0 724c705b kernel32!WaitForSingleObject+0x12
0293fdb4 724c6745 DllB!Cleanup+0x3b
[...]
0293fde0 7c81a352 DllB!DLLEntry+0x62
0293fe00 7c830e90 ntdll!LdrpCallInitRoutine+0x14
0293feb8 77e668ab ntdll!LdrShutdownProcess+0x182
0293ffa4 77e6690d kernel32!_ExitProcess+0×43
0293ffb8 77e792c1 kernel32!ExitProcess+0×14
0293ffec 00000000 kernel32!BaseThreadStart+0×5f
However, the brief scan of all other processes and threads from !process 0 ff command output shows that another important service ServiceB has critical section wait chains:
THREAD 89e1f890 Cid 09f4.1018 Teb: 7ff96000 Win32Thread: bc279160 WAIT: (Unknown) UserMode Non-Alertable
89d96c30 SynchronizationEvent
Not impersonating
DeviceMap e10028c8
Owning Process 8a453b18 Image: ServiceB.exe
Attached Process N/A Image: N/A
Wait Start TickCount 11750950 Ticks: 9763 (0:00:02:32.546)
Context Switch Count 327 LargeStack
UserTime 00:00:00.000
KernelTime 00:00:00.000
Win32 Start Address msvcrt!_endthreadex (0×77bcb4bc)
Start Address kernel32!BaseThreadStartThunk (0×77e617ec)
Stack Init f6113000 Current f6112c60 Base f6113000 Limit f6110000 Call 0
Priority 9 BasePriority 8 PriorityDecrement 1
Kernel stack not resident.
ChildEBP RetAddr
f6112c78 8083d5b1 nt!KiSwapContext+0×26
f6112ca4 8083df9e nt!KiSwapThread+0×2e5
f6112cec 8092ae67 nt!KeWaitForSingleObject+0×346
f6112d50 80833bef nt!NtWaitForSingleObject+0×9a
f6112d50 7c82860c nt!KiFastCallEntry+0xfc
09eafd98 7c827d29 ntdll!KiFastSystemCallRet
09eafd9c 7c83d266 ntdll!ZwWaitForSingleObject+0xc
09eafdd8 7c83d2b1 ntdll!RtlpWaitOnCriticalSection+0×1a3
09eafdf8 6738d489 ntdll!RtlEnterCriticalSection+0xa8
[…]
09eaffb8 77e6482f msvcrt!_endthreadex+0xa3
09eaffec 00000000 kernel32!BaseThreadStart+0×34
We switch to this process and find the owner of a critical section that blocks other threads:
0: kd> .process /r /p 8a453b18
Implicit process is now 8a453b18
0: kd> !cs -l -o -s
[...]
DebugInfo = 0x0a199ea0
Critical section = 0x0998ac80 (+0x998AC80)
LOCKED
LockCount = 0x5
WaiterWoken = No
OwningThread = 0x00001680
RecursionCount = 0x1
LockSemaphore = 0xE08
SpinCount = 0x00000000
OwningThread = .thread 89bfc4d8
[…]
0: kd> !thread 89bfc4d8 1f
THREAD 89bfc4d8 Cid 09f4.1680 Teb: 7ff70000 Win32Thread: bc1e79d8 WAIT: (Unknown) UserMode Non-Alertable
89bfc6c4 Semaphore Limit 0x1
Waiting for reply to LPC MessageId 00fbbc86:
Current LPC port e544f108
Not impersonating
DeviceMap e10028c8
Owning Process 8a453b18 Image: ServiceB.exe
Attached Process N/A Image: N/A
Wait Start TickCount 11049035 Ticks: 711678 (0:03:05:19.968)
Context Switch Count 455269 LargeStack
UserTime 00:00:45.312
KernelTime 00:00:10.531
Win32 Start Address msvcrt!_endthreadex (0×77bcb4bc)
Start Address kernel32!BaseThreadStartThunk (0×77e617ec)
Stack Init f3b8b000 Current f3b8ac08 Base f3b8b000 Limit f3b88000 Call 0
Priority 13 BasePriority 8 PriorityDecrement 0
Kernel stack not resident.
ChildEBP RetAddr
f3b8ac20 8083d5b1 nt!KiSwapContext+0×26
f3b8ac4c 8083df9e nt!KiSwapThread+0×2e5
f3b8ac94 8093edb1 nt!KeWaitForSingleObject+0×346
f3b8ad50 80833bef nt!NtRequestWaitReplyPort+0×776
f3b8ad50 7c82860c nt!KiFastCallEntry+0xfc (TrapFrame @ f3b8ad64)
0f13ebe8 7c827899 ntdll!KiFastSystemCallRet
0f13ebec 77c80a6e ntdll!ZwRequestWaitReplyPort+0xc
0f13ec38 77c7fcf0 RPCRT4!LRPC_CCALL::SendReceive+0×230
0f13ec44 77c80673 RPCRT4!I_RpcSendReceive+0×24
0f13ec58 77ce315a RPCRT4!NdrSendReceive+0×2b
0f13f040 771f40c4 RPCRT4!NdrClientCall2+0×22e
[…]
0f13ffb8 77e6482f msvcrt!_endthreadex+0xa3
0f13ffec 00000000 kernel32!BaseThreadStart+0×34
Following LPC chain we find that the blocking thread in ServiceB was waiting for a response from ServiceA:
0: kd> !lpc message 00fbbc86
[...]
Server process : 89d6d8e8 (ServiceA.exe)
[…]
Now the question arises: who was freezing first, ServiceA or ServiceB? We can compare waiting times to answer. We see that waiting time for ServiceB request thread is 3:05:19 and for ServiceA shutdown thread is 03:05:32 (from !thread and !process output above):
Owning Process 8a453b18 Image: ServiceB.exe
[...]
Wait Start TickCount 11049035 Ticks: 711678 (0:03:05:19.968)
Owning Process 89d6d8e8 Image: ServiceA.exe
[...]
Wait Start TickCount 11048203 Ticks: 712510 (0:03:05:32.968)
Therefore, we conclude that ServiceB was blocked after ServiceA was blocked. Such questions often arise in multivendor troubleshooting scenarious.
- Dmitry Vostokov @ DumpAnalysis.org -