Stack trace collection, missing threads, waiting time, critical section and LPC wait chains: pattern cooperation

Windows shutdown couldn’t progress and a complete memory dump was saved for later postmortem analysis. !stacks command showed reduced number of waiting threads in one important system service:

0: kd> !stacks
[...]
                            [89d6d8e8 ServiceA.exe]
1454.0014b0  89d36b60 0000163 Blocked    DriverA!Check+0x177
[...]

Normally this service had at least a dozen waiting threads. If we switch to the process we see many threads missing and the process itself is in the process of exiting (three “process” nouns in one sentence):

0: kd> !process 89d6d8e8
PROCESS 89d6d8e8  SessionId: 0  Cid: 1454    Peb: 7ffd8000  ParentCid: 0234
    DirBase: afa06000  ObjectTable: e5491278  HandleCount: 444.
    Image: ServiceA.exe
    VadRoot 89db18d8 Vads 213 Clone 0 Private 827. Modified 15. Locked 0.
    DeviceMap e10028c8
    Token                             e5556710
    ElapsedTime                       2 Days 02:59:39.285
    UserTime                          00:00:08.375
    KernelTime                        00:00:20.046
    QuotaPoolUsage[PagedPool]         50660
    QuotaPoolUsage[NonPagedPool]      9704
    Working Set Sizes (now,min,max)  (2523, 50, 345) (10092KB, 200KB, 1380KB)
    PeakWorkingSetSize                2953
    VirtualSize                       76 Mb
    PeakVirtualSize                   78 Mb
    PageFaultCount                    19259
    MemoryPriority                    BACKGROUND
    BasePriority                      8
    CommitCharge                      1522

THREAD 89d36b60  Cid 1454.14b0  Teb: 00000000 Win32Thread: 00000000 WAIT: (Unknown) KernelMode Alertable
    8a8d7438  NotificationEvent
    89d36bd8  NotificationTimer
Not impersonating
DeviceMap                 e10028c8
Owning Process            89d6d8e8       Image:         ServiceA.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      11760358       Ticks: 355 (0:00:00:05.546)
Context Switch Count      4591            
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Start Address DriverA!CheckProtocolStackThread (0xf762cfa0)
Stack Init f3e7b000 Current f3e7acc0 Base f3e7b000 Limit f3e78000 Call 0
Priority 8 BasePriority 8 PriorityDecrement 0
ChildEBP RetAddr 
f3e7acd8 8083d5b1 nt!KiSwapContext+0x26
f3e7ad04 8083df9e nt!KiSwapThread+0x2e5
f3e7ad4c f762cf8d nt!KeWaitForSingleObject+0x346
[...]
f3e7adac 8092083b DriverA!CheckProtocolStackThread+0x5
f3e7addc 8083fe9f nt!PspSystemThreadStartup+0x2e
00000000 00000000 nt!KiThreadStartup+0x16

THREAD 89ce9580  Cid 1454.1630  Teb: 7ff9e000 Win32Thread: bc1e71f8 WAIT: (Unknown) UserMode Non-Alertable
     893fae40  SynchronizationEvent
Not impersonating
DeviceMap                 e10028c8
Owning Process            89d6d8e8       Image:         ServiceA.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      11048203       Ticks: 712510 (0:03:05:32.968)
Context Switch Count      1103                 LargeStack
UserTime                  00:00:00.281
KernelTime                00:00:01.484
Win32 Start Address DllA!OperationThread (0x1003b37e)
Start Address kernel32!BaseThreadStartThunk (0x77e617ec)
Stack Init f65a3000 Current f65a2c60 Base f65a3000 Limit f65a0000 Call 0
Priority 8 BasePriority 8 PriorityDecrement 0
Kernel stack not resident.
ChildEBP RetAddr 
f65a2c78 8083d5b1 nt!KiSwapContext+0x26
f65a2ca4 8083df9e nt!KiSwapThread+0x2e5
f65a2cec 8092ae67 nt!KeWaitForSingleObject+0x346
f65a2d50 80833bef nt!NtWaitForSingleObject+0x9a
f65a2d50 7c82860c nt!KiFastCallEntry+0xfc (TrapFrame @ f65a2d64)
0293fd18 7c827d29 ntdll!KiFastSystemCallRet
0293fd1c 77e61d1e ntdll!ZwWaitForSingleObject+0xc
0293fd8c 77e61c8d kernel32!WaitForSingleObjectEx+0xac
0293fda0 724c705b kernel32!WaitForSingleObject+0x12
0293fdb4 724c6745 DllB!Cleanup+0x3b
[...]
0293fde0 7c81a352 DllB!DLLEntry+0x62
0293fe00 7c830e90 ntdll!LdrpCallInitRoutine+0x14
0293feb8 77e668ab ntdll!LdrShutdownProcess+0x182
0293ffa4 77e6690d kernel32!_ExitProcess+0×43
0293ffb8 77e792c1 kernel32!ExitProcess+0×14

0293ffec 00000000 kernel32!BaseThreadStart+0×5f

However, the brief scan of all other processes and threads from !process 0 ff command output shows that another important service ServiceB has critical section wait chains:

THREAD 89e1f890  Cid 09f4.1018  Teb: 7ff96000 Win32Thread: bc279160 WAIT: (Unknown) UserMode Non-Alertable
    89d96c30  SynchronizationEvent
Not impersonating
DeviceMap                 e10028c8
Owning Process            8a453b18       Image:         ServiceB.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      11750950       Ticks: 9763 (0:00:02:32.546)
Context Switch Count      327                 LargeStack
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Win32 Start Address msvcrt!_endthreadex (0×77bcb4bc)
Start Address kernel32!BaseThreadStartThunk (0×77e617ec)
Stack Init f6113000 Current f6112c60 Base f6113000 Limit f6110000 Call 0
Priority 9 BasePriority 8 PriorityDecrement 1
Kernel stack not resident.
ChildEBP RetAddr 
f6112c78 8083d5b1 nt!KiSwapContext+0×26
f6112ca4 8083df9e nt!KiSwapThread+0×2e5
f6112cec 8092ae67 nt!KeWaitForSingleObject+0×346
f6112d50 80833bef nt!NtWaitForSingleObject+0×9a
f6112d50 7c82860c nt!KiFastCallEntry+0xfc
09eafd98 7c827d29 ntdll!KiFastSystemCallRet
09eafd9c 7c83d266 ntdll!ZwWaitForSingleObject+0xc
09eafdd8 7c83d2b1 ntdll!RtlpWaitOnCriticalSection+0×1a3
09eafdf8 6738d489 ntdll!RtlEnterCriticalSection+0xa8

[…]
09eaffb8 77e6482f msvcrt!_endthreadex+0xa3
09eaffec 00000000 kernel32!BaseThreadStart+0×34

We switch to this process and find the owner of a critical section that blocks other threads:

0: kd> .process /r /p 8a453b18
Implicit process is now 8a453b18

0: kd> !cs -l -o -s
[...]
DebugInfo          = 0x0a199ea0
Critical section   = 0x0998ac80 (+0x998AC80)
LOCKED
LockCount          = 0x5
WaiterWoken        = No
OwningThread       = 0x00001680
RecursionCount     = 0x1
LockSemaphore      = 0xE08
SpinCount          = 0x00000000
OwningThread       = .thread 89bfc4d8
[…]

0: kd> !thread 89bfc4d8 1f
THREAD 89bfc4d8  Cid 09f4.1680  Teb: 7ff70000 Win32Thread: bc1e79d8 WAIT: (Unknown) UserMode Non-Alertable
    89bfc6c4  Semaphore Limit 0x1
Waiting for reply to LPC MessageId 00fbbc86:
Current LPC port e544f108
Not impersonating
DeviceMap                 e10028c8
Owning Process            8a453b18       Image:         ServiceB.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      11049035       Ticks: 711678 (0:03:05:19.968)
Context Switch Count      455269                 LargeStack
UserTime                  00:00:45.312
KernelTime                00:00:10.531
Win32 Start Address msvcrt!_endthreadex (0×77bcb4bc)
Start Address kernel32!BaseThreadStartThunk (0×77e617ec)
Stack Init f3b8b000 Current f3b8ac08 Base f3b8b000 Limit f3b88000 Call 0
Priority 13 BasePriority 8 PriorityDecrement 0
Kernel stack not resident.
ChildEBP RetAddr 
f3b8ac20 8083d5b1 nt!KiSwapContext+0×26
f3b8ac4c 8083df9e nt!KiSwapThread+0×2e5
f3b8ac94 8093edb1 nt!KeWaitForSingleObject+0×346
f3b8ad50 80833bef nt!NtRequestWaitReplyPort+0×776
f3b8ad50 7c82860c nt!KiFastCallEntry+0xfc (TrapFrame @ f3b8ad64)
0f13ebe8 7c827899 ntdll!KiFastSystemCallRet
0f13ebec 77c80a6e ntdll!ZwRequestWaitReplyPort+0xc
0f13ec38 77c7fcf0 RPCRT4!LRPC_CCALL::SendReceive+0×230
0f13ec44 77c80673 RPCRT4!I_RpcSendReceive+0×24
0f13ec58 77ce315a RPCRT4!NdrSendReceive+0×2b
0f13f040 771f40c4 RPCRT4!NdrClientCall2+0×22e
[…]
0f13ffb8 77e6482f msvcrt!_endthreadex+0xa3
0f13ffec 00000000 kernel32!BaseThreadStart+0×34

Following LPC chain we find that the blocking thread in ServiceB was waiting for a response from ServiceA:

0: kd> !lpc message 00fbbc86
[...]
    Server process  : 89d6d8e8 (ServiceA.exe)
[…]

Now the question arises: who was freezing first, ServiceA or ServiceB? We can compare waiting times to answer. We see that waiting time for ServiceB request thread is 3:05:19 and for ServiceA shutdown thread is 03:05:32 (from !thread and !process output above):

Owning Process            8a453b18       Image:         ServiceB.exe
[...]
Wait Start TickCount      11049035       Ticks: 711678 (0:03:05:19.968)

Owning Process            89d6d8e8       Image:         ServiceA.exe
[...]
Wait Start TickCount      11048203       Ticks: 712510 (0:03:05:32.968)

Therefore, we conclude that ServiceB was blocked after ServiceA was blocked. Such questions often arise in multivendor troubleshooting scenarious.

- Dmitry Vostokov @ DumpAnalysis.org -

Leave a Reply