ALPC wait chain, missing threads, message box, zombie and special processes: pattern cooperation

The purpose of this case study is to show how to choose what to include in a fiber bundle memory dump when x64 complete memory dumps are huge and not an option to deliver:

1: kd> !vm

*** Virtual Memory Usage ***
 Physical Memory:     5880464 (   23521856 Kb)
[…]

The dump we have is a kernel. When we dump all processes and threads and look for “Waiting for ” we find many ALPC wait chains spanning 3 - 4 processes (sometimes semicircular), sometimes originated from processes with missing threads (just one or two present threads when we expect a dozen of them in a normal state):

1: kd> !process fffffa800b834c10
PROCESS fffffa800b834c10
    SessionId: 205  Cid: 13c40    Peb: 7fffffdb000  ParentCid: 133c0
    DirBase: 13b61d000  ObjectTable: fffff8800c2295b0  HandleCount:  58.
    Image: ProcessA.exe
    VadRoot fffffa8007d70c00 Vads 121 Clone 0 Private 497. Modified 0. Locked 0.
    DeviceMap fffff88000007450
    Token                             fffff8800c695560
    ElapsedTime                       00:03:42.083
    UserTime                          00:00:00.000
    KernelTime                        00:00:00.000
    QuotaPoolUsage[PagedPool]         65968
    QuotaPoolUsage[NonPagedPool]      11520
    Working Set Sizes (now,min,max)  (1274, 50, 345) (5096KB, 200KB, 1380KB)
    PeakWorkingSetSize                1278
    VirtualSize                       37 Mb
    PeakVirtualSize                   38 Mb
    PageFaultCount                    1286
    MemoryPriority                    BACKGROUND
    BasePriority                      13
    CommitCharge                      581

THREAD fffffa800b845bb0  Cid 13c40.1332c  Teb: 000007fffffde000 Win32Thread: fffff900c0076010 WAIT: (WrLpcReply) UserMode Non-Alertable
    fffffa800b845f40  Semaphore Limit 0x1
Waiting for reply to ALPC Message fffff88012527770 : queued at port fffffa80055bca60 : owned by process fffffa80054dfc10
Not impersonating
DeviceMap                 fffff88000007450
Owning Process            fffffa800b834c10       Image:         ProcessA.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      10912787       Ticks: 14208 (0:00:03:42.000)
Context Switch Count      34                 LargeStack
UserTime                  00:00:00.000
KernelTime                00:00:00.015
Win32 Start Address 0×00000000fff60260
Stack Init fffffa600e8d5db0 Current fffffa600e8d5670
Base fffffa600e8d6000 Limit fffffa600e8ce000 Call 0
Priority 15 BasePriority 15 PriorityDecrement 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           Call Site
fffffa60`0e8d56b0 fffff800`016a36fa nt!KiSwapContext+0×7f
fffffa60`0e8d57f0 fffff800`0169835b nt!KiSwapThread+0×13a
fffffa60`0e8d5860 fffff800`016cd4e2 nt!KeWaitForSingleObject+0×2cb
fffffa60`0e8d58f0 fffff800`01916d14 nt!AlpcpSignalAndWait+0×92
fffffa60`0e8d5980 fffff800`019137a6 nt!AlpcpReceiveSynchronousReply+0×44
fffffa60`0e8d59e0 fffff800`0190330f nt!AlpcpProcessSynchronousRequest+0×24f
fffffa60`0e8d5b00 fffff800`016a0ef3 nt!NtAlpcSendWaitReceivePort+0×19f
fffffa60`0e8d5bb0 00000000`774d756a nt!KiSystemServiceCopyEnd+0×13 (TrapFrame @ fffffa60`0e8d5c20)
00000000`0026f038 00000000`00000000 0×774d756a

1: kd> !alpc /m fffff88012527770

Message @ fffff88012527770
  MessageID             : 0x10E8 (4328)
  CallbackID            : 0xC3416B (12796267)
  SequenceNumber        : 0x00000002 (2)
  Type                  : LPC_REQUEST
  DataLength            : 0x0040 (64)
  TotalLength           : 0x0068 (104)
  Canceled              : No
  Release               : No
  ReplyWaitReply        : No
  Continuation          : Yes
  OwnerPort             : fffffa80076e9660 [ALPC_CLIENT_COMMUNICATION_PORT]
  WaitingThread         : fffffa800b845bb0
  QueueType             : ALPC_MSGQUEUE_PENDING
  QueuePort             : fffffa80055bca60 [ALPC_CONNECTION_PORT]
  QueuePortOwnerProcess : fffffa80054dfc10 (ProcessB.exe)
  ServerThread          : fffffa800b711060
  QuotaCharged          : No
  CancelQueuePort       : 0000000000000000
  CancelSequencePort    : 0000000000000000
  CancelSequenceNumber  : 0×00000000 (0)
  ClientContext         : 00000000003fcf20
  ServerContext         : 0000000000000000
  PortContext           : 00000000029fda00
  CancelPortContext     : 0000000000000000
  SecurityData          : 0000000000000000
  View                  : 0000000000000000

1: kd> !thread fffffa800b711060
THREAD fffffa800b711060  Cid 032c.146e8  Teb: 000007fffff7c000 Win32Thread: 0000000000000000 WAIT: (WrLpcReply) UserMode Non-Alertable
    fffffa800b7113f0  Semaphore Limit 0x1
Waiting for reply to ALPC Message fffff8800e401200 : queued at port fffffa8005a32730 : owned by process fffffa8004c39040
Not impersonating
DeviceMap                 fffff88000007450
Owning Process            fffffa80054dfc10       Image:         ProcessB.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      10916800       Ticks: 10195 (0:00:02:39.296)
Context Switch Count      401            
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Win32 Start Address 0×000007fefe647780
Stack Init fffffa6001d33db0 Current fffffa6001d33670
Base fffffa6001d34000 Limit fffffa6001d2e000 Call 0
Priority 10 BasePriority 8 PriorityDecrement 1 IoPriority 2 PagePriority 5
Child-SP          RetAddr           : Call Site
fffffa60`01d336b0 fffff800`016a36fa : nt!KiSwapContext+0×7f
fffffa60`01d337f0 fffff800`0169835b : nt!KiSwapThread+0×13a
fffffa60`01d33860 fffff800`016cd4e2 : nt!KeWaitForSingleObject+0×2cb
fffffa60`01d338f0 fffff800`01916d14 : nt!AlpcpSignalAndWait+0×92
fffffa60`01d33980 fffff800`019137a6 : nt!AlpcpReceiveSynchronousReply+0×44
fffffa60`01d339e0 fffff800`0190330f : nt!AlpcpProcessSynchronousRequest+0×24f
fffffa60`01d33b00 fffff800`016a0ef3 : nt!NtAlpcSendWaitReceivePort+0×19f
fffffa60`01d33bb0 00000000`774d756a : nt!KiSystemServiceCopyEnd+0×13 (TrapFrame @ fffffa60`01d33c20)
00000000`03d8e458 00000000`00000000 : 0×774d756a

1: kd> !alpc /m fffff8800e401200

Message @ fffff8800e401200
  MessageID             : 0x0BA4 (2980)
  CallbackID            : 0xC3E68A (12838538)
  SequenceNumber        : 0x00021911 (137489)
  Type                  : LPC_REQUEST
  DataLength            : 0x00C0 (192)
  TotalLength           : 0x00E8 (232)
  Canceled              : No
  Release               : No
  ReplyWaitReply        : No
  Continuation          : Yes
  OwnerPort             : fffffa8005b119c0 [ALPC_CLIENT_COMMUNICATION_PORT]
  WaitingThread         : fffffa800b711060
  QueueType             : ALPC_MSGQUEUE_PENDING
  QueuePort             : fffffa8005a32730 [ALPC_CONNECTION_PORT]
  QueuePortOwnerProcess : fffffa8004c39040 (ProcessC.exe)
  ServerThread          : fffffa800a843bb0
  QuotaCharged          : No
  CancelQueuePort       : 0000000000000000
  CancelSequencePort    : 0000000000000000
  CancelSequenceNumber  : 0×00000000 (0)
  ClientContext         : 0000000002e2e810
  ServerContext         : 0000000000000000
  PortContext           : 00000000002f3eb0
  CancelPortContext     : 0000000000000000
  SecurityData          : 0000000000000000
  View                  : 0000000000000000

1: kd> !thread fffffa800a843bb0
THREAD fffffa800a843bb0  Cid 048c.fbec  Teb: 000007ffffdaa000 Win32Thread: 0000000000000000 WAIT: (UserRequest) UserMode Non-Alertable
    fffffa8006027d80  Semaphore Limit 0x7fffffff
    fffffa800a843c68  NotificationTimer
Not impersonating
DeviceMap                 fffff88001800ba0
Owning Process            fffffa8004c39040       Image:         ProcessC.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      10916801       Ticks: 10194 (0:00:02:39.281)
Context Switch Count      239            
UserTime                  00:00:00.000
KernelTime                00:00:00.015
Win32 Start Address 0×000007fefe647780
Stack Init fffffa601b280db0 Current fffffa601b280940
Base fffffa601b281000 Limit fffffa601b27b000 Call 0
Priority 9 BasePriority 8 PriorityDecrement 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           : Call Site
fffffa60`1b280980 fffff800`016a36fa : nt!KiSwapContext+0×7f
fffffa60`1b280ac0 fffff800`0169835b : nt!KiSwapThread+0×13a
fffffa60`1b280b30 fffff800`019013e8 : nt!KeWaitForSingleObject+0×2cb
fffffa60`1b280bc0 fffff800`016a0ef3 : nt!NtWaitForSingleObject+0×98
fffffa60`1b280c20 00000000`774d6d5a : nt!KiSystemServiceCopyEnd+0×13 (TrapFrame @ fffffa60`1b280c20)
00000000`10b7e548 00000000`00000000 : 0×774d6d5a

Some processes designed to be non-interactive have threads that wait for UI messages and therefore could be potential message or dialog box threads waiting for a dismissal and blocking other threads:

THREAD fffffa8005a7aa20  Cid 061c.0778  Teb: 000007fffff9e000 Win32Thread: fffff900c079fd50 WAIT: (WrUserRequest) UserMode Non-Alertable
    fffffa8005a7a5a0  SynchronizationEvent
Not impersonating
DeviceMap                 fffff88000007450
Owning Process            fffffa80058f01b0       Image:         ProcessD.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      10911798       Ticks: 15197 (0:00:03:57.453)
Context Switch Count      88939                 LargeStack
UserTime                  00:00:00.078
KernelTime                00:00:00.609
Win32 Start Address 0×000007fefa8238a0
Stack Init fffffa60046a8db0 Current fffffa60046a8720
Base fffffa60046a9000 Limit fffffa60046a0000 Call 0
Priority 10 BasePriority 8 PriorityDecrement 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           Call Site
fffffa60`046a8760 fffff800`016a36fa nt!KiSwapContext+0×7f
fffffa60`046a88a0 fffff800`0169835b nt!KiSwapThread+0×13a
fffffa60`046a8910 fffff960`0014c053 nt!KeWaitForSingleObject+0×2cb
fffffa60`046a89a0 fffff960`0014c0ea win32k!xxxRealSleepThread+0×25f
fffffa60`046a8a40 fffff960`0014bb3a win32k!xxxSleepThread+0×56
fffffa60`046a8a70 fffff960`0014bc39 win32k!xxxRealInternalGetMessage+0×72e
fffffa60`046a8b50 fffff960`0014d0d9 win32k!xxxInternalGetMessage+0×35
fffffa60`046a8b90 fffff800`016a0ef3 win32k!NtUserGetMessage+0×79

fffffa60`046a8c20 00000000`773dd58a nt!KiSystemServiceCopyEnd+0×13 (TrapFrame @ fffffa60`046a8c20)
00000000`03d2f7b8 00000000`00000000 0×773dd58a

We also have more than 30,000 zombie processes including some special ones signifying past faults:

1: kd> !vm
[...]
         15714 ProcessE.exe       0 (         0 Kb)
         15650 WerFault.exe       0 (         0 Kb)
         15644 ProcessF.exe       0 (         0 Kb)
         15640 ProcessE.exe       0 (         0 Kb)
         15610 ProcessG.exe       0 (         0 Kb)
         1560c ProcessE.exe       0 (         0 Kb)
         155f8 ProcessH.exe       0 (         0 Kb)
         155e8 ProcessE.exe       0 (         0 Kb)
         155c4 ProcessG.exe       0 (         0 Kb)
         155bc ProcessE.exe       0 (         0 Kb)
         155b8 ProcessH.exe       0 (         0 Kb)
         1559c WerFault.exe       0 (         0 Kb)
         15560 ProcessE.exe       0 (         0 Kb)
[…]

What we recommend here is to save user dumps of processes A, B, C and D and then force a kernel dump next time the problem surfaces. Also to check WER settings for any recorder faults and, because of the fact the the system is W2K8, configure LocalDumps registry keys to capture full user dumps.

- Dmitry Vostokov @ DumpAnalysis.org -

Leave a Reply

You must be logged in to post a comment.