Archive for the ‘Crash Dump Patterns’ Category

Pattern-Driven Memory Analysis (Part 2)

Tuesday, April 21st, 2009

Before we explain stages of the analysis process shown in Part 1, let’s start with a brief overview of memory dumps, debuggers and logs. Recall that a memory dump is a snapshot of a process, system or physical memory state. This unifies post-mortem analysis and live debugging. Debuggers are tools that allow us to get and modify these memory snapshots. Other tools that allow us to get memory dump files are process dumpers like userdump.exe, Task Manager since Vista, WER, and system dumpers like LiveKd and Win32dd. We should not forget tools and methods that allow us to trigger Windows kernel ability to save consistent memory dump files: NMI button, keyboard method and various software bugcheck-triggers like Citrix SystemDump. Now coming back to debuggers. One of their essential features is to save a debugging session log, formatted textual output saved in a text file for further processing. One good example is !process 0 ff WinDbg command to output all processes and their thread stack traces (see Stack Trace Collection pattern for other variations). 

I’ve created a page to add all P-DMA parts as soon as I write them:

Pattern-Driven Memory Analysis

- Dmitry Vostokov @ DumpAnalysis.org -

Manual and early crash dump, stack trace collection, main thread, blocked threads and pass through functions: pattern cooperation

Thursday, April 16th, 2009

The system was hanging and a manual kernel dump file was generated:

0: kd> !analyze -v
[...]
NMI_HARDWARE_FAILURE (80)
This is typically due to a hardware malfunction.  The hardware supplier should
be called.
Arguments:
Arg1: 004f4454
Arg2: 00000000
Arg3: 00000000
Arg4: 00000000
[...]

Initially it looked like an earlier premature crash dump taken after the reboot:

0: kd> version
[...]
System Uptime: 0 days 0:00:54.203
[…]

However the incident description stated that the system was hanging during its startup so we went further to look at a stack trace collection of all its threads and found the main thread of spooler was hanging for about 23 seconds after 4 seconds since process creation:

0: kd> !process 0 ff

[...]

PROCESS 8cf06020  SessionId: 0  Cid: 04a4    Peb: 7ffde000  ParentCid: 0244
    DirBase: cff8c1a0  ObjectTable: d70086d0  HandleCount:  24.
    Image: spoolsv.exe
    VadRoot 8bb19130 Vads 32 Clone 0 Private 60. Modified 0. Locked 0.
    DeviceMap d66028f0
    Token                             d6e88988
    ElapsedTime                       00:00:27.921
    UserTime                          00:00:00.000
    KernelTime                        00:00:00.000
    QuotaPoolUsage[PagedPool]         27844
    QuotaPoolUsage[NonPagedPool]      1280
    Working Set Sizes (now,min,max)  (312, 50, 345) (1248KB, 200KB, 1380KB)
    PeakWorkingSetSize                312
    VirtualSize                       12 Mb
    PeakVirtualSize                   12 Mb
    PageFaultCount                    322
    MemoryPriority                    BACKGROUND
    BasePriority                      8
    CommitCharge                      80

THREAD 8cf22db0  Cid 04a4.04a8  Teb: 7ffdd000 Win32Thread: bc155948 WAIT: (Unknown) KernelMode Non-Alertable
    8beeca28  SynchronizationEvent
IRP List:
    8beecd08: (0006,0094) Flags: 00000884  Mdl: 00000000
Not impersonating
DeviceMap                 d66028f0
Owning Process            8cf06020       Image:         spoolsv.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      1981           Ticks: 1488 (0:00:00:23.250)
Context Switch Count      9                 LargeStack
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Win32 Start Address 0×0100468c
Start Address 0×77e617f8
Stack Init b6a63b30 Current b6a630cc Base b6a64000 Limit b6a60000 Call b6a63b34
Priority 9 BasePriority 8 PriorityDecrement 0
ChildEBP RetAddr 
b6a630e4 80833465 nt!KiSwapContext+0×26
b6a63110 80829a62 nt!KiSwapThread+0×2e5
b6a63158 b5dbc727 nt!KeWaitForSingleObject+0×346
b6a6318c b5dbd020 mrxdav!RxConstructSrvCall+0×187
b6a63200 b5dbd144 mrxdav!RxFindOrCreateConnections+0×58a
b6a6324c b5dbd552 mrxdav!RxConstructVirtualNetRoot+0×38
b6a632b8 b5db27fc mrxdav!RxFindOrConstructVirtualNetRoot+0×220
b6a63304 b5da7ff1 mrxdav!RxPrefixClaim+0×160
b6a6331c b5d93f72 mrxdav!RxCommonDevFCBIoCtl+0×71
b6a633ac b5da780c mrxdav!RxFsdCommonDispatch+0×320
b6a633cc b5da0fc4 mrxdav!RxFsdDispatch+0xd4
b6a63440 8081df65 mrxdav!MRxDAVFsdDispatch+0×1f0
b6a63454 bace232e nt!IofCallDriver+0×45
b6a63460 bace24f6 Mup!MupiIssueQueryRequest+0×18
b6a63488 bace2414 Mup!MupiResolvePrefix+0×11b
b6a634d4 bacdf4ed Mup!CreateRedirectedFile+0×385
b6a63530 8081df65 Mup!MupCreate+0×1cb
b6a63544 808f8f71 nt!IofCallDriver+0×45
b6a6362c 80937942 nt!IopParseDevice+0xa35
b6a636ac 80933a76 nt!ObpLookupObjectName+0×5b0
b6a63700 808ec76b nt!ObOpenObjectByName+0xea
b6a63884 8088978c nt!NtQueryAttributesFile+0×11d
b6a63884 7c8285ec nt!KiFastCallEntry+0xfc (TrapFrame @ b6a63894)

Looking at IRP we see that I/O request was redirected to the following file share:

0: kd> !irp 8beecd08
Irp is active with 1 stacks 1 is current (= 0x8beecd78)
 No Mdl: No System Buffer: Thread 8cf22db0:  Irp stack trace. 
     cmd  flg cl Device   File     Completion-Context
>[  0, 0]   0  0 8cfdcee8 b6a637f0 00000000-00000000   
        \FileSystem\Mup
   Args: b6a63570 01200000 00070000 00000000

0: kd> !fileobj b6a637f0

\SERVER-A\f$\AppFiles\Component.dll

Device Object: 0x8cfdcee8   \FileSystem\Mup
Vpb is NULL

Flags:  0x0

CurrentByteOffset: 0

When looking at other processes we see 2 threads blocked in svchost.exe in the same redirector component:

0: kd> !stacks
Proc.Thread  .Thread  Ticks   ThreadState Blocker
[...]
                            [8bb2a728 svchost.exe]
 3d0.0003d4  8bb2a290 0000701 Blocked    nt!NtReadFile+0x5d5
 3d0.0003d8  8bb2aba0 00006ff Blocked    nt!ObpWaitForMultipleObjects+0x202
 3d0.0003dc  8bb338c0 00005eb Blocked    nt!ObpWaitForMultipleObjects+0x202
 3d0.0003e8  8bb2cdb0 0000892 Blocked    nt!ObpWaitForMultipleObjects+0x202
 3d0.0003ec  8bb2c838 0000892 Blocked    nt!KiFastCallEntry+0xfc
 3d0.0003f0  8bb26cd0 0000892 Blocked    nt!KiFastCallEntry+0xfc
 3d0.000488  8ceffb88 00006fb Blocked    mrxdav!UMRxAssignWork+0xdf
 3d0.00048c  8cec3db0 00005d0 Blocked    mrxdav!UMRxAssignWork+0xdf

 3d0.000490  8cec3b40 00006fb Blocked    nt!KiFastCallEntry+0xfc
 3d0.0004d4  8bb02918 00005c7 Blocked    nt!KiFastCallEntry+0xfc
 3d0.0004d8  8bafdb40 00005c4 Blocked    nt!KiFastCallEntry+0xfc
[…]

In these two threads we see pass through functions in driverA and driverB:

0: kd> !thread 8ceffb88
THREAD 8ceffb88  Cid 03d0.0488  Teb: 7ffd8000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable
    8cef1f28  QueueObject
Not impersonating
DeviceMap                 d6e2c298
Owning Process            8bb2a728       Image:         svchost.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      1682           Ticks: 1787 (0:00:00:27.921)
Context Switch Count      2            
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Win32 Start Address 0x5a005e80
Start Address 0x77e617ec
Stack Init b5ed8000 Current b5ed7a64 Base b5ed8000 Limit b5ed5000 Call 0
Priority 8 BasePriority 8 PriorityDecrement 0
ChildEBP RetAddr
b5ed7a7c 80833465 nt!KiSwapContext+0x26
b5ed7aa8 8082b60f nt!KiSwapThread+0x2e5
b5ed7af0 b5da714d nt!KeRemoveQueue+0x417
b5ed7b10 b5d9d7db mrxdav!UMRxAssignWork+0xdf
b5ed7b70 bae71bc5 mrxdav!MRxDAVFastIoDeviceControl+0x13d
b5ed7ba8 bae7f42d fltmgr!FltpPerformFastIoCall+0x153
b5ed7bf4 b5f57300 fltmgr!FltpFastIoDeviceControl+0xb1
b5ed7c20 b6e20f8f driverB!FS_FastIoDeviceControl+0×60
b5ed7c5c 808f5e2f driverA!FastIoDeviceControl+0xcf

b5ed7d00 808eed08 nt!IopXxxControlFile+0×255
b5ed7d34 8088978c nt!NtDeviceIoControlFile+0×2a
b5ed7d34 7c8285ec nt!KiFastCallEntry+0xfc (TrapFrame @ b5ed7d64)

- Dmitry Vostokov @ DumpAnalysis.org -

NULL Data Pointer Pattern: case study

Wednesday, April 15th, 2009

Here is the promised case study for the previous post about data NULL pointers. The complete dump has this bugcheck:

0: kd> !analyze -v

[...]

KERNEL_MODE_EXCEPTION_NOT_HANDLED (8e)
This is a very common bugcheck.  Usually the exception address pinpoints the driver/function that caused the problem.  Always note this address as well as the link date of the driver/image that contains this address. Some common problems are exception code 0x80000003.  This means a hard coded breakpoint or assertion was hit, but this system was booted /NODEBUG.  This is not supposed to happen as developers should never have hardcoded breakpoints in retail code, but ... If this happens, make sure a debugger gets connected, and the system is booted /DEBUG.  This will let us see why this breakpoint is happening.
Arguments:
Arg1: c0000005, The exception code that was not handled
Arg2: 8081c7c4, The address that the exception occurred at
Arg3: f1b5d730, Trap Frame
Arg4: 00000000

[...]

FAULTING_IP:
nt!IoIsOperationSynchronous+e
8081c7c4 f6412c02        test    byte ptr [ecx+2Ch],2

TRAP_FRAME:  f1b5d730 -- (.trap 0xfffffffff1b5d730)
[...]

0: kd> .trap 0xfffffffff1b5d730
ErrCode = 00000000
eax=8923b008 ebx=00000000 ecx=00000000 edx=8923b008 esi=891312d0 edi=89f0b300
eip=8081c7c4 esp=f1b5d7a4 ebp=f1b5d7a4 iopl=0 nv up ei ng nz ac pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010296
nt!IoIsOperationSynchronous+0xe:
8081c7c4 f6412c02   test    byte ptr [ecx+2Ch],2  ds:0023:0000002c=??

0: kd> kv 100
ChildEBP RetAddr  Args to Child             
f1b5d7a4 f42cdea9 8923b008 89f0b300 8923b008 nt!IoIsOperationSynchronous+0xe
f1b5d7bc 8081df85 89f0b300 8923b008 00000200 driveB!FsdDeviceIoControlFile+0×19
f1b5d7d0 808ed7a9 00000000 f1b5da84 f1b5db6c nt!IofCallDriver+0×45
f1b5da20 f3c3a521 89f0b300 f1b5da84 f1b5da84 nt!IoVolumeDeviceToDosName+0×89
WARNING: Stack unwind information not available. Following frames may be wrong.
f1b5da3c f3c3b58e 00000618 e4e00420 f1b5dad4 driverA+0×18531
[…]
f1b5dc3c 8081df85 89f48b48 87fa3008 89140d30 driverA+0×1df4

f1b5dc50 808f5437 87fa3078 89140d30 87fa3008 nt!IofCallDriver+0×45
f1b5dc64 808f61bf 89f48b48 87fa3008 89140d30 nt!IopSynchronousServiceTail+0×10b
f1b5dd00 808eed08 000000f0 00000000 00000000 nt!IopXxxControlFile+0×5e5
f1b5dd34 808897bc 000000f0 00000000 00000000 nt!NtDeviceIoControlFile+0×2a
f1b5dd34 7c8285ec 000000f0 00000000 00000000 nt!KiFastCallEntry+0xfc (TrapFrame @ f1b5dd64)
0856e154 7c826fcb 77e416f5 000000f0 00000000 ntdll!KiFastSystemCallRet
0856e158 77e416f5 000000f0 00000000 00000000 ntdll!NtDeviceIoControlFile+0xc
0856e1bc 6f050c6c 000000f0 5665824c 0856e234 kernel32!DeviceIoControl+0×137
[…]

From WDK help we know that the first parameter to IoIsOperationSynchronous is a pointer to an IRP structure:

0: kd> !irp 8923b008
Irp is active with 3 stacks 3 is current (= 0x8923b0c0)
 No Mdl: System buffer=878b7288: Thread 8758a020:  Irp stack trace. 
     cmd  flg cl Device   File     Completion-Context
 [  0, 0]   0  0 00000000 00000000 00000000-00000000   
                     Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0  0 00000000 00000000 00000000-00000000   
                     Args: 00000000 00000000 00000000 00000000
>[  e, 0]   0  0 89f0b300 00000000 00000000-00000000   
              \FileSystem\DriverB
                     Args: 00000200 00000000 004d0008 00000000

Disassembling the function shows some pointer dereferencing and we can reconstruct it starting from EBP+8, a pointer to an IRP. 

0: kd> .asm no_code_bytes
Assembly options: no_code_bytes

0: kd> u nt!IoIsOperationSynchronous nt!IoIsOperationSynchronous+0xe
nt!IoIsOperationSynchronous:
8081c7b6 mov     edi,edi
8081c7b8 push    ebp
8081c7b9 mov     ebp,esp
8081c7bb mov     eax,dword ptr [ebp+8]
8081c7be mov     ecx,dword ptr [eax+60h]
8081c7c1 mov     ecx,dword ptr [ecx+18h]

EAX+60 seems to be a current stack location member of IRP and it is a pointer itself to _IO_STACK_LOCATION structure:

0: kd> dt -r _IRP 8923b008
ntdll!_IRP
   +0x000 Type             : 6
   +0x002 Size             : 0x268
   +0x004 MdlAddress       : (null)
   +0x008 Flags            : 0x70
[...]
   +0x038 CancelRoutine    : (null)
   +0x03c UserBuffer       : 0xf1b5d814
   +0×040 Tail             : __unnamed
      +0×000 Overlay          : __unnamed
         +0×000 DeviceQueueEntry : _KDEVICE_QUEUE_ENTRY
         +0×000 DriverContext    : [4] (null)
         +0×010 Thread           : 0×8758a020 _ETHREAD
         +0×014 AuxiliaryBuffer  : (null)
         +0×018 ListEntry        : _LIST_ENTRY [ 0×0 - 0×0 ]
         +0×020 CurrentStackLocation : 0×8923b0c0 _IO_STACK_LOCATION
[…]

ECX+18 is a pointer to a file object in _IO_STACK_LOCATION structure:

0: kd> dt _IO_STACK_LOCATION 8923b008+60
ntdll!_IO_STACK_LOCATION
   +0x000 MajorFunction    : 0xc0 ''
   +0x001 MinorFunction    : 0xb0 ''
   +0x002 Flags            : 0x23 '#'
   +0x003 Control          : 0x89 ''
   +0x004 Parameters       : __unnamed
   +0x014 DeviceObject     : (null)
   +0×018 FileObject       : (null)
   +0×01c CompletionRoutine : (null)
   +0×020 Context          : (null)

2C offset at the crash point test byte ptr [ecx+2Ch],2 is _FILE_OBJECT Flags member:

0: kd> dt _FILE_OBJECT
ntdll!_FILE_OBJECT
   +0x000 Type             : Int2B
   +0x002 Size             : Int2B
   +0x004 DeviceObject     : Ptr32 _DEVICE_OBJECT
   +0x008 Vpb              : Ptr32 _VPB
   +0x00c FsContext        : Ptr32 Void
   +0x010 FsContext2       : Ptr32 Void
   +0x014 SectionObjectPointer : Ptr32 _SECTION_OBJECT_POINTERS
   +0x018 PrivateCacheMap  : Ptr32 Void
   +0x01c FinalStatus      : Int4B
   +0x020 RelatedFileObject : Ptr32 _FILE_OBJECT
   +0x024 LockOperation    : UChar
   +0x025 DeletePending    : UChar
   +0x026 ReadAccess       : UChar
   +0x027 WriteAccess      : UChar
   +0x028 DeleteAccess     : UChar
   +0x029 SharedRead       : UChar
   +0x02a SharedWrite      : UChar
   +0x02b SharedDelete     : UChar
   +0×02c Flags            : Uint4B
   +0×030 FileName         : _UNICODE_STRING
   +0×038 CurrentByteOffset : _LARGE_INTEGER
   +0×040 Waiters          : Uint4B
   +0×044 Busy             : Uint4B
   +0×048 LastLock         : Ptr32 Void
   +0×04c Lock             : _KEVENT
   +0×05c Event            : _KEVENT
   +0×06c CompletionContext : Ptr32 _IO_COMPLETION_CONTEXT

So it looks like driverA passed an IRP with NULL File object address to driverB and this is also shown in the output of !irp command above.

- Dmitry Vostokov @ DumpAnalysis.org -

Crash Dump Analysis Patterns (Part 6b)

Tuesday, April 14th, 2009

NULL Data Pointer is a special version of the more general Invalid Pointer pattern like NULL Code Pointer. The effective address is below 0xFFFF and it is usually a register with 0 value and the small offset, for example:

0: kd> r
Last set context:
eax=8923b008 ebx=00000000 ecx=00000000 edx=8923b008 esi=891312d0 edi=89f0b300
eip=8081c7c4 esp=f1b5d7a4 ebp=f1b5d7a4 iopl=0 nv up ei ng nz ac pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010296
nt!IoIsOperationSynchronous+0xe:
8081c7c4 f6412c02  test    byte ptr [ecx+2Ch],2   ds:0023:0000002c=??

Here, after disassembling the function backwards, we see the succession of dereferences starting from [EBP+8] and this means that a pointer to a structure (an IRP here) was passed to the function and it had a data pointer in it, pointing to another structure and the latter contained an NULL pointer:  

0: kd> ub 8081c7c4
nt!IoIsOperationSynchronous:
8081c7b6 8bff            mov     edi,edi
8081c7b8 55              push    ebp
8081c7b9 8bec            mov     ebp,esp
8081c7bb 8b4508          mov     eax,dword ptr [ebp+8]
8081c7be 8b4860          mov     ecx,dword ptr [eax+60h]
8081c7c1 8b4918          mov     ecx,dword ptr [ecx+18h]

Next post will feature a full case study.

Note: pointers are discussed in great detail in my latest book: Windows Debugging: Practical Foundations

- Dmitry Vostokov @ DumpAnalysis.org -

Crash Dump Analysis Patterns (Part 83)

Tuesday, April 14th, 2009

When constantly looking at Stack Trace Collections from complete or kernel memory dumps we notice that certain processes are always present and remember them. They are no longer suspicious. The same about thread stacks. Some are always present and some are not suspicious because of their function or status, like Passive Threads or Passive System Threads. Going more fine-grained we can talk about components and their specific functions. For example, certain kernel space components have special filter functions, they get an IRP and pass it down the device stack. It doesn’t take much code to check an IRP and forward it. This is usually reflected in small function offsets, for example:

ChildEBP RetAddr
aeced780 80833ec5 nt!KiSwapContext+0x26
aeced7ac 80829bc0 nt!KiSwapThread+0x2e5
aeced7f4 badffece nt!KeWaitForSingleObject+0x346
WARNING: Stack unwind information not available. Following frames may be wrong.
aeced824 bae00208 AVFilterB+0×1ece
aeced868 bae0e45a AVFilterB+0×2208
aeced8a0 8081e095 AVFilterB+0×1045a

aeced8b4 b946673b nt!IofCallDriver+0×45
aeced8c4 b94626ee driverB!FS_Dispatch+0xfb
aeced8d4 8081e095 driverB!dispatch+0×6e

aeced8e8 b96e04e1 nt!IofCallDriver+0×45
aeced90c b96e0755 driverA!PassThrough+0xd1
aeced92c 8081e095 driverA!Create+0×155
aeced940 b882df08 nt!IofCallDriver+0×45
aeceda5c 8081e095 AVFilterA!DispatchPassThrough+0×48
aeceda70 808fb13b nt!IofCallDriver+0×45
aecedb58 80939c6a nt!IopParseDevice+0xa35
aecedbd8 80935d9e nt!ObpLookupObjectName+0×5b0
aecedc2c 808ece57 nt!ObOpenObjectByName+0xea
aecedca8 808ee0f1 nt!IopCreateFile+0×447
aecedd04 808f1e31 nt!IoCreateFile+0xa3
aecedd44 8088ad3c nt!NtOpenFile+0×27
[…]

Here, if the thread is blocked, AVFilterB is more suspicious than AVFilterA because it is on top of the stack, waiting and AVFilterA just passed an IRP to driverA. DriverA seems also relayed the IRP to driverB and the latter relayed it to AVFilterB.

Another x64 example shows how these filter functions can be identified. They have “Dispatch” or “PassThrough” in their function names:

Child-SP          RetAddr           Call Site
fffffa60`12610880 fffff800`01875f8a nt!KiSwapContext+0x7f
fffffa60`126109c0 fffff800`0187776a nt!KiSwapThread+0x2fa
fffffa60`12610a30 fffff800`01ab16d6 nt!KeWaitForSingleObject+0x2da
[...]
fffffa60`12610fe0 fffffa60`06c5191a rdbss!RxFsdCommonDispatch+0×786
fffffa60`126110d0 fffffa60`07e4f21f rdbss!RxFsdDispatch+0×21a

fffffa60`12611140 fffffa60`011e05f5 mrxsmb!MRxSmbFsdDispatch+0xbf
fffffa60`12611180 fffffa60`011e0130 mup!MupiCallUncProvider+0×159
fffffa60`126111f0 fffffa60`011e17af mup!MupStateMachine+0×120
fffffa60`12611240 fffffa60`00d200b4 mup!MupCreate+0×2c3
fffffa60`126112c0 fffffa60`06d332d6 fltmgr!FltpCreate+0xa4
fffffa60`12611370 fffffa60`06d786c7 driverB!FS_Dispatch+0×156
fffffa60`126113a0 fffffa60`06d7894d driverA!PassThrough+0×177

fffffa60`12611400 fffffa60`090b3f30 driverA!Create+0×14d
fffffa60`12611430 fffff800`01aef360 AVDriverA!LowerDevicePassThrough+0×5c
fffffa60`12611700 fffff800`01aefa59 nt!IopParseDevice+0×5e3
fffffa60`126118a0 fffff800`01af3944 nt!ObpLookupObjectName+0×5eb
fffffa60`126119b0 fffff800`01affee0 nt!ObOpenObjectByName+0×2f4
fffffa60`12611a80 fffff800`01b00a0c nt!IopCreateFile+0×290
fffffa60`12611b20 fffff800`0186fdf3 nt!NtCreateFile+0×78
[…]

- Dmitry Vostokov @ DumpAnalysis.org -

Blocked threads, message box and self-diagnosis: pattern cooperation

Friday, April 3rd, 2009

This is a very simple case study. One service was hanging and its user dump was taken. Inspection of its critical sections reveals the one thread blocking 32 other threads. The owner stack trace points to Message Box pattern:

0:000> !cs -l -o -s
[...]
DebugInfo          = 0x00151e20
Critical section   = 0x00455bd0 (ServiceA!g_CritSec+0x0)
LOCKED
LockCount          = 0×20
WaiterWoken        = No
OwningThread       = 0×00005774
RecursionCount     = 0×1
LockSemaphore      = 0×18
SpinCount          = 0×00000000
OwningThread DbgId = ~24s
OwningThread Stack =
ChildEBP RetAddr  Args to Child              
00d0d854 77e3bf53 77e3610a 00000000 00000000 ntdll!KiFastSystemCallRet
00d0d88c 77e2969d 07a1006e 00000000 00000001 user32!NtUserWaitMessage+0xc
00d0d8b4 77e3f762 77e20000 00181620 00000000 user32!InternalDialogBox+0xd0
00d0db74 77e3f047 00d0dcd0 00d0e810 00d0f3dc user32!SoftModalMessageBox+0×94b
00d0dcc4 77e3eec9 00d0dcd0 00000028 00000000 user32!MessageBoxWorker+0×2ba
00d0dd1c 77e3ee65 00000000 00d0de94 00d0dd84 user32!MessageBoxTimeoutW+0×7a
00d0dd3c 77e3ee41 00000000 00d0de94 00d0dd84 user32!MessageBoxExW+0×1b
00d0dd58 25010b67 00000000 00d0de94 00d0dd84 user32!MessageBoxW+0×45
WARNING: Stack unwind information not available. Following frames may be wrong.
00d0de84 25010372 00d0de94 00010010 00690044 moduleA!DllMain+0×8b27
00d0e298 25018e3d 00002754 25067c58 25066a74 moduleA!DllMain+0×8332
[…]
00d0f974 00403e83 00000008 00d0f98c 00d0fae0 ServiceA!Create+0×39
[…]

Inspecting message box parameters we see a self-diagnostic message:

0:000> du /c 100 00d0de94
00d0de94  “The installation of Software Product A was not successful because moduleA could not be installed.”

- Dmitry Vostokov @ DumpAnalysis.org -

March issue of Debugged! MZ/PE is available!

Sunday, March 29th, 2009

Finally it has been published and available for orders from Amazon and other bookstores:

http://www.dumpanalysis.org/Debugged+Magazine

I had to increase the number of pages for the first issue from 16, planned originally, to 28 and this is reflected in the retail price of $10 (originally planned $8) but bookstores should sell it with a discount between 0% and 55%.

More information about the next issue should be ready by the end of the next week.

- Dmitry Vostokov @ DumpAnalysis.org

Advanced Local Procedure Call WinDbg extension

Friday, March 20th, 2009

In Windows Vista and Server 2008 LPC is implemented using undocumented Advanced Local Procedure Calls (ALPC) and the new formatting of the !process 0 ff output shows the owner of the port: 

THREAD fffffa801afae6f0  Cid 033c.3c40  Teb: 000007fffffa0000 Win32Thread: 0000000000000000 WAIT: (WrLpcReply) UserMode Non-Alertable
    fffffa801afaea80  Semaphore Limit 0x1
Waiting for reply to ALPC Message fffff8800f4cc950 : queued at port fffffa8017dbb060 : owned by process fffffa801807b8f0

Instead of !lpc we should use !alpc extension command: 

0: kd> !lpc message fffff8800f4cc950
LPC is now emulated over ALPC. Use !alpc
Arguments:

   /?                   - Print help
   /P [/V] PortAddress  - Dump the ALPC port information
   /M [/V] MsgAddress   - Searches the message and dumps the message information
   /R [/V] ResAddress   - Dumps the information regarding resources
   /G [/V]              - Dumps the global information

0: kd> !alpc /M fffff8800f4cc950

Message @ fffff8800f4cc950
  MessageID             : 0x0BFC (3068)
  CallbackID            : 0x1FEF27 (2092839)
  SequenceNumber        : 0x0004D4D4 (316628)
  Type                  : LPC_REQUEST
  DataLength            : 0x0068 (104)
  TotalLength           : 0x0090 (144)
  Canceled              : No
  Release               : No
  ReplyWaitReply        : No
  Continuation          : Yes
  OwnerPort             : fffffa8018a95e60 [ALPC_CLIENT_COMMUNICATION_PORT]
  WaitingThread         : fffffa801afae6f0
  QueueType             : ALPC_MSGQUEUE_PENDING
  QueuePort             : fffffa8017dbb060 [ALPC_CONNECTION_PORT]
  QueuePortOwnerProcess : fffffa801807b8f0 (svchost.exe)
  ServerThread          : fffffa801aeec060
  QuotaCharged          : No
  CancelQueuePort       : 0000000000000000
  CancelSequencePort    : 0000000000000000
  CancelSequenceNumber  : 0×00000000 (0)
  ClientContext         : 00000000025599e0
  ServerContext         : 0000000000000000
  PortContext           : 000000000030d640
  CancelPortContext     : 0000000000000000
  SecurityData          : 0000000000000000
  View                  : 0000000000000000

- Dmitry Vostokov @ DumpAnalysis.org -

Debugged! MZ/PE soon to be available!

Wednesday, March 18th, 2009

Yesterday I submitted the magazine to print and distribution world-wide. If everything is right it should be available by the end of this month. This first issue features 12 page WinDbg command supplement to pattern-driven memory dump analysis methodology, an overview of Win32dd complete memory dumper and PowerDbg enhancements to debug ASP.NET code. The magazine will only be available in print.

- Dmitry Vostokov @ DumpAnalysis.org -

Manual dump, wait chain, blocked thread, dynamic memory corruption and historical information: pattern cooperation

Saturday, March 14th, 2009

For one manual user dump of a hanging service neither !analyze -v nor !analyze -v -hang command found the problem:

Loading Dump File [ServiceA.dmp]
User Mini Dump File with Full Memory: Only application data is available

Comment: 'Userdump generated complete user-mode minidump with Standalone function on COMPUTER'

0:000> !analyze -v -hang

DERIVED_WAIT_CHAIN: 

Dl Eid Cid     WaitType
-- --- ------- --------------------------
   0   678.67c File IO               

WAIT_CHAIN_COMMAND:  ~0s;k;;

BLOCKING_THREAD:  0000067c

DEFAULT_BUCKET_ID:  APPLICATION_HANG_BlockedOn_FileIO

PRIMARY_PROBLEM_CLASS:  APPLICATION_HANG_BlockedOn_FileIO

LAST_CONTROL_TRANSFER:  from 7c82776b to 7c8285ec

STACK_TEXT: 
0012fbc8 7c82776b ntdll!KiFastSystemCallRet
0012fbcc 77e218b2 ntdll!NtReadFile+0xc
0012fc34 77d85edb kernel32!ReadFile+0x16c
0012fc60 77d85f82 advapi32!ScGetPipeInput+0x2a
0012fcd4 77d71ed9 advapi32!ScDispatcherLoop+0x51
0012ff38 004025e0 advapi32!StartServiceCtrlDispatcherW+0xe3
0012ff60 0042c8fd ServiceA!main+0x110
0012ffc0 77e4f23b ServiceA!mainCRTStartup+0x143
0012fff0 00000000 kernel32!BaseProcessStart+0x23

As usual the critical section list was checked for any locked sections and to find wait chains:

0:000> !cs -l -o -s
-----------------------------------------
DebugInfo          = 0x7c8877c0
Critical section   = 0×7c8877a0 (ntdll!LdrpLoaderLock+0×0)
LOCKED
LockCount          = 0xA2

WaiterWoken        = No
OwningThread       = 0×00011f64
RecursionCount     = 0×1
LockSemaphore      = 0×188
SpinCount          = 0×00000000
OwningThread DbgId = ~20s
OwningThread Stack =
ChildEBP RetAddr  Args to Child             
0327f91c 7c827d0b 7c83d236 00000754 00000000 ntdll!KiFastSystemCallRet
0327f920 7c83d236 00000754 00000000 00000000 ntdll!NtWaitForSingleObject+0xc
0327f95c 7c83d281 00000754 00000004 00370000 ntdll!RtlpWaitOnCriticalSection+0×1a3
0327f97c 7c82a264 00370608 00000000 0000008c ntdll!RtlEnterCriticalSection+0xa8
0327fba4 7c3423aa 00370000 00000008 0000008c ntdll!RtlAllocateHeap+0×313
0327fbe4 7c3422cb 00000001 0000008c 00000000 msvcr71!calloc+0xe6
0327fbfc 7c81a352 7c340000 00000002 00000000 msvcr71!_CRTDLL_INIT+0×138
0327fc1c 7c82ed97 7c34229f 7c340000 00000002 ntdll!LdrpCallInitRoutine+0×14
0327fcb8 7c82ec9f 0327fd28 0327fd28 00000000 ntdll!LdrpInitializeThread+0×10d
0327fd14 7c8284c5 0327fd28 7c800000 00000000 ntdll!_LdrpInitialize+0×16f
00000000 00000000 00000000 00000000 00000000 ntdll!KiUserApcDispatcher+0×25
ntdll!RtlpStackTraceDataBase is NULL. Probably the stack traces are not enabled.
[…]
DebugInfo          = 0×7c887e20
Critical section   = 0×00370608 (+0×370608)
LOCKED
LockCount          = 0×7

WaiterWoken        = No
OwningThread       = 0×000027ac
RecursionCount     = 0×1
LockSemaphore      = 0×754
SpinCount          = 0×00000fa0
OwningThread DbgId = ~17s
OwningThread Stack =
ChildEBP RetAddr  Args to Child             
0170ebe0 7c82775b 77e62610 d0000144 00000004 ntdll!KiFastSystemCallRet
0170ebe4 77e62610 d0000144 00000004 00000000 ntdll!NtRaiseHardError+0xc
0170ee50 77e592a3 0170ee78 77e41ac1 0170ee80 kernel32!UnhandledExceptionFilter+0×51a

0170ee58 77e41ac1 0170ee80 00000000 0170ee80 kernel32!BaseThreadStart+0×4a
0170ee80 7c828752 0170ef64 0170ffdc 0170ef80 kernel32!_except_handler3+0×61
0170eea4 7c828723 0170ef64 0170ffdc 0170ef80 ntdll!ExecuteHandler2+0×26
0170ef4c 7c82855e 0170d000 0170ef80 0170ef64 ntdll!ExecuteHandler+0×24
0170ef4c 7c82a754 0170d000 0170ef80 0170ef64 ntdll!KiUserExceptionDispatcher+0xe (CONTEXT @ 0170ef80)
0170f258 7c82a82b 0185d6a8 00690043 0170f320 ntdll!RtlpCoalesceFreeBlocks+0×36e

0170f340 7c34218a 00370000 00000000 0185bad8 ntdll!RtlFreeHeap+0×38e
0170f388 004288df 0185bad8 05afd9c4 00000000 msvcr71!free+0xc3
0170f3b4 00428a16 05afd9c4 00001bc4 0185bad8 ServiceA!SetData+0×4df
ntdll!RtlpStackTraceDataBase is NULL. Probably the stack traces are not enabled.
[…]

We see that 162 threads (0xA2) are waiting for 0×7c8877a0 locked critical section whose owner thread #20 is waiting for the critical section 00370608 owned by the thread #17. The latter thread is blocked in an unhandled exception filter raising a hard error message box. Applying the new exception context we confirm heap corruption problem:

0:000> .cxr 0170ef80
eax=0185d6a8 ebx=00370000 ecx=00690043 edx=00000001 esi=0185d6a0 edi=01858030
eip=7c82a754 esp=0170f24c ebp=0170f258 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010246
ntdll!RtlpCoalesceFreeBlocks+0×36e:
7c82a754 8b09            mov     ecx,dword ptr [ecx]  ds:0023:00690043=????????

0:000> kL
  *** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr 
0170f258 7c82a82b ntdll!RtlpCoalesceFreeBlocks+0×36e
0170f340 7c34218a ntdll!RtlFreeHeap+0×38e
0170f388 004288df msvcr71!free+0xc3
0170f3b4 00428a16 ServiceA!SetData+0×4df
[…]
0170ffec 00000000 kernel32!BaseThreadStart+0×34

Therefore we suggest to enable full page heap either by using gflags or application verifier. Looking further at other locked critical sections omitted above we notice that one is inaccessible and from an unloaded module:

Cannot read structure field value at 0x5531953c, error 0
Cannot determine if the critical section is locked or not.
-----------------------------------------
Critical section   = 0x55319538 (<Unloaded_DetouredA.DLL>+0×29538)
DebugInfo          = 0×00147108
ntdll!RtlpStackTraceDataBase is NULL. Probably the stack traces are not enabled.

We see this execution residue in the list of unloaded modules too:

0:000> lmt
[...]
Unloaded modules:
[...]
697b0000 697c7000   faultrep.dll
    Timestamp: Sat Feb 17 15:12:57 2007 (45D71B79)
    Checksum:  0001D340
[...]
552f0000 55320000   DetouredA.DLL
    Timestamp: Wed Dec 19 09:38:46 2007 (4768E6A6)
    Checksum:  00030564

Recalling that this hooksware had problems before we also suggest to remove software package that injects DetouredA module to test the stability of the system.

- Dmitry Vostokov @ DumpAnalysis.org -

Coupled processes, wait chains, message box, waiting thread time, paged out data, incorrect stack trace, hidden exception, unknown component and execution residue: pattern cooperation

Wednesday, March 11th, 2009

Let’s look at one complete memory dump from a hanging system. It was reported that an application was hanging when trying to create a printer. In its process we can identify this blocked thread that was processing an LPC request and waiting for an LPC reply from a coupled process, a spooler service:

0: kd> !thread 88db89a8 1f
THREAD 88db89a8  Cid 0764.31b8  Teb: 7ffdc000 Win32Thread: bc2dc980 WAIT: (Unknown) UserMode Non-Alertable
    88db8b94  Semaphore Limit 0x1
Waiting for reply to LPC MessageId 0016346d:
Current LPC port e179d780
Not impersonating
DeviceMap                 e157b068
Owning Process            8a95f690       Image:         ApplicationA.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      2750643        Ticks: 153453 (0:00:39:57.703)
Context Switch Count      2120                 LargeStack
UserTime                  00:00:00.031
KernelTime                00:00:00.078
Win32 Start Address 0×0015b4fd
LPC Server thread working on message Id 15b4fd
Start Address kernel32!BaseThreadStartThunk (0×77e617ec)
Stack Init b7848000 Current b7847c08 Base b7848000 Limit b7845000 Call 0
Priority 9 BasePriority 8 PriorityDecrement 0
Kernel stack not resident.
ChildEBP RetAddr 
b7847c20 80833ec5 nt!KiSwapContext+0×26
b7847c4c 80829bc0 nt!KiSwapThread+0×2e5
b7847c94 80920f28 nt!KeWaitForSingleObject+0×346
b7847d50 8088ad3c nt!NtRequestWaitReplyPort+0×776
b7847d50 7c8285ec nt!KiFastCallEntry+0xfc (TrapFrame @ b7847d64)
00e6e150 7c82787b ntdll!KiFastSystemCallRet
00e6e154 77c80a6e ntdll!NtRequestWaitReplyPort+0xc
00e6e1a8 77c7fcf0 RPCRT4!LRPC_CCALL::SendReceive+0×230
00e6e1b4 77c80673 RPCRT4!I_RpcSendReceive+0×24
00e6e1c8 77ce315a RPCRT4!NdrSendReceive+0×2b
00e6e5b0 7307ca61 RPCRT4!NdrClientCall2+0×22e
00e6e5c8 73082e15 WINSPOOL!RpcAddPrinter+0×1c
00e6ea50 0040dd93 WINSPOOL!AddPrinterW+0×102
00e6ef40 0040f535 ApplicationA!CreatePrinter+0×8d9

[…]
00e6f8b4 77ce33e1 RPCRT4!Invoke+0×30
00e6fcb4 77ce35c4 RPCRT4!NdrStubCall2+0×299
00e6fcd0 77c7ff7a RPCRT4!NdrServerCall2+0×19
00e6fd04 77c8042d RPCRT4!DispatchToStubInCNoAvrf+0×38
00e6fd58 77c80353 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0×11f
00e6fd7c 77c7e0d4 RPCRT4!RPC_INTERFACE::DispatchToStub+0xa3
00e6fdbc 77c7e080 RPCRT4!RPC_INTERFACE::DispatchToStubWithObject+0xc0
00e6fdfc 77c812f0 RPCRT4!LRPC_SCALL::DealWithRequestMessage+0×41e
00e6fe20 77c88678 RPCRT4!LRPC_ADDRESS::DealWithLRPCRequest+0×127
00e6ff84 77c88792 RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0×430
00e6ff8c 77c8872d RPCRT4!RecvLotsaCallsWrapper+0xd
00e6ffac 77c7b110 RPCRT4!BaseCachedThreadRoutine+0×9d
00e6ffb8 77e64829 RPCRT4!ThreadStartRoutine+0×1b

We follow an LPC wait chain and see a thread blocked by a message box:

0: kd> !lpc message 0016346d
Searching message 16346d in threads …
    Server thread 891e9db0 is working on message 16346d                        
Client thread 88db89a8 waiting a reply from 16346d                         
Searching thread 88db89a8 in port rundown queues …

0: kd> !thread 891e9db0 1f
THREAD 891e9db0  Cid 045c.3b3c  Teb: 7ffa7000 Win32Thread: bc2d4ea8 WAIT: (Unknown) UserMode Non-Alertable
    88f5c850  SynchronizationEvent
Impersonation token:  e76f5030 (Level Impersonation)
DeviceMap                 e157b068
Owning Process            8ab10460       Image:         spoolsv.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      2414962        Ticks: 489134 (0:02:07:22.718)
Context Switch Count      6797                 LargeStack
UserTime                  00:00:00.359
KernelTime                00:00:01.437
Win32 Start Address 0×0016346d
LPC Server thread working on message Id 16346d
Start Address kernel32!BaseThreadStartThunk (0×77e617ec)
Stack Init b6aaa000 Current b6aa9c50 Base b6aaa000 Limit b6aa6000 Call 0
Priority 11 BasePriority 8 PriorityDecrement 0
Kernel stack not resident.
ChildEBP RetAddr 
b6aa9c68 80833ec5 nt!KiSwapContext+0×26
b6aa9c94 80829bc0 nt!KiSwapThread+0×2e5
b6aa9cdc bf89aacc nt!KeWaitForSingleObject+0×346
b6aa9d38 bf89d952 win32k!xxxSleepThread+0×1be
b6aa9d4c bf89e310 win32k!xxxRealWaitMessageEx+0×12
b6aa9d5c 8088ad3c win32k!NtUserWaitMessage+0×14
b6aa9d5c 7c8285ec nt!KiFastCallEntry+0xfc (TrapFrame @ b6aa9d64)
021267bc 7739bf53 ntdll!KiFastSystemCallRet
021267f4 7738965e USER32!NtUserWaitMessage+0xc
0212681c 7739f762 USER32!InternalDialogBox+0xd0
02126adc 7739f047 USER32!SoftModalMessageBox+0×94b
02126c2c 7739eec9 USER32!MessageBoxWorker+0×2ba
02126c84 773d7d0d USER32!MessageBoxTimeoutW+0×7a
02126cb8 773c42c8 USER32!MessageBoxTimeoutA+0×9c
02126cd8 773c42a4 USER32!MessageBoxExA+0×1b
02126cf4 6dfcf8c2 USER32!MessageBoxA+0×45
WARNING: Frame IP not in any known module. Following frames may be wrong.
02126d98 530a2935 0×6dfcf8c2
02126d9c 20656d6f 0×530a2935
[…]

Looking at this thread waiting time we see the problem happened about 1 hour 20 minutes before our hanging application attempted to print. We want to see other threads in the spooler but most of them are paged out:

0: kd> !process 8ab10460 1f

[...]

THREAD 8a6aedb0  Cid 045c.1090  Teb: 7ffab000 Win32Thread: bc33bbe8 WAIT: (Unknown) UserMode Non-Alertable
    8abb2f40  SynchronizationEvent
    IRP List:
        8a0a2ed0: (0006,0094) Flags: 00000800  Mdl: 00000000
Impersonation token:  e7c94760 (Level Impersonation)
DeviceMap                 e9194e68
Owning Process            8ab10460       Image:         spoolsv.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      1897118        Ticks: 1006978 (0:04:22:14.031)
Context Switch Count      9711                 LargeStack
UserTime                  00:00:00.343
KernelTime                00:00:02.812
Win32 Start Address 0x00165a40
LPC Server thread working on message Id 165a40
Start Address kernel32!BaseThreadStartThunk (0x77e617ec)
Stack Init b7c28000 Current b7c27c60 Base b7c28000 Limit b7c24000 Call 0
Priority 13 BasePriority 8 PriorityDecrement 0
Kernel stack not resident.

[...] 

Coming back to our message box thread we inspect MessageBox parameters to see the message:

0: kd> .process /r /p 8ab10460
Implicit process is now 8ab10460
Loading User Symbols

0: kd> !thread 891e9db0
[...]
ChildEBP RetAddr  Args to Child             
[...]
02126cf4 6dfcf8c2 00000000 02126d4c 02126d2c USER32!MessageBoxA+0×45
[…]

0: kd> da /c 100 02126d4c
02126d4c  "Function address 0×77e63ce4 caused a protection fault. (exception code 0xc0000005).

0: kd> ln 0x77e63ce4
(77e63c87)   kernel32!GetModuleFileNameW+0×83   |  (77e63d7a)   kernel32!GetProcAddress

We see that the thread experienced an access violation in GetModuleFileName API call.

We also see that stack trace is incorrect and we try to reconstruct the point of exception by looking at thread raw stack and searching for any hidden exception

0: kd> .thread 891e9db0
Implicit thread is now 891e9db0

0: kd> !teb
TEB at 7ffa7000
    ExceptionList:        0212757c
    StackBase:            02130000
    StackLimit:           02123000

    SubSystemTib:         00000000
    FiberData:            00001e00
    ArbitraryUserPointer: 00000000
    Self:                 7ffa7000
    EnvironmentPointer:   00000000
    ClientId:             0000045c . 00003b3c
    RpcHandle:            00000000
    Tls Storage:          00000000
    PEB Address:          7ffde000
    LastErrorValue:       0
    LastStatusValue:      c0000034
    Count Owned Locks:    0
    HardErrorMode:        0

0: kd> dds 02123000 02130000
02123000  00000000
02123004  00000000
02123008  00000000
0212300c  00000000
[...]
021275a8  02127648
021275ac  0212aa0c
021275b0  7c8315c2 ntdll!RtlDispatchException+0×91
021275b4  02127648
021275b8  0212aa0c
021275bc  02127664

021275c0  02127624
021275c4  6dfd0a54
021275c8  063b0000
021275cc  02127648
021275d0  7c8877ec ntdll!PebLdr+0xc
021275d4  77e61a60 kernel32!_except_handler3
021275d8  77e69cf8 kernel32!`string’+0xc
021275dc  ffffffff
021275e0  00080000
021275e4  00089878
021275e8  77b92e40 VERSION!`string’
021275ec  00000000
021275f0  00000000
021275f4  00017608
021275f8  02127540
021275fc  77b92e40 VERSION!`string’
02127600  02127648
02127604  7c828270 ntdll!_except_handler3
02127608  7c829f60 ntdll!CheckHeapFillPattern+0×64
0212760c  ffffffff
02127610  7c829f59 ntdll!RtlFreeHeap+0×70f
02127614  77e62444 kernel32!LocalFree+0×27
02127618  00080000
0212761c  00000000
02127620  77e62457 kernel32!LocalFree+0×108
02127624  5dff1510
02127628  02130000
0212762c  00000000
02127630  0212797c
02127634  7c82855e ntdll!KiUserExceptionDispatcher+0xe
02127638  02123000
0212763c  02127664

02127640  02127648
02127644  02127664
02127648  c0000005
0212764c  00000000
02127650  00000000
02127654  77e63ce4 kernel32!GetModuleFileNameW+0×83
02127658  00000002
0212765c  00000000
02127660  00000018
02127664  0001003f
02127668  00000000
0212766c  00000000
02127670  00000000
[…]

0: kd> .cxr 02127664
eax=00000000 ebx=00000000 ecx=7c82d279 edx=7c8877a0 esi=7c8877ec edi=063b0000
eip=77e63ce4 esp=02127930 ebp=0212797c iopl=0  nv up ei ng nz ac pe cy
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000  efl=00010297
kernel32!GetModuleFileNameW+0x83:
001b:77e63ce4 3b7818          cmp     edi,dword ptr [eax+18h] ds:0023:00000018=????????

0: kd> kv
  *** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr  Args to Child             
0212797c 77390f3d 063b0000 021279d0 00000104 kernel32!GetModuleFileNameW+0x83
02127990 77390f13 063b0000 021279d0 00000104 USER32!WowGetModuleFileName+0x14
02127bdc 7738bcbd 063b0000 00000097 00000003 USER32!LoadIcoCur+0x76
02127bfc 06425db5 063b0000 00000097 02127d28 USER32!LoadIconW+0×1b
WARNING: Frame IP not in any known module. Following frames may be wrong.
02127c38 7c82cefe 00000704 00001f10 02127c58 0×6425db5
02127c4c 7c82ceb0 00001f10 00002614 00000704 ntdll!ARRAY_FITS+0×29
02127c8c c0150008 00000000 03923f68 0000001d ntdll!RtlpLocateActivationContextSection+0×1da
02127d20 7c82e0ef 00000000 00000000 00071f10 0xc0150008

What we see is the fact that the thread tried to load an icon from 063b0000 module not listed in lmv command output list perhaps because it was not fully loaded yet. We see that the thread also holds a loader critical section:

0: kd> !cs -l -o -s
-----------------------------------------
DebugInfo          = 0x7c8877c0
Critical section   = 0×7c8877a0 (ntdll!LdrpLoaderLock+0×0)
LOCKED
LockCount          = 0xC
WaiterWoken        = No
OwningThread       = 0×00003b3c
RecursionCount     = 0×1
LockSemaphore      = 0×108
SpinCount          = 0×00000000
OwningThread       = .thread 891e9db0
ntdll!RtlpStackTraceDataBase is NULL. Probably the stack traces are not enabled.

However we can inspect the unknown module PE headers left in memory:

 0: kd> !dh 063b0000

File Type: DLL
FILE HEADER VALUES
     14C machine (i386)
       4 number of sections
42B9172C time date stamp Wed Jun 22 08:45:48 2005

       0 file pointer to symbol table
       0 number of symbols
      E0 size of optional header
    210E characteristics
            Executable
            Line numbers stripped
            Symbols stripped
            32 bit word machine
            DLL

[...]

Debug Directories(1)
 Type       Size     Address  Pointer
 cv           16       10198     f598 Format: NB10, 42b9172c, 1, c:\printerdriver123UI.pdb

[...]

We see that the module name is printerdriver123UI. So we should recommend to check the printer driver package containing this component. lmv command lists many printer drivers with the similar name prefix and we look at thread execution residue to get better stack trace and find the corresponding printer. We know that the application called AddPrinter API and therefore we try to find the corresponding call on the raw stack that we examined for hidden exceptions before:

[...]
0212ef2c  01f20390
0212ef30  0212f3f8
0212ef34  76144799 localspl!SplAddPrinter+0×5f3
0212ef38  01efb730
0212ef3c  00000003
[…]
0212f3f8  0212f424
0212f3fc  76144ab2 localspl!LocalAddPrinterEx+0×2e
0212f400  00000000
[…]
0212f424  0212f874
0212f428  7407018b SPOOLSS!AddPrinterExW+0×151
0212f42c  00000000
[…]
0212f874  0212f890
0212f878  74070254 SPOOLSS!AddPrinterW+0×17
0212f87c  00000000
[…]

0: kd> kv L=0212ef30
ChildEBP RetAddr  Args to Child             
0212ef30 76144799 01efb730 00000003 00000000 kernel32!GetModuleFileNameW+0x83
0212f3f8 76144ab2 00000000 00000002 011116e0 localspl!SplAddPrinter+0x5f3
0212f424 7407018b 00000000 00000002 011116e0 localspl!LocalAddPrinterEx+0x2e
0212f874 74070254 00000000 00000002 011116e0 SPOOLSS!AddPrinterExW+0x151
0212f890 0100792d 00000000 00000002 011116e0 SPOOLSS!AddPrinterW+0×17
0212f8ac 01006762 00000000 0118f968 00159180 spoolsv!YAddPrinter+0×75
0212f8d0 77c80193 00000000 0118f968 00159180 spoolsv!RpcAddPrinter+0×37
0212f8f8 77ce33e1 0100672b 0212fae0 00000005 RPCRT4!Invoke+0×30
0212fcf8 77ce35c4 00000000 00000000 000cefbc RPCRT4!NdrStubCall2+0×299
0212fd14 77c7ff7a 000cefbc 000bbd20 000cefbc RPCRT4!NdrServerCall2+0×19
0212fd48 77c8042d 01002c47 000cefbc 0212fdec RPCRT4!DispatchToStubInCNoAvrf+0×38
0212fd9c 77c80353 00000005 00000000 0100d228 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0×11f
0212fdc0 77c811dc 000cefbc 00000000 0100d228 RPCRT4!RPC_INTERFACE::DispatchToStub+0xa3
0212fdfc 77c812f0 000c9e80 000a7af0 0014ce70 RPCRT4!LRPC_SCALL::DealWithRequestMessage+0×42c
0212fe20 77c88678 000a7b28 0212fe38 000c9e80 RPCRT4!LRPC_ADDRESS::DealWithLRPCRequest+0×127
0212ff84 77c88792 0212ffac 77c8872d 000a7af0 RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0×430
0212ff8c 77c8872d 000a7af0 00000000 00000000 RPCRT4!RecvLotsaCallsWrapper+0xd
0212ffac 77c7b110 000885d8 0212ffec 77e64829 RPCRT4!BaseCachedThreadRoutine+0×9d
0212ffb8 77e64829 000a3750 00000000 00000000 RPCRT4!ThreadStartRoutine+0×1b
0212ffec 00000000 77c7b0f5 000a3750 00000000 kernel32!BaseThreadStart+0×34

 
Examination of the server-side AddPrinter arguments reveals the printer name:

0: kd> dpu 011116e0 l5
011116e0  00000000
011116e4  00158f48
011116e8  00000000
011116ec  00158fe4
011116f0  00159040 "PRINTER NAME ABCDEF

- Dmitry Vostokov @ DumpAnalysis.org -

Crash Dump Analysis Patterns (Part 82)

Monday, March 9th, 2009

One pattern I recently discovered is the partitioning of anomalous debugger output from debugging commands into several disjoint or weakly linked classes. I call the pattern Semantic Split. It is better characterized as a partition of the memory analysis pattern, for example, Blocked Thread, into classes with different semantics, for example, blocked display threads and blocked remote share threads. Here is one short example of it found in the output of !locks and !process 0 ff WinDbg commands from a complete memory dump forced on a hanging server. The output shows several blocked threads and wait chains of executive resources (some shared locks have been removed for clarity):

0: kd> !locks
**** DUMP OF ALL RESOURCE OBJECTS ****

Resource @ 0×88eeeaf0    Exclusively owned
    Contention Count = 809254
    NumberOfExclusiveWaiters = 4
     Threads: 88a26db0-01<*>
     Threads Waiting On Exclusive Access:
              88c6b6d0       8877b020       88a7e020       88938db0      

Resource @ 0×88badb20    Exclusively owned
    Contention Count = 9195
    NumberOfExclusiveWaiters = 4
     Threads: 88842020-02<*>
     Threads Waiting On Exclusive Access:
              88a8b170       89069450       88c4d020       88a26db0      

Resource @ 0x88859cc0    Exclusively owned
    Contention Count = 51021
    NumberOfExclusiveWaiters = 8
     Threads: 886f1c50-01<*>
     Threads Waiting On Exclusive Access:
              88e04db0       886785c0       8851edb0       896ee890      
              8869fb50       886d6498       889aa918       88c2da38      

Resource @ 0x881cc138    Exclusively owned
    Contention Count = 173698
    NumberOfExclusiveWaiters = 4
     Threads: 87e72598-01<*>
     Threads Waiting On Exclusive Access:
              88392020       8918c8d8       88423020       880eba50      

Resource @ 0x884ffab0    Exclusively owned
    Contention Count = 3363
    NumberOfExclusiveWaiters = 2
     Threads: 8807c5b8-02<*>
     Threads Waiting On Exclusive Access:
              87e72598       881c12a8      

Resource @ 0x87cd6d48    Exclusively owned
    Contention Count = 242361
    NumberOfExclusiveWaiters = 5
     Threads: 87540718-01<*>
     Threads Waiting On Exclusive Access:
              878ceaf0       8785ac50       8884a7b8       87c4ca28      
              89ab5db0      

Resource @ 0x87c44d08    Exclusively owned
    Contention Count = 2560
    NumberOfExclusiveWaiters = 1
     Threads: 87540718-01<*>
     Threads Waiting On Exclusive Access:
              87c4e468      

Resource @ 0×87bf51d8    Exclusively owned
    Contention Count = 3
    NumberOfSharedWaiters = 3
     Threads: 89e76db0-01<*> 8739ac50-01    86f5d1c8-01    870f4db0-01   

Resource @ 0×888bfc38    Exclusively owned
    Contention Count = 3
    NumberOfSharedWaiters = 3
     Threads: 88a10db0-01<*> 86c94198-01    86dac598-01    86d85c50-01    

The first group of locks (one of them shown in blue) shows various problems with ComponentA module:

0: kd> !thread 88842020 1f
THREAD 88842020  Cid 1cf8.1b28  Teb: 7ffdb000 Win32Thread: bc25e8c0 WAIT: (Unknown) KernelMode Non-Alertable
    88842098  NotificationTimer
Not impersonating
DeviceMap                 e3813fd0
Owning Process            888c5d88       Image:         ApplicationA.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      1163714        Ticks: 0
Context Switch Count      35781                 LargeStack
UserTime                  00:00:00.453
KernelTime                00:00:01.109
Win32 Start Address 0x77ec3ea5
Start Address kernel32!BaseThreadStartThunk (0x7c8217ec)
Stack Init b5cc4bd0 Current b5cc4614 Base b5cc5000 Limit b5cbf000 Call b5cc4bd8
Priority 9 BasePriority 9 PriorityDecrement 0
ChildEBP RetAddr 
b5cc462c 80833ec5 nt!KiSwapContext+0x26
b5cc4658 80829069 nt!KiSwapThread+0x2e5
b5cc46a0 bf8981b3 nt!KeDelayExecutionThread+0x2ab
b5cc46c4 bf898422 ComponentA!LockGUIHandle+0×6d
[…]
b5cc49e8 80a63456 nt!KiFastCallEntry+0xcd
[…]

The second group of locks (shown in red) shows the problem with ComponentB module:

0: kd> !thread 89e76db0 1f
THREAD 89e76db0  Cid 0004.0624  Teb: 00000000 Win32Thread: 00000000 WAIT: (Unknown) KernelMode Alertable
    89e76e28  NotificationTimer
Not impersonating
DeviceMap                 e1006e10
Owning Process            8b581648       Image:         System
Attached Process          N/A            Image:         N/A
Wait Start TickCount      1163714        Ticks: 0
Context Switch Count      545            
UserTime                  00:00:00.000
KernelTime                00:00:00.015
Start Address 0xb9003c20
Stack Init b9148000 Current b9147abc Base b9148000 Limit b9145000 Call 0
Priority 16 BasePriority 8 PriorityDecrement 0
ChildEBP RetAddr 
b9147ad4 80833ec5 nt!KiSwapContext+0x26
b9147b00 80829069 nt!KiSwapThread+0x2e5
b9147b48 b8fc9353 nt!KeDelayExecutionThread+0x2ab
b9147b74 b8ff9460 ComponentB!DeleteShareConnection+0×203
[…]
b9147ddc 8088f61e nt!PspSystemThreadStartup+0×2e
00000000 00000000 nt!KiThreadStartup+0×16

Looking at the list of all threads we see another classes of blocked threads, one is involving ComponentC module in user space:

0: kd> !thread 86c21db0 1f
THREAD 86c21db0  Cid 0fac.5260  Teb: 7ff6a000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable
    869f2f68  SynchronizationEvent
IRP List:
    87fe3148: (0006,0220) Flags: 00000830  Mdl: 00000000
Not impersonating
DeviceMap                 e1006e10
Owning Process            896ccc28       Image:         ServiceA.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      1163714        Ticks: 0
Context Switch Count      22            
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Win32 Start Address 0×005c1de0
LPC Server thread working on message Id 5c1de0
Start Address kernel32!BaseThreadStartThunk (0×7c8217ec)
Stack Init b9787000 Current b9786c60 Base b9787000 Limit b9784000 Call 0
Priority 9 BasePriority 8 PriorityDecrement 0
ChildEBP RetAddr 
b9786c78 80833ec5 nt!KiSwapContext+0×26
b9786ca4 80829bc0 nt!KiSwapThread+0×2e5
b9786cec 8093b034 nt!KeWaitForSingleObject+0×346
b9786d50 8088ad3c nt!NtWaitForSingleObject+0×9a
b9786d50 7c9485ec nt!KiFastCallEntry+0xfc
03f8f984 7c821c8d ntdll!KiFastSystemCallRet
03f8f998 10097728 kernel32!WaitForSingleObject+0×12
03f8f9bc 10008164 ComponentC!ComponentB_Control+0×68
[…]
03f8ffec 00000000 kernel32!BaseThreadStart+0×34

This thread holds a mutant and blocks a dozen of other threads in ServiceA.exe, for example:

THREAD 8aa7cb40  Cid 0fac.0110  Teb: 7ffad000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable
       87764550  Mutant - owning thread 86c21db0

From the function name we can infer that ComponentC controls ComponentB and this makes both blocked threads weakly connected.

Another thread in ServiceB involves DriverA module and blocks a thread ServiceA:

0: kd> !thread 8899e778 1f
THREAD 8899e778  Cid 01b0.13e0  Teb: 7ffdc000 Win32Thread: 00000000 WAIT: (Unknown) KernelMode Non-Alertable
    8aadb6e0  SynchronizationEvent
    8899e7f0  NotificationTimer
IRP List:
    86f21de0: (0006,0220) Flags: 00000884  Mdl: 00000000
Not impersonating
DeviceMap                 e1006e10
Owning Process            8ab3d020       Image:         ServiceB.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      1163714        Ticks: 0
Context Switch Count      2            
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Win32 Start Address 0x005c1a3c
LPC Server thread working on message Id 5c1a3c
Start Address 0x48589bb3
Stack Init aecee000 Current aeced768 Base aecee000 Limit aeceb000 Call 0
Priority 12 BasePriority 11 PriorityDecrement 0
ChildEBP RetAddr 
aeced780 80833ec5 nt!KiSwapContext+0x26
aeced7ac 80829bc0 nt!KiSwapThread+0x2e5
aeced7f4 badffece nt!KeWaitForSingleObject+0x346
WARNING: Stack unwind information not available. Following frames may be wrong.
aeced824 bae00208 DriverA+0×1ece
aeced868 bae0e45a DriverA+0×2208
aeced8a0 8081e095 DriverA+0×1045a

aeced8b4 b946673b nt!IofCallDriver+0×45
[…]

0: kd> !thread 8776c220 1f
THREAD 8776c220  Cid 0fac.5714  Teb: 7ff66000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable
    8776c40c  Semaphore Limit 0x1
Waiting for reply to LPC MessageId 005c1a3c:
Current LPC port e213b0c8
Not impersonating
DeviceMap                 e1006e10
Owning Process            896ccc28       Image:         ServiceA.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      1163714        Ticks: 0
Context Switch Count      12            
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Win32 Start Address 0x75fddd73
Start Address kernel32!BaseThreadStartThunk (0x7c8217ec)
Stack Init aecf2000 Current aecf1c08 Base aecf2000 Limit aecef000 Call 0
Priority 8 BasePriority 8 PriorityDecrement 0
ChildEBP RetAddr 
aecf1c20 80833ec5 nt!KiSwapContext+0x26
aecf1c4c 80829bc0 nt!KiSwapThread+0x2e5
aecf1c94 80920f28 nt!KeWaitForSingleObject+0x346
aecf1d50 8088ad3c nt!NtRequestWaitReplyPort+0x776
aecf1d50 7c9485ec nt!KiFastCallEntry+0xfc
0408f594 75fde6b5 ntdll!KiFastSystemCallRet
0408f5fc 75fdd65e ComponentD!ServiceB_Request+0×1ae
[…]
0408ffec 00000000 kernel32!BaseThreadStart+0×34

In ServiceA we can also find several threads blocked by a RPC request to ServiceC:

0: kd> !thread 87397020 1f
THREAD 87397020  Cid 0fac.38cc  Teb: 7ff80000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable
    87397098  NotificationTimer
Not impersonating
DeviceMap                 e1006e10
Owning Process            896ccc28       Image:         ServiceA.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      1163714        Ticks: 0
Context Switch Count      7807            
UserTime                  00:00:00.125
KernelTime                00:00:00.109
Win32 Start Address 0x005c21a8
LPC Server thread working on message Id 5c21a8
Start Address kernel32!BaseThreadStartThunk (0x7c8217ec)
Stack Init b4ecf000 Current b4ecec80 Base b4ecf000 Limit b4ecc000 Call 0
Priority 13 BasePriority 8 PriorityDecrement 0
ChildEBP RetAddr 
b4ecec98 80833ec5 nt!KiSwapContext+0x26
b4ececc4 80829069 nt!KiSwapThread+0x2e5
b4eced0c 80996d8a nt!KeDelayExecutionThread+0x2ab
b4eced54 8088ad3c nt!NtDelayExecution+0x84
b4eced54 7c9485ec nt!KiFastCallEntry+0xfc
03a1f178 7c8024ed ntdll!KiFastSystemCallRet
03a1f188 77c5e51a kernel32!Sleep+0xf
03a1f198 77c36a44 RPCRT4!OSF_BINDING_HANDLE::Unbind+0x3a
03a1f1b0 77c36a08 RPCRT4!OSF_BINDING_HANDLE::~OSF_BINDING_HANDLE+0x32
03a1f1bc 77c369f1 RPCRT4!OSF_BINDING_HANDLE::`scalar deleting destructor'+0xd
03a1f1cc 77c5250a RPCRT4!OSF_BINDING_HANDLE::BindingFree+0x30
03a1f1dc 77f48c00 RPCRT4!RpcBindingFree+0x4e
03a1f1e8 77f48be2 ADVAPI32!RpcpUnbindRpc+0x15
03a1f1f4 77c3688e ADVAPI32!PLSAPR_SERVER_NAME_unbind+0xd
03a1f21c 77c369bb RPCRT4!GenericHandleMgr+0xca
03a1f23c 77c36983 RPCRT4!GenericHandleUnbind+0x31
03a1f260 77cb31b2 RPCRT4!NdrpClientFinally+0x5b
03a1f26c 77cb317a RPCRT4!NdrClientCall2+0x324
03a1f64c 77f4a0a1 RPCRT4!NdrClientCall2+0x2ea
03a1f664 77f4a022 ComponentD!ServiceC_Request+0×1c
[…]
03a1f8f8 77cb33e1 RPCRT4!Invoke+0×30
03a1fcf8 77cb35c4 RPCRT4!NdrStubCall2+0×299
03a1fd14 77c4ff7a RPCRT4!NdrServerCall2+0×19
03a1fd48 77c5042d RPCRT4!DispatchToStubInCNoAvrf+0×38
03a1fd9c 77c50353 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0×11f
03a1fdc0 77c511dc RPCRT4!RPC_INTERFACE::DispatchToStub+0xa3
03a1fdfc 77c512f0 RPCRT4!LRPC_SCALL::DealWithRequestMessage+0×42c
03a1fe20 77c58678 RPCRT4!LRPC_ADDRESS::DealWithLRPCRequest+0×127
03a1ff84 77c58792 RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0×430
03a1ff8c 77c5872d RPCRT4!RecvLotsaCallsWrapper+0xd
03a1ffac 77c4b110 RPCRT4!BaseCachedThreadRoutine+0×9d
03a1ffb8 7c824829 RPCRT4!ThreadStartRoutine+0×1b
03a1ffec 00000000 kernel32!BaseThreadStart+0×34

In ServiceC we see several RPC processing threads blocked by ComponentE:   

0: kd> !thread 873acb40 1f
THREAD 873acb40  Cid 023c.3a00  Teb: 7ff93000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable
    89f0aeb0  Semaphore Limit 0x1
    873acbb8  NotificationTimer
IRP List:
    89838a00: (0006,0094) Flags: 00000900  Mdl: 00000000
    8705d4a0: (0006,0094) Flags: 00000800  Mdl: 00000000
    88bc9440: (0006,0094) Flags: 00000900  Mdl: 00000000
    87674af8: (0006,0094) Flags: 00000900  Mdl: 00000000
    86f2aa48: (0006,0094) Flags: 00000900  Mdl: 00000000
    87551290: (0006,0094) Flags: 00000900  Mdl: 00000000
Not impersonating
DeviceMap                 e1006e10
Owning Process            89dc0508       Image:         ServiceC.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      1163714        Ticks: 0
Context Switch Count      16571            
UserTime                  00:00:00.250
KernelTime                00:00:00.703
Win32 Start Address RPCRT4!ThreadStartRoutine (0x77c4b0f5)
Start Address kernel32!BaseThreadStartThunk (0x7c8217ec)
Stack Init b2a9b000 Current b2a9ac60 Base b2a9b000 Limit b2a98000 Call 0
Priority 13 BasePriority 9 PriorityDecrement 0
ChildEBP RetAddr 
b2a9ac78 80833ec5 nt!KiSwapContext+0x26
b2a9aca4 80829bc0 nt!KiSwapThread+0x2e5
b2a9acec 8093b034 nt!KeWaitForSingleObject+0x346
b2a9ad50 8088ad3c nt!NtWaitForSingleObject+0x9a
b2a9ad50 7c9485ec nt!KiFastCallEntry+0xfc
022cf8d0 7c821c8d ntdll!KiFastSystemCallRet
022cf8e4 741269e5 kernel32!WaitForSingleObject+0x12
022cf8f8 7412cdca ComponentE!Enumerate+0×37
[…]
022cf944 77cb33e1 RPCRT4!Invoke+0×30
022cfd44 77cb35c4 RPCRT4!NdrStubCall2+0×299
022cfd60 77c4ff7a RPCRT4!NdrServerCall2+0×19
022cfd94 77c5042d RPCRT4!DispatchToStubInCNoAvrf+0×38
022cfde8 77c50353 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0×11f
022cfe0c 77c38e0d RPCRT4!RPC_INTERFACE::DispatchToStub+0xa3
022cfe40 77c38cb3 RPCRT4!OSF_SCALL::DispatchHelper+0×149
022cfe54 77c38c2b RPCRT4!OSF_SCALL::DispatchRPCCall+0×10d
022cfe84 77c38b5e RPCRT4!OSF_SCALL::ProcessReceivedPDU+0×57f
022cfea4 77c3e8db RPCRT4!OSF_SCALL::BeginRpcCall+0×194
022cff04 77c3e7b4 RPCRT4!OSF_SCONNECTION::ProcessReceiveComplete+0×435
022cff18 77c4b799 RPCRT4!ProcessConnectionServerReceivedEvent+0×21
022cff84 77c4b9b5 RPCRT4!LOADABLE_TRANSPORT::ProcessIOEvents+0×1b8
022cff8c 77c5872d RPCRT4!ProcessIOEventsWrapper+0xd
022cffac 77c4b110 RPCRT4!BaseCachedThreadRoutine+0×9d
022cffb8 7c824829 RPCRT4!ThreadStartRoutine+0×1b
022cffec 00000000 kernel32!BaseThreadStart+0×34
 

Therefore we have 4 main groups of wait chain endpoints involving semantically disjoint ComponentA, ComponentB, ComponentE and DriverA modules. Although their module names do not infer disjointedness this example was abstracted from the real incident where respective modules were having different system functions and were even from different software vendors.

- Dmitry Vostokov @ DumpAnalysis.org -

Manual dump, dynamic memory corruption, blocked threads, stack trace collection, multiple exceptions, wait chains and deadlock: pattern cooperation

Sunday, March 8th, 2009

The following memory dump of a hanging process was manually generated:

Loading Dump File [Process.dmp]
User Mini Dump File with Full Memory: Only application data is available

Comment: 'Userdump generated complete user-mode minidump with Standalone function on COMPUTER'

Applying default analysis command shows the following stack traces:

0:000> .kframes 100
Default stack trace depth is 0n256 frames

0:000> !analyze -v 

[...] 

STACK_TEXT: 
009ef258 7c827d0b ntdll!KiFastSystemCallRet
009ef25c 7c83d236 ntdll!NtWaitForSingleObject+0xc
009ef298 7c83d281 ntdll!RtlpWaitOnCriticalSection+0×1a3
009ef2b8 7c82dabf ntdll!RtlEnterCriticalSection+0xa8

009ef358 7c82dab1 ntdll!LdrpGetProcedureAddress+0×128
009ef374 77e764ea ntdll!LdrGetProcedureAddress+0×18
009ef5d8 7c34c456 kernel32!UnhandledExceptionFilter+0×46f
009ef5f4 7c34957c msvcr71!_XcptFilter+0×15f
009ef600 7c34246e msvcr71!_endthreadex+0xb7
009ef628 7c828752 msvcr71!_except_handler3+0×61
009ef64c 7c828723 ntdll!ExecuteHandler2+0×26
009ef6f4 7c82855e ntdll!ExecuteHandler+0×24
009ef6f4 7c82be3e ntdll!KiUserExceptionDispatcher+0xe

009efa00 7c82a319 ntdll!RtlpFindEntry+0×68
009efc2c 7c3416b3 ntdll!RtlAllocateHeap+0×606
009efc6c 7c3416db msvcr71!_heap_alloc+0xe0
009efc74 7c360947 msvcr71!_nh_malloc+0×10
009efc80 0285f893 msvcr71!operator new+0xb

[…]
009effb8 77e64829 msvcr71!_endthreadex+0xa0
009effec 00000000 kernel32!BaseThreadStart+0×34

[...]

BUGCHECK_STR:  APPLICATION_FAULT_STATUS_BREAKPOINT_STATUS_BREAKPOINT

We see the exeption was dispatched because of heap corruption and the unhandled exception filter is blocked waiting for a critical section. We can immediately recommend to enable full page heap. However, let’s explore the dump file further. By listing all threads we find that there were 2 exceptions with the second one having the following stack trace:

0:000> ~*kb

[...]

  98  Id: be4.2ca4 Suspend: 1 Teb: 7ff68000 Unfrozen
ChildEBP RetAddr  Args to Child             
0f83c80c 7c827d0b 7c83d236 00000154 00000000 ntdll!KiFastSystemCallRet
0f83c810 7c83d236 00000154 00000000 00000000 ntdll!NtWaitForSingleObject+0xc
0f83c84c 7c83d281 00000154 00000004 0f83c8b0 ntdll!RtlpWaitOnCriticalSection+0×1a3
0f83c86c 7c82dabf 7c8877a0 00000000 00000000 ntdll!RtlEnterCriticalSection+0xa8

0f83c90c 7c82dab1 00000000 77e767cc 00000000 ntdll!LdrpGetProcedureAddress+0×128
0f83c928 77e764ea 00000000 77e767cc 00000000 ntdll!LdrGetProcedureAddress+0×18
0f83cb8c 77e792a3 0f83cbb4 77e61ac1 0f83cbbc kernel32!UnhandledExceptionFilter+0×46f
0f83cb94 77e61ac1 0f83cbbc 00000000 0f83cbbc kernel32!BaseThreadStart+0×4a
0f83cbbc 7c828752 0f83cca0 0f83ffdc 0f83ccbc kernel32!_except_handler3+0×61
0f83cbe0 7c828723 0f83cca0 0f83ffdc 0f83ccbc ntdll!ExecuteHandler2+0×26
0f83cc88 7c82855e 0f83c000 0f83ccbc 0f83cca0 ntdll!ExecuteHandler+0×24
0f83cc88 7c82be3e 0f83c000 0f83ccbc 0f83cca0 ntdll!KiUserExceptionDispatcher+0xe

0f83cf94 7c82a319 00340178 00000050 0b425f54 ntdll!RtlpFindEntry+0×68
0f83d1c0 7c3416b3 00340000 00000000 00000278 ntdll!RtlAllocateHeap+0×606
0f83d200 7c3416db 00000278 7c3416f8 00000278 msvcr71!_heap_alloc+0xe0
0f83d208 7c3416f8 00000278 00000000 003214fd msvcr71!_nh_malloc+0×10
0f83d214 003214fd 00000278 0f83d23c 023e5912 msvcr71!malloc+0xf

[…]
0f83f8b4 77ce33e1 0cf4ca00 0f83fa98 0000000c rpcrt4!Invoke+0×30
0f83fcb4 77ce35c4 00000000 00000000 0bf93d8c rpcrt4!NdrStubCall2+0×299
0f83fcd0 77c7ff7a 0bf93d8c 0beae6b8 0bf93d8c rpcrt4!NdrServerCall2+0×19
0f83fd04 77c8042d 0cf4f53e 0bf93d8c 0f83fdec rpcrt4!DispatchToStubInCNoAvrf+0×38
0f83fd58 77c80353 0000006f 00000000 0cf74358 rpcrt4!RPC_INTERFACE::DispatchToStubWorker+0×11f
0f83fd7c 77c7e0d4 0bf93d8c 00000000 0cf74358 rpcrt4!RPC_INTERFACE::DispatchToStub+0xa3
0f83fdbc 77c7e080 0bf93d8c 0bf93d44 00000000 rpcrt4!RPC_INTERFACE::DispatchToStubWithObject+0xc0
0f83fdfc 77c812f0 001c85c0 0bf0f510 00189d08 rpcrt4!LRPC_SCALL::DealWithRequestMessage+0×41e
0f83fe20 77c88678 0bf0f548 0f83fe38 001c85c0 rpcrt4!LRPC_ADDRESS::DealWithLRPCRequest+0×127
0f83ff84 77c88792 0f83ffac 77c8872d 0bf0f510 rpcrt4!LRPC_ADDRESS::ReceiveLotsaCalls+0×430
0f83ff8c 77c8872d 0bf0f510 00000000 00000000 rpcrt4!RecvLotsaCallsWrapper+0xd
0f83ffac 77c7b110 00145520 0f83ffec 77e64829 rpcrt4!BaseCachedThreadRoutine+0×9d
0f83ffb8 77e64829 0be74ce0 00000000 00000000 rpcrt4!ThreadStartRoutine+0×1b
0f83ffec 00000000 77c7b0f5 0be74ce0 00000000 kernel32!BaseThreadStart+0×34

[...] 

 102  Id: be4.2ac0 Suspend: 1 Teb: 7ff4f000 Unfrozen
ChildEBP RetAddr  Args to Child             
134df91c 7c827d0b 7c83d236 000004e4 00000000 ntdll!KiFastSystemCallRet
134df920 7c83d236 000004e4 00000000 00000000 ntdll!NtWaitForSingleObject+0xc
134df95c 7c83d281 000004e4 00000004 00340000 ntdll!RtlpWaitOnCriticalSection+0×1a3
134df97c 7c82a264 00340608 00000000 0000008c ntdll!RtlEnterCriticalSection+0xa8

134dfba4 7c3423aa 00340000 00000008 0000008c ntdll!RtlAllocateHeap+0×313
134dfbe4 7c3422cb 00000001 0000008c 00000000 msvcr71!calloc+0xe6

134dfbfc 7c81a352 7c340000 00000002 00000000 msvcr71!_CRTDLL_INIT+0×138
134dfc1c 7c82ed97 7c34229f 7c340000 00000002 ntdll!LdrpCallInitRoutine+0×14
134dfcb8 7c82ec9f 134dfd28 134dfd28 00000000 ntdll!LdrpInitializeThread+0×10d
134dfd14 7c8284c5 134dfd28 7c800000 00000000 ntdll!_LdrpInitialize+0×16f
00000000 00000000 00000000 00000000 00000000 ntdll!KiUserApcDispatcher+0×25

If we inspect the list of locked critical sections and compare stack trace parameters for threads 2ac0 and 2ca4 we would see them deadlocked:

0:000> !locks

CritSec ntdll!LdrpLoaderLock+0 at 7c8877a0
WaiterWoken        No
LockCount          6
RecursionCount     1
OwningThread       2ac0
EntryCount         0
ContentionCount    36
*** Locked

CritSec +340608 at 00340608
WaiterWoken        No
LockCount          32
RecursionCount     1
OwningThread       2ca4
EntryCount         0
ContentionCount    6f
*** Locked

[...]

Looking at other locks (omitted here) we can find additional wait chains but we employ default hang analysis command to find one chain for us:

0:000> !analyze -v -hang

[...]

LOADERLOCK_BLOCKED_API:  UnhandledExceptionFilter:LdrGetProcedureAddress: LdrpGetProcedureAddress:

LOADERLOCK_OWNER_API:  _LdrpInitialize:LdrpInitializeThread:LdrpCallInitRoutine:

DERIVED_WAIT_CHAIN: 

Dl Eid Cid     WaitType
-- --- ------- --------------------------
   2   be4.c20 Critical Section       -->
x  98  be4.2ca4 Critical Section       -->
x  102 be4.2ac0 Critical Section       --^

WAIT_CHAIN_COMMAND:  ~2s;k;;~98s;k;;~102s;k;;

BLOCKING_THREAD:  00002ca4

DEFAULT_BUCKET_ID:  APPLICATION_HANG_DEADLOCK_HeapCorruption

PRIMARY_PROBLEM_CLASS:  APPLICATION_HANG_DEADLOCK_HeapCorruption

[...]

- Dmitry Vostokov @ DumpAnalysis.org -

Busy system, blocked threads, wait chains and deadlock: pattern cooperation

Tuesday, March 3rd, 2009

In one kernel memory dump we can see the signs of a busy system where all processors are executing non-idle threads:

0: kd> !running

System Processors ff (affinity mask)
  Idle Processors 0

     Prcb      Current   Next   
  0  ffdff120  8b223928            ................
  1  f772f120  8a765380            ................
  2  f7737120  89365db0            ................
  3  f773f120  8833adb0            ................
  4  f7747120  889bbdb0            ................
  5  f774f120  8c085db0            ................
  6  f7757120  8aa79698            ................
  7  f775f120  896c0668            ................

When inspecting them we see that some are kernel worker threads without a process context, for example:

0: kd> !thread 8aa79698 1f
THREAD 8aa79698  Cid 0004.6edc  Teb: 00000000 Win32Thread: 00000000 RUNNING on processor 6
Not impersonating
DeviceMap                 d66008d0
Owning Process            8d15d648       Image:         System
Wait Start TickCount      2548878        Ticks: 3 (0:00:00:00.046)
Context Switch Count      248713            
UserTime                  00:00:00.000
KernelTime                00:00:00.906

Start Address nt!ExpWorkerThread (0×80881860)
Stack Init acfbc000 Current acfbbcec Base acfbc000 Limit acfb9000 Call 0
Priority 13 BasePriority 13 PriorityDecrement 0
ChildEBP RetAddr
WARNING: Stack unwind information not available. Following frames may be wrong.
acfbbaec b19483f3 driver+0×3f65
acfbbb6c 8081e095 driver+0×23f3
acfbbb80 af36044a nt!IofCallDriver+0×45
[…]
acfbbdac 8094bea4 nt!ExpWorkerThread+0xeb
acfbbddc 8088f57e nt!PspSystemThreadStartup+0×2e
00000000 00000000 nt!KiThreadStartup+0×16

some threads with an associated process context are running in kernel space:

0: kd> !thread 889bbdb0 1f
THREAD 889bbdb0  Cid 6c58.6f98  Teb: 00000000 Win32Thread: 00000000 RUNNING on processor 4
Not impersonating
DeviceMap                 d66008d0
Owning Process            89ad8b18       Image:         csrss.exe
Wait Start TickCount      2548880        Ticks: 1 (0:00:00:00.015)
Context Switch Count      129536            
UserTime                  00:00:00.000
KernelTime                00:00:00.312

Start Address displaydriver!Thread (0xbfad4a60)
Stack Init a439d000 Current a439cc70 Base a439d000 Limit a439a000 Call 0
Priority 13 BasePriority 10 PriorityDecrement 3
ChildEBP RetAddr
a439c004 bfad707f displaydriver!CalcRegion+0×30
[…]
a439cddc 8088f57e nt!PspSystemThreadStartup+0×2e
00000000 00000000 nt!KiThreadStartup+0×16

and some threads with an associated process context are running in user space:

0: kd> !thread 8c085db0 1f
THREAD 8c085db0  Cid 2318.231c  Teb: 7ffdd000 Win32Thread: b4b5ebe8 RUNNING on processor 5
Not impersonating
DeviceMap                 dc1a71f0
Owning Process            8b02e458       Image:         Application.EXE
Wait Start TickCount      2548881        Ticks: 0
Context Switch Count      725122                 LargeStack
UserTime                  00:00:01.625
KernelTime                00:00:03.062

Win32 Start Address 0×30001084
Start Address 0×7c8217f8
Stack Init ad648000 Current ad647c50 Base ad648000 Limit ad642000 Call 0
Priority 12 BasePriority 10 PriorityDecrement 0
ChildEBP RetAddr
0013fb7c 00000000 0×7c81b910

Because none of them consumed much CPU the patter of Spiking Thread is ruled out and CPU load can be explained by the number of active user sessions and this appears to be normal:

0: kd> !session
Sessions on machine: 50

However looking at ERESOURCE locks we see many blocked threads and signs of possible wait chains:

0: kd> !locks
**** DUMP OF ALL RESOURCE OBJECTS ****

Resource @ 0x8cbfaa68    Exclusively owned
    Contention Count = 22969
    NumberOfSharedWaiters = 1
    NumberOfExclusiveWaiters = 109
     Threads: 8a961db0-01<*> 8bf532b0-01   
     Threads Waiting On Exclusive Access:
              8b4532f0       884fc648       88c58a00       8a751360      
              88ed64f8       89aa6738       89870db0       881dedb0      
              8a6d7b40       8b4a4db0       89818ad0       8afcedb0      
              8a2ca020       88684db0       8b411020       89d595c0      
              8d1573f0       88d06020       8aed8b38       8a8c9020      
              8a5a0a50       8a1f63b0       89b66688       89bf1db0      
              880dab18       882e6730       895d8020       88e6d3f0      
              896e6748       89802100       8a604508       8907c5e8      
              8890a020       885e2300       8a061bd8       88445340      
              88113db0       8a680db0       89b53370       88c3f2a0      
              88a774f8       8834ddb0       89d78888       88386020      
              897ca8d8       8b3532d0       882341d0       8a4a9b80      
              87e7c4f8       895e5db0       8846f4e8       89df3db0      
              889b8b40       89d82db0       89e4b720       8aadadb0      
              8aa63020       88852020       8a249ba8       891b8c20      
              8b3f95f0       8aace760       8b470020       897ad388      
              8c07dba8       8a331628       896c74d0       8997cb40      
              88e133c8       886eddb0       8864e518       89ab5698      
              88d8bdb0       89996db0       8ac54d28       87f42020      
              882b1020       8857fdb0       895f3db0       88b0ab40      
              8a1aadb0       8b819020       8b3bf388       88315660      
              8a45db18       883fbdb0       88f53db0       87f209a0      
              8978ddb0       8840c868       8823c1c8       88277db0      
              89c0a8c8       88322940       8a475db0       8a6ad460      
              8a35a4c8       88e3da40       886b1b40       8886a2a0      
              8897d750       8b30bdb0       8a123020       8b0ad7f8      
              8a256930       885cedb0       88ec8db0       887d7ba8      
              88175b90      

Resource @ 0x8b8f09a8    Shared 1 owning threads
    Contention Count = 123597
    NumberOfSharedWaiters = 1
    NumberOfExclusiveWaiters = 6
     Threads: 88200840-01    8a92ddb0-01<*>
     Threads Waiting On Exclusive Access:
              8a317db0       8d151840       899acdb0       8a961db0
              891ac940       89ee5db0      

Resource @ 0x8ac79f08    Exclusively owned
    Contention Count = 717691
    NumberOfExclusiveWaiters = 12
     Threads: 8a5193f0-01<*>
     Threads Waiting On Exclusive Access:
              880e7b40       8a60adb0       8a543108       8a4be020      
              8a77c360       8a470730       87f12db0       8a4618d0      
              895c5600       8a942b98       8a453b40       8a3bf020      

Resource @ 0x8a73ed28    Exclusively owned
    Contention Count = 4
    NumberOfExclusiveWaiters = 2
     Threads: 8a45db18-01<*>
     Threads Waiting On Exclusive Access:
              8a412db0       8a542268      

Resource @ 0x8a621bf8    Exclusively owned
    Contention Count = 8532
    NumberOfExclusiveWaiters = 3
     Threads: 8a412db0-01<*>
     Threads Waiting On Exclusive Access:
              8a5193f0       8a60cdb0       8a595c78      

Resource @ 0x8a4c8b20    Exclusively owned
    Contention Count = 1
    NumberOfExclusiveWaiters = 1
     Threads: 8a92ddb0-01<*>
     Threads Waiting On Exclusive Access:
              89524a70      

Resource @ 0x8a43b0e8    Exclusively owned
    Contention Count = 1135854
    NumberOfSharedWaiters = 1
    NumberOfExclusiveWaiters = 9
     Threads: 8aaa3020-01<*> 88efb400-01   
     Threads Waiting On Exclusive Access:
              89f883b0       8a273a70       89f82c10       89fd9020      
              89ec0db0       89571290       89edcdb0       88930400      
              8845f4c8      

Resource @ 0x89f7dbe8    Exclusively owned
    Contention Count = 2
    NumberOfExclusiveWaiters = 2
     Threads: 891b8c20-01<*>
     Threads Waiting On Exclusive Access:
              89ecedb0       89fc3020      

Resource @ 0x89f82f28    Exclusively owned
    Contention Count = 26674
    NumberOfExclusiveWaiters = 2
     Threads: 89fc3020-01<*>
     Threads Waiting On Exclusive Access:
              8aaa3020       8a02db40      

Resource @ 0x89315320    Exclusively owned
    Contention Count = 509247
    NumberOfSharedWaiters = 1
    NumberOfExclusiveWaiters = 19
     Threads: 89261428-01<*> 89313a08-01   
     Threads Waiting On Exclusive Access:
              89cc7db0       8ad26528       8970db68       88ef64d0      
              8a629020       89450798       8825c9a8       89206378      
              8a7c7b90       89162890       8ae7c020       883318e0      
              88bd6358       89367db0       8952aaa0       8a817b40      
              881d65b8       8ab74db0       889202c0      

Resource @ 0x893872d8    Exclusively owned
    Contention Count = 5079
    NumberOfExclusiveWaiters = 3
     Threads: 896e6748-01<*>
     Threads Waiting On Exclusive Access:
              89261428       893bc3c8       892a88a8      

Resource @ 0x8924adf8    Exclusively owned
    Contention Count = 1
    NumberOfExclusiveWaiters = 1
     Threads: 88ed64f8-01<*>
     Threads Waiting On Exclusive Access:
              89146660      

Resource @ 0x890281b0    Exclusively owned
    Contention Count = 4
    NumberOfExclusiveWaiters = 4
     Threads: 88d06020-01<*>
     Threads Waiting On Exclusive Access:
              88b5c528       88c5aa98       87ef77b8       88c48b40      

Resource @ 0x88d40440    Exclusively owned
    Contention Count = 13
    NumberOfExclusiveWaiters = 1
     Threads: 899acdb0-01<*>
     Threads Waiting On Exclusive Access:
              895e6db0      

Resource @ 0x88ed0c20    Exclusively owned
    Contention Count = 2
    NumberOfExclusiveWaiters = 2
     Threads: 895e6db0-01<*>
     Threads Waiting On Exclusive Access:
              88ad7540       88b5f620      

Resource @ 0x894e7990    Exclusively owned
    Contention Count = 3852647
    NumberOfExclusiveWaiters = 12
     Threads: 881b14b8-01<*>
     Threads Waiting On Exclusive Access:
              88a13db0       87f12020       8aad7a20       8820a020      
              8824bdb0       88213db0       88eefdb0       88ab7550      
              889fe808       89df17a0       8aa83430       8a8f73c8      

Resource @ 0x88559288    Exclusively owned
    Contention Count = 7422
    NumberOfExclusiveWaiters = 3
     Threads: 880dab18-01<*>
     Threads Waiting On Exclusive Access:
              881b14b8       88311020       882ab660      

Resource @ 0x8aff12b0    Exclusively owned
    Contention Count = 6
    NumberOfExclusiveWaiters = 1
     Threads: 89524a70-01<*>
     Threads Waiting On Exclusive Access:
              8a92ddb0      

62174 total locks, 75 locks currently held

Starting with the thread 8a961db0 that blocks 109 other threads we can unravel the following deadlock:

109 threads -> 8a961db0 -> 8a92ddb0 -> 89524a70 -> 8a92ddb0 -> 89524a70 -> …

Looking at threads involved in the deadlock we see that they belong to the same process and deadlocked in kernel space when running through driverA.sys code:

0: kd> !thread 89524a70 1f
THREAD 89524a70  Cid 1fdc.26cc  Teb: 7ffdd000 Win32Thread: b4d0fea8 WAIT: (Unknown) KernelMode Non-Alertable
    89170648  SynchronizationEvent
    89524ae8  NotificationTimer
IRP List:
    88e7a008: (0006,0268) Flags: 00000000  Mdl: 00000000
    8a7cc228: (0006,0268) Flags: 00000000  Mdl: 00000000
    89e67b90: (0006,0268) Flags: 00000000  Mdl: 00000000 Not impersonating
DeviceMap                 e2e671d0
Owning Process            88c37020       Image:         ApplicationA.exe
Wait Start TickCount      2548760        Ticks: 121 (0:00:00:01.890)
Context Switch Count      4850                 LargeStack
UserTime                  00:00:00.734
KernelTime                00:00:01.718
Win32 Start Address 0×00404054
Start Address 0×7c8217f8
Stack Init 91971000 Current 91970278 Base 91971000 Limit 9196a000 Call 0
Priority 14 BasePriority 10 PriorityDecrement 4
ChildEBP RetAddr
91970290 80833ec5 nt!KiSwapContext+0×26
919702bc 80829bc0 nt!KiSwapThread+0×2e5
91970304 8087e0db nt!KeWaitForSingleObject+0×346
91970340 8087e2f5 nt!ExpWaitForResource+0xd5
91970360 b0a1cf6d nt!ExAcquireResourceExclusiveLite+0×8d
91970374 b0a08cef driverA+0×2ef6d
919703cc b0a089cc driverA+0×1acef
919703f4 b0a209d9 driverA+0×1a9cc
9197056c b0a20386 driverA+0×329d9
91970630 b0a1dc32 driverA+0×32386
919706e8 b0a20508 driverA+0×2fc32
919707ac b0a1eec1 driverA+0×32508
919708a0 b0a21e90 driverA+0×30ec1
91970930 b0a171c9 driverA+0×33e90
919709c4 b0a16c9d driverA+0×291c9
91970a38 b0a600b3 driverA+0×28c9d
91970a84 b0a45dda driverA+0×720b3
91970afc b0a4657a driverA+0×57dda
91970b48 8081e095 driverA+0×5857a

91970b5c f7876d28 nt!IofCallDriver+0×45
91970b88 8081e095 fltmgr!FltpDispatch+0×152
91970b9c f74fc6ca nt!IofCallDriver+0×45
91970bb4 f7876d28 driverB+0×56ca
91970be0 8081e095 fltmgr!FltpDispatch+0×152
91970bf4 b195a4e1 nt!IofCallDriver+0×45
91970c18 b195a5d0 driverC!PassThrough+0xd1
91970c28 8081e095 driverC!Dispatch+0×80
91970c3c f7876d28 nt!IofCallDriver+0×45
91970c68 8081e095 fltmgr!FltpDispatch+0×152
91970c7c 808f7601 nt!IofCallDriver+0×45
91970c90 808f5339 nt!IopSynchronousServiceTail+0×10b
91970d38 8088ac9c nt!NtWriteFile+0×663
91970d38 7c9485ec nt!KiFastCallEntry+0xfc

0: kd> !thread 8a92ddb0 1f
THREAD 8a92ddb0  Cid 1fdc.7b98  Teb: 7ffa9000 Win32Thread: b4deeae8 WAIT: (Unknown) KernelMode Non-Alertable
    8b422388  SynchronizationEvent
    8a92de28  NotificationTimer
Not impersonating
DeviceMap                 e2e671d0
Owning Process            88c37020       Image:         ApplicationA.exe
Wait Start TickCount      2548760        Ticks: 121 (0:00:00:01.890)
Context Switch Count      956                 LargeStack
UserTime                  00:00:00.000
KernelTime                00:00:00.015
Win32 Start Address 0×01381fa0
Start Address 0×7c8217ec
Stack Init 917c1000 Current 917c034c Base 917c1000 Limit 917bd000 Call 0
Priority 14 BasePriority 10 PriorityDecrement 4
ChildEBP RetAddr
917c0364 80833ec5 nt!KiSwapContext+0×26
917c0390 80829bc0 nt!KiSwapThread+0×2e5
917c03d8 8087e0db nt!KeWaitForSingleObject+0×346
917c0414 8087e2f5 nt!ExpWaitForResource+0xd5
917c0434 b0a1cf6d nt!ExAcquireResourceExclusiveLite+0×8d
917c0448 b0a08cef driverA+0×2ef6d
917c04a0 b0a089cc driverA+0×1acef
917c04c8 b0a13787 driverA+0×1a9cc
917c053c b0a0bfaa driverA+0×25787
917c057c b0a0c3b3 driverA+0×1dfaa
917c0858 b0a0ccaf driverA+0×1e3b3
917c0934 b0a6074c driverA+0×1ecaf
917c097c b0a4f9d2 driverA+0×7274c
917c0a18 b0a501f6 driverA+0×619d2
917c0a40 b0a5020c driverA+0×621f6
917c0a4c b0a50442 driverA+0×6220c
917c0a6c b0a50687 driverA+0×62442
917c0ac4 b0a50cb0 driverA+0×62687
917c0b08 b0a50ddd driverA+0×62cb0
917c0b18 8081e095 driverA+0×62ddd

917c0b2c f7876d28 nt!IofCallDriver+0×45
917c0b58 8081e095 fltmgr!FltpDispatch+0×152
917c0b6c f74fc6ca nt!IofCallDriver+0×45
917c0b84 f7876d28 driverB+0×56ca
917c0bb0 8081e095 fltmgr!FltpDispatch+0×152
917c0bc4 b195a4e1 nt!IofCallDriver+0×45
917c0be8 b195a5d0 driverC!PassThrough+0xd1
917c0bf8 8081e095 driverC!Dispatch+0×80
917c0c0c f7876d28 nt!IofCallDriver+0×45
917c0c38 8081e095 fltmgr!FltpDispatch+0×152
917c0c4c 808f7601 nt!IofCallDriver+0×45
917c0c60 808f1b45 nt!IopSynchronousServiceTail+0×10b
917c0c84 afdfebd5 nt!NtQueryDirectoryFile+0×5d
917c0cf8 afdff95d driverD+0×8bd5
917c0d30 8088ac9c driverD+0×995d
917c0d30 7c9485ec nt!KiFastCallEntry+0xfc

Other wait chains seem to be subordinate to the main deadlock chain. 

- Dmitry Vostokov @ DumpAnalysis.org -

Pattern-Driven Memory Analysis (Part 1)

Monday, March 2nd, 2009

Last week I had an opportunity to present a pattern-driven memory dump analysis methodology at a global engineering conference. Now in a series of articles I’m going to clarify certain points and extend it to wider domain of memory analysis including computer memory forensics and intelligence.

Today I post the reworked picture of a waterfall-like analysis process:

 

Various phases and their relationship will be discussed in subsequent parts together with examples. 

- Dmitry Vostokov @ DumpAnalysis.org -

Debugger Log Reading Techniques (Part 1)

Thursday, February 26th, 2009

Debugger logs (textual output) from commands like !process 0 ff and various scripts can be very long and consist of thousands of pages. I found the following reading technique useful for my daily memory dump analysis activities:

CSA-QSA

Checklists-Skim-Analyze—Questions-Survey-Analyze   

1. First, have a checklist

2. Skim through the log several times

3. Write analysis notes

4. Have a list of questions based on problem description and steps 1-3

5. Survey the log

6. Write analysis notes

Repeat steps 2,3 and 5,6 if necessary.

This technique can also be applied to reading any large logs, for example, voluminous CDF or ETW traces.

- Dmitry Vostokov @ DumpAnalysis.org -

Crash Dump Analysis Patterns (Part 81)

Thursday, February 19th, 2009

One analysis problem that happens frequently is the absence of stack traces due to kernel stack pages being paged out and therefore not present in a complete memory dump that only contains physical memory. This shouldn’t be a problem for kernel or process memory dumps because they contain virtual memory. The problem usually manifests itself either on busy systems utilizing almost all physical memory or on over-aged systems where certain processes hadn’t been used for a long time. It could also be the case when a problem happened some time ago and only diagnosed much later. For example, this LPC wait chain for coupled processes happened to be 2 days ago before the dump was saved:

0: kd> !thread fffffadfcf9e8bf0 1f
THREAD fffffadfcf9e8bf0  Cid 61f0.2c70  Teb: 000007fffffdd000 Win32Thread: fffff97ff381a480 WAIT: (Unknown) UserMode Non-Alertable
    fffffadfcf9e8f58  Semaphore Limit 0x1
Waiting for reply to LPC MessageId 01e2cb39:
Current LPC port fffffa800e5a9d10
Impersonation token:  fffffa80039cd050 (Level Impersonation)
Owning Process fffffadffc7c7c20      
Image:         applicationA.exe
Wait Start TickCount      12018444           
Ticks: 11312740 (2:01:06:01.562)
Context Switch Count      456                 LargeStack
UserTime                  00:00:00.046
KernelTime                00:00:00.078
Start Address applicationA (0×0000000100061411)
Stack Init fffffadc125d4e00 Current fffffadc125d48e0
Base fffffadc125d5000 Limit fffffadc125cc000 Call 0
Priority 15 BasePriority 15 PriorityDecrement 0
Kernel stack not resident.

0: kd> !lpc message 01e2cb39
Searching message 1e2cb39 in threads …
    Server thread fffffadff93c5bf0 is working on message 1e2cb39

0: kd> !thread fffffadff93c5bf0 1f
THREAD fffffadff93c5bf0  Cid 0218.5130  Teb: 000007ffffcbc000 Win32Thread: 0000000000000000 WAIT: (Unknown) UserMode Non-Alertable
    fffffadff6c71c70  SynchronizationEvent
Impersonation token:  fffffa803bde5060 (Level Impersonation)
Owning Process fffffadcde439280     
Image:         applicationB.exe
Wait Start TickCount      12018444        
Ticks: 11312740 (2:01:06:01.562)
Context Switch Count      12            
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Win32 Start Address 0×0000000001e2cb39
LPC Server thread working on message Id 1e2cb39
Start Address kernel32 (0×0000000077d6b6a0)
Stack Init fffffadc28b19e00 Current fffffadc28b19950
Base fffffadc28b1a000 Limit fffffadc28b14000 Call 0
Priority 14 BasePriority 13 PriorityDecrement 0
Kernel stack not resident.

One of tricks I started to recommend in such cases is to save user dumps of processes that could possibly be paged out before forcing a complete memory dump but I haven’t yet got enough data to see whether this helps or not.

- Dmitry Vostokov @ DumpAnalysis.org -

Memory Dump Analysis Pattern: A Definition

Tuesday, February 17th, 2009

I found today that I never came up with a definition for a memory dump analysis pattern and only mentioned similar design pattern definition in the first pattern post. Now it is time to draft it:

a common recurrent identifiable problem together with a set of recommendations and possible solutions to apply in a specific context

In memory dump analysis pattern catalogue some identified patterns are left without a set of recommendations and I’m going to rectify this in the nearest future.

- Dmitry Vostokov @ DumpAnalysis.org -

Wait Chain Patterns

Tuesday, February 17th, 2009

While preparing a presentation for an engineering conference I found that a page to reference various wait chain patterns is necessary, so I created this post:

I’ll update it as soon as I add more similar patterns.

- Dmitry Vostokov @ DumpAnalysis.org -

DLL Link Patterns

Tuesday, February 17th, 2009

While preparing a presentation for an engineering conference I found that another different page to reference link time problems is necessary, so I created this post:

I’ll update it as soon as I add more similar patterns.

- Dmitry Vostokov @ DumpAnalysis.org -