Archive for December 3rd, 2011

Crash Dump Analysis Patterns (Part 157)

Saturday, December 3rd, 2011

In addition to functions we also have module variables like nt!MmPagedPoolCommit in Windows 7:

0: kd> x nt!MmPagedPool*
fffff800`031148d0 nt!MmPagedPoolInfo = <no type information>
fffff800`03092d20 nt!MmPagedPoolCommit = <no type information>
fffff800`031141a0 nt!MmPagedPoolEnd = <no type information>
fffff800`031175c0 nt!MmPagedPoolWs = <no type information>

If we are not sure whether we have a function or Module Variable we can try to disassemble:

0: kd> u nt!MmPagedPoolCommit
nt!MmPagedPoolCommit:
fffff800`03092d20 e3b2            jrcxz   nt!MmTotalNonPagedPoolQuota+0x4 (fffff800`03092cd4)
fffff800`03092d22 0000            add     byte ptr [rax],al
fffff800`03092d24 0000            add     byte ptr [rax],al
fffff800`03092d26 0000            add     byte ptr [rax],al
fffff800`03092d28 0000            add     byte ptr [rax],al
fffff800`03092d2a 0000            add     byte ptr [rax],al
fffff800`03092d2c 0000            add     byte ptr [rax],al
fffff800`03092d2e 0000            add     byte ptr [rax],al

Here the value is probably in pages so we multiply by 4 to get Kb value and compare to the output of !vm command:

0: kd> dp nt!MmPagedPoolCommit
fffff800`03092d20  00000000`0000b2e3 00000000`00000000
fffff800`03092d30  00000000`00000000 00000000`00000000
fffff800`03092d40  00000000`00000001 00000000`00000000
fffff800`03092d50  00000000`00000000 00000000`00060107
fffff800`03092d60  fffff800`03092d60 fffff800`03092d60
fffff800`03092d70  00000000`00000000 00000000`0001e972
fffff800`03092d80  fffff900`c0000000 00000000`00000002
fffff800`03092d90  fffff880`071dc0a8 fffff880`057340a8

0: kd> ? b2e3 * 4
Evaluate expression: 183180 = 00000000`0002cb8c

0: kd> !vm

*** Virtual Memory Usage ***
 Physical Memory:     1035228 (   4140912 Kb)
 Page File: \??\C:\pagefile.sys
   Current:   4448112 Kb  Free Space:   4448108 Kb
   Minimum:   4448112 Kb  Maximum:     12422736 Kb
Unimplemented error for MiSystemVaTypeCount
 Available Pages:      594029 (   2376116 Kb)
 ResAvail Pages:       889795 (   3559180 Kb)
 Locked IO Pages:           0 (         0 Kb)
 Free System PTEs:   33556870 ( 134227480 Kb)
 Modified Pages:        20079 (     80316 Kb)
 Modified PF Pages:     19441 (     77764 Kb)
 NonPagedPool Usage: 50865104 ( 203460416 Kb)
 NonPagedPoolNx Usage:  28163 (    112652 Kb)
 NonPagedPool Max:     763396 (   3053584 Kb)
 ********** Excessive NonPaged Pool Usage *****
 PagedPool 0 Usage:     39420 (    157680 Kb)
 PagedPool 1 Usage:      5194 (     20776 Kb)
 PagedPool 2 Usage:       367 (      1468 Kb)
 PagedPool 3 Usage:       338 (      1352 Kb)
 PagedPool 4 Usage:       440 (      1760 Kb)
 PagedPool Usage:       45759 (    183036 Kb)
 PagedPool Maximum:  33554432 ( 134217728 Kb)
 Session Commit:         8112 (     32448 Kb)
 Shared Commit:         31802 (    127208 Kb)
 Special Pool:              0 (         0 Kb)
 Shared Process:        10765 (     43060 Kb)
 PagedPool Commit:      45795 (    183180 Kb)
 Driver Commit:         13773 (     55092 Kb)
 Committed pages:      540998 (   2163992 Kb)
 Commit limit:        2146794 (   8587176 Kb)
[…]

Knowledge of available module variables is useful because some of them are not included in WinDbg extension command output. For their list please consult Windows Internals book. Also useful variables can be found in other modules as well, for example, srv!srvcomputername:

0: kd> dS srv!srvcomputername
fffff8a0`0344b090  "MYNOTEBOOK"

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Crash Dump Analysis Patterns (Part 27c)

Saturday, December 3rd, 2011

Sometimes we need to narrow general stack trace collection to a few threads that satisfy some predicate, for example, all threads with kernel time spent greater than some value or all suspended threads or all threads that wait for a specific synchronization object type. We call this pattern variant Stack Trace Collection (predicate). This can be implemented using WinDbg scripts and / or debugger extensions.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

2 WinDbg Scripts That Changed The World

Saturday, December 3rd, 2011

If not for you then definitely for me because I now diagnose Spiking Thread pattern much faster. One of the readers if this blog asked me whether there is !runaway command equivalent for kernel and complete memory dumps. So, after some thinking I gave it a try especially in the context of WinDbg scripting exercises designed for Advanced Windows Memory Dump Analysis training. As a result I wrote 2 scripts initially that you can try yourself. Their output here is taken from a complete memory dump I used for Fundamentals of Complete Crash and Hang Memory Dump Analysis presentation.

The first one dumps the most CPU consuming threads for user and kernel mode:

$$
$$ krunawaymost.wds
$$ Copyright (c) 2011 Memory Dump Analysis Services
$$ GNU GENERAL PUBLIC LICENSE
$$ http://www.gnu.org/licenses/gpl-3.0.txt
$$
r $t0 = 0
!for_each_thread “r $t1 = dwo( @#Thread + @@c++(#FIELD_OFFSET(nt!_KTHREAD, UserTime)) ); .if (@$t1 > @$t0) {r $t0 = @$t1; r $t2 = @#Thread}”
.echo “The largest UserTime value: ”
? @$t0
!thread @$t2 ff
r $t0 = 0
!for_each_thread “r $t1 = dwo( @#Thread + @@c++(#FIELD_OFFSET(nt!_KTHREAD, KernelTime)) ); .if (@$t1 > @$t0) {r $t0 = @$t1; r $t2 = @#Thread}”
.echo “The largest KernelTime value: ”
? @$t0
!thread @$t2 ff

0: kd> $$><c:\Scripts\krunawaymost.wds
The largest UserTime value:
Evaluate expression: 5470 = 00000000`0000155e

THREAD fffffa800451d720  Cid 1418.17fc  Teb: 000007fffffdc000 Win32Thread: 0000000000000000 RUNNING on processor 2
Not impersonating
DeviceMap                 fffff8a001ce6b90
Owning Process            fffffa800442ab30       Image:         ApplicationE.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      22295          Ticks: 0
Context Switch Count      27960            
UserTime                  00:01:25.332
KernelTime                00:00:00.015
*** ERROR: Module load completed but symbols could not be loaded for ApplicationE.exe
Win32 Start Address ApplicationE (0×000000013f0f1578)
Stack Init fffff8800723cc70 Current fffff8800723c960
Base fffff8800723d000 Limit fffff88007237000 Call 0
Priority 8 BasePriority 8 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           Call Site
00000000`0021f9e0 00000000`00000000 ApplicationE+0×6cd3

The largest KernelTime value:
Evaluate expression: 187 = 00000000`000000bb

THREAD fffffa80098d7b60  Cid 07bc.0a14  Teb: 000007fffffd7000 Win32Thread: fffff900c2ca0c20 WAIT: (UserRequest) KernelMode Non-Alertable
    fffffa8008a4a030  NotificationEvent
Not impersonating
DeviceMap                 fffff8a001ce6b90
Owning Process            fffffa80096beb30       Image:         dwm.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      22294          Ticks: 1 (0:00:00:00.015)
Context Switch Count      15473                 LargeStack
UserTime                  00:00:06.801
KernelTime                00:00:02.917
Win32 Start Address dwmcore!CPartitionThread::ThreadMain (0×000007fef8a1f0d8)
Stack Init fffff8800d3d5c70 Current fffff8800d3d5740
Base fffff8800d3d6000 Limit fffff8800d3cf000 Call 0
Priority 15 BasePriority 15 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           Call Site
fffff880`0d3d5780 fffff800`02ee6f32 nt!KiSwapContext+0×7a
fffff880`0d3d58c0 fffff800`02ee974f nt!KiCommitThreadWait+0×1d2
fffff880`0d3d5950 fffff880`0fef65b3 nt!KeWaitForSingleObject+0×19f
fffff880`0d3d59f0 fffff960`001fedea dxgkrnl!DxgkWaitForVerticalBlankEvent+0×53f
fffff880`0d3d5ab0 fffff800`02ee0ed3 win32k!NtGdiDdDDIWaitForVerticalBlankEvent+0×12
fffff880`0d3d5ae0 000007fe`ff1d143a nt!KiSystemServiceCopyEnd+0×13 (TrapFrame @ fffff880`0d3d5ae0)
00000000`0287f778 000007fe`f8791da1 GDI32!NtGdiDdDDIWaitForVerticalBlankEvent+0xa
00000000`0287f780 000007fe`f89e1b6e dxgi!CDXGIOutput::WaitForVBlank+0×51
00000000`0287f7c0 000007fe`f89e1ae9 dwmcore!CD3DDeviceLevel1::WaitForVBlank+0×1f9
00000000`0287f810 000007fe`f89e1a9d dwmcore!CHwDisplayRenderTarget::WaitForVBlank+0×39
00000000`0287f850 000007fe`f89e1a4c dwmcore!CDesktopRenderTarget::WaitForVBlank+0×40
00000000`0287f880 000007fe`f89d3513 dwmcore!CSlaveHWndRenderTarget::WaitForVBlank+0×2c
00000000`0287f8c0 000007fe`f89d3584 dwmcore!CRenderTargetManager::WaitForVBlank+0×7d
00000000`0287f900 000007fe`f89d2661 dwmcore!CPartitionVerticalBlankScheduler::WaitForVBlank+0×7c
00000000`0287f950 000007fe`f8a1f0f4 dwmcore!CPartitionVerticalBlankScheduler::Run+0xe5
00000000`0287f9b0 00000000`7719652d dwmcore!CPartitionThread::ThreadMain+0×1c
00000000`0287f9e0 00000000`772cc521 kernel32!BaseThreadInitThunk+0xd
00000000`0287fa10 00000000`00000000 ntdll!RtlUserThreadStart+0×1d

The second script takes two arguments and shows all threads that have UserTime and KernelTime ticks value greater than (you can have the idea of the maximum from the previous script):

$$
$$ krunawaygt.wds
$$ Copyright (c) 2011 Memory Dump Analysis Services
$$ GNU GENERAL PUBLIC LICENSE
$$ http://www.gnu.org/licenses/gpl-3.0.txt
$$
!for_each_thread “r $t1 = dwo( @#Thread + @@c++(#FIELD_OFFSET(nt!_KTHREAD, UserTime)) ); r $t0 = $arg1; .if (@$t1 > @$t0) {!thread @#Thread ff}”
!for_each_thread “r $t1 = dwo( @#Thread + @@c++(#FIELD_OFFSET(nt!_KTHREAD, KernelTime)) ); r $t0 = $arg2; .if (@$t1 > @$t0) {!thread @#Thread ff}”

Using hints from the previous script run (the largest UserTime ticks value is 0×155e) we now get threads that spent more than 0×100 ticks in user mode:

0: kd> $$>a<c:\Scripts\krunawaygt.wds 100 100
THREAD fffffa800843e060  Cid 03f4.0658  Teb: 000007fffff90000 Win32Thread: 0000000000000000 WAIT: (WrQueue) UserMode Non-Alertable
    fffffa800843c2c0  QueueObject
Not impersonating
DeviceMap                 fffff8a000008aa0
Owning Process            fffffa800916b060       Image:         MsMpEng.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      21211          Ticks: 1084 (0:00:00:16.910)
Context Switch Count      6028            
UserTime                  00:00:10.140
KernelTime                00:00:00.296
Win32 Start Address msvcrt!endthreadex (0×000007feff5173fc)
Stack Init fffff88009d4bc70 Current fffff88009d4b660
Base fffff88009d4c000 Limit fffff88009d46000 Call 0
Priority 9 BasePriority 8 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for mprtp.dll -
Child-SP          RetAddr           Call Site
fffff880`09d4b6a0 fffff800`02ee6f32 nt!KiSwapContext+0×7a
fffff880`09d4b7e0 fffff800`02ee9f93 nt!KiCommitThreadWait+0×1d2
fffff880`09d4b870 fffff800`031ca647 nt!KeRemoveQueueEx+0×323
fffff880`09d4b930 fffff800`0319cae5 nt!IoRemoveIoCompletion+0×47
fffff880`09d4b9c0 fffff800`02ee0ed3 nt!NtRemoveIoCompletion+0×145
fffff880`09d4ba70 00000000`772f13aa nt!KiSystemServiceCopyEnd+0×13 (TrapFrame @ fffff880`09d4bae0)
00000000`0209fb08 000007fe`fd9e169d ntdll!ZwRemoveIoCompletion+0xa
00000000`0209fb10 00000000`7718a4e1 KERNELBASE!GetQueuedCompletionStatus+0×39
00000000`0209fb70 00000000`748f2c74 kernel32!GetQueuedCompletionStatusStub+0×11
00000000`0209fbb0 00000000`0045cbc0 mprtp!MpPluginSignatureChange+0×3e170
00000000`0209fbb8 000007fe`fbac25ff 0×45cbc0
00000000`0209fbc0 00000000`00466610 FLTLIB!FilterGetMessage+0×2b
00000000`0209fc20 00000000`00000000 0×466610

THREAD fffffa800845c060  Cid 03f4.065c  Teb: 000007fffff8e000 Win32Thread: 0000000000000000 WAIT: (WrQueue) UserMode Non-Alertable
    fffffa800843c2c0  QueueObject
Not impersonating
DeviceMap                 fffff8a000008aa0
Owning Process            fffffa800916b060       Image:         MsMpEng.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      21520          Ticks: 775 (0:00:00:12.090)
Context Switch Count      4979            
UserTime                  00:00:04.149
KernelTime                00:00:00.156
Win32 Start Address msvcrt!endthreadex (0×000007feff5173fc)
Stack Init fffff88009d52c70 Current fffff88009d52660
Base fffff88009d53000 Limit fffff88009d4d000 Call 0
Priority 8 BasePriority 8 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for mprtp.dll -
Child-SP          RetAddr           Call Site
fffff880`09d526a0 fffff800`02ee6f32 nt!KiSwapContext+0×7a
fffff880`09d527e0 fffff800`02ee9f93 nt!KiCommitThreadWait+0×1d2
fffff880`09d52870 fffff800`031ca647 nt!KeRemoveQueueEx+0×323
fffff880`09d52930 fffff800`0319cae5 nt!IoRemoveIoCompletion+0×47
fffff880`09d529c0 fffff800`02ee0ed3 nt!NtRemoveIoCompletion+0×145
fffff880`09d52a70 00000000`772f13aa nt!KiSystemServiceCopyEnd+0×13 (TrapFrame @ fffff880`09d52ae0)
00000000`01ccf498 000007fe`fd9e169d ntdll!ZwRemoveIoCompletion+0xa
00000000`01ccf4a0 00000000`7718a4e1 KERNELBASE!GetQueuedCompletionStatus+0×39
00000000`01ccf500 00000000`748f2c74 kernel32!GetQueuedCompletionStatusStub+0×11
00000000`01ccf540 00000000`0045d030 mprtp!MpPluginSignatureChange+0×3e170
00000000`01ccf548 000007fe`fbac25ff 0×45d030
00000000`01ccf550 00000000`004666b0 FLTLIB!FilterGetMessage+0×2b
00000000`01ccf5b0 00000000`00000000 0×4666b0

THREAD fffffa80092b7060  Cid 03f4.1268  Teb: 000007fffff6a000 Win32Thread: 0000000000000000 WAIT: (WrQueue) UserMode Alertable
    fffffa8009299140  QueueObject
Not impersonating
DeviceMap                 fffff8a000008aa0
Owning Process            fffffa800916b060       Image:         MsMpEng.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      7762           Ticks: 14533 (0:00:03:46.716)
Context Switch Count      3297            
UserTime                  00:00:06.489
KernelTime                00:00:00.499
Win32 Start Address ntdll!TppWorkerThread (0×00000000772bfbc0)
Stack Init fffff8800e620c70 Current fffff8800e620680
Base fffff8800e621000 Limit fffff8800e61b000 Call 0
Priority 8 BasePriority 8 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           Call Site
fffff880`0e6206c0 fffff800`02ee6f32 nt!KiSwapContext+0×7a
fffff880`0e620800 fffff800`02ee9f93 nt!KiCommitThreadWait+0×1d2
fffff880`0e620890 fffff800`031ca647 nt!KeRemoveQueueEx+0×323
fffff880`0e620950 fffff800`02ecdb36 nt!IoRemoveIoCompletion+0×47
fffff880`0e6209e0 fffff800`02ee0ed3 nt!NtWaitForWorkViaWorkerFactory+0×285
fffff880`0e620ae0 00000000`772f2c1a nt!KiSystemServiceCopyEnd+0×13 (TrapFrame @ fffff880`0e620ae0)
00000000`0540f998 00000000`772bfe0b ntdll!ZwWaitForWorkViaWorkerFactory+0xa
00000000`0540f9a0 00000000`7719652d ntdll!TppWorkerThread+0×2c9
00000000`0540fca0 00000000`772cc521 kernel32!BaseThreadInitThunk+0xd
00000000`0540fcd0 00000000`00000000 ntdll!RtlUserThreadStart+0×1d

THREAD fffffa80098d7b60  Cid 07bc.0a14  Teb: 000007fffffd7000 Win32Thread: fffff900c2ca0c20 WAIT: (UserRequest) KernelMode Non-Alertable
    fffffa8008a4a030  NotificationEvent
Not impersonating
DeviceMap                 fffff8a001ce6b90
Owning Process            fffffa80096beb30       Image:         dwm.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      22294          Ticks: 1 (0:00:00:00.015)
Context Switch Count      15473                 LargeStack
UserTime                  00:00:06.801
KernelTime                00:00:02.917
Win32 Start Address dwmcore!CPartitionThread::ThreadMain (0×000007fef8a1f0d8)
Stack Init fffff8800d3d5c70 Current fffff8800d3d5740
Base fffff8800d3d6000 Limit fffff8800d3cf000 Call 0
Priority 15 BasePriority 15 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           Call Site
fffff880`0d3d5780 fffff800`02ee6f32 nt!KiSwapContext+0×7a
fffff880`0d3d58c0 fffff800`02ee974f nt!KiCommitThreadWait+0×1d2
fffff880`0d3d5950 fffff880`0fef65b3 nt!KeWaitForSingleObject+0×19f
fffff880`0d3d59f0 fffff960`001fedea dxgkrnl!DxgkWaitForVerticalBlankEvent+0×53f
fffff880`0d3d5ab0 fffff800`02ee0ed3 win32k!NtGdiDdDDIWaitForVerticalBlankEvent+0×12
fffff880`0d3d5ae0 000007fe`ff1d143a nt!KiSystemServiceCopyEnd+0×13 (TrapFrame @ fffff880`0d3d5ae0)
00000000`0287f778 000007fe`f8791da1 GDI32!NtGdiDdDDIWaitForVerticalBlankEvent+0xa
00000000`0287f780 000007fe`f89e1b6e dxgi!CDXGIOutput::WaitForVBlank+0×51
00000000`0287f7c0 000007fe`f89e1ae9 dwmcore!CD3DDeviceLevel1::WaitForVBlank+0×1f9
00000000`0287f810 000007fe`f89e1a9d dwmcore!CHwDisplayRenderTarget::WaitForVBlank+0×39
00000000`0287f850 000007fe`f89e1a4c dwmcore!CDesktopRenderTarget::WaitForVBlank+0×40
00000000`0287f880 000007fe`f89d3513 dwmcore!CSlaveHWndRenderTarget::WaitForVBlank+0×2c
00000000`0287f8c0 000007fe`f89d3584 dwmcore!CRenderTargetManager::WaitForVBlank+0×7d
00000000`0287f900 000007fe`f89d2661 dwmcore!CPartitionVerticalBlankScheduler::WaitForVBlank+0×7c
00000000`0287f950 000007fe`f8a1f0f4 dwmcore!CPartitionVerticalBlankScheduler::Run+0xe5
00000000`0287f9b0 00000000`7719652d dwmcore!CPartitionThread::ThreadMain+0×1c
00000000`0287f9e0 00000000`772cc521 kernel32!BaseThreadInitThunk+0xd
00000000`0287fa10 00000000`00000000 ntdll!RtlUserThreadStart+0×1d

THREAD fffffa800451d720  Cid 1418.17fc  Teb: 000007fffffdc000 Win32Thread: 0000000000000000 RUNNING on processor 2
Not impersonating
DeviceMap                 fffff8a001ce6b90
Owning Process            fffffa800442ab30       Image:         ApplicationE.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      22295          Ticks: 0
Context Switch Count      27960            
UserTime                  00:01:25.332
KernelTime                00:00:00.015
*** ERROR: Module load completed but symbols could not be loaded for ApplicationE.exe
Win32 Start Address ApplicationE (0×000000013f0f1578)
Stack Init fffff8800723cc70 Current fffff8800723c960
Base fffff8800723d000 Limit fffff88007237000 Call 0
Priority 8 BasePriority 8 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           Call Site
00000000`0021f9e0 00000000`00000000 ApplicationE+0×6cd3

Memory Dump Analysis Services is now working to incorporate client-side WinDbg scripting into their CARE2 architecture. 

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org