Archive for October 15th, 2007

TOC for Windows® Crash Dump Analysis

Monday, October 15th, 2007

Following up the announcement of the forthcoming book I’ve published a preliminary Table of Contents which can be found here:

http://www.dumpanalysis.org/index.php?q=TOC+for+Windows+Crash+Dump+Analysis+Book

In November I’m going to publish a sample chapter and by that time try to finalize TOC.

As a part of my preparatory work I also installed TinyMCE module for my Drupal installation and can now enjoy writing anytime and anywhere:

At the same time I’m trying to improve my writing by reading these two books:

Spring Into Technical Writing for Engineers and Scientists (Spring Into… Series)

Buy from Amazon

BUGS in Writing, Revised Edition: A Guide to Debugging Your Prose (2nd Edition)

Buy from Amazon

- Dmitry Vostokov @ DumpAnalysis.org -

Crash Dump Analysis Patterns (Part 30)

Monday, October 15th, 2007

When a system is unresponsive or sluggish we usually check _ERESOURCE locks in kernel or complete memory dumps to see deadlock or high resource contention. However there is some chance that reported locks are purely accidental and appear in a crash dump because they just happened at that time. We need to look at Contention Count, Ticks and KernelTime in both blocking and blocked threads to recognize an Accidental Lock. Also the current version of WinDbg doesn’t distinguish between prolonged and accidental locks when we use !analyze -v -hang command and merely reports some lock chain it finds among equal alternatives.

Here is an example. The system was reported hang and kernel memory dump was saved. WinDbg analysis command reports one thread blocking 3 other threads and the driver on top of the blocking thread stack is AVDriver.sys. The algorithm WinDbg uses to point to specific image name is described here and in our case it chooses AVDriver:  

BLOCKED_THREAD:  8089d8c0

BLOCKING_THREAD:  8aab4700

LOCK_ADDRESS:  8859a570 -- (!locks 8859a570)

Resource @ 0x8859a570    Exclusively owned
    Contention Count = 3
    NumberOfExclusiveWaiters = 3
     Threads: 8aab4700-01<*>
     Threads Waiting On Exclusive Access:
              885d0020       88a7c020       8aafc7d8

1 total locks, 1 locks currently held

BUGCHECK_STR:  LOCK_HELD

FAULTING_THREAD:  8aab4700

STACK_TEXT:
f592f698 80832f7a nt!KiSwapContext+0x26
f592f6c4 80828705 nt!KiSwapThread+0x284
f592f70c f720a394 nt!KeDelayExecutionThread+0x2ab
WARNING: Stack unwind information not available. Following frames may be wrong.
f592f734 f720ae35 AVDriver+0×1394
f592f750 f720b208 AVDriver+0×1e35
f592f794 f721945a AVDriver+0×2208
f592f7cc 8081dcdf AVDriver+0×1045a
f592f7e0 f5b9f76a nt!IofCallDriver+0×45
f592f7f0 f5b9c621 Driver!FS_Dispatch+0xa4
f592f7fc 8081dcdf Driver!Kernel_dispatch+0×53
f592f810 f5eb2856 nt!IofCallDriver+0×45
f592f874 8081dcdf AVFilter!QueryFullName+0×5c10
f592f888 f5e9eae3 nt!IofCallDriver+0×45
f592f8b8 f5e9eca4 DrvFilter!PassThrough+0×115
f592f8d4 8081dcdf DrvFilter!Create+0xda
f592f8e8 808f8275 nt!IofCallDriver+0×45
f592f9d0 808f86bc nt!IopParseDevice+0xa35
f592fa08 80936689 nt!IopParseFile+0×46
f592fa88 80932e04 nt!ObpLookupObjectName+0×11f
f592fadc 808ea231 nt!ObOpenObjectByName+0xea
f592fb58 808eb4cb nt!IopCreateFile+0×447
f592fbb4 f57c8efd nt!IoCreateFile+0xa3
f592fc24 f57c9f29 srv!SrvIoCreateFile+0×36d
f592fcf0 f57ca5e4 srv!SrvNtCreateFile+0×5cc
f592fd78 f57adbc6 srv!SrvSmbNtCreateAndX+0×15c
f592fd84 f57c3451 srv!SrvProcessSmb+0xb7
f592fdac 80948bd0 srv!WorkerThread+0×138
f592fddc 8088d4e2 nt!PspSystemThreadStartup+0×2e
00000000 00000000 nt!KiThreadStartup+0×16

STACK_COMMAND:  .thread 0xffffffff8aab4700 ; kb

FOLLOWUP_IP:
AVDriver+1394
f720a394 eb85            jmp     AVDriver+0x131b (f720a31b)

MODULE_NAME: AVDriver

IMAGE_NAME:  AVDriver.sys

Motivated by this “discovery” we want to see all locks:

0: kd> !locks
**** DUMP OF ALL RESOURCE OBJECTS ****
KD: Scanning for held locks...

Resource @ 0x895a62d8    Shared 1 owning threads
     Threads: 89570520-01<*>

Resource @ 0x897ceba8    Shared 1 owning threads
     Threads: 89584020-01<*>

Resource @ 0x8958e020    Shared 1 owning threads
     Threads: 89555020-01<*>

Resource @ 0x89590608    Shared 1 owning threads
     Threads: 89666020-01<*>

Resource @ 0x89efc398    Shared 1 owning threads
     Threads: 89e277c0-01<*>

Resource @ 0x88d70820    Shared 1 owning threads
     Threads: 88e43948-01<*>

Resource @ 0x89f2fb00    Shared 1 owning threads
     Threads: 89674688-01<*>

Resource @ 0x89c80370    Shared 1 owning threads
     Threads: 888496b8-01<*>

Resource @ 0x89bfdf08    Shared 1 owning threads
     Threads: 88b62910-01<*>

Resource @ 0x888b5488    Shared 1 owning threads
     Threads: 88536730-01<*>

Resource @ 0x89f2e348    Shared 1 owning threads
     Threads: 89295930-01<*>

Resource @ 0x891a0838    Shared 1 owning threads
     Threads: 88949020-01<*>

Resource @ 0x8825bf08    Shared 1 owning threads
     Threads: 882b9a08-01<*>

Resource @ 0x881a6510    Shared 1 owning threads
     Threads: 88a88338-01<*>

Resource @ 0x885c5890    Shared 1 owning threads
     Threads: 881ab020-01<*>

Resource @ 0x886633a8    Shared 1 owning threads
     Threads: 89b5f8b0-01<*>

Resource @ 0x88216390    Shared 1 owning threads
     Threads: 88820020-01<*>

Resource @ 0x88524490    Shared 1 owning threads
     Threads: 88073020-01<*>

Resource @ 0x88f6a020    Shared 1 owning threads
     Threads: 88e547b0-01<*>

Resource @ 0x88cf2020    Shared 1 owning threads
     Threads: 89af32d8-01<*>

Resource @ 0x889cea80    Shared 1 owning threads
     Threads: 88d18b40-01<*>

Resource @ 0x88486298    Shared 1 owning threads
     Threads: 88af7db0-01<*>

Resource @ 0x88b22270    Exclusively owned
    Contention Count = 4
    NumberOfExclusiveWaiters = 4
     Threads: 8aad07d8-01<*>
     Threads Waiting On Exclusive Access:
              8ad78020       887abdb0       88eb39a8       8aa1f668

Resource @ 0x88748c20    Exclusively owned
    Contention Count = 2
    NumberOfExclusiveWaiters = 2
     Threads: 8873c8d8-01<*>
     Threads Waiting On Exclusive Access:
              88477478       88db6020

Resource @ 0x8859a570    Exclusively owned
    Contention Count = 3
    NumberOfExclusiveWaiters = 3
     Threads: 8aab4700-01<*>
     Threads Waiting On Exclusive Access:
              885d0020       88a7c020       8aafc7d8

KD: Scanning for held locks...
18911 total locks, 25 locks currently held

We can ignore shared locks and concentrate on the last 3 exclusively owned resources. It looks suspicious that Contention Count has the same number as the number of threads waiting on exclusive access (NumberOfExclusiveWaiters). This means that these resources had never been used before. If we dump locks verbosely we would see that blocked threads had been waiting no more than 2 seconds, for example, for resource 0×8859a570:

0: kd> !thread 885d0020; !thread 88a7c020; !thread 8aafc7d8
THREAD 885d0020  Cid 0004.1c34  Teb: 00000000 Win32Thread: 00000000 WAIT: (Unknown) KernelMode Non-Alertable
    89908d50  SynchronizationEvent
    885d0098  NotificationTimer
Not impersonating
DeviceMap                 e10022c8
Owning Process            8ad80648 Image:         System
Wait Start TickCount      7689055  Ticks: 127 (0:00:00:01.984)
Context Switch Count      248
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Start Address srv!WorkerThread (0xf57c3394)
Stack Init b4136000 Current b4135b74 Base b4136000 Limit b4133000 Call 0
Priority 9 BasePriority 9 PriorityDecrement 0
ChildEBP RetAddr
b4135b8c 80832f7a nt!KiSwapContext+0×26
b4135bb8 8082925c nt!KiSwapThread+0×284
b4135c00 8087c1ad nt!KeWaitForSingleObject+0×346
b4135c3c 8087c3a1 nt!ExpWaitForResource+0xd5
b4135c5c f57c9e95 nt!ExAcquireResourceExclusiveLite+0×8d
b4135cf0 f57ca5e4 srv!SrvNtCreateFile+0×510
b4135d78 f57adbc6 srv!SrvSmbNtCreateAndX+0×15c
b4135d84 f57c3451 srv!SrvProcessSmb+0xb7
b4135dac 80948bd0 srv!WorkerThread+0×138
b4135ddc 8088d4e2 nt!PspSystemThreadStartup+0×2e
00000000 00000000 nt!KiThreadStartup+0×16

THREAD 88a7c020  Cid 0004.3448  Teb: 00000000 Win32Thread: 00000000 WAIT: (Unknown) KernelMode Non-Alertable
    89908d50  SynchronizationEvent
    88a7c098  NotificationTimer
Not impersonating
DeviceMap                 e10022c8
Owning Process            8ad80648  Image:         System
Wait Start TickCount      7689112   Ticks: 70 (0:00:00:01.093)
Context Switch Count      210
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Start Address srv!WorkerThread (0xf57c3394)
Stack Init b55dd000 Current b55dcb74 Base b55dd000 Limit b55da000 Call 0
Priority 9 BasePriority 9 PriorityDecrement 0
ChildEBP RetAddr
b55dcb8c 80832f7a nt!KiSwapContext+0×26
b55dcbb8 8082925c nt!KiSwapThread+0×284
b55dcc00 8087c1ad nt!KeWaitForSingleObject+0×346
b55dcc3c 8087c3a1 nt!ExpWaitForResource+0xd5
b55dcc5c f57c9e95 nt!ExAcquireResourceExclusiveLite+0×8d
b55dccf0 f57ca5e4 srv!SrvNtCreateFile+0×510
b55dcd78 f57adbc6 srv!SrvSmbNtCreateAndX+0×15c
b55dcd84 f57c3451 srv!SrvProcessSmb+0xb7
b55dcdac 80948bd0 srv!WorkerThread+0×138
b55dcddc 8088d4e2 nt!PspSystemThreadStartup+0×2e
00000000 00000000 nt!KiThreadStartup+0×16

THREAD 8aafc7d8  Cid 0004.058c  Teb: 00000000 Win32Thread: 00000000 WAIT: (Unknown) KernelMode Non-Alertable
    89908d50  SynchronizationEvent
    8aafc850  NotificationTimer
Not impersonating
DeviceMap                 e10022c8
Owning Process            8ad80648  Image:         System
Wait Start TickCount      7689171   Ticks: 11 (0:00:00:00.171)
Context Switch Count      310
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Start Address srv!WorkerThread (0xf57c3394)
Stack Init f592c000 Current f592bb18 Base f592c000 Limit f5929000 Call 0
Priority 9 BasePriority 9 PriorityDecrement 0
ChildEBP RetAddr
f592bb30 80832f7a nt!KiSwapContext+0×26
f592bb5c 8082925c nt!KiSwapThread+0×284
f592bba4 8087c1ad nt!KeWaitForSingleObject+0×346
f592bbe0 8087c3a1 nt!ExpWaitForResource+0xd5
f592bc00 f57c8267 nt!ExAcquireResourceExclusiveLite+0×8d
f592bc18 f57ff0ed srv!UnlinkRfcbFromLfcb+0×33
f592bc34 f57ff2ea srv!SrvCompleteRfcbClose+0×1df
f592bc54 f57b5e8f srv!CloseRfcbInternal+0xb6
f592bc78 f57ce8a9 srv!SrvCloseRfcbsOnSessionOrPid+0×74
f592bc94 f57e2b22 srv!SrvCloseSession+0xb0
f592bcb8 f57aeb12 srv!SrvCloseSessionsOnConnection+0xa9
f592bcd4 f57c79ed srv!SrvCloseConnection+0×143
f592bd04 f5808c50 srv!SrvCloseConnectionsFromClient+0×17f
f592bdac 80948bd0 srv!WorkerThread+0×138
f592bddc 8088d4e2 nt!PspSystemThreadStartup+0×2e
00000000 00000000 nt!KiThreadStartup+0×16

Blocking threads themselves are not blocked and active: the number of ticks passed since their last wait or preemption is 0. This could be a sign of CPU spike pattern. However their accumulated KernelTime is less than a second:

0: kd> !thread 8aad07d8
THREAD 8aad07d8  Cid 0004.0580  Teb: 00000000 Win32Thread: 00000000 WAIT: (Unknown) KernelMode Non-Alertable
    8aad0850  NotificationTimer
IRP List:
    8927ade0: (0006,0220) Flags: 00000884  Mdl: 00000000
Impersonation token:  eafdc030 (Level Impersonation)
DeviceMap                 e5d69340
Owning Process            8ad80648       Image:         System
Wait Start TickCount      7689182        Ticks: 0
Context Switch Count      915582
UserTime                  00:00:00.000
KernelTime                00:00:00.125
Start Address srv!WorkerThread (0xf57c3394)
Stack Init f59d8000 Current f59d7680 Base f59d8000 Limit f59d5000 Call 0
Priority 9 BasePriority 9 PriorityDecrement 0

0: kd> !thread 8873c8d8
THREAD 8873c8d8  Cid 0004.2898  Teb: 00000000 Win32Thread: 00000000 WAIT: (Unknown) KernelMode Non-Alertable
    8873c950  NotificationTimer
IRP List:
    882a8de0: (0006,0220) Flags: 00000884  Mdl: 00000000
Impersonation token:  eafdc030 (Level Impersonation)
DeviceMap                 e5d69340
Owning Process            8ad80648       Image:         System
Wait Start TickCount      7689182        Ticks: 0
Context Switch Count      917832
UserTime                  00:00:00.000
KernelTime                00:00:00.031
Start Address srv!WorkerThread (0xf57c3394)
Stack Init ac320000 Current ac31f680 Base ac320000 Limit ac31d000 Call 0
Priority 9 BasePriority 9 PriorityDecrement 0

0: kd> !thread 8aab4700
THREAD 8aab4700  Cid 0004.0588  Teb: 00000000 Win32Thread: 00000000 WAIT: (Unknown) KernelMode Non-Alertable
    8aab4778  NotificationTimer
IRP List:
    88453008: (0006,0220) Flags: 00000884  Mdl: 00000000
Impersonation token:  e9a82728 (Level Impersonation)
DeviceMap                 eb45f108
Owning Process            8ad80648       Image:         System
Wait Start TickCount      7689182        Ticks: 0
Context Switch Count      1028220
UserTime                  00:00:00.000
KernelTime                00:00:00.765
Start Address srv!WorkerThread (0xf57c3394)
Stack Init f5930000 Current f592f680 Base f5930000 Limit f592d000 Call 0
Priority 9 BasePriority 9 PriorityDecrement 0

Based on this observation we could say that locks were accidental and indeed, when the problem happened again, the new dump didn’t show them.

- Dmitry Vostokov @ DumpAnalysis.org -