Archive for the ‘Common Mistakes’ Category

10 Common Mistakes in Memory Analysis (Part 6)

Tuesday, December 8th, 2009

Some debugger commands or commands they invoke can be context-sensitive and their diagnostic output can depend on a current thread or a process set in a debugger, not to mention loaded debugger extensions and even their load order. Therefore, it is advisable to be context-conscious about or at least to know about context sensitivity. For example, in one mmc.exe process memory dump a default analysis command in x64 WinDbg doesn’t show any managed stack trace reported by a user who had seen it in a failure dialog box:

0:000> !analyze -v

[...]

MANAGED_BITNESS_MISMATCH:
Managed code needs matching platform of sos.dll for proper analysis. Use 'x86' debugger.

PRIMARY_PROBLEM_CLASS:  STATUS_BREAKPOINT

BUGCHECK_STR:  APPLICATION_FAULT_STATUS_BREAKPOINT

STACK_TEXT: 
0007fc98 7c827d19 77e6202c 00000002 0007fce8 ntdll!KiFastSystemCallRet
0007fc9c 77e6202c 00000002 0007fce8 00000001 ntdll!NtWaitForMultipleObjects+0xc
0007fd44 7739bbd1 00000002 0007fd6c 00000000 kernel32!WaitForMultipleObjectsEx+0x11a
0007fda0 6c296601 00000001 0007fdd4 ffffffff user32!RealMsgWaitForMultipleObjectsEx+0x141
0007fdc0 6c29684b 000004ff ffffffff 00000001 duser!CoreSC::Wait+0x3a
0007fdf4 6c29693d 0007fe34 00000000 00000000 duser!CoreSC::xwProcessNL+0xab
0007fe14 773b0c02 0007fe34 00000000 00000000 duser!MphProcessMessage+0x2e
0007fe5c 7c828556 0007fe74 00000014 0007ffb0 user32!__ClientGetMessageMPH+0x30
0007fe84 7739c811 7739c844 01116894 00000000 ntdll!KiUserCallbackDispatcher+0x2e
0007fea4 7f072fd6 01116894 00000000 00000000 user32!NtUserGetMessage+0xc
0007fec0 010080ef 01116894 01116860 00000002 mfc42u!CWinThread::PumpMessage+0x16
0007fef0 7f072dda 01116860 01116860 ffffffff mmc!CAMCApp::PumpMessage+0x37
0007ff08 7f044d5b ffffffff 00000002 7ffd9000 mfc42u!CWinThread::Run+0x4a
0007ff1c 01034e19 01000000 00000000 00020710 mfc42u!AfxWinMain+0x7b
0007ffc0 77e6f23b 00000000 00000000 7ffd9000 mmc!wWinMainCRTStartup+0x19d
0007fff0 00000000 01034cb0 00000000 78746341 kernel32!BaseProcessStart+0x23

Instead of concluding that the dump file wasn’t saved at the time of the failure we pay attention to all aspects of the default analysis and see that we need a platform-specific debugger. We load the same dump file into x86 WinDbg: 

0:000> !analyze -v

[...]

MANAGED_STACK: !dumpstack -EE
No export dumpstack found

Now we see that we need to load SOS extension explicitly and retry: 

0:000> .load C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\sos.dll

0:000> !analyze -v

[...]

MANAGED_STACK: !dumpstack -EE
OS Thread Id: 0x4ec (0)
Current frame:
ChildEBP RetAddr  Caller,Callee

Managed stack trace is empty here but look at all threads (we list full traces in order not to miss any module) we find one that shows a dialog box reporting a failure:

0:000> ~*kL 100

[...]

  17  Id: 658.7e4 Suspend: 1 Teb: 7ff48000 Unfrozen
ChildEBP RetAddr 
06b4f498 7739bf53 ntdll!KiFastSystemCallRet
06b4f4d0 7738965e user32!NtUserWaitMessage+0xc
06b4f4f8 773896a0 user32!InternalDialogBox+0xd0
06b4f518 773896e8 user32!DialogBoxIndirectParamAorW+0x37
06b4f53c 4afde2e1 user32!DialogBoxParamW+0×3f
06b4f584 4b05c4bc mmcndmgr!IsolationAwareDialogBoxParamW+0×4e
06b4f5a4 4b05c6eb mmcndmgr!ATL::CDialogImpl<CSnapInFailureReportDialog, ATL::CWindow>::DoModal+0×4f
06b4f68c 77c80193 mmcndmgr!CSnapInFailureReporter::ReportSnapInFailure+0×195

06b4f6b8 77ce33e1 rpcrt4!Invoke+0×30
06b4fab8 77ce2ed5 rpcrt4!NdrStubCall2+0×299
06b4fb10 7778d01b rpcrt4!CStdStubBuffer_Invoke+0xc6
06b4fb54 7778cfc8 ole32!SyncStubInvoke+0×37
06b4fb9c 776c120b ole32!StubInvoke+0xa7
06b4fc78 776c0bf5 ole32!CCtxComChnl::ContextInvoke+0xec
06b4fc94 776bc455 ole32!MTAInvoke+0×1a
06b4fcc0 7778ced5 ole32!STAInvoke+0×48
06b4fcf4 7778cd66 ole32!AppInvoke+0xa3
06b4fdc8 7778c24d ole32!ComInvokeWithLockAndIPID+0×2c5
06b4fdf0 776bc344 ole32!ComInvoke+0xca
06b4fe04 776bc30f ole32!ThreadDispatch+0×23
06b4fe1c 7739b6e3 ole32!ThreadWndProc+0xfe
06b4fe48 7739b874 user32!InternalCallWinProc+0×28
06b4fec0 7739ba92 user32!UserCallWinProcCheckWow+0×151
06b4ff28 7739bad0 user32!DispatchMessageWorker+0×327
06b4ff38 7768ffdc user32!DispatchMessageW+0xf
06b4ff6c 7768f366 ole32!CDllHost::STAWorkerLoop+0×5c
06b4ff88 7768f2a2 ole32!CDllHost::WorkerThread+0xc8
06b4ff90 776bbab4 ole32!DLLHostThreadEntry+0xd
06b4ffac 776b1704 ole32!CRpcThread::WorkerLoop+0×26
06b4ffb8 77e6482f ole32!CRpcThreadCache::RpcWorkerThreadEntry+0×20
06b4ffec 00000000 kernel32!BaseThreadStart+0×34

The previous thread #16 is a CLR thread loading an assembly:

  16  Id: 658.28c Suspend: 1 Teb: 7ffd4000 Unfrozen
ChildEBP RetAddr 
0714d024 7c827d29 ntdll!KiFastSystemCallRet
0714d028 77e61d1e ntdll!ZwWaitForSingleObject+0xc
0714d098 73ca790b kernel32!WaitForSingleObjectEx+0xac
0714d0c4 73ca485a cryptnet!CryptRetrieveObjectByUrlWithTimeout+0x12f
0714d0f0 73ca37ce cryptnet!CryptRetrieveObjectByUrlW+0x9b
0714d168 73ca4a60 cryptnet!RetrieveObjectByUrlValidForSubject+0x5b
0714d1b8 73ca3525 cryptnet!RetrieveTimeValidObjectByUrl+0xbc
0714d220 73ca3473 cryptnet!CTVOAgent::GetTimeValidObjectByUrl+0xc2
0714d2d0 73ca3314 cryptnet!CTVOAgent::GetTimeValidObject+0x2f1
0714d300 73ca2c00 cryptnet!FreshestCrlFromCrlGetTimeValidObject+0x2d
0714d344 73ca43a4 cryptnet!CryptGetTimeValidObject+0x58
0714d3a0 73ca3122 cryptnet!GetTimeValidCrl+0x1e0
0714d3e4 73ca3080 cryptnet!GetBaseCrl+0x34
0714d484 761d9033 cryptnet!MicrosoftCertDllVerifyRevocation+0x128
0714d514 761d8eef crypt32!I_CryptRemainingMilliseconds+0x21b
0714d584 761cf39f crypt32!CertVerifyRevocation+0xb7
0714d604 761c6966 crypt32!CChainPathObject::CalculateRevocationStatus+0x1f2
0714d64c 761c6771 crypt32!CChainPathObject::CalculateAdditionalStatus+0x147
0714d708 761c78bc crypt32!CCertChainEngine::CreateChainContextFromPathGraph+0x227
0714d738 761c783f crypt32!CCertChainEngine::GetChainContext+0x44
0714d760 76bb6d8f crypt32!CertGetCertificateChain+0x60
0714d7c4 76bb6bbc wintrust!_WalkChain+0x1a8
0714d800 76bb39ef wintrust!WintrustCertificateTrust+0xb7
0714d8f4 76bb31e2 wintrust!_VerifyTrust+0x144
0714d918 64025b1b wintrust!WinVerifyTrust+0x4e
0714d9bc 7a117c85 mscorsec!GetPublisher+0xe4
0714da14 79ebeccb mscorwks!PEFile::CheckSecurity+0xcb
0714da3c 79ebec14 mscorwks!PEAssembly::DoLoadSignatureChecks+0x3a
0714da64 79ebf05a mscorwks!PEAssembly::PEAssembly+0x109
0714dd00 79ebf155 mscorwks!PEAssembly::DoOpen+0x103
0714dd94 79eb8ff2 mscorwks!PEAssembly::Open+0x79
0714def8 79eb6a5e mscorwks!AppDomain::BindAssemblySpec+0x247
0714df90 79eb691c mscorwks!PEFile::LoadAssembly+0×95
0714e030 79eb68c0 mscorwks!Module::LoadAssembly+0xee

0714e06c 79e92873 mscorwks!Assembly::FindModuleByTypeRef+0×113
0714e0d8 79fc3dc8 mscorwks!ClassLoader::ResolveTokenToTypeDefThrowing+0×88
0714e12c 79fc953d mscorwks!CEEInfo::AddDependencyOnClassToken+0×103
0714e158 79fc61cf mscorwks!CEEInfo::ScanForModuleDependencies+0xa3
0714e1fc 7908bce1 mscorwks!CEEInfo::getArgType+0×256
0714e214 7908bc5b mscorjit!Compiler::eeGetArgType+0×23
0714e25c 79067745 mscorjit!Compiler::impInlineInitVars+0×3c3
0714e4fc 790673d5 mscorjit!Compiler::fgInvokeInlineeCompiler+0×95
0714e518 79067400 mscorjit!Compiler::fgMorphCallInline+0×41
0714e52c 79065272 mscorjit!Compiler::fgInline+0×30
0714e534 7906513e mscorjit!Compiler::fgMorph+0×45
0714e544 79065b8e mscorjit!Compiler::compCompile+0×83
0714e590 79065d33 mscorjit!Compiler::compCompile+0×44f
0714e618 79066448 mscorjit!jitNativeCode+0xef
0714e63c 79fc7198 mscorjit!CILJit::compileMethod+0×25
0714e6a8 79fc722d mscorwks!invokeCompileMethodHelper+0×72
0714e6ec 79fc72a0 mscorwks!invokeCompileMethod+0×31
0714e740 79fc7019 mscorwks!CallCompileMethodWithSEHWrapper+0×5b
0714eae8 79fc6ddb mscorwks!UnsafeJitFunction+0×31b
0714eb8c 79e811a3 mscorwks!MethodDesc::MakeJitWorker+0×1a8
0714ebe4 79e81363 mscorwks!MethodDesc::DoPrestub+0×41b
0714ec34 01c01efe mscorwks!PreStubWorker+0xf3
WARNING: Frame IP not in any known module. Following frames may be wrong.
0714ec4c 06b08f29 0×1c01efe
0714ecb4 06b088dc 0×6b08f29
0714edf0 79e71b4c 0×6b088dc
0714edf4 79e7e45d mscorwks!CallDescrWorker+0×33
0714eeac 79e968b0 mscorwks!MethodDesc::IsSharedByGenericInstantiations+0×1c
0714ef2c 79e9eeb2 mscorwks!MetaSig::MetaSig+0×3a
0714f258 00000000 mscorwks!JIT_MonReliableEnter+0×120

If we switch to it we get a managed stack:

0:000> ~16s
eax=0714ce90 ebx=048bb528 ecx=77e63d5b edx=7ffd4000 esi=0000073c edi=00000000
eip=7c82860c esp=0714d028 ebp=0714d098 iopl=0 nv up ei ng nz ac pe cy
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000297
ntdll!KiFastSystemCallRet:
7c82860c c3              ret

0:016> !analyze -v

[...]

MANAGED_STACK: !dumpstack -EE
OS Thread Id: 0x28c (16)
Current frame:
ChildEBP RetAddr  Caller,Callee
[...]
0714f264 792e0c3a (MethodDesc 0x79104344 +0xa System.Reflection.RuntimeMethodInfo.GetParametersNoCopy())
0714f28c 792d52d8 (MethodDesc 0x790c5058 +0x48 System.RuntimeMethodHandle.InvokeMethodFast(System.Object, System.Object[], System.Signature, System.Reflection.MethodAttributes, System.RuntimeTypeHandle))
0714f2dc 792d5086 (MethodDesc 0x791043a4 +0x106 System.Reflection.RuntimeMethodInfo.Invoke(System.Object, System.Reflection.BindingFlags, System.Reflection.Binder, System.Object[], System.Globalization.CultureInfo, Boolean))
0714f314 792d4f6e (MethodDesc 0x7910439c +0x1e System.Reflection.RuntimeMethodInfo.Invoke(System.Object, System.Reflection.BindingFlags, System.Reflection.Binder, System.Object[], System.Globalization.CultureInfo))
0714f338 7928ea4b (MethodDesc 0x79108798 +0x82b System.RuntimeType.InvokeMember(System.String, System.Reflection.BindingFlags, System.Reflection.Binder, System.Object, System.Object[], System.Reflection.ParameterModifier[], System.Globalization.CultureInfo, System.String[]))
0714f478 7973ea9d (MethodDesc 0x79108264 +0x1d System.Type.InvokeMember(System.String, System.Reflection.BindingFlags, System.Reflection.Binder, System.Object, System.Object[]))
[...]
0714f588 792d6cf6 (MethodDesc 0x791939dc +0x66 System.Threading.ThreadHelper.ThreadStart_Context(System.Object))
0714f594 792e019f (MethodDesc 0x7910276c +0x6f System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object))
0714f5a8 792d6c74 (MethodDesc 0x790fbde4 +0x44 System.Threading.ThreadHelper.ThreadStart())

- Dmitry Vostokov @ DumpAnalysis.org -

Crash Dump Analysis Patterns (Part 88)

Friday, October 23rd, 2009

Some modules like drivers or runtime DLLs are always present after some action has happened. I call them Effect Components. It is the last thing to assume them to be the “Cause” components” or “Root Cause” or the so so called “culprit” components. Typical example, is dump disk driver symbolic references found in execution residue on the raw stack of a running bugchecking thread:

0: kd> !thread
THREAD fffffa8002bdebb0  Cid 03c4.03f0  Teb: 000007fffffde000 Win32Thread: fffff900c20f9810 RUNNING on processor 0
IRP List:
    fffffa8002b986f0: (0006,0118) Flags: 00060000  Mdl: 00000000
Not impersonating
DeviceMap                 fffff88005346920
Owning Process            fffffa80035bec10       Image:         Application.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      35246          Ticks: 7 (0:00:00:00.109)
Context Switch Count      1595                 LargeStack
UserTime                  00:00:00.000
KernelTime                00:00:00.031
Win32 Start Address Application (0x0000000140002708)
Stack Init fffffa600495ddb0 Current fffffa600495d720
Base fffffa600495e000 Limit fffffa6004955000 Call 0
Priority 11 BasePriority 8 PriorityDecrement 1 IoPriority 2 PagePriority 5
Child-SP          RetAddr           : Call Site
fffffa60`0495d558 fffff800`0186e3ee : nt!KeBugCheckEx
fffffa60`0495d560 fffff800`0186d2cb : nt!KiBugCheckDispatch+0×6e
fffffa60`0495d6a0 fffffa60`03d5917a : nt!KiPageFault+0×20b (TrapFrame @ fffffa60`0495d6a0)
[…]

0: kd> dps fffffa6004955000 fffffa600495e000
fffffa60`04955000  00d4d0c8`00d4d0c8
fffffa60`04955008  00d4d0c8`00d4d0c8
fffffa60`04955010  00d4d0c8`00d4d0c8
[…]
fffffa60`0495c7e0  00000000`00000001
fffffa60`0495c7e8  fffffa60`02877f6f dump_SATA_Driver!RecordExecutionHistory+0xcf
fffffa60`0495c7f0  fffffa80`024c05a8
fffffa60`0495c7f8  fffffa60`02869ad4 dump_dumpata!IdeDumpNotification+0×1a4
fffffa60`0495c800  fffffa60`0495cb00
fffffa60`0495c808  fffff800`0182ff34 nt!output_l+0×6c0
fffffa60`0495c810  fffffa60`02860110 crashdmp!StrBeginningDump
fffffa60`0495c818  fffffa60`0495cb00
fffffa60`0495c820  00000000`00000000
fffffa60`0495c828  fffffa60`02869b18 dump_dumpata!IdeDumpNotification+0×1e8
fffffa60`0495c830  00000000`00000000
fffffa60`0495c838  fffffa60`0495c8c0
fffffa60`0495c840  00000000`00000000
fffffa60`0495c848  fffffa60`00000024
fffffa60`0495c850  00000000`ffffffff
fffffa60`0495c858  00000000`00000000
fffffa60`0495c860  00000000`00000000
fffffa60`0495c868  fffffa60`0495cb00
fffffa60`0495c870  fffffa80`00000000
fffffa60`0495c878  00000000`00000000
fffffa60`0495c880  00000000`00000101
fffffa60`0495c888  fffffa60`02877f6f dump_SATA_Driver!RecordExecutionHistory+0xcf
fffffa60`0495c890  fffffa60`0495cb0f
fffffa60`0495c898  fffff800`0182ff34 nt!output_l+0×6c0
fffffa60`0495c8a0  fffffa60`0495cb0f
fffffa60`0495c8a8  fffffa60`0495cb90
fffffa60`0495c8b0  00000000`00000040
fffffa60`0495c8b8  fffffa60`02877f6f dump_SATA_Driver!RecordExecutionHistory+0xcf
fffffa60`0495c8c0  fffffa80`024c0728
fffffa60`0495c8c8  fffffa80`024c0728
fffffa60`0495c8d0  00000001`00000000
fffffa60`0495c8d8  fffffa60`00000026
fffffa60`0495c8e0  00000000`ffffffff
fffffa60`0495c8e8  00000000`00000000
fffffa60`0495c8f0  fffffa80`00000000
fffffa60`0495c8f8  fffffa60`0495cb90
fffffa60`0495c900  00000000`00000000
fffffa60`0495c908  fffffa60`02877f6f dump_SATA_Driver!RecordExecutionHistory+0xcf
fffffa60`0495c910  00000000`00000000
fffffa60`0495c918  fffffa60`02877f6f dump_SATA_Driver!RecordExecutionHistory+0xcf
fffffa60`0495c920  fffff880`05311010
fffffa60`0495c928  00000000`00000002
fffffa60`0495c930  fffffa60`02875094 dump_SATA_Driver!AhciAdapterControl
fffffa60`0495c938  fffffa80`024c6018
fffffa60`0495c940  fffffa80`024c0728
fffffa60`0495c948  fffffa60`02877f6f dump_SATA_Driver!RecordExecutionHistory+0xcf
fffffa60`0495c950  fffffa80`024c0728
fffffa60`0495c958  00000000`00000000
fffffa60`0495c960  fffffa60`0495ca18
fffffa60`0495c968  00000000`00000000
fffffa60`0495c970  fffffa80`024c0728
fffffa60`0495c978  fffffa60`02876427 dump_SATA_Driver!AhciHwInitialize+0×337
fffffa60`0495c980  fffffa80`024c0be6
fffffa60`0495c988  fffffa60`0286a459 dump_dumpata!IdeDumpWaitOnRequest+0×79
fffffa60`0495c990  00000000`00000000
fffffa60`0495c998  00000000`0000023a
fffffa60`0495c9a0  20474e55`534d4153
fffffa60`0495c9a8  204a4831`36314448
fffffa60`0495c9b0  20202020`20202020
fffffa60`0495c9b8  20202020`20202020
fffffa60`0495c9c0  fffffa80`024c05a8
fffffa60`0495c9c8  fffffa60`02869b18 dump_dumpata!IdeDumpNotification+0×1e8
fffffa60`0495c9d0  00000000`00000000
fffffa60`0495c9d8  fffffa60`0495ca60
fffffa60`0495c9e0  00000000`00000001
fffffa60`0495c9e8  fffffa60`02869396 dump_dumpata!IdeDumpMiniportChannelInitialize+0×236
fffffa60`0495c9f0  fffffa80`024c05a8
fffffa60`0495c9f8  fffffa60`02869ad4 dump_dumpata!IdeDumpNotification+0×1a4
fffffa60`0495ca00  00000000`00000000
fffffa60`0495ca08  fffffa60`0495ca90
fffffa60`0495ca10  00000000`00000001
fffffa60`0495ca18  00000001`00000038
fffffa60`0495ca20  00000000`10010000
fffffa60`0495ca28  00000000`00000003
fffffa60`0495ca30  fffffa80`024c05a8
fffffa60`0495ca38  fffffa60`0286a954 dump_dumpata!AtaPortGetPhysicalAddress+0×2c
fffffa60`0495ca40  fffffa80`024c0728
fffffa60`0495ca48  fffffa60`02877f6f dump_SATA_Driver!RecordExecutionHistory+0xcf
fffffa60`0495ca50  00000000`00000001
fffffa60`0495ca58  0000003f`022a8856
fffffa60`0495ca60  fffffa80`0000000c
fffffa60`0495ca68  fffffa80`024c0728
fffffa60`0495ca70  00000000`00000200
fffffa60`0495ca78  fffffa60`02877f6f dump_SATA_Driver!RecordExecutionHistory+0xcf
fffffa60`0495ca80  fffffa80`024c0728
fffffa60`0495ca88  ffff6226`4f5f3eb8
fffffa60`0495ca90  00000000`00000010
fffffa60`0495ca98  fffffa60`02860370 crashdmp!Context+0×30
fffffa60`0495caa0  fffffa80`024c05a8
fffffa60`0495caa8  fffffa60`02875a0d dump_SATA_Driver!AhciHwStartIo+0×69d
fffffa60`0495cab0  fffffa80`024c0728
fffffa60`0495cab8  00000000`00000000
fffffa60`0495cac0  00000000`00000001
fffffa60`0495cac8  fffff800`018f3dfc nt!DisplayCharacter+0×5c
fffffa60`0495cad0  00000000`00000000
fffffa60`0495cad8  fffffa60`02877f6f dump_SATA_Driver!RecordExecutionHistory+0xcf
fffffa60`0495cae0  00000000`00010000
fffffa60`0495cae8  00000000`00000000
fffffa60`0495caf0  fffffa60`0495cd10
fffffa60`0495caf8  fffffa60`0495cc00
fffffa60`0495cb00  fffffa80`024c01c0
fffffa60`0495cb08  fffffa60`02875c3f dump_SATA_Driver!AhciHwInterrupt+0×2b
fffffa60`0495cb10  fffffa80`024c05a8
fffffa60`0495cb18  00000000`00000000
fffffa60`0495cb20  00000000`00000000
fffffa60`0495cb28  fffff800`01d406c9 hal!KeStallExecutionProcessor+0×25
fffffa60`0495cb30  00000000`00010000
fffffa60`0495cb38  00000000`00000000
fffffa60`0495cb40  fffffa60`0495cd10
fffffa60`0495cb48  fffffa60`0495cc00
fffffa60`0495cb50  00000000`00000000
fffffa60`0495cb58  fffffa60`0286a429 dump_dumpata!IdeDumpWaitOnRequest+0×49
fffffa60`0495cb60  fffffa60`02860370 crashdmp!Context+0×30
fffffa60`0495cb68  00000000`d8bda325
fffffa60`0495cb70  00000000`00000000
fffffa60`0495cb78  00000000`0000033e
fffffa60`0495cb80  00000000`00000000
fffffa60`0495cb88  fffffa60`028694d2 dump_dumpata!IdeDumpWritePending+0xee
fffffa60`0495cb90  fffffa80`024c0000
fffffa60`0495cb98  fffffa80`024c01c0
fffffa60`0495cba0  00000000`00000000
fffffa60`0495cba8  00000000`00000000
fffffa60`0495cbb0  fffffa80`024c01c0
fffffa60`0495cbb8  fffffa80`01e3c740
fffffa60`0495cbc0  00000000`00010000
fffffa60`0495cbc8  00000000`00000000
fffffa60`0495cbd0  00000000`0c01f000
fffffa60`0495cbd8  fffffa60`0285bca9 crashdmp!WritePageSpanToDisk+0×181
fffffa60`0495cbe0  00000000`83d81000
fffffa60`0495cbe8  00000000`00000000
fffffa60`0495cbf0  fffffa60`02860370 crashdmp!Context+0×30
fffffa60`0495cbf8  00000000`00000002
fffffa60`0495cc00  00000000`00000000
fffffa60`0495cc08  00000000`00030000
fffffa60`0495cc10  00000000`00000000
fffffa60`0495cc18  fffffa60`00441000
fffffa60`0495cc20  fffffa60`00441000
fffffa60`0495cc28  00000000`00010000
fffffa60`0495cc30  00000000`0000c080
fffffa60`0495cc38  00000000`0000c081
fffffa60`0495cc40  00000000`0000c082
fffffa60`0495cc48  00000000`0000c083
fffffa60`0495cc50  00000000`0000c084
fffffa60`0495cc58  00000000`0000c085
fffffa60`0495cc60  00000000`0000c086
fffffa60`0495cc68  00000000`0000c087
fffffa60`0495cc70  00000000`0000c088
fffffa60`0495cc78  00000000`0000c089
fffffa60`0495cc80  00000000`0000c08a
fffffa60`0495cc88  00000000`0000c08b
fffffa60`0495cc90  00000000`0000c08c
fffffa60`0495cc98  00000000`0000c08d
fffffa60`0495cca0  00000000`0000c08e
fffffa60`0495cca8  00000000`0000c08f
fffffa60`0495ccb0  00000000`00000000
fffffa60`0495ccb8  00000000`00000000
fffffa60`0495ccc0  00000000`00000000
fffffa60`0495ccc8  00000000`00000010
fffffa60`0495ccd0  00000000`0000c01d
fffffa60`0495ccd8  fffffa60`02860370 crashdmp!Context+0×30
fffffa60`0495cce0  00000000`0000bf80
fffffa60`0495cce8  00000000`00000001
fffffa60`0495ccf0  00000000`00000000
fffffa60`0495ccf8  fffffa80`01e353d0
fffffa60`0495cd00  fffffa80`01e353f8
fffffa60`0495cd08  fffffa60`0285bacc crashdmp!WriteFullDump+0×70
fffffa60`0495cd10  00000002`3a3d8000
fffffa60`0495cd18  00000000`0000c080
fffffa60`0495cd20  fffffa80`00000000
fffffa60`0495cd28  fffffa60`0285c9c0 crashdmp!CrashdmpWriteRoutine
fffffa60`0495cd30  fffff880`05311010
fffffa60`0495cd38  00000000`00000002
fffffa60`0495cd40  fffffa60`0495cf70
fffffa60`0495cd48  00000000`00000000
fffffa60`0495cd50  fffffa60`02860370 crashdmp!Context+0×30
fffffa60`0495cd58  fffffa60`0285b835 crashdmp!DumpWrite+0xc5
fffffa60`0495cd60  00000000`00000000
fffffa60`0495cd68  00000000`0000000f
fffffa60`0495cd70  00000000`00000001
fffffa60`0495cd78  fffffa60`00000001
fffffa60`0495cd80  fffffa80`02bdebb0
fffffa60`0495cd88  fffffa60`0285b153 crashdmp!CrashdmpWrite+0×57
fffffa60`0495cd90  00000000`00000000
fffffa60`0495cd98  fffffa60`028602f0 crashdmp!StrInitPortDriver
fffffa60`0495cda0  00000000`00000000
fffffa60`0495cda8  fffffa60`02860a00 crashdmp!ContextCopy
fffffa60`0495cdb0  00000000`00000000
fffffa60`0495cdb8  fffff800`01902764 nt!IoWriteCrashDump+0×3f4
fffffa60`0495cdc0  fffffa60`0495ce00
fffffa60`0495cdc8  00000028`00000025
fffffa60`0495cdd0  fffff800`018afd40 nt! ?? ::FNODOBFM::`string’
fffffa60`0495cdd8  00000000`000000d1
fffffa60`0495cde0  fffff880`05311010
fffffa60`0495cde8  00000000`00000002
fffffa60`0495cdf0  00000000`00000000
fffffa60`0495cdf8  fffffa60`03d5917a
fffffa60`0495ce00  202a2a2a`0a0d0a0d
fffffa60`0495ce08  7830203a`504f5453
fffffa60`0495ce10  31443030`30303030
fffffa60`0495ce18  46464646`78302820
fffffa60`0495ce20  31333530`30383846
fffffa60`0495ce28  fffff800`018f5f83 nt!VidDisplayString+0×143
fffffa60`0495ce30  30303030`30300030
fffffa60`0495ce38  2c323030`30303030
fffffa60`0495ce40  30303030`30307830
fffffa60`0495ce48  30303030`30303030
fffffa60`0495ce50  46464678`302c3030
fffffa60`0495ce58  fffff800`018fe040 nt!KiInvokeBugCheckEntryCallbacks+0×80
fffffa60`0495ce60  fffffa80`02bdebb0
fffffa60`0495ce68  fffff800`01921d52 nt!InbvDisplayString+0×72
fffffa60`0495ce70  fffff880`05311000
fffffa60`0495ce78  fffff800`01d406c9 hal!KeStallExecutionProcessor+0×25
fffffa60`0495ce80  00000000`00000001
fffffa60`0495ce88  00000000`0000000a
fffffa60`0495ce90  fffffa60`03d5917a
fffffa60`0495ce98  00000000`40000082
fffffa60`0495cea0  00000000`00000001
fffffa60`0495cea8  fffff800`01922c3e nt!KeBugCheck2+0×92e
fffffa60`0495ceb0  fffff800`000000d1
fffffa60`0495ceb8  00000000`000004d0
fffffa60`0495cec0  fffff800`01a43640 nt!KiProcessorBlock
fffffa60`0495cec8  00000000`0000000a
fffffa60`0495ced0  fffffa60`03d5917a
fffffa60`0495ced8  fffffa60`0495cf70
fffffa60`0495cee0  fffffa80`02bdebb0
fffffa60`0495cee8  00000000`00000000
fffffa60`0495cef0  00000000`00000000
fffffa60`0495cef8  fffffa80`02bdebb0
fffffa60`0495cf00  00000000`c21a6d00
fffffa60`0495cf08  00000000`00000000
fffffa60`0495cf10  fffff800`0198e7a0 nt!KiInitialPCR+0×2a0
fffffa60`0495cf18  fffff800`0198e680 nt!KiInitialPCR+0×180
fffffa60`0495cf20  fffffa80`02bb7320
fffffa60`0495cf28  00000000`00000000
fffffa60`0495cf30  00000000`00000000
fffffa60`0495cf38  fffff960`00000003
fffffa60`0495cf40  fffffa60`0495e000
fffffa60`0495cf48  fffffa60`04955000
fffffa60`0495cf50  00000001`c0643000
fffffa60`0495cf58  00000000`00000000
fffffa60`0495cf60  fffff900`c06ca53c
fffffa60`0495cf68  fffffa60`0495d090
fffffa60`0495cf70  00000000`00000000
fffffa60`0495cf78  00000000`00000000
fffffa60`0495cf80  00000000`00000000
fffffa60`0495cf88  00000000`00000000
fffffa60`0495cf90  00000000`00000000
fffffa60`0495cf98  00000000`00000000
fffffa60`0495cfa0  00001f80`0010000f
fffffa60`0495cfa8  0053002b`002b0010
fffffa60`0495cfb0  00000286`0018002b
fffffa60`0495cfb8  00000000`00000000
fffffa60`0495cfc0  00000000`00000000
fffffa60`0495cfc8  00000000`00000000
fffffa60`0495cfd0  00000000`00000000
fffffa60`0495cfd8  00000000`00000000
fffffa60`0495cfe0  00000000`00000000
fffffa60`0495cfe8  fffffa60`0495d660
fffffa60`0495cff0  00000000`0000000a
fffffa60`0495cff8  fffff880`05311010
fffffa60`0495d000  fffff880`05311010
fffffa60`0495d008  fffffa60`0495d558
fffffa60`0495d010  fffffa60`0495d720
fffffa60`0495d018  fffffa80`02b986f0
fffffa60`0495d020  fffffa80`02b98720
fffffa60`0495d028  00000000`00000002
fffffa60`0495d030  00000000`00000000
fffffa60`0495d038  fffffa60`03d5917a
fffffa60`0495d040  00000000`000001f1
fffffa60`0495d048  fffffa80`026a9df0
fffffa60`0495d050  00000000`00000001
fffffa60`0495d058  00000000`83360018
fffffa60`0495d060  fffffa80`02b3ee40
fffffa60`0495d068  fffff800`0186e650 nt!KeBugCheckEx
fffffa60`0495d070  00000000`00000000
fffffa60`0495d078  00000000`00000000
fffffa60`0495d080  00000000`00000000
fffffa60`0495d088  00000000`00000000
fffffa60`0495d090  00000000`00000000
fffffa60`0495d098  00000000`00000000
fffffa60`0495d0a0  00000000`00000000
[…]

If a BSOD was reported after installing new drivers we shouldn’t suspect SATA_Driver package here because its components would almost always be present on any bugcheck thread as referenced after a bugcheck cause. There presence is the “effect”. This example might seem trivial and pointless but I’ve seen some memory dump analysis conclusions based on the reversal of causes and effects.

- Dmitry Vostokov @ DumpAnalysis.org -

Forthcoming Memory Dump Analysis Anthology, Volume 3

Saturday, September 26th, 2009

This is a revised, edited, cross-referenced and thematically organized volume of selected DumpAnalysis.org blog posts about crash dump analysis and debugging written in October 2008 - June 2009 for software engineers developing and maintaining products on Windows platforms, quality assurance engineers testing software on Windows platforms and technical support and escalation engineers dealing with complex software issues. The third volume features:

- 15 new crash dump analysis patterns
- 29 new pattern interaction case studies
- Trace analysis patterns
- Updated checklist
- Fully cross-referenced with Volume 1 and Volume 2
- New appendixes

Product information:

  • Title: Memory Dump Analysis Anthology, Volume 3
  • Author: Dmitry Vostokov
  • Language: English
  • Product Dimensions: 22.86 x 15.24
  • Paperback: 404 pages
  • Publisher: Opentask (20 December 2009)
  • ISBN-13: 978-1-906717-43-8
  • Hardcover: 404 pages
  • Publisher: Opentask (30 January 2010)
  • ISBN-13: 978-1-906717-44-5

Back cover features 3D computer memory visualization image.

- Dmitry Vostokov @ DumpAnalysis.org -

10 Common Mistakes in Memory Analysis (Part 5)

Monday, August 31st, 2009

Sometimes not paying attention to all aspects of default analysis makes it difficult to consider an alternative troubleshooting hypothesis. Here is a sample of !analyze -v output showing massive patching (hooked functions pattern) by DriverA module:

KERNEL_MODE_EXCEPTION_NOT_HANDLED (8e)
This is a very common bugcheck.  Usually the exception address pinpoints the driver/function that caused the problem.  Always note this address as well as the link date of the driver/image that contains this address. Some common problems are exception code 0x80000003.  This means a hard coded breakpoint or assertion was hit, but this system was booted /NODEBUG.  This is not supposed to happen as developers should never have hardcoded breakpoints in retail code, but ... If this happens, make sure a debugger gets connected, and the system is booted /DEBUG.  This will let us see why this breakpoint is happening.
Arguments:
Arg1: c0000005, The exception code that was not handled
Arg2: 8092d47f, The address that the exception occurred at
Arg3: f5205b14, Trap Frame
Arg4: 00000000

[...]

CHKIMG_EXTENSION: !chkimg -lo 50 -d !nt
    80822a49-80822a4d  5 bytes - nt!NtYieldExecution
 [ 8b ff 55 8b ec:e9 14 3a 95 76 ]
    80823c11-80823c14  4 bytes - nt!KeFlushProcessTb+2c (+0x11c8)
 [ 69 76 82 80:88 ff ff ff ]
    80823c17-80823c1a  4 bytes - nt!KeFlushProcessTb+32 (+0x06)
 [ dd 40 01 00:b5 34 b3 76 ]
    8083771f-80837725  7 bytes - nt!KeAcquireQueuedSpinLockAtDpcLevel+1b (+0x13b08)
 [ f7 41 04 01 00 00 00:e9 c4 f9 b1 76 cc cc ]
    80840945-8084094a  6 bytes - nt!KxFlushEntireTb+9 (+0x9226)
 [ ff 15 1c 10 80 80:e9 65 67 b1 76 cc ]
    80845fe0-80845fe3  4 bytes - nt!KeFlushSingleTb+49 (+0x569b)
 [ 14 1d ff ff:dd 10 b1 76 ]
    80845fe5 - nt!KeFlushSingleTb+4e (+0x05)
 [ b9:c3 ]
    8084722d-80847230  4 bytes - nt!KeFlushMultipleTb+45 (+0x1248)
 [ 5e e3 82 80:14 00 00 00 ]
    80847233-80847236  4 bytes - nt!KeFlushMultipleTb+4b (+0x06)
 [ c1 0a ff ff:99 fe b0 76 ]
    808c039c-808c039e  3 bytes - nt!NtSetContextThread
 [ 8b ff 55:e9 31 5f ]
    808c03a0 - nt!NtSetContextThread+4 (+0x04)
 [ ec:76 ]
    808e3184-808e3188  5 bytes - nt!NtCreateProcess (+0x22de4)
 [ 8b ff 55 8b ec:e9 0b 31 89 76 ]
    808f6ad0-808f6ad6  7 bytes - nt!NtLoadKeyEx (+0x1394c)
 [ 6a 70 68 98 4b 81 80:e9 e7 f8 87 76 90 90 ]
    8090c66f-8090c675  7 bytes - nt!NtDeleteValueKey (+0x15b9f)
 [ 6a 44 68 60 f0 81 80:e9 c4 9c 86 76 90 90 ]
    8090e36c-8090e370  5 bytes - nt!NtTerminateProcess (+0x1cfd)
 [ 8b ff 55 8b ec:e9 34 81 86 76 ]
    80915342-80915346  5 bytes - nt!NtDeleteKey (+0x6fd6)
 [ 8b ff 55 8b ec:e9 c7 0f 86 76 ]
    80918114-80918118  5 bytes - nt!NtOpenThread (+0x2dd2)
 [ 68 c4 00 00 00:e9 53 e1 85 76 ]
    80921eac-80921eb2  7 bytes - nt!NtEnumerateKey (+0x9d98)
 [ 6a 48 68 f0 f9 82 80:e9 f5 44 85 76 90 90 ]
    80922578-8092257e  7 bytes - nt!NtEnumerateValueKey (+0x6cc)
 [ 6a 48 68 10 fc 82 80:e9 13 3e 85 76 90 90 ]
    80922efd-80922f01  5 bytes - nt!NtNotifyChangeKey (+0x985)
 [ 8b ff 55 8b ec:e9 e4 34 85 76 ]
    809246fb-809246ff  5 bytes - nt!NtOpenProcess (+0x17fe)
 [ 68 c8 00 00 00:e9 58 1b 85 76 ]
    8092c8a0-8092c8a4  5 bytes - nt!NtCreateKey (+0x81a5)
 [ 68 c0 00 00 00:e9 55 9a 84 76 ]
    8092f3a6-8092f3ac  7 bytes - nt!NtSetValueKey (+0x2b06)
 [ 6a 58 68 a0 f6 82 80:e9 a3 6f 84 76 90 90 ]
    8092fa88-8092fa8c  5 bytes - nt!NtCreateFile (+0x6e2)
 [ 8b ff 55 8b ec:e9 ab 69 84 76 ]
    80931311-80931315  5 bytes - nt!NtOpenKey (+0x1889)
 [ 68 ac 00 00 00:e9 d0 4f 84 76 ]
    809316ed-809316f3  7 bytes - nt!NtQueryValueKey (+0x3dc)
 [ 6a 60 68 80 90 84 80:e9 72 4c 84 76 90 90 ]
    8093470f-80934715  7 bytes - nt!NtQueryKey (+0x3022)
 [ 6a 58 68 c8 97 84 80:e9 0e 1d 84 76 90 90 ]
    809354fa-80935500  7 bytes - nt!NtMapViewOfSection (+0xdeb)
 [ 6a 38 68 80 a2 84 80:e9 77 0f 84 76 90 90 ]
    80935785-80935789  5 bytes - nt!NtUnmapViewOfSection (+0x28b)
 [ 8b ff 55 8b ec:e9 02 0d 84 76 ]
    8093ba96-8093ba9c  7 bytes - nt!NtProtectVirtualMemory (+0x6311)
 [ 6a 44 68 40 03 85 80:e9 b1 a9 83 76 90 90 ]
    8093c86d-8093c871  5 bytes - nt!NtSetInformationProcess (+0xdd7)
 [ 68 08 01 00 00:e9 4c 9a 83 76 ]
    8093ce6b-8093ce71  7 bytes - nt!NtCreateProcessEx (+0x5fe)
 [ 6a 0c 68 58 0e 85 80:e9 38 94 83 76 90 90 ]
    80978fef-80978ff5  7 bytes - nt!NtQueryMultipleValueKey (+0x3c184)
 [ 6a 48 68 f0 f9 86 80:e9 86 d3 7f 76 90 90 ]
    80979775-8097977b  7 bytes - nt!NtRenameKey (+0x786)
 [ 6a 3c 68 38 fa 86 80:e9 a8 cb 7f 76 90 90 ]
    80979caf-80979cb3  5 bytes - nt!NtRestoreKey (+0x53a)
 [ 8b ff 55 8b ec:e9 46 c7 7f 76 ]
    8097a11c-8097a120  5 bytes - nt!NtUnloadKey (+0x46d)
 [ 8b ff 55 8b ec:e9 b1 c2 7f 76 ]
    8097a139-8097a13d  5 bytes - nt!NtReplaceKey (+0x1d)
 [ 8b ff 55 8b ec:e9 d0 c2 7f 76 ]
197 errors : !nt (80822a49-8097a13d)

MODULE_NAME: DriverA

IMAGE_NAME:  DriverA.sys

MEMORY_CORRUPTOR:  PATCH_DriverA

FAILURE_BUCKET_ID:  MEMORY_CORRUPTION_PATCH_DriverA

BUCKET_ID:  MEMORY_CORRUPTION_PATCH_DriverA

However, when we look at the stack trace, we would see that BSOD happened when accessing registry while updating drivers:

FAULTING_IP:
nt!HvpGetCellMapped+97
8092d47f 8b4604          mov     eax,dword ptr [esi+4]

TRAP_FRAME:  f5205b14 -- (.trap 0xfffffffff5205b14)
ErrCode = 00000000
eax=e1021000 ebx=e101a3b8 ecx=00000003 edx=89214988 esi=00000100 edi=00000000
eip=8092d47f esp=f5205b88 ebp=f5205bfc iopl=0         nv up ei pl nz na pe nc
cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00010206
nt!HvpGetCellMapped+0×97:
8092d47f 8b4604          mov     eax,dword ptr [esi+4] ds:0023:00000104=????????
Resetting default scope

PROCESS_NAME:  updatedrivers.exe

STACK_TEXT: 
f52056e0 8085bb9f 0000008e c0000005 8092d47f nt!KeBugCheckEx+0x1b
f5205aa4 808346b4 f5205ac0 00000000 f5205b14 nt!KiDispatchException+0x3a2
f5205b0c 80834668 f5205bfc 8092d47f badb0d00 nt!CommonDispatchException+0x4a
f5205b98 8092d559 e101a3b8 e63a8e40 0010fc18 nt!Kei386EoiHelper+0x186
f5205bfc 80920fcd e101a3b8 00610052 3b9aca07 nt!HvpGetCellMapped+0×36a
f5205c20 8092248b e63a8e40 e22b4794 00000000 nt!CmpGetValueKeyFromCache+0xa4
f5205cc0 80922649 e63a8e40 00000000 00000001 nt!CmEnumerateValueKey+0×45a
f5205d44 80833bdf 00000058 00000000 00000001 nt!NtEnumerateValueKey+0×1c9
f5205d44 7c9485ec 00000058 00000000 00000001 nt!KiFastCallEntry+0xfc
WARNING: Frame IP not in any known module. Following frames may be wrong.
001290fc 00000000 00000000 00000000 00000000 0×7c9485ec

So an alternative hypothesis to pursue would be some sort of registry corruption after driver updates.

- Dmitry Vostokov @ DumpAnalysis.org -

10 Common Mistakes in Memory Analysis (Part 4)

Friday, July 3rd, 2009

One of the common mistakes that I observe is to habitually stick to certain WinDbg commands to recognize patterns. One example is !locks command used to find out any wait chains and deadlock conditions among threads. Recently a service process was reported to be hang and !locks command showed no blocked threads:

0:000> !locks
CritSec +18caf94 at 018CAF94
LockCount          -2
RecursionCount     1
OwningThread       58e8
EntryCount         0
ContentionCount    0
*** Locked

CritSec +18cc7c4 at 018CC7C4
LockCount          -2
RecursionCount     1
OwningThread       58e8
EntryCount         0
ContentionCount    0
*** Locked

The number of threads waiting for the lock is 0 (this calculation is explained in the MSDN article): 

0:000> ? ((-1) - (-2)) >> 2
Evaluate expression: 0 = 00000000

In the past, for that hang sevice memory dumps, !locks command always showed LockCount values corresponding to several waiting threads. Therefore, an engineer assumed that the dump was taken at some random time, not at the time the service was hanging, and asked for a new right dump. The mistake here is that the engineer didn’t look at the corresponding thread stack trace that shows the characteristic pattern of the blocked thread waiting for a reply from an LRPC call:

0:000> ~~[58e8]kc 100

ntdll!KiFastSystemCallRet
ntdll!NtRequestWaitReplyPort
RPCRT4!LRPC_CCALL::SendReceive
RPCRT4!I_RpcSendReceive
RPCRT4!NdrSendReceive
RPCRT4!NdrClientCall2

ServiceA!foo
[…]
ServiceA!bar
RPCRT4!NdrStubCall2
RPCRT4!NdrServerCall2
RPCRT4!DispatchToStubInCNoAvrf
RPCRT4!RPC_INTERFACE::DispatchToStubWorker
RPCRT4!RPC_INTERFACE::DispatchToStub
RPCRT4!RPC_INTERFACE::DispatchToStubWithObject
RPCRT4!LRPC_SCALL::DealWithRequestMessage
RPCRT4!LRPC_ADDRESS::DealWithLRPCRequest
RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls
RPCRT4!RecvLotsaCallsWrapper
RPCRT4!BaseCachedThreadRoutine
RPCRT4!ThreadStartRoutine
kernel32!BaseThreadStart

We don’t see other blocked threads and wait chains because the dump was saved as soon as the freezing condition was detected: the service didn’t allow a user connection to proceed. If more users tried to connect we would have seen critical section wait chains that are absent in this dump.

To prevent such mistakes checklists are indispensable. For one example, see Crash Dump Analysis Checklist. You can also order it in print:

WinDbg: A Reference Poster and Learning Cards

- Dmitry Vostokov @ DumpAnalysis.org -

WinDbg In Use: Debugging Exercises

Wednesday, December 24th, 2008

The analogy between learning a complex tool with its own language and a foreign natural language has been developed further after the release of WinDbg Learning Cards and finally culminated in “WinDbg In Use” book series with the first book to be published during the 1st quarter of 2009:

  • Title: WinDbg In Use: Debugging Exercises (Elementary and Intermediate Level)
  • Author: Dmitry Vostokov
  • Publisher: Opentask (15 March 2009)
  • Language: English
  • Product Dimensions: 23.5 x 19.1
  • ISBN-13: 978-1-906717-50-6
  • Paperback: 200 pages
  • Book Annotation: Includes 60 programmed exercises from real life debugging and crash dump analysis scenarios and multiple-choice questions with full answers, comments and suggestions for further reading.

Some example exercises will be published on this blog from time to time. I also plan a corresponding column in the forthcoming Debugged! magazine. 

- Dmitry Vostokov @ DumpAnalysis.org -

Debugged! Magazine

Tuesday, November 25th, 2008

As one of the new initiatives for the Year of Debugging  DumpAnalysis Portal will publish bimonthly full color 16 page publication called:

Debugged! MZ/PE: MagaZine for/from Practicing Engineers
The only serial publication dedicated entirely to Windows® debugging

The first issue is planned for March, 2009 and will have ISBN-13: 978-1-906717-38-4. If it goes well I’m planning to have ISSN number assigned to it too. More details will be announced soon.

- Dmitry Vostokov @ DumpAnalysis.org

10 Common Mistakes in Memory Analysis (Part 3)

Wednesday, October 29th, 2008

In part 1 we discussed the common mistake of not looking at full stack traces. In this part we discuss the common mistake of not looking at all stack traces. This is important when the dump is partially truncated or inconsistent. For example, in one complete memory dump from one hang system WinDbg !locks command is not able to traverse them at all due to truncated dump:

3: kd> !locks
**** DUMP OF ALL RESOURCE OBJECTS ****
KD: Scanning for held locks.......Error 1 in reading nt!_ERESOURCE.SystemResourcesList.Flink @ f71612a0

The common response, especially from beginners, would be to dismiss this dump and request the new one after increasing page file size. However, dumping all thread stacks reveals the resource contention around ERESOURCE objects similar to what was discussed in a mixed object deadlock example in kernel space

3: kd> !stacks
Proc.Thread  .Thread  Ticks   ThreadState Blocker
[...]
                            [85973590 csrss.exe]
4138.0051e0  85961db0 00cb222 Blocked    driverA+0xec08
4138.0048c8  85d1d240 000006d Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
4138.0054cc  85c8a840 00c0d50 Blocked    driverA+0xec08
4138.00227c  859be330 00c0d53 Blocked    driverA+0xec08
4138.0053d8  8590f458 00000df Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
4138.003bb4  85b61020 00000e1 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
4138.002a08  85d1edb0 00000e1 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
4138.005e6c  85943020 00cc9cc Blocked    driverA+0xec08
4138.00575c  858eeb40 00c0d4e Blocked    driverA+0xec08
4138.003880  858ee5f8 00c0d51 Blocked    driverA+0xec08

                            [85bb9b18 winlogon.exe]
50e0.0054d4  85a8cb30 00c0d53 Blocked    driverA+0xec08
50e0.004b90  85b6c7b8 000001a Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
50e0.0032cc  85a1f850 0000084 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
50e0.005450  85c43db0 0000014 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
50e0.005648  85a1f5e0 0000015 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
50e0.004a80  85a7abd8 000001b Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
50e0.0036d8  85d886a8 000001b Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
50e0.0055b0  85d88438 0000014 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
50e0.004380  85962020 00c0d53 Blocked    driverA+0xec08
50e0.005744  85a22db0 0000015 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
50e0.005dd4  8584c7a0 0000015 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
50e0.005e30  858902f0 0000018 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
50e0.005ce8  857bbdb0 00c0d53 Blocked    driverA+0xec08

                            [85914868 explorer.exe]
5fd8.005fdc  85911020 0000016 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
5fd8.005fec  8579d020 00bc253 Blocked    driverA+0xec08
5fd8.005ff8  857ce020 0000014 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
5fd8.003678  857ce8d0 00bc253 Blocked    driverA+0xec08
5fd8.00556c  857ce3f0 00b85d9 Blocked    driverA+0xec08
5fd8.005564  857e4db0 00bc253 Blocked    driverA+0xec08
5fd8.005548  86529380 00bc253 Blocked    driverA+0xec08
5fd8.006fd8  856095c8 00bc253 Blocked    driverA+0xec08
5fd8.001844  85d50020 00bc253 Blocked    driverA+0xec08
5fd8.0069cc  85ab8db0 000001a Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
5fd8.0057c4  85fea2b0 00bc253 Blocked    driverA+0xec08
5fd8.00394c  85a475b8 00bc253 Blocked    driverA+0xec08
5fd8.004a8c  86090020 00bc253 Blocked    driverA+0xec08
5fd8.00583c  85990db0 00bc253 Blocked    driverA+0xec08

                            [858634a0 ApplicationA.EXE]
5b7c.005ad8  8597ddb0 0078325 Blocked    driverA+0xec08
5b7c.0058b4  85735020 00b6852 Blocked    driverA+0xec08
5b7c.00598c  8597db40 000001a Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
5b7c.0059dc  85746a18 000001a Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
5b7c.005b3c  85733ae8 0000016 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
5b7c.005934  85733878 0000018 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
5b7c.002b68  85bb8a40 0000016 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
5b7c.0016dc  85747438 0000018 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
5b7c.003fc0  8577ea60 00b6852 Blocked    driverA+0xec08
5b7c.0066a4  8595c2f8 0000016 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
5b7c.006b50  893d5660 0000018 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
5b7c.0066f4  8605f530 00b6852 Blocked    driverA+0xec08
5b7c.001554  85930cf0 00b6852 Blocked    driverA+0xec08
5b7c.006f28  86132db0 00b6852 Blocked    driverA+0xec08
5b7c.004448  85aa6890 0000016 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
5b7c.000fa8  859073c8 00b6852 Blocked    driverA+0xec08

                            [8595c928 ApplicationB.exe]
5990.0059a0  857c5508 000001a Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
5990.005950  85ce7548 00b3b52 Blocked    driverA+0xec08
5990.005c10  856dc910 00b3b52 Blocked    driverA+0xec08
5990.005bd4  85767b40 00b3b52 Blocked    driverA+0xec08
5990.005e38  859b6a18 000001a Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
5990.005f14  85a747a0 0000015 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
5990.005e68  85989020 0000015 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
5990.005f10  859f42d8 0000015 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
5990.005f0c  856ec5e8 00b3b52 Blocked    driverA+0xec08
5990.0045d0  856ec9a8 0000016 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
5990.004584  85728020 0000018 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
5990.004754  8572d818 0000016 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
5990.004b94  856cf020 00b3b52 Blocked    driverA+0xec08
5990.003374  85722db0 0000016 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
5990.000b1c  8647ddb0 00b3b52 Blocked    driverA+0xec08
5990.003bdc  85f812f0 00b3b52 Blocked    driverA+0xec08

                            [859bd598 dllhost.exe]
5e3c.00591c  8593e2f0 000001a Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
5e3c.005e60  85777db0 000006e Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
5e3c.005e64  85978b40 0000018 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19
5e3c.0055c8  85903358 0000018 Blocked    nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19

[...]

Threads Processed: 1500

Different methods to list all thread stacks are listed in Stack Trace Collection pattern. 

- Dmitry Vostokov @ DumpAnalysis.org -

MDAA Volume 2 is available on Amazon and B&N

Saturday, October 18th, 2008

Paperback edition of Memory Dump Analysis Anthology, Volume 2 is finally available on Amazon and Barnes & Noble. Search Inside is also available on Amazon. In addition, I updated the list of recommended books:

Listmania! Crash Dump Analysis and Debugging

Hardcover edition will be available on Amazon and B&N in 2-3 weeks.

- Dmitry Vostokov @ DumpAnalysis.org -

Bugtation No.51

Wednesday, October 15th, 2008

The following bugtation is quite wise and dedicated to beginners learning WinDbg (see Common Mistakes and Coincidental Symbolic Information for some examples).

“You rule the” debugger, “not the” debugger “you”.

John Dryden, The Hind and the Panther

- Dmitry Vostokov @ DumpAnalysis.org -

Memory Dump Analysis Anthology, Volume 2

Friday, October 3rd, 2008

“Everything is memory dump.”

I’m very excited to announce that Volume 2 is available in paperback, hardcover and digital editions:

Memory Dump Analysis Anthology, Volume 2

In one or two weeks paperback edition should also appear on Amazon and other bookstores. Amazon hardcover edition is planned to be available by the end of October.

I’m often asked when Volume 3 is available and I currently plan to release it in October - November, 2009. In the mean time I’m planning to concentrate on other publishing projects. 

- Dmitry Vostokov @ DumpAnalysis.org -

MDAA Volume 2: Table of Contents

Wednesday, October 1st, 2008

The book is nearly finished and here is the final TOC:

Memory Dump Analysis Anthology, Volume 2: Table of Contents

- Dmitry Vostokov @ DumpAnalysis.org -

10 Common Mistakes in Memory Analysis (Part 2)

Tuesday, June 17th, 2008

Mistake #2 - Not seeing semantic and pragmatic inconsistencies

Why would FreeHeap need a file name? See Incorrect Stack Trace pattern case study for semantic inconsistency. Why is this function on the stack trace

dll!exit+0x10,834

67,636 bytes long (0×10,834 in decimal)?  

The latter is an example of pragmatic inconsistency and the answer is that we don’t have symbols and the name appears from the DLL export table. The code on the stack has nothing to do with exit action when proper symbols are applied.

Another example. The memory dump of a hanging process has only one thread and it is waiting for an event. Is this the problem in ThreadProc and application logic or in the fact that _endthreadex was called when the thread was created?

STACK_TEXT: 
0379fa50 7642dcea ntdll!NtWaitForMultipleObjects+0x15
0379faec 75e08f76 kernel32!WaitForMultipleObjectsEx+0x11d
0379fb40 75e08fbf user32!RealMsgWaitForMultipleObjectsEx+0x14d
0379fb5c 00f6b45d user32!MsgWaitForMultipleObjects+0x1f
0379fba8 752e29bb application!ThreadProc+0xad
0379fbe0 752e2a47 msvcr80!_endthreadex+0×3b
0379fbe8 7649e3f3 msvcr80!_endthreadex+0xc7
0379fbf4 7773cfed kernel32!BaseThreadInitThunk+0xe
0379fc34 7773d1ff ntdll!__RtlUserThreadStart+0×23
0379fc4c 00000000 ntdll!_RtlUserThreadStart+0×1b

The latter assumption is wrong. The presence of _endthreadex stems from the fact that its address was pushed to let a user thread procedure to automatically call it upon the normal function return:  

0:000> u 752e29bb
msvcr80!_endthreadex+0x3b:
752e29bb 50              push    eax
752e29bc e8bfffffff      call    msvcr80!_endthreadex (752e2980)
752e29c1 8b45ec          mov     eax,dword ptr [ebp-14h]
752e29c4 8b08            mov     ecx,dword ptr [eax]
752e29c6 8b09            mov     ecx,dword ptr [ecx]
752e29c8 894de4          mov     dword ptr [ebp-1Ch],ecx
752e29cb 50              push    eax
752e29cc 51              push    ecx

A thread procedure passed to thread creation API call can be any C function. How would a C/C++ compiler understand that it needs to generate a call to thread exit API especially if ThreadProc is named FooBar and resides in a different compilation unit or a library? It seems logical that the runtime environment provides such an automatic return address dynamically. Also why and how _endthreadex knows about our custom ThreadProc to call it? Looks like inconsistency. The ability to see and reason about them is very important skill in memory dump analysis and debugging. The lack of sufficient unmanaged code programming experience might partly explain many analysis mistakes.

- Dmitry Vostokov @ DumpAnalysis.org -

10 Common Mistakes in Memory Analysis (Part 1)

Tuesday, May 27th, 2008

Mistake #1 - Not looking at full stack traces

By default WinDbg cuts off stack traces after 20th line and an analyst misses essential information when looking at Stack Trace or Stack Trace Collection. Consider this thread stack trace taken from a user process dump where runaway information was not saved but customers reported CPU spikes:

0:000> ~3kvL
ChildEBP RetAddr
0290f864 773976f2 user32!_SEH_prolog+0xb
0290f86c 0047f9ec user32!EnableMenuItem+0xf
0290f884 00488f6d Application!Close+0x142c
0290f8a4 0047a9c6 Application!EnableMenu+0x5d
0290f8b8 0048890d Application!EnableWindows+0x106
0290f8d0 0048cc2b Application!SetHourGlass+0xbd
0290f8fc 0046327a Application!WriteDataStream+0x24b
0290f924 0048d8f9 Application!WriteDataStream+0x21a
0290fa68 00479811 Application!WriteDataStream+0xcb9
0290fadc 5b5e976c Application!OnWrite+0x3c1
0290fb70 5b60e0b0 mfc71!CWnd::OnWndMsg+0x4f2
0290fb90 5b60e14f mfc71!CWnd::WindowProc+0x22
0290fbf0 5b60e1b8 mfc71!AfxCallWndProc+0x91
0290fc10 00516454 mfc71!AfxWndProc+0x46
0290fc3c 7739c3b7 Application!ExitCheck+0x28f34
0290fc68 7739c484 user32!InternalCallWinProc+0x28
0290fce0 77395563 user32!UserCallWinProcCheckWow+0x151
0290fd10 773ad03f user32!CallWindowProcAorW+0x98
0290fd30 0047a59a user32!CallWindowProcA+0x1b

We can see that it uses MFC libraries and window messaging API but was it caught accidentally? Is it a typical message loop like idle message loops in Passive Thread pattern using GetMessage or it is an active GUI message pump using PeekMessage? If we expand stack trace we would see that the thread is actually MFC GUI thread that spins according to MFC source code:

int CWinThread::Run()
{
  for (;;)
  {
    while (bIdle &&
      !::PeekMessage(&(pState->m_msgCur), 
         NULL, NULL, NULL, PM_NOREMOVE))
     {

0:000> ~3kvL 100
ChildEBP RetAddr
0290f864 773976f2 user32!_SEH_prolog+0xb
0290f86c 0047f9ec user32!EnableMenuItem+0xf
0290f884 00488f6d Application!Close+0x142c
0290f8a4 0047a9c6 Application!EnableMenu+0x5d
0290f8b8 0048890d Application!EnableWindows+0x106
0290f8d0 0048cc2b Application!SetHourGlass+0xbd
0290f8fc 0046327a Application!WriteDataStream+0x24b
0290f924 0048d8f9 Application!WriteDataStream+0x21a
0290fa68 00479811 Application!WriteDataStream+0xcb9
0290fadc 5b5e976c Application!OnWrite+0x3c1
0290fb70 5b60e0b0 mfc71!CWnd::OnWndMsg+0x4f2
0290fb90 5b60e14f mfc71!CWnd::WindowProc+0x22
0290fbf0 5b60e1b8 mfc71!AfxCallWndProc+0x91
0290fc10 00516454 mfc71!AfxWndProc+0x46
0290fc3c 7739c3b7 Application!ExitCheck+0x28f34
0290fc68 7739c484 user32!InternalCallWinProc+0x28
0290fce0 77395563 user32!UserCallWinProcCheckWow+0x151
0290fd10 773ad03f user32!CallWindowProcAorW+0x98
0290fd30 0047a59a user32!CallWindowProcA+0x1b
0290fdb0 7739c3b7 Application!OnOK+0x77a
0290fddc 7739c484 user32!InternalCallWinProc+0x28
0290fe54 7739c73c user32!UserCallWinProcCheckWow+0x151
0290febc 7738e406 user32!DispatchMessageWorker+0x327
0290fecc 5b609076 user32!DispatchMessageA+0xf
0290fedc 5b60913e mfc71!AfxInternalPumpMessage+0x3e
0290fef8 004ba7cf mfc71!CWinThread::Run+0×54
0290ff04 5b61b30c Application!CMyThread::Run+0xf
0290ff84 5b869565 mfc71!_AfxThreadEntry+0×100
0290ffb8 77e66063 msvcr71!_endthreadex+0xa0
0290ffec 00000000 kernel32!BaseThreadStart+0×34

There is also WinDbg .kframes meta-command that can change default stack trace depth:

2: kd> .kframes 0n100
Default stack trace depth is 0n100 frames

- Dmitry Vostokov @ DumpAnalysis.org -