Interrupts and exceptions explained (Part 5)

July 20th, 2007

In this part I’ll show how to simulate .trap WinDbg command when you have x64 Windows kernel and complete memory dumps. 

When you have a fault an x64 processor saves some registers on the current thread stack as explained previously in Part 2. Then an interrupt handler saves _KTRAP_FRAME on the stack:

6: kd> uf nt!KiPageFault
nt!KiPageFault:
fffff800`0102d400 push    rbp
fffff800`0102d401 sub     rsp,158h
fffff800`0102d408 lea     rbp,[rsp+80h]
fffff800`0102d410 mov     byte ptr [rbp-55h],1
fffff800`0102d414 mov     qword ptr [rbp-50h],rax
fffff800`0102d418 mov     qword ptr [rbp-48h],rcx
fffff800`0102d41c mov     qword ptr [rbp-40h],rdx
fffff800`0102d420 mov     qword ptr [rbp-38h],r8
fffff800`0102d424 mov     qword ptr [rbp-30h],r9
fffff800`0102d428 mov     qword ptr [rbp-28h],r10
fffff800`0102d42c mov     qword ptr [rbp-20h],r11
...
...
...

6: kd> dt _KTRAP_FRAME
   +0x000 P1Home           : Uint8B
   +0x008 P2Home           : Uint8B
   +0x010 P3Home           : Uint8B
   +0x018 P4Home           : Uint8B
   +0x020 P5               : Uint8B
   +0x028 PreviousMode     : Char
   +0x029 PreviousIrql     : UChar
   +0x02a FaultIndicator   : UChar
   +0x02b ExceptionActive  : UChar
   +0x02c MxCsr            : Uint4B
   +0x030 Rax              : Uint8B
   +0x038 Rcx              : Uint8B
   +0x040 Rdx              : Uint8B
   +0x048 R8               : Uint8B
   +0x050 R9               : Uint8B
   +0x058 R10              : Uint8B
   +0x060 R11              : Uint8B
   +0x068 GsBase           : Uint8B
   +0x068 GsSwap           : Uint8B
   +0x070 Xmm0             : _M128A
   +0x080 Xmm1             : _M128A
   +0x090 Xmm2             : _M128A
   +0x0a0 Xmm3             : _M128A
   +0x0b0 Xmm4             : _M128A
   +0x0c0 Xmm5             : _M128A
   +0x0d0 FaultAddress     : Uint8B
   +0x0d0 ContextRecord    : Uint8B
   +0x0d0 TimeStamp        : Uint8B
   +0x0d8 Dr0              : Uint8B
   +0x0e0 Dr1              : Uint8B
   +0x0e8 Dr2              : Uint8B
   +0x0f0 Dr3              : Uint8B
   +0x0f8 Dr6              : Uint8B
   +0x100 Dr7              : Uint8B
   +0x108 DebugControl     : Uint8B
   +0x110 LastBranchToRip  : Uint8B
   +0x118 LastBranchFromRip : Uint8B
   +0x120 LastExceptionToRip : Uint8B
   +0x128 LastExceptionFromRip : Uint8B
   +0x108 LastBranchControl : Uint8B
   +0x110 LastBranchMSR    : Uint4B
   +0x130 SegDs            : Uint2B
   +0x132 SegEs            : Uint2B
   +0x134 SegFs            : Uint2B
   +0x136 SegGs            : Uint2B
   +0x138 TrapFrame        : Uint8B
   +0x140 Rbx              : Uint8B
   +0x148 Rdi              : Uint8B
   +0x150 Rsi              : Uint8B
   +0x158 Rbp              : Uint8B
   +0×160 ErrorCode        : Uint8B
   +0×160 ExceptionFrame   : Uint8B
   +0×168 Rip              : Uint8B
   +0×170 SegCs            : Uint2B
   +0×172 Fill1            : [3] Uint2B
   +0×178 EFlags           : Uint4B
   +0×17c Fill2            : Uint4B
   +0×180 Rsp              : Uint8B
   +0×188 SegSs            : Uint2B
   +0×18a Fill3            : [1] Uint2B
   +0×18c CodePatchCycle   : Int4B

Unfortunately the technique to use DS and ES pair to find the trap frame in x86 Windows crash dump doesn’t work here because KiPageFault interrupt handler doesn’t save them as can be found by inspecting its disassembly. Fortunately the registers that an x64 processor pushes upon an interrupt are part of _KTRAP_FRAME shown in blue above. Fill1, Fill2, Fill3 and CodePatchCycle are just dummy values to fill 64-bit slots because CS and SS are 16-bit registers and in 64-bit RFLAGS only the first 32-bit EFLAGS part is currently used. Remember that a processor in 64-bit mode pushes 64-bit values even if values occupy only 16 or 32-bit. Therefore we can try to find CS and SS on the stack because they have the following constant values:

6: kd> r cs
cs=0010
6: kd> r ss
ss=0018

6: kd> k
Child-SP          RetAddr           Call Site
fffffadc`6e02b9e8 fffff800`013731b1 nt!KeBugCheckEx



fffffadc`6e02cd70 fffff800`010202d6 nt!PspSystemThreadStartup+0×3e
fffffadc`6e02cdd0 00000000`00000000 nt!KxStartSystemThread+0×16

6: kd> dqs fffffadc`6e02b9e8 fffffadc`6e02cd70
...
...
...
fffffadc`6e02c938 fffff800`0102d5e1 nt!KiPageFault+0x1e1
...
...
...
fffffadc`6e02ca70  fffff97f`f3937a8c
fffffadc`6e02ca78  fffff97f`ff57d28b driver+0x3028b
fffffadc`6e02ca80  00000000`00000000
fffffadc`6e02ca88  fffff97f`f3937030
fffffadc`6e02ca90  fffff97f`ff5c2990 driver+0x75990
fffffadc`6e02ca98  00000000`00000000
fffffadc`6e02caa0  00000000`00000000 ; ErrorCode
fffffadc`6e02caa8  fffff97f`ff591ed3 driver+0x44ed3 ; RIP
fffffadc`6e02cab0  00000000`00000010 ; CS
fffffadc`6e02cab8  00000000`00010282 ; RFLAGS
fffffadc`6e02cac0  fffffadc`6e02cad0 ; RSP
fffffadc`6e02cac8  00000000`00000018 ; SS
fffffadc`6e02cad0  fffff97f`f382b0e0
fffffadc`6e02cad8  fffffadc`6e02cbd0
fffffadc`6e02cae0  fffff97f`f3937a8c
fffffadc`6e02cae8  fffff97f`f3937030
fffffadc`6e02caf0  00000000`00000000
fffffadc`6e02caf8  00000000`00000001


Now we can calculate the trap frame address by subtracting SegSs offset in _KTRAP_FRAME structure (0×188) from fffffadc`6e02cac8 address:

6: kd> ? fffffadc`6e02cac8-188
Evaluate expression: -5650331285184 = fffffadc`6e02c940

6: kd> .trap fffffadc`6e02c940
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed.
rax=fffffadcdac27298 rbx=0000000000000000 rcx=fffffadcdb45a4c0
rdx=0000000000000555 rsi=fffff97fff5c2990 rdi=fffff97ff3937030
rip=fffff97fff591ed3 rsp=fffffadc6e02cad0 rbp=0000000000000000
 r8=fffffadcdac27250  r9=fffff97ff3824030 r10=0000000000000020
r11=fffffadcdac27250 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0 nv up ei ng nz na pe nc
driver+0x44ed3:
fffff97f`ff591ed3 0fb74514  movzx eax,word ptr [rbp+14h] ss:0018:00000000`00000014=????

6: kd> k
Child-SP          RetAddr           Call Site
fffffadc`6e02cad0 fffff97f`ff5935f7 driver+0x44ed3
fffffadc`6e02cc40 fffff800`0124b972 driver+0x465f7
fffffadc`6e02cd70 fffff800`010202d6 nt!PspSystemThreadStartup+0x3e
fffffadc`6e02cdd0 00000000`00000000 nt!KxStartSystemThread+0x16

Our example shows how to find a trap frame manually in x64 kernel or complete memory dump. Usually WinDbg finds trap frames automatically (call arguments are removed from the verbose stack trace for clarity):

6: kd> kv
Child-SP          RetAddr           Call Site
fffffadc`6e02b9e8 fffff800`013731b1 nt!KeBugCheckEx
fffffadc`6e02b9f0 fffff800`010556ab nt!PspSystemThreadStartup+0x270
fffffadc`6e02ba40 fffff800`010549fd nt!_C_specific_handler+0x9b
fffffadc`6e02bad0 fffff800`01054f93 nt!RtlpExecuteHandlerForException+0xd
fffffadc`6e02bb00 fffff800`0100b901 nt!RtlDispatchException+0x2c0
fffffadc`6e02c1c0 fffff800`0102e76f nt!KiDispatchException+0xd9
fffffadc`6e02c7c0 fffff800`0102d5e1 nt!KiExceptionExit
fffffadc`6e02c940 fffff97f`ff591ed3 nt!KiPageFault+0x1e1 (TrapFrame @ fffffadc`6e02c940)
fffffadc`6e02cad0 fffff97f`ff5935f7 driver+0×44ed3
fffffadc`6e02cc40 fffff800`0124b972 driver+0×465f7
fffffadc`6e02cd70 fffff800`010202d6 nt!PspSystemThreadStartup+0×3e
fffffadc`6e02cdd0 00000000`00000000 nt!KxStartSystemThread+0×16

- Dmitry Vostokov @ DumpAnalysis.org -

Crash Dump Analysis Patterns (Part 18)

July 20th, 2007

Sometimes the page file size is less than the amount of physical memory. If this is the case and we have configured “Complete memory dump” in Startup and Recovery settings in Control Panel we get truncated dumps. Therefore we can call our next pattern “Truncated Dump”. WinDbg prints a warning when we open such dump:

************************************************************
WARNING: Dump file has been truncated.  Data may be missing.
************************************************************

We can double check this with !vm command:

kd> !vm

*** Virtual Memory Usage ***
       Physical Memory:      511859 (   2047436 Kb)
       Paging File Name paged out
         Current:   1536000 Kb  Free Space:   1522732 Kb
         Minimum:   1536000 Kb  Maximum:      1536000 Kb

We see that the page file size is 1.5Gb but the amount of physical memory is 2Gb. When BSOD happens the physical memory contents will be saved to the page file and the dump file size will be no more than 1.5Gb effectively truncating data needed for crash dump analysis.

Sometimes you can still access some data in truncated dumps but pay attention to what WinDbg says. For example, in the truncated dump shown above the stack and driver code are not available:

kd> kv
ChildEBP RetAddr  Args to Child
WARNING: Stack unwind information not available. Following frames may be wrong.
f408b004 00000000 00000000 00000000 00000000 driver+0x19237

kd> r
Last set context:
eax=89d55230 ebx=89d21130 ecx=89d21130 edx=89c8cc20 esi=89e24ac0 edi=89c8cc20
eip=f7242237 esp=f408afec ebp=f408b004 iopl=0 nv up ei ng nz ac po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010292
driver+0x19237:
f7242237 ??              ???

kd> dds esp
f408afec  ????????
f408aff0  ????????
f408aff4  ????????
f408aff8  ????????
f408affc  ????????
f408b000  ????????
f408b004  ????????
f408b008  ????????
f408b00c  ????????
f408b010  ????????
f408b014  ????????
f408b018  ????????
f408b01c  ????????
f408b020  ????????
f408b024  ????????
f408b028  ????????
f408b02c  ????????
f408b030  ????????
f408b034  ????????
f408b038  ????????
f408b03c  ????????
f408b040  ????????
f408b044  ????????
f408b048  ????????
f408b04c  ????????
f408b050  ????????
f408b054  ????????
f408b058  ????????
f408b05c  ????????
f408b060  ????????
f408b064  ????????
f408b068  ????????

kd> lmv m driver
start    end        module name
f7229000 f725f000   driver     T (no symbols)
    Loaded symbol image file: driver.sys
    Image path: driver.sys
    Image name: driver.sys
    Timestamp:        unavailable (FFFFFFFE)
    CheckSum:         missing
    ImageSize:        00036000

kd> dd f7229000
f7229000  ???????? ???????? ???????? ????????
f7229010  ???????? ???????? ???????? ????????
f7229020  ???????? ???????? ???????? ????????
f7229030  ???????? ???????? ???????? ????????
f7229040  ???????? ???????? ???????? ????????
f7229050  ???????? ???????? ???????? ????????
f7229060  ???????? ???????? ???????? ????????
f7229070  ???????? ???????? ???????? ????????

If due to some reasons you cannot increase the size of your page file then just configure “Kernel memory dump” in Startup and Recovery. For most all bugchecks kernel memory dump is sufficient except manual crash dumps when you need to inspect user process space.

- Dmitry Vostokov @ DumpAnalysis.org -

Crash Dump Analysis Patterns (Part 17)

July 20th, 2007

.NET programs also crash either from defects in .NET runtime (Common Language Runtime, CLR) or from non-handled runtime exceptions in managed code executed by .NET virtual machine. The latter exceptions are re-thrown from .NET runtime to be handled by operating system and intercepted by native debuggers. Therefore our next crash dump analysis pattern is called Managed Code Exception

When you get a dump from .NET application it is the dump from a native process. !analyze -v output can usually tell you that exception is actually CLR exception and give you other hints to look at managed code stack (CLR stack):

FAULTING_IP:
kernel32!RaiseException+53
77e4bee7 5e              pop     esi

EXCEPTION_RECORD:  ffffffff -- (.exr 0xffffffffffffffff)
ExceptionAddress: 77e4bee7 (kernel32!RaiseException+0x00000053)
   ExceptionCode: e0434f4d (CLR exception)
   ExceptionFlags: 00000001
NumberParameters: 1
   Parameter[0]: 80131604

DEFAULT_BUCKET_ID:  CLR_EXCEPTION

PROCESS_NAME:  mmc.exe

ERROR_CODE: (NTSTATUS) 0xe0434f4d - <Unable to get error code text>

MANAGED_STACK: !dumpstack -EE
No export dumpstack found

STACK_TEXT:
05faf3d8 79f97065 e0434f4d 00000001 00000001 kernel32!RaiseException+0x53
WARNING: Stack unwind information not available. Following frames may be wrong.
05faf438 7a0945a4 023f31e0 00000000 00000000 mscorwks!DllCanUnloadNowInternal+0×37a9
05faf4fc 00f2f00a 02066be4 02085ee8 023d0df0 mscorwks!CorLaunchApplication+0×12005
05faf500 02066be4 02085ee8 023d0df0 023d0e2c 0xf2f00a
05faf504 02085ee8 023d0df0 023d0e2c 05e00dfa 0×2066be4
05faf508 023d0df0 023d0e2c 05e00dfa 023d0e10 0×2085ee8
05faf50c 023d0e2c 05e00dfa 023d0e10 05351d30 0×23d0df0
05faf510 05e00dfa 023d0e10 05351d30 023d0e10 0×23d0e2c

FOLLOWUP_IP:
mscorwks!DllCanUnloadNowInternal+37a9
79f97065 c745fcfeffffff  mov     dword ptr [ebp-4],0FFFFFFFEh

SYMBOL_NAME:  mscorwks!DllCanUnloadNowInternal+37a9

MODULE_NAME: mscorwks

IMAGE_NAME:  mscorwks.dll

PRIMARY_PROBLEM_CLASS:  CLR_EXCEPTION

BUGCHECK_STR:  APPLICATION_FAULT_CLR_EXCEPTION

Sometimes you can see mscorwks.dll on raw stack or see it loaded and find it on other thread stacks than the current one.

When you get such hints you might want to get managed code stack as well. First you need to load the appropriate WinDbg SOS extension (Son of Strike) corresponding to .NET runtime version. This can be done by the following command:

0:015> .loadby sos mscorwks

You can check which SOS extension version was loaded this by using .chain command:

0:015> .chain
Extension DLL search Path:
...
...
...
Extension DLL chain:
    C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\sos: image 2.0.50727.42, API 1.0.0, built Fri Sep 23 08:27:26 2005
        [path: C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\sos.dll]

    dbghelp: image 6.6.0007.5, API 6.0.6, built Sat Jul 08 21:11:32 2006
        [path: C:\Program Files\Debugging Tools for Windows\dbghelp.dll]
    ext: image 6.6.0007.5, API 1.0.0, built Sat Jul 08 21:10:52 2006
        [path: C:\Program Files\Debugging Tools for Windows\winext\ext.dll]
    exts: image 6.6.0007.5, API 1.0.0, built Sat Jul 08 21:10:48 2006
        [path: C:\Program Files\Debugging Tools for Windows\WINXP\exts.dll]
    uext: image 6.6.0007.5, API 1.0.0, built Sat Jul 08 21:11:02 2006
        [path: C:\Program Files\Debugging Tools for Windows\winext\uext.dll]
    ntsdexts: image 6.0.5457.0, API 1.0.0, built Sat Jul 08 21:29:38 2006
        [path: C:\Program Files\Debugging Tools for Windows\WINXP\ntsdexts.dll]

Then you can use !dumpstack to dump the current stack or !EEStack command to dump all thread stacks. The native stack trace would be mixed with managed stack trace:

0:015> !dumpstack
OS Thread Id: 0x16e8 (15)
Current frame: kernel32!RaiseException+0x53
ChildEBP RetAddr Caller,Callee
05faf390 77e4bee7 kernel32!RaiseException+0x53, calling ntdll!RtlRaiseException
05faf3a8 79e814da mscorwks!Binder::RawGetClass+0x23, calling mscorwks!Module::LookupTypeDef
05faf3bc 79e87ff4 mscorwks!Binder::IsClass+0x21, calling mscorwks!Binder::RawGetClass
05faf3c8 79f958b8 mscorwks!Binder::IsException+0x13, calling mscorwks!Binder::IsClass
05faf3d8 79f97065 mscorwks!RaiseTheExceptionInternalOnly+0x226, calling kernel32!RaiseException
05faf438 7a0945a4 mscorwks!JIT_Throw+0xd0, calling mscorwks!RaiseTheExceptionInternalOnly
05faf4ac 7a0944ea mscorwks!JIT_Throw+0x1e, calling mscorwks!LazyMachStateCaptureState
05faf4c8 793d424e (MethodDesc 0x7924ad68 +0x2e System.Threading.WaitHandle.WaitOne(Int64, Boolean)), calling mscorwks!WaitHandleNative::CorWaitOneNative
05faf4fc 00f2f00a (MethodDesc 0x4f97500 +0x9a Ironring.Management.MMC.SnapinBase+MmcWindow.Invoke(System.Delegate, System.Object[])), calling mscorwks!JIT_Throw
05faf510 05e00dfa (MethodDesc 0×4f98fd8 +0xca MyNamespace.MyClass.MyMethod(Boolean)), calling 05fc7124
05faf55c 00f62fbc (MethodDesc 0×4f95f90 +0×16f4 MyNamespace.MyClass.MyMethod.Initialise(System.Object))

05faf740 793d912f (MethodDesc 0×7925fc70 +0×2f System.Threading._ThreadPoolWaitCallback.WaitCallback_Context(System.Object))
05faf748 793683dd (MethodDesc 0×7913f3d0 +0×81 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object))
05faf75c 793d9218 (MethodDesc 0×7925fc80 +0×6c System.Threading._ThreadPoolWaitCallback.PerformWaitCallback(System.Object)), calling (MethodDesc 0×7913f3d0 +0 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object))
05faf774 79e88f63 mscorwks!CallDescrWorker+0×33
05faf784 79e88ee4 mscorwks!CallDescrWorkerWithHandler+0xa3, calling mscorwks!CallDescrWorker
05faf804 79f20212 mscorwks!DispatchCallBody+0×1e, calling mscorwks!CallDescrWorkerWithHandler
05faf824 79f201bc mscorwks!DispatchCallDebuggerWrapper+0×3d, calling mscorwks!DispatchCallBody
05faf888 79f2024b mscorwks!DispatchCallNoEH+0×51, calling mscorwks!DispatchCallDebuggerWrapper
05faf8bc 7a07bdf0 mscorwks!Holder,2>::~Holder,2>+0xbb, calling mscorwks!DispatchCallNoEH
05faf90c 77e61d1e kernel32!WaitForSingleObjectEx+0xac, calling ntdll!ZwWaitForSingleObject
05faf91c 79ecb4a4 mscorwks!Thread::UserResumeThread+0xfb
05faf92c 79ecb442 mscorwks!Thread::DoADCallBack+0×355, calling mscorwks!Thread::UserResumeThread+0xae
05faf950 79e74afe mscorwks!Thread::EnterRuntimeNoThrow+0×9b, calling mscorwks!_EH_epilog3
05faf988 79e77fe8 mscorwks!PEImage::LoadImage+0×1e1, calling mscorwks!_SEH_epilog4
05faf9c0 79ecb364 mscorwks!Thread::DoADCallBack+0×541, calling mscorwks!Thread::DoADCallBack+0×2a5
05faf9fc 7a0e1b7e mscorwks!Thread::DoADCallBack+0×575, calling mscorwks!Thread::DoADCallBack+0×4d4
05fafa24 7a0e1bab mscorwks!ManagedThreadBase::ThreadPool+0×13, calling mscorwks!Thread::DoADCallBack+0×550
05fafa38 7a07cae8 mscorwks!QueueUserWorkItemCallback+0×9d, calling mscorwks!ManagedThreadBase::ThreadPool
05fafa54 7a07ca48 mscorwks!QueueUserWorkItemCallback, calling mscorwks!UnwindAndContinueRethrowHelperAfterCatch
05fafa90 7a110f08 mscorwks!ThreadpoolMgr::ExecuteWorkRequest+0×40
05fafaa8 7a112328 mscorwks!ThreadpoolMgr::WorkerThreadStart+0×1f2, calling mscorwks!ThreadpoolMgr::ExecuteWorkRequest
05fafad0 79e7839d mscorwks!EEHeapFreeInProcessHeap+0×21, calling mscorwks!EEHeapFree
05fafae0 79e782dc mscorwks!operator delete[]+0×30, calling mscorwks!EEHeapFreeInProcessHeap
05fafb14 79ecb00b mscorwks!Thread::intermediateThreadProc+0×49
05fafb48 77e65512 kernel32!FlsSetValue+0xc7, calling kernel32!_SEH_epilog
05fafb6c 75da14d0 sxs!_calloc_crt+0×19, calling sxs!calloc
05fafb80 77e65512 kernel32!FlsSetValue+0xc7, calling kernel32!_SEH_epilog
05fafb88 75da1401 sxs!_CRT_INIT+0×17e, calling sxs!_initptd
05fafb8c 75da1408 sxs!_CRT_INIT+0×185, calling kernel32!GetCurrentThreadId
05fafb9c 30403805 MMCFormsShim!DllMain+0×15, calling MMCFormsShim!PrxDllMain
05fafbb0 30418b69 MMCFormsShim!__DllMainCRTStartup+0×7a, calling MMCFormsShim!DllMain
05fafbdc 75de0e4c sxs!_SxsDllMain+0×87, calling sxs!DllStartup_CrtInit
05fafbf0 30418bf9 MMCFormsShim!__DllMainCRTStartup+0×10a, calling MMCFormsShim!__SEH_epilog4
05fafbf4 30418c22 MMCFormsShim!_DllMainCRTStartup+0×1d, calling MMCFormsShim!__DllMainCRTStartup
05fafbfc 7c81a352 ntdll!LdrpCallInitRoutine+0×14
05fafc24 7c82ee8b ntdll!LdrpInitializeThread+0×1a5, calling ntdll!RtlLeaveCriticalSection
05fafc2c 7c82edec ntdll!LdrpInitializeThread+0×18f, calling ntdll!_SEH_epilog
05fafc7c 7c82ed71 ntdll!LdrpInitializeThread+0xd8, calling ntdll!RtlActivateActivationContextUnsafeFast
05fafc80 7c82ed35 ntdll!LdrpInitializeThread+0×12c, calling ntdll!RtlDeactivateActivationContextUnsafeFast
05fafcb4 7c82edec ntdll!LdrpInitializeThread+0×18f, calling ntdll!_SEH_epilog
05fafcb8 7c827c3b ntdll!NtTestAlert+0xc
05fafcbc 7c82ecb1 ntdll!_LdrpInitialize+0×1de, calling ntdll!_SEH_epilog
05fafd10 7c82ecb1 ntdll!_LdrpInitialize+0×1de, calling ntdll!_SEH_epilog
05fafd14 7c826d9b ntdll!NtContinue+0xc
05fafd18 7c8284da ntdll!KiUserApcDispatcher+0×3a, calling ntdll!NtContinue
05faffa4 79ecaff9 mscorwks!Thread::intermediateThreadProc+0×37, calling mscorwks!_alloca_probe_16
05faffb8 77e64829 kernel32!BaseThreadStart+0×34

.NET language symbolic names are usually reconstructed from .NET assembly metadata. 

You can examine a CLR exception and get managed stack trace by using !PrintException and !CLRStack commands, for example:

0:014> !PrintException
Exception object: 02320314
Exception type: System.Reflection.TargetInvocationException
Message: Exception has been thrown by the target of an invocation.
InnerException: System.Runtime.InteropServices.COMException, use !PrintException 023201a8 to see more
StackTrace (generated):
    SP       IP       Function
    075AF4FC 016BFD9A Ironring.Management.MMC.SnapinBase+MmcWindow.Invoke(System.Delegate, System.Object[])
    ...
    ...
    ...
    075AF740 793D87AF System.Threading._ThreadPoolWaitCallback.WaitCallback_Context(System.Object)
    075AF748 793608FD System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
    075AF760 793D8898 System.Threading._ThreadPoolWaitCallback.PerformWaitCallback(System.Object)

StackTraceString: <none>
HResult: 80131604

0:014> !PrintException 023201a8
Exception object: 023201a8
Exception type: System.Runtime.InteropServices.COMException
Message: Error HRESULT E_FAIL has been returned from a call to a COM component.
InnerException: <none>
StackTrace (generated):
    SP       IP       Function
    00000000 00000001 Ironring.Management.MMC.IMMCFormsShim.HostUserControl3(System.Object, System.Object, System.String, System.String, Int32, Int32)
    0007F724 073875B9 Ironring.Management.MMC.FormNode.SetShimControl(System.Object)
    0007F738 053D9DDE Ironring.Management.MMC.FormNode.set_ControlType(System.Type)
    ...
    ...
    ...

StackTraceString: <none>
HResult: 80004005

0:014> !CLRStack
OS Thread Id: 0x11ec (14)
ESP       EIP
075af4fc 016bfd9a Ironring.Management.MMC.SnapinBase+MmcWindow.Invoke(System.Delegate, System.Object[])
...
...
...
075af740 793d87af System.Threading._ThreadPoolWaitCallback.WaitCallback_Context(System.Object)
075af748 793608fd System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
075af760 793d8898 System.Threading._ThreadPoolWaitCallback.PerformWaitCallback(System.Object)
075af8f0 79e7be1b [GCFrame: 075af8f0]

!help command gives the list of other available SOS extension commands:

0:014> !help

Object Inspection

DumpObj (do)
DumpArray (da)
DumpStackObjects (dso)
DumpHeap
DumpVC
GCRoot
ObjSize
FinalizeQueue
PrintException (pe)
TraverseHeap

Examining code and stacks

Threads
CLRStack
IP2MD
U
DumpStack
EEStack
GCInfo
EHInfo
COMState
BPMD

Examining CLR data structures

DumpDomain
EEHeap
Name2EE
SyncBlk
DumpMT
DumpClass
DumpMD
Token2EE
EEVersion
DumpModule
ThreadPool
DumpAssembly
DumpMethodSig
DumpRuntimeTypes
DumpSig
RCWCleanupList
DumpIL

Diagnostic Utilities

VerifyHeap
DumpLog
FindAppDomain
SaveModule
GCHandles
GCHandleLeaks
VMMap
VMStat
ProcInfo
StopOnException (soe)
MinidumpMode

Other

FAQ

If you are new to .NET and interested in .NET debugging I would recommend 3 books:

Essential .NET, Volume I: The Common Language Runtime

Buy from Amazon

Debugging Microsoft .NET 2.0 Applications

Buy from Amazon

Advanced .NET Debugging

Buy from Amazon

Expert .NET 2.0 IL Assembler

Buy from Amazon

- Dmitry Vostokov @ DumpAnalysis.org -

COM+ crash dumps

July 16th, 2007

If you have problems with COM+ components you can configure Component Services in Control Panel to save a dump:

Refer to the following article for details:

http://msdn.microsoft.com/msdnmag/issues/01/08/ComXP/

If you want to use userdump.exe to save a crash dump when a failing COM+ application displays an error dialog box refer to the following article:

http://support.microsoft.com/kb/287643

If you want dumps to be automatically collected after some timeout value refer to the following article for details:

http://support.microsoft.com/kb/910904/ 

If you have an exception the following article describes how to get a stack trace from a saved process dump:

http://support.microsoft.com/kb/317317

This article explains how COM+ handles application faults:

Fault Isolation and Failfast Policy

Now I show how to get an error message that was written to event log when COM+ application was terminated due to a different error code than an access violation. If you get a dump from COM+ process look at all threads and find the one that runs through comsvcs.dll:

0:000> ~*k
...
...
...
6 Id: 8d4.1254 Suspend: 0 Teb: 7ffd9000 Unfrozen
ChildEBP RetAddr
0072ee30 7c822124 ntdll!KiFastSystemCallRet
0072ee34 77e6baa8 ntdll!NtWaitForSingleObject+0xc
0072eea4 77e6ba12 kernel32!WaitForSingleObjectEx+0xac
0072eeb8 75c2b250 kernel32!WaitForSingleObject+0x12
0072f340 75c2bb91 comsvcs!FF_RunCmd+0xa2
0072f60c 75c2bc76 comsvcs!FF_DumpProcess_MD+0x21a
0072f850 75c2be83 comsvcs!FF_DumpProcess+0x39
0072fdc0 75c2c351 comsvcs!FailFastStr+0x2ce
0072fe20 75bf31fa comsvcs!CError::WriteToLog+0x198
0072fe8c 75bf3d48 comsvcs!CSurrogateServices::FireApplicationLaunch+0x13b
0072fee0 75bf3e19 comsvcs!CApplication::AsyncApplicationLaunch+0x101
0072feec 7c81a3c5 comsvcs!CApplication::AppLaunchThreadProc+0x18
0072ff44 7c8200fc ntdll!RtlpWorkerCallout+0x71
0072ff64 7c81a3fa ntdll!RtlpExecuteWorkerRequest+0x4f
0072ff78 7c82017f ntdll!RtlpApcCallout+0x11
0072ffb8 77e66063 ntdll!RtlpWorkerThread+0x61
0072ffec 00000000 kernel32!BaseThreadStart+0x34
...
...
...

0:000> ~*kL
...
...
...
   6  Id: 8d4.1254 Suspend: 0 Teb: 7ffd9000 Unfrozen
ChildEBP RetAddr  Args to Child
0072ee30 7c822124 77e6baa8 00000394 00000000
ntdll!KiFastSystemCallRet
0072ee34 77e6baa8 00000394 00000000 00000000
ntdll!NtWaitForSingleObject+0xc
0072eea4 77e6ba12 00000394 ffffffff 00000000
kernel32!WaitForSingleObjectEx+0xac
0072eeb8 75c2b250 00000394 ffffffff 0072f640
kernel32!WaitForSingleObject+0x12
0072f340 75c2bb91 75b8e7fc 75b8e810 000008d4
comsvcs!FF_RunCmd+0xa2
0072f60c 75c2bc76 0072f640 75c6c5c0 0072fe44
comsvcs!FF_DumpProcess_MD+0x21a
0072f850 75c2be83 00000000 77ce21ce 0bd5f0f0
comsvcs!FF_DumpProcess+0×39
0072fdc0 75c2c351 75c6c5c0 75b8b008 00000142
comsvcs!FailFastStr+0×2ce
0072fe20 75bf31fa 0072fe44 75b8b008 00000142
comsvcs!CError::WriteToLog+0×198
0072fe8c 75bf3d48 0bcf5d0c 00000000 0bcf5cf8
comsvcs!CSurrogateServices::FireApplicationLaunch+0×13b
0072fee0 75bf3e19 75bf3e01 0072ff44 7c81a3c5
comsvcs!CApplication::AsyncApplicationLaunch+0×101
0072feec 7c81a3c5 0bcf5cf8 7c889880 0bcf5d50
comsvcs!CApplication::AppLaunchThreadProc+0×18
0072ff44 7c8200fc 75bf3e01 0bcf5cf8 00000000
ntdll!RtlpWorkerCallout+0×71
0072ff64 7c81a3fa 00000000 0bcf5cf8 0bcf5d50
ntdll!RtlpExecuteWorkerRequest+0×4f
0072ff78 7c82017f 7c8200bb 00000000 0bcf5cf8
ntdll!RtlpApcCallout+0×11
0072ffb8 77e66063 00000000 00000000 00000000
ntdll!RtlpWorkerThread+0×61
0072ffec 00000000 7c83ad38 00000000 00000000
kernel32!BaseThreadStart+0×34


FF_DumpProcess function is an indication that the process was being dumped. There is no ComSvcsExceptionFilter function on thread stack but we can still get an error message if we look at FailFastStr function arguments:

0:000> du 75c6c5c0 75c6c5c0+400
75c6c5c0  “{646F1874-46B6-4149-BD55-8C317FB”
75c6c600  “71CC0}….Server Application ID:”
75c6c640  ” {646F1874-46B6-4149-BD55-8C317F”
75c6c680  “B71CC0}..Server Application Inst”
75c6c6c0  “ance ID:..{7A39BC48-78DA-4FBB-A7″
75c6c700  “46-EEA7E42CDAC7}..Server Applica”
75c6c740  “tion Name: My Server”
75c6c780  “..The serious nature of this err”
75c6c7c0  “or has caused the process to ter”
75c6c800  “minate…Error Code = 0×80131600″
75c6c840  ” : ..COM+ Services Internals Inf”
75c6c880  “ormation:..File: d:\nt\com\compl”
75c6c8c0  “us\src\comsvcs\srgtapi\csrgtserv”
75c6c900  “.cpp, Line: 322..Comsvcs.dll fil”
75c6c940  “e version: ENU 2001.12.4720.2517″
75c6c980  ” shp”

Also if we examine parameters of FF_RunCmd call we would see what application was used to dump a process:

ChildEBP RetAddr Args to Child
0072f340 75c2bb91 75b8e7fc 75b8e810 000008d4
comsvcs!FF_RunCmd+0xa2

0:000> du 75b8e7fc
75b8e7fc  “%s %d %s”
0:000> du 75b8e810
75b8e810  “RunDll32 comsvcs.dll,MiniDump”

We can guess that the first parameter is a format string, the second one is the command line for a process dumper, the third one is PID and the fourth one should be the name of a dump file to save. We can double check this from the raw stack:

ChildEBP RetAddr Args to Child
0072f340 75c2bb91 75b8e7fc 75b8e810 000008d4
comsvcs!FF_RunCmd+0xa2

0:000> dd 0072f340
0072f340  0072f60c 75c2bb91 75b8e7fc 75b8e810
     ; saved EBP, return EIP, 1st param, 2nd param
0072f350  000008d4 0072f640 0072f84a 00000000
     ; 3rd param, 4th param

0:000> du 0072f640
0072f640  “C:\WINDOWS\system32\com\dmp\{646″
0072f680  “F1874-46B6-4149-BD55-8C317FB71CC”
0072f6c0  “0}_2007_07_16_12_05_08.dmp”

We can actually find the already formatted command that was passed to CreateProcess call on the raw stack:

0:006> du 0072ef2c
0072ef2c  "RunDll32 comsvcs.dll,MiniDump 22"
0072ef6c  "60 C:\WINDOWS\system32\com\dmp\{"
0072efac  "646F1874-46B6-4149-BD55-8C317FB7"
0072efec  "1CC0}_2007_07_16_12_05_08.dmp"

- Dmitry Vostokov @ DumpAnalysis.org -

Crash Dump Analysis Patterns (Part 13b)

July 15th, 2007

Sometimes handle leaks also result in insufficient memory especially if handles point to structures allocated by OS. Here is the typical example of the handle leak resulted in freezing several servers. The complete memory dump shows exhausted non-paged pool:

0: kd> !vm

*** Virtual Memory Usage ***
 Physical Memory:     1048352 (   4193408 Kb)
 Page File: \??\C:\pagefile.sys
   Current:   4190208 Kb  Free Space:   3749732 Kb
   Minimum:   4190208 Kb  Maximum:      4190208 Kb
 Available Pages:      697734 (   2790936 Kb)
 ResAvail Pages:       958085 (   3832340 Kb)
 Locked IO Pages:          95 (       380 Kb)
 Free System PTEs:     199971 (    799884 Kb)
 Free NP PTEs:            105 (       420 Kb)
 Free Special NP:           0 (         0 Kb)
 Modified Pages:          195 (       780 Kb)
 Modified PF Pages:       195 (       780 Kb)
 NonPagedPool Usage:    65244 (    260976 Kb)
 NonPagedPool Max:      65503 (    262012 Kb)
 ********** Excessive NonPaged Pool Usage *****

 PagedPool 0 Usage:      6576 (     26304 Kb)
 PagedPool 1 Usage:       629 (      2516 Kb)
 PagedPool 2 Usage:       624 (      2496 Kb)
 PagedPool 3 Usage:       608 (      2432 Kb)
 PagedPool 4 Usage:       625 (      2500 Kb)
 PagedPool Usage:        9062 (     36248 Kb)
 PagedPool Maximum:     66560 (    266240 Kb)

********** 184 pool allocations have failed **********

 Shared Commit:          7711 (     30844 Kb)
 Special Pool:              0 (         0 Kb)
 Shared Process:        10625 (     42500 Kb)
 PagedPool Commit:       9102 (     36408 Kb)
 Driver Commit:          1759 (      7036 Kb)
 Committed pages:      425816 (   1703264 Kb)
 Commit limit:        2052560 (   8210240 Kb)

Looking at non-paged pool consumption reveals excessive number of thread objects:

0: kd> !poolused 3
   Sorting by  NonPaged Pool Consumed

  Pool Used:
            NonPaged
 Tag    Allocs    Frees     Diff     Used
 Thre   772672   463590   309082 192867168  Thread objects , Binary: nt!ps
 MmCm       42        9       33 12153104   Calls made to MmAllocateContiguousMemory , Binary: nt!mm


The next logical step would be to list processes and find their handle usage. Indeed there is such a process:

0: kd> !process 0 0



PROCESS 88b75020  SessionId: 7  Cid: 172e4    Peb: 7ffdf000  ParentCid: 17238
    DirBase: c7fb6bc0  ObjectTable: e17f50a0  HandleCount: 143428.
    Image: iexplore.exe


Making the process current and listing its handles shows contiguously allocated handles to thread objects:

0: kd> .process 88b75020
Implicit process is now 88b75020
0: kd> .reload /user

0: kd> !handle



0d94: Object: 88a6b020  GrantedAccess: 001f03ff Entry: e35e1b28
Object: 88a6b020  Type: (8b780c68) Thread
    ObjectHeader: 88a6b008
        HandleCount: 1  PointerCount: 1

0d98: Object: 88a97320  GrantedAccess: 001f03ff Entry: e35e1b30
Object: 88a97320  Type: (8b780c68) Thread
    ObjectHeader: 88a97308
        HandleCount: 1  PointerCount: 1

0d9c: Object: 88b2b020  GrantedAccess: 001f03ff Entry: e35e1b38
Object: 88b2b020  Type: (8b780c68) Thread
    ObjectHeader: 88b2b008
        HandleCount: 1  PointerCount: 1

0da0: Object: 88b2a730  GrantedAccess: 001f03ff Entry: e35e1b40
Object: 88b2a730  Type: (8b780c68) Thread
    ObjectHeader: 88b2a718
        HandleCount: 1  PointerCount: 1

0da4: Object: 88b929a0  GrantedAccess: 001f03ff Entry: e35e1b48
Object: 88b929a0  Type: (8b780c68) Thread
    ObjectHeader: 88b92988
        HandleCount: 1  PointerCount: 1

0da8: Object: 88a57db0  GrantedAccess: 001f03ff Entry: e35e1b50
Object: 88a57db0  Type: (8b780c68) Thread
    ObjectHeader: 88a57d98
        HandleCount: 1  PointerCount: 1

0dac: Object: 88b92db0  GrantedAccess: 001f03ff Entry: e35e1b58
Object: 88b92db0  Type: (8b780c68) Thread
    ObjectHeader: 88b92d98
        HandleCount: 1  PointerCount: 1

0db0: Object: 88b4a730  GrantedAccess: 001f03ff Entry: e35e1b60
Object: 88b4a730  Type: (8b780c68) Thread
    ObjectHeader: 88b4a718
        HandleCount: 1  PointerCount: 1

0db4: Object: 88a7e730  GrantedAccess: 001f03ff Entry: e35e1b68
Object: 88a7e730  Type: (8b780c68) Thread
    ObjectHeader: 88a7e718
        HandleCount: 1  PointerCount: 1

0db8: Object: 88a349a0  GrantedAccess: 001f03ff Entry: e35e1b70
Object: 88a349a0  Type: (8b780c68) Thread
    ObjectHeader: 88a34988
        HandleCount: 1  PointerCount: 1

0dbc: Object: 88a554c0  GrantedAccess: 001f03ff Entry: e35e1b78
Object: 88a554c0  Type: (8b780c68) Thread
    ObjectHeader: 88a554a8
        HandleCount: 1  PointerCount: 1


Examination of these threads shows their stack traces and start address:

0: kd> !thread 88b4a730
THREAD 88b4a730  Cid 0004.1885c  Teb: 00000000 Win32Thread: 00000000 TERMINATED
Not impersonating
DeviceMap                 e1000930
Owning Process            8b7807a8       Image:         System
Wait Start TickCount      975361         Ticks: 980987 (0:04:15:27.921)
Context Switch Count      1
UserTime                  00:00:00.0000
KernelTime                00:00:00.0000
Start Address mydriver!StatusWaitThread (0xf5c5d128)
Stack Init 0 Current f3c4cc98 Base f3c4d000 Limit f3c4a000 Call 0
Priority 8 BasePriority 8 PriorityDecrement 0
ChildEBP RetAddr  Args to Child
f3c4ccac 8083129e ffdff5f0 8697ba00 a674c913 hal!KfLowerIrql+0×62
f3c4ccc8 00000000 808ae498 8697ba00 00000000 nt!KiExitDispatcher+0×130

0: kd> !thread 88a554c0
THREAD 88a554c0  Cid 0004.1888c  Teb: 00000000 Win32Thread: 00000000 TERMINATED
Not impersonating
DeviceMap                 e1000930
Owning Process            8b7807a8       Image:         System
Wait Start TickCount      975380         Ticks: 980968 (0:04:15:27.625)
Context Switch Count      1
UserTime                  00:00:00.0000
KernelTime                00:00:00.0000
Start Address mydriver!StatusWaitThread (0xf5c5d128)
Stack Init 0 Current f3c4cc98 Base f3c4d000 Limit f3c4a000 Call 0
Priority 8 BasePriority 8 PriorityDecrement 0
ChildEBP RetAddr  Args to Child
f3c4ccac 8083129e ffdff5f0 8697ba00 a674c913 hal!KfLowerIrql+0×62
f3c4ccc8 00000000 808ae498 8697ba00 00000000 nt!KiExitDispatcher+0×130

We can see that they have been terminated and their start address belongs to mydriver.sys. Therefore we can say that mydriver code has to be examined to find the source of the handle leak.

- Dmitry Vostokov @ DumpAnalysis.org -

Music for Debugging

July 15th, 2007

Debugging and understanding multithreaded programs is hard and sometimes it requires running several execution paths mentally. Listening to composers who use multithreading in music can help here. My favourite is J.S. Bach and I recently purchased his complete works (155 CD box from Brilliant Classics). Virtuoso music helps me in live debugging too and here my favourites are Chopin and Liszt. I recently purchased 30 CD box of complete Chopin works (also from Brilliant Classics).

Many software engineers listen to music when writing code and I’m not the exception. However, I have found that not all music suitable for programming helps me during debugging sessions.

Music for relaxation, quiet classical or modern music helps me to think about program design and write solid code. Music with several melodies played simultaneously, heroic and virtuoso works help me to achieve breakthrough and find a bug. The latter kind of music also suits me for listening when doing crash dump analysis or problem troubleshooting.

In 1997 I read a wonderful book “Zen of Windows 95 Programming: Master the Art of Moving to Windows 95 and Creating High-Performance Windows Applications” written by Lou Grinzo and this book provides some music suggestions to listen to while doing programming.

Recently I’ve found one research project related to audio debugging, mapping source code to musical structures

http://www.wsu.edu/~stefika/ProgramAuralization.html

- Dmitry Vostokov @ DumpAnalysis.org -

Interrupts and exceptions explained (Part 4)

July 15th, 2007

The previous part discussed processor interrupts in user mode. In this part I will explain WinDbg .trap command and show how to simulate it manually.

Upon an interrupt a processor saves the current instruction pointer and transfers execution to an interrupt handler as explained in the first part of these series. This interrupt handler has to save a full thread context before calling other functions to do complex interrupt processing. For example, if we disassemble KiTrap0E handler from x86 Windows 2003 crash dump we would see that it saves a lot of registers including segment registers:

3: kd> uf nt!KiTrap0E
...
...
...
nt!KiTrap0E:
e088bb2c mov     word ptr [esp+2],0
e088bb33 push    ebp
e088bb34 push    ebx
e088bb35 push    esi
e088bb36 push    edi
e088bb37 push    fs
e088bb39 mov     ebx,30h
e088bb3e mov     fs,bx
e088bb41 mov     ebx,dword ptr fs:[0]
e088bb48 push    ebx
e088bb49 sub     esp,4
e088bb4c push    eax
e088bb4d push    ecx
e088bb4e push    edx
e088bb4f push    ds
e088bb50 push    es
e088bb51 push    gs
e088bb53 mov     ax,23h
e088bb57 sub     esp,30h
e088bb5a mov     ds,ax
e088bb5d mov     es,ax
e088bb60 mov     ebp,esp
e088bb62 test    dword ptr [esp+70h],20000h
e088bb6a jne     nt!V86_kite_a (e088bb04)
...
...
...

The saved processor state information (context) forms the so called Windows kernel trap frame:

3: kd> dt _KTRAP_FRAME
+0x000 DbgEbp           : Uint4B
+0x004 DbgEip           : Uint4B
+0x008 DbgArgMark       : Uint4B
+0x00c DbgArgPointer    : Uint4B
+0x010 TempSegCs        : Uint4B
+0x014 TempEsp          : Uint4B
+0x018 Dr0              : Uint4B
+0x01c Dr1              : Uint4B
+0x020 Dr2              : Uint4B
+0x024 Dr3              : Uint4B
+0x028 Dr6              : Uint4B
+0x02c Dr7              : Uint4B
+0x030 SegGs            : Uint4B
+0x034 SegEs            : Uint4B
+0x038 SegDs            : Uint4B
+0x03c Edx              : Uint4B
+0x040 Ecx              : Uint4B
+0x044 Eax              : Uint4B
+0x048 PreviousPreviousMode : Uint4B
+0x04c ExceptionList    : Ptr32 _EXCEPTION_REGISTRATION_RECORD
+0x050 SegFs            : Uint4B
+0x054 Edi              : Uint4B
+0x058 Esi              : Uint4B
+0x05c Ebx              : Uint4B
+0x060 Ebp              : Uint4B
+0x064 ErrCode          : Uint4B
+0x068 Eip              : Uint4B
+0x06c SegCs            : Uint4B
+0x070 EFlags           : Uint4B
+0x074 HardwareEsp      : Uint4B
+0x078 HardwareSegSs    : Uint4B
+0x07c V86Es            : Uint4B
+0x080 V86Ds            : Uint4B
+0x084 V86Fs            : Uint4B
+0x088 V86Gs            : Uint4B

This Windows trap frame is not the same as an interrupt frame a processor saves on a current thread stack when an interrupt occurs in kernel mode. The latter frame is very small and consists only of EIP, CS, EFLAGS and ErrorCode. When an interrupt occurs in user mode an x86 processor additionally saves the current stack pointer SS:ESP.

The .trap command finds the trap frame on a current thread stack and sets the current thread register context using the values from that saved structure. You can see that command in action for certain bugchecks when you use !analyze -v:

3: kd> !analyze -v
KERNEL_MODE_EXCEPTION_NOT_HANDLED (8e)
...
...
...
Arguments:
Arg1: c0000005, The exception code that was not handled
Arg2: de65190c, The address that the exception occurred at
Arg3: f24f8a74, Trap Frame
Arg4: 00000000



TRAP_FRAME:  f24f8a74 — (.trap fffffffff24f8a74)
.trap fffffffff24f8a74
ErrCode = 00000000
eax=dbc128c0 ebx=dbe4a010 ecx=f24f8ac4 edx=00000001 esi=46525356 edi=00000000
eip=de65190c esp=f24f8ae8 ebp=f24f8b18 iopl=0 nv up ei pl nz na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010206
driver!foo+0×16:
de65190c 837e1c00         cmp     dword ptr [esi+1Ch],0 ds:0023:46525372=????????


If we look at the trap frame we would see the same register values that WinDbg reports above:

3: kd> dt _KTRAP_FRAME f24f8a74
+0x000 DbgEbp           : 0xf24f8b18
+0x004 DbgEip           : 0xde65190c
+0x008 DbgArgMark       : 0xbadb0d00
+0x00c DbgArgPointer    : 1
+0x010 TempSegCs        : 0xb0501cd
+0x014 TempEsp          : 0xdcc01cd0
+0x018 Dr0              : 0xf24f8aa8
+0x01c Dr1              : 0xde46c90a
+0x020 Dr2              : 0
+0x024 Dr3              : 0
+0x028 Dr6              : 0xdbe4a000
+0x02c Dr7              : 0
+0x030 SegGs            : 0
+0x034 SegEs            : 0x23
+0x038 SegDs            : 0x23
+0x03c Edx              : 1
+0x040 Ecx              : 0xf24f8ac4
+0x044 Eax              : 0xdbc128c0
+0x048 PreviousPreviousMode : 0xdbe4a010
+0x04c ExceptionList    : 0xffffffff _EXCEPTION_REGISTRATION_RECORD
+0x050 SegFs            : 0x30
+0x054 Edi              : 0
+0x058 Esi              : 0x46525356
+0x05c Ebx              : 0xdbe4a010
+0x060 Ebp              : 0xf24f8b18
+0x064 ErrCode          : 0
+0x068 Eip              : 0xde65190c ; driver!foo+0x16
+0x06c SegCs            : 8
+0x070 EFlags           : 0x10206
+0x074 HardwareEsp      : 0xdbc171b0
+0x078 HardwareSegSs    : 0xde667677
+0x07c V86Es            : 0xdbc128c0
+0x080 V86Ds            : 0xdbc171c4
+0x084 V86Fs            : 0xf24f8bc4
+0x088 V86Gs            : 0

It is good to know how to find a trap frame manually in the case the stack is corrupt or WinDbg cannot find a trap frame automatically. In this case we can take the advantage of the fact that DS and ES segment registers have the same value in the Windows flat memory model:

   +0x034 SegEs            : 0x23
+0x038 SegDs            : 0x23

We need to find 2 consecutive 0×23 values on the stack. There may be several such places but usually the correct one comes between KiTrapXX address on the stack and the initial processor trap frame shown below in red. This is because KiTrapXX obviously calls other functions to further process an interrupt so its return address is saved on the stack.

3: kd> r
eax=f535713c ebx=de65190c ecx=00000000 edx=e088e1d2 esi=f5357120 edi=00000000
eip=e0827451 esp=f24f8628 ebp=f24f8640 iopl=0 nv up ei ng nz na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00000286
nt!KeBugCheckEx+0×1b:
e0827451 5d              pop     ebp

3: kd> dds f24f8628 f24f8628+1000
...
...
...
f24f8784  de4b2995 win32k!NtUserQueryWindow
f24f8788  00000000
f24f878c  fe76a324
f24f8790  f24f8d64
f24f8794  0006e43c
f24f8798  e087c041 nt!ExReleaseResourceAndLeaveCriticalRegion+0x5
f24f879c  83f3b801
f24f87a0  f24f8a58
f24f87a4  0000003b
f24f87a8  00000000
f24f87ac  00000030
f24f87b0  00000023
f24f87b4  00000023

f24f87b8  00000000



f24f8a58  00000111
f24f8a5c  f24f8a74
f24f8a60  e088bc08 nt!KiTrap0E+0xdc
f24f8a64  00000000
f24f8a68  46525372
f24f8a6c  00000000
f24f8a70  e0889686 nt!Kei386EoiHelper+0×186
f24f8a74  f24f8b18
f24f8a78  de65190c driver!foo+0×16
f24f8a7c  badb0d00
f24f8a80  00000001
f24f8a84  0b0501cd
f24f8a88  dcc01cd0
f24f8a8c  f24f8aa8
f24f8a90  de46c90a win32k!HANDLELOCK::vLockHandle+0×80
f24f8a94  00000000
f24f8a98  00000000
f24f8a9c  dbe4a000
f24f8aa0  00000000
f24f8aa4  00000000
f24f8aa8  00000023
f24f8aac  00000023

f24f8ab0  00000001
f24f8ab4  f24f8ac4
f24f8ab8  dbc128c0
f24f8abc  dbe4a010
f24f8ac0  ffffffff
f24f8ac4  00000030
f24f8ac8  00000000
f24f8acc  46525356
f24f8ad0  dbe4a010
f24f8ad4  f24f8b18
f24f8ad8  00000000
f24f8adc  de65190c driver!foo+0×16
f24f8ae0  00000008
f24f8ae4  00010206

f24f8ae8  dbc171b0
f24f8aec  de667677 driver!bar+0×173
f24f8af0  dbc128c0
f24f8af4  dbc171c4
f24f8af8  f24f8bc4
f24f8afc  00000000


Subtracting the offset 0×38 from the address of the 00000023 value (f24f8aac) and using dt command we can check _KTRAP_FRAME structure and apply .trap command afterwards:

3: kd> dt _KTRAP_FRAME f24f8aac-38
+0x000 DbgEbp           : 0xf24f8b18
+0x004 DbgEip           : 0xde65190c
+0x008 DbgArgMark       : 0xbadb0d00
+0x00c DbgArgPointer    : 1
+0x010 TempSegCs        : 0xb0501cd
+0x014 TempEsp          : 0xdcc01cd0
+0x018 Dr0              : 0xf24f8aa8
+0x01c Dr1              : 0xde46c90a
+0x020 Dr2              : 0
+0x024 Dr3              : 0
+0x028 Dr6              : 0xdbe4a000
+0x02c Dr7              : 0
+0x030 SegGs            : 0
+0x034 SegEs            : 0x23
+0x038 SegDs            : 0x23
+0x03c Edx              : 1
+0x040 Ecx              : 0xf24f8ac4
+0x044 Eax              : 0xdbc128c0
+0x048 PreviousPreviousMode : 0xdbe4a010
+0x04c ExceptionList    : 0xffffffff _EXCEPTION_REGISTRATION_RECORD
+0x050 SegFs            : 0x30
+0x054 Edi              : 0
+0x058 Esi              : 0x46525356
+0x05c Ebx              : 0xdbe4a010
+0x060 Ebp              : 0xf24f8b18
+0x064 ErrCode          : 0
+0x068 Eip              : 0xde65190c
+0x06c SegCs            : 8
+0x070 EFlags           : 0x10206
+0x074 HardwareEsp      : 0xdbc171b0
+0x078 HardwareSegSs    : 0xde667677
+0x07c V86Es            : 0xdbc128c0
+0x080 V86Ds            : 0xdbc171c4
+0x084 V86Fs            : 0xf24f8bc4
+0x088 V86Gs            : 0

3: kd> ? f24f8aac-38
Evaluate expression: -229668236 = f24f8a74

3: kd> .trap f24f8a74
ErrCode = 00000000
eax=dbc128c0 ebx=dbe4a010 ecx=f24f8ac4 edx=00000001 esi=46525356 edi=00000000
eip=de65190c esp=f24f8ae8 ebp=f24f8b18 iopl=0 nv up ei pl nz na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010206
driver!foo+0x16:
de65190c 837e1c00        cmp     dword ptr [esi+1Ch],0 ds:0023:46525372=????????

In complete memory dumps we can see that _KTRAP_FRAME is saved when calling system services too:

3: kd> kL
ChildEBP RetAddr
f24f8ae8 de667677 driver!foo+0x16
f24f8b18 de667799 driver!bar+0x173
f24f8b90 de4a853e win32k!GreSaveScreenBits+0x69
f24f8bd8 de4922bd win32k!CreateSpb+0x167
f24f8c40 de490bb8 win32k!zzzChangeStates+0x448
f24f8c88 de4912de win32k!zzzBltValidBits+0xe2
f24f8ce0 de4926c6 win32k!xxxEndDeferWindowPosEx+0x13a
f24f8cfc de49aa8f win32k!xxxSetWindowPos+0xb1
f24f8d34 de4acf4d win32k!xxxShowWindow+0x201
f24f8d54 e0888c6c win32k!NtUserShowWindow+0x79
f24f8d54 7c94ed54 nt!KiFastCallEntry+0xfc (TrapFrame @ f24f8d64)
0006e48c 77e34f1d ntdll!KiFastSystemCallRet
0006e53c 77e2f12f USER32!NtUserShowWindow+0xc
0006e570 77e2b0fe USER32!InternalDialogBox+0xa9
0006e590 77e29005 USER32!DialogBoxIndirectParamAorW+0×37
0006e5b4 0103d569 USER32!DialogBoxParamW+0×3f
0006e5d8 0102d2f5 winlogon!Fusion_DialogBoxParam+0×24

and we can get the current thread context before its transition to kernel mode:

3: kd> .trap f24f8d64
ErrCode = 00000000
eax=7ffff000 ebx=00000000 ecx=00000000 edx=7c94ed54 esi=00532e68 edi=0002002c
eip=7c94ed54 esp=0006e490 ebp=0006e53c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
ntdll!KiFastSystemCallRet:
001b:7c94ed54 c3              ret

3: kd> kL
ChildEBP RetAddr
0006e48c 77e34f1d ntdll!KiFastSystemCallRet
0006e53c 77e2f12f USER32!NtUserShowWindow+0xc
0006e570 77e2b0fe USER32!InternalDialogBox+0xa9
0006e590 77e29005 USER32!DialogBoxIndirectParamAorW+0x37
0006e5b4 0103d569 USER32!DialogBoxParamW+0x3f
0006e5d8 0102d2f5 winlogon!Fusion_DialogBoxParam+0x24

In the next part I’ll show an example from an x64 crash dump.

- Dmitry Vostokov @ DumpAnalysis.org -

Reading Windows-based Code (Part 1)

July 13th, 2007

As promised here is the first introductory part of the Code Reading (The Windows Perspective) training. You might need to download and install Microsoft Office Animation Runtime if you don’t have PowerPoint installed:

PowerPoint 2003/2002 Add-in: Office Animation Runtime 

The HTML version of the presentation is located here:

Reading Windows-based Code (Part 1)

- Dmitry Vostokov @ DumpAnalysis.org -

StressPrinters update

July 12th, 2007

The new version 1.3.1 has been published and can be downloaded from Citrix technical support:

StressPrinters 1.3.1 for 32-bit and 64-bit platforms

What’s new:

  1. Configurable timeout to mark potential printer drivers in the log 

  2. The log structure and warnings are documented in the article with an example

  3. AddPrinter command line section in the article for fine-tuning tests

  4. The option to execute a post-processing command after tests 

The motivation behind the creation of this tool is explained in the previous post:

StressPrinters: Stressing Printer Autocreation 

- Dmitry Vostokov @ DumpAnalysis.org -

Troubleshooting as debugging

July 11th, 2007

This post is motivated by TRAFFIC steps introduced by Andreas Zeller in his book ”Why Programs Fail?”. This book is wonderful and it gives practical debugging skills coherent and solid systematical foundation.

However these steps are for fixing defects in code, the traditional view of the software debugging process. Based on an analogy with systems theories where we have different levels of abstraction like psychology, biology, chemistry and physics, I would say that debugging starts when you have the failure at the system level.

If we compare systems to applications, troubleshooting to source code debugging, the question we ask at the higher level is “Who caused the product to fail?” which also has a business and political flavor. Therefore I propose a different acronym: VERSION. If you always try to fix system problems at the code level you will get a huge “traffic” in all sense but if you troubleshoot them first you get a different system / subsystem / component version and get your problem solved faster. This is why we have technical support departments in organizations. 

There are some parallels between TRAFFIC and VERSION steps:

Track                     View the problem
Reproduce                 Environment/repro steps
Automate (and simplify)   Relevant description
Find origins              Subsystem/component
                             identification
Focus                     Identify the origin
                             (subsystem/component)
Isolate (defect in code)  Obtain the solution
                             (replace/eliminate
                              subsystem/component)
Correct (defect in code)  New case study
                             (document,
                              postmortem analysis)

Troubleshooting doesn’t eliminate the need to look at source code. In many cases a support engineer has to be proficient in code reading skill to be able to map from traces to source code. This will help in component identification, especially if your product has extensive tracing facility. I have started development of  ”Code Reading” training targeted for Windows environments and will post some presentations soon. The first one will be available tomorrow, so stay tuned.

- Dmitry Vostokov @ DumpAnalysis.org -

WinDbg is privacy-aware

July 8th, 2007

This is a quick follow up to my previous post about privacy issues related to crash dumps. WinDbg has two options for .dump command to remove the potentially sensitive user data (from WinDbg help):

  • r  - Deletes from the minidump those portions of the stack and store memory that are not useful for recreating the stack trace. Local variables and other data type values are deleted as well. This option does not make the minidump smaller (because these memory sections are simply zeroed), but it is useful if you want to protect the privacy of other applications. 

  • R  - Deletes the full module paths from the minidump. Only the module names will be included. This is a useful option if you want to protect the privacy of the user’s directory structure.

Therefore it is possible to configure CDB or WinDbg as default postmortem debuggers and avoid process data to be included. Most of stack is zeroed except frame data pointers and return addresses used to reconstruct stack trace. Therefore string constants like passwords are eliminated. I made the following test with CDB configured as the default post-mortem debugger on my Windows x64 server:

HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Windows NT\CurrentVersion\AeDebug
Debugger="C:\Program Files (x86)\Debugging Tools for Windows\cdb.exe" -p %ld -e %ld -g -c ".dump /o /mrR /u c:\temp\safedump.dmp; q"

I got the following stack trace from TestDefaultDebugger (module names and function offsets are removed for visual clarity):

0:000> kvL 100
ChildEBP RetAddr  Args to Child
002df868 00403263 00425ae8 00000000 002df8a8 OnBnClickedButton1
002df878 00403470 002dfe90 00000000 00000000 _AfxDispatchCmdMsg
002df8a8 00402a27 00000000 00000000 00000000 OnCmdMsg
002df8cc 00408e69 00000000 00000000 00000000 OnCmdMsg
002df91c 004098d9 00000000 00580a9e 00000000 OnCommand
002df9b8 00406258 00000000 00000000 00580a9e OnWndMsg
002df9d8 0040836d 00000000 00000000 00580a9e WindowProc
002dfa40 004083f4 00000000 00000000 00000000 AfxCallWndProc
002dfa60 7d9472d8 00000000 00000000 00000000 AfxWndProc
002dfa8c 7d9475c3 004083c0 00000000 00000000 InternalCallWinProc
002dfb04 7d948626 00000000 004083c0 00000000 UserCallWinProcCheckWow
002dfb48 7d94868d 00000000 00000000 00000000 SendMessageWorker
002dfb6c 7dbf87b3 00000000 00000000 00000000 SendMessageW
002dfb8c 7dbf8895 00000000 00000000 00000000 Button_NotifyParent
002dfba8 7dbfab9a 00000000 00000000 002dfcb0 Button_ReleaseCapture
002dfc38 7d9472d8 00580a9e 00000000 00000000 Button_WndProc
002dfc64 7d9475c3 7dbfa313 00580a9e 00000000 InternalCallWinProc
002dfcdc 7d9477f6 00000000 7dbfa313 00580a9e UserCallWinProcCheckWow
002dfd54 7d947838 00000000 00000000 002dfd90 DispatchMessageWorker
002dfd64 7d956ca0 00000000 00000000 002dfe90 DispatchMessageW
002dfd90 0040568b 00000000 00000000 002dfe90 IsDialogMessageW
002dfda0 004065d8 00000000 00402a07 00000000 IsDialogMessageW
002dfda8 00402a07 00000000 00000000 00000000 PreTranslateInput
002dfdb8 00408041 00000000 00000000 002dfe90 PreTranslateMessage
002dfdc8 00403ae3 00000000 00000000 00000000 WalkPreTranslateTree
002dfddc 00403c1e 00000000 00403b29 00000000 AfxInternalPreTranslateMessage
002dfde4 00403b29 00000000 00403c68 00000000 PreTranslateMessage
002dfdec 00403c68 00000000 00000000 002dfe90 AfxPreTranslateMessage
002dfdfc 00407920 00000000 002dfe90 002dfe6c AfxInternalPumpMessage
002dfe20 004030a1 00000000 00000000 0042ec18 CWnd::RunModalLoop
002dfe6c 0040110d 00000000 0042ec18 0042ec18 CDialog::DoModal
002dff18 004206fb 00000000 00000000 00000000 InitInstance
002dff28 0040e852 00400000 00000000 00000000 AfxWinMain
002dffc0 7d4e992a 00000000 00000000 00000000 __tmainCRTStartup
002dfff0 00000000 0040e8bb 00000000 00000000 BaseProcessStart

We can see that most arguments are zeroes. Those that are not either do not point to valid data or related to function return addresses and frame pointers. This can be seen from the raw stack data as well:

0:000> dds esp
002df86c  00403263 TestDefaultDebugger!_AfxDispatchCmdMsg+0x43
002df870  00425ae8 TestDefaultDebugger!CTestDefaultDebuggerApp::`vftable'+0x154
002df874  00000000
002df878  002df8a8
002df87c  00403470 TestDefaultDebugger!CCmdTarget::OnCmdMsg+0x118
002df880  002dfe90
002df884  00000000
002df888  00000000
002df88c  004014f0 TestDefaultDebugger!CTestDefaultDebuggerDlg::OnBnClickedButton1
002df890  00000000
002df894  00000000
002df898  00000000
002df89c  002dfe90
002df8a0  00000000
002df8a4  00000000
002df8a8  002df8cc
002df8ac  00402a27 TestDefaultDebugger!CDialog::OnCmdMsg+0x1b
002df8b0  00000000
002df8b4  00000000
002df8b8  00000000
002df8bc  00000000
002df8c0  00000000
002df8c4  002dfe90
002df8c8  00000000
002df8cc  002df91c
002df8d0  00408e69 TestDefaultDebugger!CWnd::OnCommand+0x90
002df8d4  00000000
002df8d8  00000000
002df8dc  00000000
002df8e0  00000000
002df8e4  002dfe90
002df8e8  002dfe90

We can compare it with the normal full or minidump saved with other /m options. The data zeroed when we use /mr option is shown in red color (module names and function offsets are removed for visual clarity):

0:000> kvL 100
ChildEBP RetAddr Args to Child
002df868 00403263 00425ae8 00000111 002df8a8 OnBnClickedButton1
002df878 00403470 002dfe90 000003e8 00000000 _AfxDispatchCmdMsg
002df8a8 00402a27 000003e8 00000000 00000000 OnCmdMsg
002df8cc 00408e69 000003e8 00000000 00000000 OnCmdMsg
002df91c 004098d9 00000000 00271876 d5b6c7f7 OnCommand
002df9b8 00406258 00000111 000003e8 00271876 OnWndMsg
002df9d8 0040836d 00000111 000003e8 00271876 WindowProc
002dfa40 004083f4 00000000 00561878 00000111 AfxCallWndProc
002dfa60 7d9472d8 00561878 00000111 000003e8 AfxWndProc
002dfa8c 7d9475c3 004083c0 00561878 00000111 InternalCallWinProc
002dfb04 7d948626 00000000 004083c0 00561878 UserCallWinProcCheckWow
002dfb48 7d94868d 00aec860 00000000 00000111 SendMessageWorker
002dfb6c 7dbf87b3 00561878 00000111 000003e8 SendMessageW
002dfb8c 7dbf8895 002ec9e0 00000000 0023002c Button_NotifyParent
002dfba8 7dbfab9a 002ec9e0 00000001 002dfcb0 Button_ReleaseCapture
002dfc38 7d9472d8 00271876 00000202 00000000 Button_WndProc
002dfc64 7d9475c3 7dbfa313 00271876 00000202 InternalCallWinProc
002dfcdc 7d9477f6 00000000 7dbfa313 00271876 UserCallWinProcCheckWow
002dfd54 7d947838 002e77f8 00000000 002dfd90 DispatchMessageWorker
002dfd64 7d956ca0 002e77f8 00000000 002dfe90 DispatchMessageW
002dfd90 0040568b 00561878 00000000 002dfe90 IsDialogMessageW
002dfda0 004065d8 002e77f8 00402a07 002e77f8 IsDialogMessageW
002dfda8 00402a07 002e77f8 002e77f8 00561878 PreTranslateInput
002dfdb8 00408041 002e77f8 002e77f8 002dfe90 PreTranslateMessage
002dfdc8 00403ae3 00561878 002e77f8 002e77f8 WalkPreTranslateTree
002dfddc 00403c1e 002e77f8 00403b29 002e77f8 AfxInternalPreTranslateMessage
002dfde4 00403b29 002e77f8 00403c68 002e77f8 PreTranslateMessage
002dfdec 00403c68 002e77f8 00000000 002dfe90 AfxPreTranslateMessage
002dfdfc 00407920 00000004 002dfe90 002dfe6c AfxInternalPumpMessage
002dfe20 004030a1 00000004 d5b6c023 0042ec18 RunModalLoop
002dfe6c 0040110d d5b6c037 0042ec18 0042ec18 DoModal
002dff18 004206fb 00000ece 00000002 00000001 InitInstance
002dff28 0040e852 00400000 00000000 001d083e AfxWinMain
002dffc0 7d4e992a 00000000 00000000 7efdf000 __tmainCRTStartup
002dfff0 00000000 0040e8bb 00000000 000000c8 BaseProcessStart

0:000> dds esp
002df86c 00403263 TestDefaultDebugger!_AfxDispatchCmdMsg+0x43
002df870 00425ae8 TestDefaultDebugger!CTestDefaultDebuggerApp::`vftable'+0x154
002df874 00000111
002df878 002df8a8
002df87c 00403470 TestDefaultDebugger!CCmdTarget::OnCmdMsg+0×118
002df880 002dfe90
002df884 000003e8
002df888 00000000
002df88c 004014f0 TestDefaultDebugger!CTestDefaultDebuggerDlg::OnBnClickedButton1
002df890 00000000
002df894 00000038
002df898 00000000
002df89c 002dfe90
002df8a0 000003e8
002df8a4 00000000
002df8a8 002df8cc
002df8ac 00402a27 TestDefaultDebugger!CDialog::OnCmdMsg+0×1b
002df8b0 000003e8
002df8b4 00000000
002df8b8 00000000
002df8bc 00000000
002df8c0 000003e8
002df8c4 002dfe90
002df8c8 00000000
002df8cc 002df91c
002df8d0 00408e69 TestDefaultDebugger!CWnd::OnCommand+0×90
002df8d4 000003e8
002df8d8 00000000
002df8dc 00000000
002df8e0 00000000
002df8e4 002dfe90
002df8e8 002dfe90

- Dmitry Vostokov @ DumpAnalysis.org -

WinDbg update 6.7.5.1

July 8th, 2007

The new WinDbg has been released this week: 6.7.5.1. Contains some enhancements since 6.7.5.0 released earlier in April.

What’s New for Debugging Tools for Windows

One improvement is for handling mini-dumps:

When loading modules from a user-mode minidump provide available misc and CV record info from dump. This can allow symbols to be loaded in some cases when PE image file is not available.

- Dmitry Vostokov @ DumpAnalysis.org -

Citrixware

July 8th, 2007

Citrix is a global leader in application delivery and access infrastructure solutions including application streaming and virtualization. There are so many great products developed by this company including WinFrame, MetaFrame, Presentation Server and its clients, Desktop Server, XenServer, XenApp, XenDesktop, Receiver and Dazzle, Access Gateway, Application Firewall, Application Gateway, NetScaler, WANScaler, GoToMeeting, GoToMyPC, GoToWebinar, GoToAssist, EdgeSight and Password Manager.

Citrix is no longer tied to Windows platforms because its products run on Linux, Solaris, FreeBSD, HP-UX, AIX, Symbian and Mac OS X as well. To bind them all together I propose to use the word “Citrixware”.

With more than 180,000 organizations in the world using Citrixware the chances are that you use it too. This is more encompassing word than just simple “accessware”. 

- Dmitry Vostokov @ DumpAnalysis.org -

Resolving security issues with crash dumps

July 8th, 2007

It is a well known fact that crash dumps may contain sensitive and private information. Crash reports that contain binary process extracts may contain it too. There is a conflict here between the desire to get full memory contents for debugging purposes and possible security implications. The solution would be to have postmortem debuggers and user mode process dumpers to implement an option to save only the activity data like stack traces in a text form. Some problems on a system level can be corrected just by looking at thread stack traces, critical section list, full module information, thread times and so on. This can help to identify components that cause process crashes, hangs or CPU spikes.

Users or system administrators can review text data before sending it outside their environment. This was already implemented as Dr. Watson logs. However these logs don’t usually have sufficient information required for crash dump analysis compared to information we can extract from a dump using WinDbg, for example. If you need to analyze kernel and all process activities you can use scripts to convert your kernel and complete memory dumps to text files:

Dmp2Txt: Solving Security Problem

The similar scripts can be applied to user dumps:

Using scripts to process hundreds of user dumps

Generating good scripts in a production environment has one problem: the conversion tool or debugger needs to know about symbols. This can be easily done with Microsoft modules because of Microsoft public symbol server.  Other companies like Citrix have the option to download public symbols:

Debug Symbols for Citrix Presentation Server

Alternatively one can write a WinDbg extension that loads a text file with stack traces, appropriate module images, finds the right PDB files and presents stack traces with full symbolic information. This can also be a separate program that uses Visual Studio DIA (Debug Interface Access) SDK to access PDB files later after receiving a text file from a customer.

I’m currently experimenting with some approaches and will write about them later. Text files will also be used in Internet Based Crash Dump Analysis Service because it is much easier to process text files than crash dumps. Although it is feasible to submit small mini dumps for this purpose they don’t contain much information and require writing specialized OS specific code to parse them. Also text files having the same file size can contain much more useful information without exposing private and sensitive information.

I would appreciate any comments and suggestions regarding this problem. 

- Dmitry Vostokov @ DumpAnalysis.org -

Coping with missing symbolic information

July 5th, 2007

Sometimes there is no private PDB file available for a module in a crash dump although we know they exist for different versions of the same module. The typical example is when we have a public PDB file loaded automatically and we need access to structure definitions, for example, _TEB or _PEB. In this case we need to force WinDbg to load an additional PDB file just to be able to use these structure definitions. This can be achieved by loading an additional module at a different address and forcing it to use another private PDB file. At the same time we want to keep the original module to reference the correct PDB file albeit the public one. Let’s look at one concrete example.   

I was trying to get stack limits for a thread by using !teb command:

0:000> !teb
TEB at 7efdd000
*** Your debugger is not using the correct symbols
***
*** In order for this command to work properly, your symbol path
*** must point to .pdb files that have full type information.
***
*** Certain .pdb files (such as the public OS symbols) do not
*** contain the required information.  Contact the group that
*** provided you with these symbols if you need this command to
*** work.
***
*** Type referenced: ntdll!_TEB
***
error InitTypeRead( TEB )…
0:000> dt ntdll!*

lm command showed that the symbol file was loaded and it was correct so perhaps it was the public symbol file or _TEB definition was missing in it: 

0:000> lm m ntdll
start    end      module name
7d600000 7d6f0000 ntdll (pdb symbols) c:\websymbols\wntdll.pdb\ 40B574C84D5C42708465A7E4A1E4D7CC2\wntdll.pdb

I looked at the size of wntdll.pdb and it was 1,091Kb. I searched for other ntdll.pdb files, found one with the bigger size 1,187Kb and appended it to my symbol search path:

0:000> .sympath+ C:\websymbols\ntdll.pdb\ DCE823FCF71A4BF5AA489994520EA18F2
Symbol search path is: SRV*c:\websymbols*http://msdl.microsoft.com/download/symbols; C:\websymbols\ntdll.pdb\DCE823FCF71A4BF5AA489994520EA18F2

Then I looked at my symbol cache folder for ntdll.dll, chose a path to a random one and loaded it at the address not occupied by other modules forcing to load symbol files and ignore a mismatch if any:

0:000> .reload /f /i C:\websymbols\ntdll.dll\45D709FFf0000\ntdll.dll=7E000000
0:000> lm
start    end        module name
...
...
...
7d600000 7d6f0000   ntdll      (pdb symbols)          c:\websymbols\wntdll.pdb\40B574C84D5C42708465A7E4A1E4D7CC2\wntdll.pdb
7d800000 7d890000   GDI32      (deferred)
7d8d0000 7d920000   Secur32    (deferred)
7d930000 7da00000   USER32     (deferred)
7da20000 7db00000   RPCRT4     (deferred)
7e000000 7e000000   ntdll_7e000000   (pdb symbols)          C:\websymbols\ntdll.pdb\DCE823FCF71A4BF5AA489994520EA18F2\ntdll.pdb

The additional ntdll.dll was loaded at 7e000000 address and its module name became ntdll_7e000000. Because I knew TEB address I could see the values of _TEB structure fields immediately:

0:000> dt -r1 ntdll_7e000000!_TEB 7efdd000
   +0×000 NtTib            : _NT_TIB
      +0×000 ExceptionList    : 0×0012fec0 _EXCEPTION_REGISTRATION_RECORD
      +0×004 StackBase        : 0×00130000
      +0×008 StackLimit       : 0×0011c000

      +0×00c SubSystemTib     : (null)
      +0×010 FiberData        : 0×00001e00
      +0×010 Version          : 0×1e00
      +0×014 ArbitraryUserPointer : (null)
      +0×018 Self             : 0×7efdd000 _NT_TIB
   +0×01c EnvironmentPointer : (null)
   +0×020 ClientId         : _CLIENT_ID
      +0×000 UniqueProcess    : 0×00000e0c
      +0×004 UniqueThread     : 0×000013dc
   +0×028 ActiveRpcHandle  : (null)
   +0×02c ThreadLocalStoragePointer : (null)
   +0×030 ProcessEnvironmentBlock : 0×7efde000 _PEB
      +0×000 InheritedAddressSpace : 0 ”
      +0×001 ReadImageFileExecOptions : 0×1 ”
      +0×002 BeingDebugged    : 0×1 ”
      +0×003 BitField         : 0 ”
      +0×003 ImageUsesLargePages : 0y0
      +0×003 SpareBits        : 0y0000000 (0)
      +0×004 Mutant           : 0xffffffff
      +0×008 ImageBaseAddress : 0×00400000
      +0×00c Ldr              : 0×7d6a01e0 _PEB_LDR_DATA
      +0×010 ProcessParameters : 0×00020000 _RTL_USER_PROCESS_PARAMETERS
      +0×014 SubSystemData    : (null)
      +0×018 ProcessHeap      : 0×00210000
      +0×01c FastPebLock      : 0×7d6a00e0 _RTL_CRITICAL_SECTION
      +0×020 AtlThunkSListPtr : (null)
      +0×024 SparePtr2        : (null)
      +0×028 EnvironmentUpdateCount : 1
      +0×02c KernelCallbackTable : 0×7d9419f0
      +0×030 SystemReserved   : [1] 0
      +0×034 SpareUlong       : 0
      +0×038 FreeList         : (null)
      +0×03c TlsExpansionCounter : 0
      +0×040 TlsBitmap        : 0×7d6a2058
      +0×044 TlsBitmapBits    : [2] 0xf
      +0×04c ReadOnlySharedMemoryBase : 0×7efe0000
      +0×050 ReadOnlySharedMemoryHeap : 0×7efe0000
      +0×054 ReadOnlyStaticServerData : 0×7efe0cd0  -> (null)
      +0×058 AnsiCodePageData : 0×7efb0000
      +0×05c OemCodePageData  : 0×7efc1000
      +0×060 UnicodeCaseTableData : 0×7efd2000
      +0×064 NumberOfProcessors : 8
      +0×068 NtGlobalFlag     : 0×70
      +0×070 CriticalSectionTimeout : _LARGE_INTEGER 0xffffe86d`079b8000
      +0×078 HeapSegmentReserve : 0×100000
      +0×07c HeapSegmentCommit : 0×2000
      +0×080 HeapDeCommitTotalFreeThreshold : 0×10000
      +0×084 HeapDeCommitFreeBlockThreshold : 0×1000
      +0×088 NumberOfHeaps    : 5
      +0×08c MaximumNumberOfHeaps : 0×10
      +0×090 ProcessHeaps     : 0×7d6a06a0  -> 0×00210000
      +0×094 GdiSharedHandleTable : (null)
      +0×098 ProcessStarterHelper : (null)
      +0×09c GdiDCAttributeList : 0
      +0×0a0 LoaderLock       : 0×7d6a0180 _RTL_CRITICAL_SECTION
      +0×0a4 OSMajorVersion   : 5
      +0×0a8 OSMinorVersion   : 2
      +0×0ac OSBuildNumber    : 0xece
      +0×0ae OSCSDVersion     : 0×200
      +0×0b0 OSPlatformId     : 2
      +0×0b4 ImageSubsystem   : 2
      +0×0b8 ImageSubsystemMajorVersion : 4
      +0×0bc ImageSubsystemMinorVersion : 0
      +0×0c0 ImageProcessAffinityMask : 0
      +0×0c4 GdiHandleBuffer  : [34] 0
      +0×14c PostProcessInitRoutine : (null)
      +0×150 TlsExpansionBitmap : 0×7d6a2050
      +0×154 TlsExpansionBitmapBits : [32] 1
      +0×1d4 SessionId        : 1
      +0×1d8 AppCompatFlags   : _ULARGE_INTEGER 0×0
      +0×1e0 AppCompatFlagsUser : _ULARGE_INTEGER 0×0
      +0×1e8 pShimData        : (null)
      +0×1ec AppCompatInfo    : (null)
      +0×1f0 CSDVersion       : _UNICODE_STRING “Service Pack 2″
      +0×1f8 ActivationContextData : (null)
      +0×1fc ProcessAssemblyStorageMap : (null)
      +0×200 SystemDefaultActivationContextData : 0×00180000 _ACTIVATION_CONTEXT_DATA
      +0×204 SystemAssemblyStorageMap : (null)
      +0×208 MinimumStackCommit : 0
      +0×20c FlsCallback      : 0×002137b0  -> (null)
      +0×210 FlsListHead      : _LIST_ENTRY [ 0×2139c8 - 0×2139c8 ]
      +0×218 FlsBitmap        : 0×7d6a2040
      +0×21c FlsBitmapBits    : [4] 0×33
      +0×22c FlsHighIndex     : 5
   +0×034 LastErrorValue   : 0
   +0×038 CountOfOwnedCriticalSections : 0
   +0×03c CsrClientThread  : (null)
   +0×040 Win32ThreadInfo  : (null)
   +0×044 User32Reserved   : [26] 0
   +0×0ac UserReserved     : [5] 0
   +0×0c0 WOW32Reserved    : 0×78b81910
   +0×0c4 CurrentLocale    : 0×409
   +0×0c8 FpSoftwareStatusRegister : 0
   +0×0cc SystemReserved1  : [54] (null)
   +0×1a4 ExceptionCode    : 0
   +0×1a8 ActivationContextStackPointer : 0×00211ea0 _ACTIVATION_CONTEXT_STACK
      +0×000 ActiveFrame      : (null)
      +0×004 FrameListCache   : _LIST_ENTRY [ 0×211ea4 - 0×211ea4 ]
      +0×00c Flags            : 0
      +0×010 NextCookieSequenceNumber : 1
      +0×014 StackId          : 0×9444f8
   +0×1ac SpareBytes1      : [40]  “”
   +0×1d4 GdiTebBatch      : _GDI_TEB_BATCH
      +0×000 Offset           : 0
      +0×004 HDC              : 0
      +0×008 Buffer           : [310] 0
   +0×6b4 RealClientId     : _CLIENT_ID
      +0×000 UniqueProcess    : 0×00000e0c
      +0×004 UniqueThread     : 0×000013dc
   +0×6bc GdiCachedProcessHandle : (null)
   +0×6c0 GdiClientPID     : 0
   +0×6c4 GdiClientTID     : 0
   +0×6c8 GdiThreadLocalInfo : (null)
   +0×6cc Win32ClientInfo  : [62] 0
   +0×7c4 glDispatchTable  : [233] (null)
   +0xb68 glReserved1      : [29] 0
   +0xbdc glReserved2      : (null)
   +0xbe0 glSectionInfo    : (null)
   +0xbe4 glSection        : (null)
   +0xbe8 glTable          : (null)
   +0xbec glCurrentRC      : (null)
   +0xbf0 glContext        : (null)
   +0xbf4 LastStatusValue  : 0xc0000135
   +0xbf8 StaticUnicodeString : _UNICODE_STRING “mscoree.dll”
      +0×000 Length           : 0×16
      +0×002 MaximumLength    : 0×20a
      +0×004 Buffer           : 0×7efddc00  “mscoree.dll”
   +0xc00 StaticUnicodeBuffer : [261] 0×6d
   +0xe0c DeallocationStack : 0×00030000
   +0xe10 TlsSlots         : [64] (null)
   +0xf10 TlsLinks         : _LIST_ENTRY [ 0×0 - 0×0 ]
      +0×000 Flink            : (null)
      +0×004 Blink            : (null)
   +0xf18 Vdm              : (null)
   +0xf1c ReservedForNtRpc : (null)
   +0xf20 DbgSsReserved    : [2] (null)
   +0xf28 HardErrorMode    : 0
   +0xf2c Instrumentation  : [14] (null)
   +0xf64 SubProcessTag    : (null)
   +0xf68 EtwTraceData     : (null)
   +0xf6c WinSockData      : (null)
   +0xf70 GdiBatchCount    : 0×7efdb000
   +0xf74 InDbgPrint       : 0 ”
   +0xf75 FreeStackOnTermination : 0 ”
   +0xf76 HasFiberData     : 0 ”
   +0xf77 IdealProcessor   : 0×3 ”
   +0xf78 GuaranteedStackBytes : 0
   +0xf7c ReservedForPerf  : (null)
   +0xf80 ReservedForOle   : (null)
   +0xf84 WaitingOnLoaderLock : 0
   +0xf88 SparePointer1    : 0
   +0xf8c SoftPatchPtr1    : 0
   +0xf90 SoftPatchPtr2    : 0
   +0xf94 TlsExpansionSlots : (null)
   +0xf98 ImpersonationLocale : 0
   +0xf9c IsImpersonating  : 0
   +0xfa0 NlsCache         : (null)
   +0xfa4 pShimData        : (null)
   +0xfa8 HeapVirtualAffinity : 0
   +0xfac CurrentTransactionHandle : (null)
   +0xfb0 ActiveFrame      : (null)
   +0xfb4 FlsData          : 0×002139c8
   +0xfb8 SafeThunkCall    : 0 ”
   +0xfb9 BooleanSpare     : [3]  “”

Of course, if I knew in advance that StackBase and StackLimit were the second and the third double words I could have just dumped the first 3 double words at TEB address:

0:000> dd 7efdd000 l3
7efdd000  0012fec0 00130000 0011c000

- Dmitry Vostokov @ DumpAnalysis.org -

Citrixofication

July 5th, 2007

Following the invention of one of the greatest technological thinkers of our civilization, Thomas Edison, at the beginning of the 20th century, that prompted the electrification of our world, I would like to introduce and coin the word ”Citrixofication”, the electrifying power of Citrix that transforms our lives in the 21st century - instant access to any software application whenever and wherever we want. I am very proud that I work for Citrix, the company that changes the way we work at the beginning of the 21st century, like electricity transformed our lives a century ago. The moment you do remote access to your Windows application you use solutions and technology invented by Citrix. Every Windows computer in the world has the code developed by Citrix, ICA protocol code! To be honest I was thinking about “Citrification” but I found that it is already used in chemistry so I tried  ”Citrixification” but this was already used in one French document. The only word left was “Citrixofication” or just “Citrixfication”, whatever you prefer.

- Dmitry Vostokov @ DumpAnalysis.org -

Crash Dump Analysis Patterns (Part 9b)

July 3rd, 2007

This is a follow up to the previous Deadlock pattern post. In this part I’m going to show an example of ERESOURCE deadlock in the Windows kernel.

ERESOURCE (executive resource) is a Windows synchronization object that has ownership semantics. 

An executive resource can be owned exclusively or can have a shared ownership. This is similar to the following file sharing analogy: when a file is opened for writing others can’t write or read it; if you have that file opened for reading others can read from it but can’t write to it.

ERESOURCE structure is linked into a list and have threads as owners which allows us to quickly find deadlocks using !locks command in kernel and complete memory dumps. Here is the definition of _ERESOURCE from x86 and x64 Windows:

0: kd> dt -r1 _ERESOURCE
   +0x000 SystemResourcesList : _LIST_ENTRY
      +0x000 Flink            : Ptr32 _LIST_ENTRY
      +0x004 Blink            : Ptr32 _LIST_ENTRY
   +0x008 OwnerTable       : Ptr32 _OWNER_ENTRY
      +0x000 OwnerThread      : Uint4B
      +0x004 OwnerCount       : Int4B
      +0x004 TableSize        : Uint4B
   +0x00c ActiveCount      : Int2B
   +0x00e Flag             : Uint2B
   +0x010 SharedWaiters    : Ptr32 _KSEMAPHORE
      +0x000 Header           : _DISPATCHER_HEADER
      +0x010 Limit            : Int4B
   +0x014 ExclusiveWaiters : Ptr32 _KEVENT
      +0x000 Header           : _DISPATCHER_HEADER
   +0x018 OwnerThreads     : [2] _OWNER_ENTRY
      +0x000 OwnerThread      : Uint4B
      +0x004 OwnerCount       : Int4B
      +0x004 TableSize        : Uint4B
   +0x028 ContentionCount  : Uint4B
   +0x02c NumberOfSharedWaiters : Uint2B
   +0x02e NumberOfExclusiveWaiters : Uint2B
   +0x030 Address          : Ptr32 Void
   +0x030 CreatorBackTraceIndex : Uint4B
   +0x034 SpinLock         : Uint4B

0: kd> dt -r1  _ERESOURCE
nt!_ERESOURCE
   +0x000 SystemResourcesList : _LIST_ENTRY
      +0x000 Flink            : Ptr64 _LIST_ENTRY
      +0x008 Blink            : Ptr64 _LIST_ENTRY
   +0x010 OwnerTable       : Ptr64 _OWNER_ENTRY
      +0x000 OwnerThread      : Uint8B
      +0x008 OwnerCount       : Int4B
      +0x008 TableSize        : Uint4B
   +0x018 ActiveCount      : Int2B
   +0x01a Flag             : Uint2B
   +0x020 SharedWaiters    : Ptr64 _KSEMAPHORE
      +0x000 Header           : _DISPATCHER_HEADER
      +0x018 Limit            : Int4B
   +0x028 ExclusiveWaiters : Ptr64 _KEVENT
      +0x000 Header           : _DISPATCHER_HEADER
   +0x030 OwnerThreads     : [2] _OWNER_ENTRY
      +0x000 OwnerThread      : Uint8B
      +0x008 OwnerCount       : Int4B
      +0x008 TableSize        : Uint4B
   +0x050 ContentionCount  : Uint4B
   +0x054 NumberOfSharedWaiters : Uint2B
   +0x056 NumberOfExclusiveWaiters : Uint2B
   +0x058 Address          : Ptr64 Void
   +0x058 CreatorBackTraceIndex : Uint8B
   +0x060 SpinLock         : Uint8B

If we have a list of resources from !locks output we can start following threads that own these resources. Owner threads are marked with a star character (*):

0: kd> !locks
**** DUMP OF ALL RESOURCE OBJECTS ****
KD: Scanning for held locks......

Resource @ 0x8815b928    Exclusively owned
    Contention Count = 6234751
    NumberOfExclusiveWaiters = 53
     Threads: 89ab8db0-01<*>
     Threads Waiting On Exclusive Access:
        8810fa08       880f5b40       88831020       87e33020
        880353f0       88115020       88131678       880f5db0
        89295420       88255378       880f8b40       8940d020
        880f58d0       893ee500       880edac8       880f8db0
        89172938       879b3020       88091510       88038020
        880407b8       88051020       89511db0       8921f020
        880e9db0       87c33020       88064cc0       88044730
        8803f020       87a2a020       89529380       8802d330
        89a53020       89231b28       880285b8       88106b90
        8803cbc8       88aa3020       88093400       8809aab0
        880ea540       87d46948       88036020       8806e198
        8802d020       88038b40       8826b020       88231020
        890a2020       8807f5d0
     

We see that 53 threads are waiting for _KTHREAD 89ab8db0 to release _ERESOURCE 8815b928. Searching for this thread address reveals the following: 

Resource @ 0x88159560    Exclusively owned
    Contention Count = 166896
    NumberOfExclusiveWaiters = 1
     Threads: 8802a790-01<*>
     Threads Waiting On Exclusive Access:
              89ab8db0
  

We see that thread 89ab8db0 is waiting for 8802a790 to release resource 88159560. We continue searching for thread 8802a790 waiting for another thread but we skip occurences when this thread is not waiting:

Resource @ 0x881f7b60    Exclusively owned
     Threads: 8802a790-01<*>

Resource @ 0x8824b418    Exclusively owned
    Contention Count = 34
     Threads: 8802a790-01<*>
 

Resource @ 0x8825e5a0    Exclusively owned
     Threads: 8802a790-01<*>

Resource @ 0x88172428    Exclusively owned
    Contention Count = 5
    NumberOfExclusiveWaiters = 1
     Threads: 8802a790-01<*>
     Threads Waiting On Exclusive Access:
              880f5020

Searching further we see that thread 8802a790 is waiting for thread 880f5020 to release resource 89bd7bf0:

Resource @ 0x89bd7bf0    Exclusively owned
    Contention Count = 1
    NumberOfExclusiveWaiters = 1
     Threads: 880f5020-01<*>
     Threads Waiting On Exclusive Access:
              8802a790

If we look carefully we would see that we have already seen thread 880f5020 above and I repeat the fragment (with appropriate colors now):

Resource @ 0x88172428    Exclusively owned
    Contention Count = 5
    NumberOfExclusiveWaiters = 1
     Threads: 8802a790-01<*>
     Threads Waiting On Exclusive Access:
              880f5020

We see that thread 880f5020 is waiting for thread 8802a790 and thread 8802a790 is waiting for thread 880f5020.

Therefore we have identified the classical deadlock. What we have to do now is to look at stack traces of these threads to see involved components.

- Dmitry Vostokov @ DumpAnalysis.org -

PDBFinder (public version 3.5)

July 1st, 2007

Version 3.5 uses the new binary database format and achieves the following results compare to the previous version 3.0.1:

  • 2 times smaller database size
  • 5 times faster database load time on startup!

It is fully backwards compatible with 3.0.1 and 2.x database formats and silently converts your old database to the new format on the first load.

Additionally the new version fixes the bug in version 3.0.1 sometimes manifested when removing and then adding folders before building the new database which resulted in incorrectly built database. 

The next version 4.0 is currently under development and it will have the following features:

  • The ability to open multiple databases
  • The ability to exclude certain folders during build to avoid excessive search results output
  • Fully configurable OS and language search options (which are currently disabled for public version)

PDBFinder upgrade is available for download from Citrix support.

If you still use version 2.x there is some additional information about features in version 3.5:

http://www.dumpanalysis.org/blog/index.php/2007/05/04/pdbfinder-public-version-301/

- Dmitry Vostokov @ DumpAnalysis.org -

GDB for WinDbg Users (Part 5)

July 1st, 2007

Displaying thread stack trace is the most used action in crash or core dump analysis and debugging. To show various available GDB commands I created the next version of the test program with the following source code:

#include <stdio.h>

void func_1(int param_1, char param_2, int *param_3, char *param_4);
void func_2(int param_1, char param_2, int *param_3, char *param_4);
void func_3(int param_1, char param_2, int *param_3, char *param_4);
void func_4();

int val_1;
char val_2;
int *pval_1 = &val_1;
char *pval_2 = &val_2;

int main()
{
  val_1 = 1;
  val_2 = '1';
  func_1(val_1, val_2, (int *)pval_1, (char *)pval_2);
  return 0;
}

void func_1(int param_1, char param_2, int *param_3, char *param_4)
{
  val_1 = 2;
  val_2 = '2';
  func_2(param_1, param_2, param_3, param_4);
}

void func_2(int param_1, char param_2, int *param_3, char *param_4)
{
  val_1 = 3;
  val_2 = '3';
  func_3(param_1, param_2, param_3, param_4);
}

void func_3(int param_1, char param_2, int *param_3, char *param_4)
{
  *pval_1 += param_1;
  *pval_2 += param_2;
  func_4();
}

void func_4()
{
  puts("Hello World!");
}

I compiled it with -g gcc compiler option to generate symbolic information. It will be needed for GDB to display function arguments and local variables.

C:\MinGW\examples>..\bin\gcc -g -o test.exe test.c

If you have a crash in func_4 then you can examine stack trace (backtrace) once you open a core dump. Because we don’t have a core dump of our test program we will simulate the stack trace by putting a breakpoint on func_4. In GDB this can be done by break command:

C:\MinGW\examples>..\bin\gdb test.exe
...
...
...
(gdb) break func_4
Breakpoint 1 at 0x40141d
(gdb) run
Starting program: C:\MinGW\examples/test.exe
Breakpoint 1, 0x0040141d in func_4 ()
(gdb)

In WinDbg the breakpoint command is bp:

CommandLine: C:\dmitri\test\release\test.exe
Symbol search path is: SRV*c:\websymbols*http://msdl.microsoft.com/download/symbols
Executable search path is:
ModLoad: 00400000 0040f000   test.exe
ModLoad: 7d4c0000 7d5f0000   NOT_AN_IMAGE
ModLoad: 7d600000 7d6f0000   C:\W2K3\SysWOW64\ntdll32.dll
ModLoad: 7d4c0000 7d5f0000   C:\W2K3\syswow64\kernel32.dll
(103c.17d8): Break instruction exception - code 80000003 (first chance)
eax=7d600000 ebx=7efde000 ecx=00000005 edx=00000020 esi=7d6a01f4 edi=00221f38
eip=7d61002d esp=0012fb4c ebp=0012fcac iopl=0 nv up ei pl nz na po nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202
ntdll32!DbgBreakPoint:
7d61002d cc              int     3

0:000> bp func_4

0:000> g
ModLoad: 71c20000 71c32000   C:\W2K3\SysWOW64\tsappcmp.dll
ModLoad: 77ba0000 77bfa000   C:\W2K3\syswow64\msvcrt.dll
ModLoad: 77f50000 77fec000   C:\W2K3\syswow64\ADVAPI32.dll
ModLoad: 7da20000 7db00000   C:\W2K3\syswow64\RPCRT4.dll
Breakpoint 0 hit
eax=0040c9d0 ebx=7d4d8dc9 ecx=0040c9d0 edx=00000064 esi=00000002 edi=00000ece
eip=00408be0 esp=0012ff24 ebp=0012ff28 iopl=0 nv up ei pl nz na po nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202
test!func_4:
00408be0 55              push    ebp

I had to disable optimization in the project properties otherwise Visual C++ compiler optimizes away all function calls and produces the following short code:

0:000> uf main
00401000 push    offset test!`string' (004020f4)
00401005 mov     dword ptr [test!val_1 (0040337c)],4
0040100f mov     byte ptr [test!val_2 (00403378)],64h
00401016 call    dword ptr [test!_imp__puts (004020a0)]
0040101c add     esp,4
0040101f xor     eax,eax
00401021 ret

I will talk about setting breakpoints in another part and here I’m going to concentrate only on commands that examine call stack. backtrace or bt command shows stack trace. backtrace <N> or bt <N> shows only the innermost N stack frames. backtrace -<N> or bt -<N> shows only the outermost N stack frames. backtrace full or bt full additionally shows local variables. There are also variants backtrace full <N> or bt full <N> and backtrace full -<N> or bt full -<N>:

(gdb) backtrace
#0  func_4 () at test.c:48
#1  0x00401414 in func_3 (param_1=1, param_2=49 '1', param_3=0x404080,
    param_4=0x404070 "d") at test.c:43
#2  0x004013da in func_2 (param_1=1, param_2=49 '1', param_3=0x404080,
    param_4=0x404070 "d") at test.c:35
#3  0x0040139a in func_1 (param_1=1, param_2=49 '1', param_3=0x404080,
    param_4=0x404070 "d") at test.c:27
#4  0x00401355 in main () at test.c:18

(gdb) bt
#0  func_4 () at test.c:48
#1  0x00401414 in func_3 (param_1=1, param_2=49 '1', param_3=0x404080,
    param_4=0x404070 "d") at test.c:43
#2  0x004013da in func_2 (param_1=1, param_2=49 '1', param_3=0x404080,
    param_4=0x404070 "d") at test.c:35
#3  0x0040139a in func_1 (param_1=1, param_2=49 '1', param_3=0x404080,
    param_4=0x404070 "d") at test.c:27
#4  0x00401355 in main () at test.c:18

(gdb) bt 2
#0  func_4 () at test.c:48
#1  0x00401414 in func_3 (param_1=1, param_2=49 '1', param_3=0x404080,
    param_4=0x404070 "d") at test.c:43
(More stack frames follow...)

(gdb) bt -2
#3  0x0040139a in func_1 (param_1=1, param_2=49 '1', param_3=0x404080,
    param_4=0x404070 "d") at test.c:27
#4  0x00401355 in main () at test.c:18

(gdb) bt full
#0  func_4 () at test.c:48
No locals.
#1  0x00401414 in func_3 (param_1=1, param_2=49 '1', param_3=0x404080,
    param_4=0x404070 "d") at test.c:43
        param_2 = 49 '1'
#2  0x004013da in func_2 (param_1=1, param_2=49 '1', param_3=0x404080,
    param_4=0x404070 "d") at test.c:35
        param_2 = 49 '1'
#3  0x0040139a in func_1 (param_1=1, param_2=49 '1', param_3=0x404080,
    param_4=0x404070 "d") at test.c:27
        param_2 = 49 '1'
#4  0x00401355 in main () at test.c:18
No locals.

(gdb) bt full 2
#0  func_4 () at test.c:48
No locals.
#1  0x00401414 in func_3 (param_1=1, param_2=49 '1', param_3=0x404080,
    param_4=0x404070 "d") at test.c:43
        param_2 = 49 '1'
(More stack frames follow...)

(gdb) bt full -2
#3  0x0040139a in func_1 (param_1=1, param_2=49 '1', param_3=0x404080,
    param_4=0x404070 "d") at test.c:27
        param_2 = 49 '1'
#4  0x00401355 in main () at test.c:18
No locals.

(gdb)

In WinDbg there is only one k command but it has many parameters, for example:

- default stack trace with source code lines

0:000> k
ChildEBP RetAddr
0012ff20 00408c30 test!func_4 [c:\dmitri\test\test\test.cpp @ 47]
0012ff28 00408c69 test!func_3+0x30 [c:\dmitri\test\test\test.cpp @ 44]
0012ff40 00408c99 test!func_2+0x29 [c:\dmitri\test\test\test.cpp @ 35]
0012ff58 00408cd3 test!func_1+0x29 [c:\dmitri\test\test\test.cpp @ 27]
0012ff70 00401368 test!main+0x33 [c:\dmitri\test\test\test.cpp @ 18]
0012ffc0 7d4e992a test!__tmainCRTStartup+0x15f [f:\sp\vctools\crt_bld\self_x86\crt\src\crt0.c @ 327]
0012fff0 00000000 kernel32!BaseProcessStart+0x28

- stack trace without source code lines

0:000> kL
ChildEBP RetAddr
0012ff20 00408c30 test!func_4
0012ff28 00408c69 test!func_3+0x30
0012ff40 00408c99 test!func_2+0x29
0012ff58 00408cd3 test!func_1+0x29
0012ff70 00401368 test!main+0x33
0012ffc0 7d4e992a test!__tmainCRTStartup+0x15f
0012fff0 00000000 kernel32!BaseProcessStart+0x28

- full stack trace without source code lines showing 3 stack arguments for every stack frame, calling convention and optimization information

0:000> kvL
ChildEBP RetAddr  Args to Child
0012ff20 00408c30 0012ff40 00408c69 00000001 test!func_4 (CONV: cdecl)
0012ff28 00408c69 00000001 00000031 0040c9d4 test!func_3+0x30 (CONV: cdecl)
0012ff40 00408c99 00000001 00000031 0040c9d4 test!func_2+0x29 (CONV: cdecl)
0012ff58 00408cd3 00000001 00000031 0040c9d4 test!func_1+0x29 (CONV: cdecl)
0012ff70 00401368 00000001 004230e0 00423120 test!main+0x33 (CONV: cdecl)
0012ffc0 7d4e992a 00000000 00000000 7efde000 test!__tmainCRTStartup+0x15f (FPO: [Non-Fpo]) (CONV: cdecl)
0012fff0 00000000 004013bf 00000000 00000000 kernel32!BaseProcessStart+0x28 (FPO: [Non-Fpo])

- stack trace without source code lines showing all function parameters

0:000> kPL
ChildEBP RetAddr
0012ff20 00408c30 test!func_4(void)
0012ff28 00408c69 test!func_3(
   int param_1 = 1,
   char param_2 = 49 '1',
   int * param_3 = 0x0040c9d4,
   char * param_4 = 0x0040c9d0 "d")+0x30
0012ff40 00408c99 test!func_2(
   int param_1 = 1,
   char param_2 = 49 '1',
   int * param_3 = 0x0040c9d4,
   char * param_4 = 0x0040c9d0 "d")+0x29
0012ff58 00408cd3 test!func_1(
   int param_1 = 1,
   char param_2 = 49 '1',
   int * param_3 = 0x0040c9d4,
   char * param_4 = 0x0040c9d0 "d")+0x29
0012ff70 00401368 test!main(void)+0x33
0012ffc0 7d4e992a test!__tmainCRTStartup(void)+0x15f
0012fff0 00000000 kernel32!BaseProcessStart+0x28

- stack trace without source code lines showing stack frame numbers

0:000> knL
 # ChildEBP RetAddr
00 0012ff20 00408c30 test!func_4
01 0012ff28 00408c69 test!func_3+0x30
02 0012ff40 00408c99 test!func_2+0x29
03 0012ff58 00408cd3 test!func_1+0x29
04 0012ff70 00401368 test!main+0x33
05 0012ffc0 7d4e992a test!__tmainCRTStartup+0x15f
06 0012fff0 00000000 kernel32!BaseProcessStart+0x28

- stack trace without source code lines showing the distance between stack frames in bytes

0:000> knfL
 #   Memory  ChildEBP RetAddr
00           0012ff20 00408c30 test!func_4
01         8 0012ff28 00408c69 test!func_3+0x30
02        18 0012ff40 00408c99 test!func_2+0x29
03        18 0012ff58 00408cd3 test!func_1+0x29
04        18 0012ff70 00401368 test!main+0x33
05        50 0012ffc0 7d4e992a test!__tmainCRTStartup+0x15f
06        30 0012fff0 00000000 kernel32!BaseProcessStart+0x28

- stack trace without source code lines showing the innermost 2 frames:

0:000> kL 2
ChildEBP RetAddr
0012ff20 00408c30 test!func_4
0012ff28 00408c69 test!func_3+0x30

If you want to see stack traces from all threads in a process use the following command: 

(gdb) thread apply all bt

Thread 1 (thread 728.0xc0c):
#0  func_4 () at test.c:48
#1  0x00401414 in func_3 (param_1=1, param_2=49 '1', param_3=0x404080,
    param_4=0x404070 "d") at test.c:43
#2  0x004013da in func_2 (param_1=1, param_2=49 '1', param_3=0x404080,
    param_4=0x404070 "d") at test.c:35
#3  0x0040139a in func_1 (param_1=1, param_2=49 '1', param_3=0x404080,
    param_4=0x404070 "d") at test.c:27
#4  0x00401355 in main () at test.c:18
(gdb)

In WinDbg it is ~*k. Any parameter shown above can be used, for example:

0:000> ~*kL

.  0  Id: 103c.17d8 Suspend: 1 Teb: 7efdd000 Unfrozen
ChildEBP RetAddr
0012ff20 00408c30 test!func_4
0012ff28 00408c69 test!func_3+0x30
0012ff40 00408c99 test!func_2+0x29
0012ff58 00408cd3 test!func_1+0x29
0012ff70 00401368 test!main+0x33
0012ffc0 7d4e992a test!__tmainCRTStartup+0x15f
0012fff0 00000000 kernel32!BaseProcessStart+0x28

Therefore, our next version of the map contains these new commands:

Action                      | GDB                 | WinDbg
----------------------------------------------------------
Start the process           | run                 | g
Exit                        | (q)uit              | q
Disassemble (forward)       | (disas)semble       | uf, u
Disassemble N instructions  | x/<N>i              | -
Disassemble (backward)      | -                   | ub
Stack trace                 | backtrace (bt)      | k
Full stack trace            | bt full             | kv
Partial trace (innermost)   | bt <N>              | k <N>
Partial trace (outermost)   | bt -<N>             | -
Stack trace for all threads | thread apply all bt | ~*k
Breakpoint                  | break               | bp

- Dmitry Vostokov @ DumpAnalysis.org -

GDB for WinDbg Users (Part 4)

July 1st, 2007

If you are looking for debugging tutorials with a wider scope than just listing various debugger commands then the following books will be useful:

Both use GDB for debugging case studies and will be useful for engineers with any level of debugging experience.

- Dmitry Vostokov @ DumpAnalysis.org -