Archive for July 15th, 2007

Crash Dump Analysis Patterns (Part 13b)

Sunday, July 15th, 2007

Sometimes handle leaks also result in insufficient memory especially if handles point to structures allocated by OS. Here is the typical example of the handle leak resulted in freezing several servers. The complete memory dump shows exhausted non-paged pool:

0: kd> !vm

*** Virtual Memory Usage ***
 Physical Memory:     1048352 (   4193408 Kb)
 Page File: \??\C:\pagefile.sys
   Current:   4190208 Kb  Free Space:   3749732 Kb
   Minimum:   4190208 Kb  Maximum:      4190208 Kb
 Available Pages:      697734 (   2790936 Kb)
 ResAvail Pages:       958085 (   3832340 Kb)
 Locked IO Pages:          95 (       380 Kb)
 Free System PTEs:     199971 (    799884 Kb)
 Free NP PTEs:            105 (       420 Kb)
 Free Special NP:           0 (         0 Kb)
 Modified Pages:          195 (       780 Kb)
 Modified PF Pages:       195 (       780 Kb)
 NonPagedPool Usage:    65244 (    260976 Kb)
 NonPagedPool Max:      65503 (    262012 Kb)
 ********** Excessive NonPaged Pool Usage *****

 PagedPool 0 Usage:      6576 (     26304 Kb)
 PagedPool 1 Usage:       629 (      2516 Kb)
 PagedPool 2 Usage:       624 (      2496 Kb)
 PagedPool 3 Usage:       608 (      2432 Kb)
 PagedPool 4 Usage:       625 (      2500 Kb)
 PagedPool Usage:        9062 (     36248 Kb)
 PagedPool Maximum:     66560 (    266240 Kb)

********** 184 pool allocations have failed **********

 Shared Commit:          7711 (     30844 Kb)
 Special Pool:              0 (         0 Kb)
 Shared Process:        10625 (     42500 Kb)
 PagedPool Commit:       9102 (     36408 Kb)
 Driver Commit:          1759 (      7036 Kb)
 Committed pages:      425816 (   1703264 Kb)
 Commit limit:        2052560 (   8210240 Kb)

Looking at non-paged pool consumption reveals excessive number of thread objects:

0: kd> !poolused 3
   Sorting by  NonPaged Pool Consumed

  Pool Used:
            NonPaged
 Tag    Allocs    Frees     Diff     Used
 Thre   772672   463590   309082 192867168  Thread objects , Binary: nt!ps
 MmCm       42        9       33 12153104   Calls made to MmAllocateContiguousMemory , Binary: nt!mm


The next logical step would be to list processes and find their handle usage. Indeed there is such a process:

0: kd> !process 0 0



PROCESS 88b75020  SessionId: 7  Cid: 172e4    Peb: 7ffdf000  ParentCid: 17238
    DirBase: c7fb6bc0  ObjectTable: e17f50a0  HandleCount: 143428.
    Image: iexplore.exe


Making the process current and listing its handles shows contiguously allocated handles to thread objects:

0: kd> .process 88b75020
Implicit process is now 88b75020
0: kd> .reload /user

0: kd> !handle



0d94: Object: 88a6b020  GrantedAccess: 001f03ff Entry: e35e1b28
Object: 88a6b020  Type: (8b780c68) Thread
    ObjectHeader: 88a6b008
        HandleCount: 1  PointerCount: 1

0d98: Object: 88a97320  GrantedAccess: 001f03ff Entry: e35e1b30
Object: 88a97320  Type: (8b780c68) Thread
    ObjectHeader: 88a97308
        HandleCount: 1  PointerCount: 1

0d9c: Object: 88b2b020  GrantedAccess: 001f03ff Entry: e35e1b38
Object: 88b2b020  Type: (8b780c68) Thread
    ObjectHeader: 88b2b008
        HandleCount: 1  PointerCount: 1

0da0: Object: 88b2a730  GrantedAccess: 001f03ff Entry: e35e1b40
Object: 88b2a730  Type: (8b780c68) Thread
    ObjectHeader: 88b2a718
        HandleCount: 1  PointerCount: 1

0da4: Object: 88b929a0  GrantedAccess: 001f03ff Entry: e35e1b48
Object: 88b929a0  Type: (8b780c68) Thread
    ObjectHeader: 88b92988
        HandleCount: 1  PointerCount: 1

0da8: Object: 88a57db0  GrantedAccess: 001f03ff Entry: e35e1b50
Object: 88a57db0  Type: (8b780c68) Thread
    ObjectHeader: 88a57d98
        HandleCount: 1  PointerCount: 1

0dac: Object: 88b92db0  GrantedAccess: 001f03ff Entry: e35e1b58
Object: 88b92db0  Type: (8b780c68) Thread
    ObjectHeader: 88b92d98
        HandleCount: 1  PointerCount: 1

0db0: Object: 88b4a730  GrantedAccess: 001f03ff Entry: e35e1b60
Object: 88b4a730  Type: (8b780c68) Thread
    ObjectHeader: 88b4a718
        HandleCount: 1  PointerCount: 1

0db4: Object: 88a7e730  GrantedAccess: 001f03ff Entry: e35e1b68
Object: 88a7e730  Type: (8b780c68) Thread
    ObjectHeader: 88a7e718
        HandleCount: 1  PointerCount: 1

0db8: Object: 88a349a0  GrantedAccess: 001f03ff Entry: e35e1b70
Object: 88a349a0  Type: (8b780c68) Thread
    ObjectHeader: 88a34988
        HandleCount: 1  PointerCount: 1

0dbc: Object: 88a554c0  GrantedAccess: 001f03ff Entry: e35e1b78
Object: 88a554c0  Type: (8b780c68) Thread
    ObjectHeader: 88a554a8
        HandleCount: 1  PointerCount: 1


Examination of these threads shows their stack traces and start address:

0: kd> !thread 88b4a730
THREAD 88b4a730  Cid 0004.1885c  Teb: 00000000 Win32Thread: 00000000 TERMINATED
Not impersonating
DeviceMap                 e1000930
Owning Process            8b7807a8       Image:         System
Wait Start TickCount      975361         Ticks: 980987 (0:04:15:27.921)
Context Switch Count      1
UserTime                  00:00:00.0000
KernelTime                00:00:00.0000
Start Address mydriver!StatusWaitThread (0xf5c5d128)
Stack Init 0 Current f3c4cc98 Base f3c4d000 Limit f3c4a000 Call 0
Priority 8 BasePriority 8 PriorityDecrement 0
ChildEBP RetAddr  Args to Child
f3c4ccac 8083129e ffdff5f0 8697ba00 a674c913 hal!KfLowerIrql+0×62
f3c4ccc8 00000000 808ae498 8697ba00 00000000 nt!KiExitDispatcher+0×130

0: kd> !thread 88a554c0
THREAD 88a554c0  Cid 0004.1888c  Teb: 00000000 Win32Thread: 00000000 TERMINATED
Not impersonating
DeviceMap                 e1000930
Owning Process            8b7807a8       Image:         System
Wait Start TickCount      975380         Ticks: 980968 (0:04:15:27.625)
Context Switch Count      1
UserTime                  00:00:00.0000
KernelTime                00:00:00.0000
Start Address mydriver!StatusWaitThread (0xf5c5d128)
Stack Init 0 Current f3c4cc98 Base f3c4d000 Limit f3c4a000 Call 0
Priority 8 BasePriority 8 PriorityDecrement 0
ChildEBP RetAddr  Args to Child
f3c4ccac 8083129e ffdff5f0 8697ba00 a674c913 hal!KfLowerIrql+0×62
f3c4ccc8 00000000 808ae498 8697ba00 00000000 nt!KiExitDispatcher+0×130

We can see that they have been terminated and their start address belongs to mydriver.sys. Therefore we can say that mydriver code has to be examined to find the source of the handle leak.

- Dmitry Vostokov @ DumpAnalysis.org -

Music for Debugging

Sunday, July 15th, 2007

Debugging and understanding multithreaded programs is hard and sometimes it requires running several execution paths mentally. Listening to composers who use multithreading in music can help here. My favourite is J.S. Bach and I recently purchased his complete works (155 CD box from Brilliant Classics). Virtuoso music helps me in live debugging too and here my favourites are Chopin and Liszt. I recently purchased 30 CD box of complete Chopin works (also from Brilliant Classics).

Many software engineers listen to music when writing code and I’m not the exception. However, I have found that not all music suitable for programming helps me during debugging sessions.

Music for relaxation, quiet classical or modern music helps me to think about program design and write solid code. Music with several melodies played simultaneously, heroic and virtuoso works help me to achieve breakthrough and find a bug. The latter kind of music also suits me for listening when doing crash dump analysis or problem troubleshooting.

In 1997 I read a wonderful book “Zen of Windows 95 Programming: Master the Art of Moving to Windows 95 and Creating High-Performance Windows Applications” written by Lou Grinzo and this book provides some music suggestions to listen to while doing programming.

Recently I’ve found one research project related to audio debugging, mapping source code to musical structures

http://www.wsu.edu/~stefika/ProgramAuralization.html

- Dmitry Vostokov @ DumpAnalysis.org -

Interrupts and exceptions explained (Part 4)

Sunday, July 15th, 2007

The previous part discussed processor interrupts in user mode. In this part I will explain WinDbg .trap command and show how to simulate it manually.

Upon an interrupt a processor saves the current instruction pointer and transfers execution to an interrupt handler as explained in the first part of these series. This interrupt handler has to save a full thread context before calling other functions to do complex interrupt processing. For example, if we disassemble KiTrap0E handler from x86 Windows 2003 crash dump we would see that it saves a lot of registers including segment registers:

3: kd> uf nt!KiTrap0E
...
...
...
nt!KiTrap0E:
e088bb2c mov     word ptr [esp+2],0
e088bb33 push    ebp
e088bb34 push    ebx
e088bb35 push    esi
e088bb36 push    edi
e088bb37 push    fs
e088bb39 mov     ebx,30h
e088bb3e mov     fs,bx
e088bb41 mov     ebx,dword ptr fs:[0]
e088bb48 push    ebx
e088bb49 sub     esp,4
e088bb4c push    eax
e088bb4d push    ecx
e088bb4e push    edx
e088bb4f push    ds
e088bb50 push    es
e088bb51 push    gs
e088bb53 mov     ax,23h
e088bb57 sub     esp,30h
e088bb5a mov     ds,ax
e088bb5d mov     es,ax
e088bb60 mov     ebp,esp
e088bb62 test    dword ptr [esp+70h],20000h
e088bb6a jne     nt!V86_kite_a (e088bb04)
...
...
...

The saved processor state information (context) forms the so called Windows kernel trap frame:

3: kd> dt _KTRAP_FRAME
+0x000 DbgEbp           : Uint4B
+0x004 DbgEip           : Uint4B
+0x008 DbgArgMark       : Uint4B
+0x00c DbgArgPointer    : Uint4B
+0x010 TempSegCs        : Uint4B
+0x014 TempEsp          : Uint4B
+0x018 Dr0              : Uint4B
+0x01c Dr1              : Uint4B
+0x020 Dr2              : Uint4B
+0x024 Dr3              : Uint4B
+0x028 Dr6              : Uint4B
+0x02c Dr7              : Uint4B
+0x030 SegGs            : Uint4B
+0x034 SegEs            : Uint4B
+0x038 SegDs            : Uint4B
+0x03c Edx              : Uint4B
+0x040 Ecx              : Uint4B
+0x044 Eax              : Uint4B
+0x048 PreviousPreviousMode : Uint4B
+0x04c ExceptionList    : Ptr32 _EXCEPTION_REGISTRATION_RECORD
+0x050 SegFs            : Uint4B
+0x054 Edi              : Uint4B
+0x058 Esi              : Uint4B
+0x05c Ebx              : Uint4B
+0x060 Ebp              : Uint4B
+0x064 ErrCode          : Uint4B
+0x068 Eip              : Uint4B
+0x06c SegCs            : Uint4B
+0x070 EFlags           : Uint4B
+0x074 HardwareEsp      : Uint4B
+0x078 HardwareSegSs    : Uint4B
+0x07c V86Es            : Uint4B
+0x080 V86Ds            : Uint4B
+0x084 V86Fs            : Uint4B
+0x088 V86Gs            : Uint4B

This Windows trap frame is not the same as an interrupt frame a processor saves on a current thread stack when an interrupt occurs in kernel mode. The latter frame is very small and consists only of EIP, CS, EFLAGS and ErrorCode. When an interrupt occurs in user mode an x86 processor additionally saves the current stack pointer SS:ESP.

The .trap command finds the trap frame on a current thread stack and sets the current thread register context using the values from that saved structure. You can see that command in action for certain bugchecks when you use !analyze -v:

3: kd> !analyze -v
KERNEL_MODE_EXCEPTION_NOT_HANDLED (8e)
...
...
...
Arguments:
Arg1: c0000005, The exception code that was not handled
Arg2: de65190c, The address that the exception occurred at
Arg3: f24f8a74, Trap Frame
Arg4: 00000000



TRAP_FRAME:  f24f8a74 — (.trap fffffffff24f8a74)
.trap fffffffff24f8a74
ErrCode = 00000000
eax=dbc128c0 ebx=dbe4a010 ecx=f24f8ac4 edx=00000001 esi=46525356 edi=00000000
eip=de65190c esp=f24f8ae8 ebp=f24f8b18 iopl=0 nv up ei pl nz na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010206
driver!foo+0×16:
de65190c 837e1c00         cmp     dword ptr [esi+1Ch],0 ds:0023:46525372=????????


If we look at the trap frame we would see the same register values that WinDbg reports above:

3: kd> dt _KTRAP_FRAME f24f8a74
+0x000 DbgEbp           : 0xf24f8b18
+0x004 DbgEip           : 0xde65190c
+0x008 DbgArgMark       : 0xbadb0d00
+0x00c DbgArgPointer    : 1
+0x010 TempSegCs        : 0xb0501cd
+0x014 TempEsp          : 0xdcc01cd0
+0x018 Dr0              : 0xf24f8aa8
+0x01c Dr1              : 0xde46c90a
+0x020 Dr2              : 0
+0x024 Dr3              : 0
+0x028 Dr6              : 0xdbe4a000
+0x02c Dr7              : 0
+0x030 SegGs            : 0
+0x034 SegEs            : 0x23
+0x038 SegDs            : 0x23
+0x03c Edx              : 1
+0x040 Ecx              : 0xf24f8ac4
+0x044 Eax              : 0xdbc128c0
+0x048 PreviousPreviousMode : 0xdbe4a010
+0x04c ExceptionList    : 0xffffffff _EXCEPTION_REGISTRATION_RECORD
+0x050 SegFs            : 0x30
+0x054 Edi              : 0
+0x058 Esi              : 0x46525356
+0x05c Ebx              : 0xdbe4a010
+0x060 Ebp              : 0xf24f8b18
+0x064 ErrCode          : 0
+0x068 Eip              : 0xde65190c ; driver!foo+0x16
+0x06c SegCs            : 8
+0x070 EFlags           : 0x10206
+0x074 HardwareEsp      : 0xdbc171b0
+0x078 HardwareSegSs    : 0xde667677
+0x07c V86Es            : 0xdbc128c0
+0x080 V86Ds            : 0xdbc171c4
+0x084 V86Fs            : 0xf24f8bc4
+0x088 V86Gs            : 0

It is good to know how to find a trap frame manually in the case the stack is corrupt or WinDbg cannot find a trap frame automatically. In this case we can take the advantage of the fact that DS and ES segment registers have the same value in the Windows flat memory model:

   +0x034 SegEs            : 0x23
+0x038 SegDs            : 0x23

We need to find 2 consecutive 0×23 values on the stack. There may be several such places but usually the correct one comes between KiTrapXX address on the stack and the initial processor trap frame shown below in red. This is because KiTrapXX obviously calls other functions to further process an interrupt so its return address is saved on the stack.

3: kd> r
eax=f535713c ebx=de65190c ecx=00000000 edx=e088e1d2 esi=f5357120 edi=00000000
eip=e0827451 esp=f24f8628 ebp=f24f8640 iopl=0 nv up ei ng nz na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00000286
nt!KeBugCheckEx+0×1b:
e0827451 5d              pop     ebp

3: kd> dds f24f8628 f24f8628+1000
...
...
...
f24f8784  de4b2995 win32k!NtUserQueryWindow
f24f8788  00000000
f24f878c  fe76a324
f24f8790  f24f8d64
f24f8794  0006e43c
f24f8798  e087c041 nt!ExReleaseResourceAndLeaveCriticalRegion+0x5
f24f879c  83f3b801
f24f87a0  f24f8a58
f24f87a4  0000003b
f24f87a8  00000000
f24f87ac  00000030
f24f87b0  00000023
f24f87b4  00000023

f24f87b8  00000000



f24f8a58  00000111
f24f8a5c  f24f8a74
f24f8a60  e088bc08 nt!KiTrap0E+0xdc
f24f8a64  00000000
f24f8a68  46525372
f24f8a6c  00000000
f24f8a70  e0889686 nt!Kei386EoiHelper+0×186
f24f8a74  f24f8b18
f24f8a78  de65190c driver!foo+0×16
f24f8a7c  badb0d00
f24f8a80  00000001
f24f8a84  0b0501cd
f24f8a88  dcc01cd0
f24f8a8c  f24f8aa8
f24f8a90  de46c90a win32k!HANDLELOCK::vLockHandle+0×80
f24f8a94  00000000
f24f8a98  00000000
f24f8a9c  dbe4a000
f24f8aa0  00000000
f24f8aa4  00000000
f24f8aa8  00000023
f24f8aac  00000023

f24f8ab0  00000001
f24f8ab4  f24f8ac4
f24f8ab8  dbc128c0
f24f8abc  dbe4a010
f24f8ac0  ffffffff
f24f8ac4  00000030
f24f8ac8  00000000
f24f8acc  46525356
f24f8ad0  dbe4a010
f24f8ad4  f24f8b18
f24f8ad8  00000000
f24f8adc  de65190c driver!foo+0×16
f24f8ae0  00000008
f24f8ae4  00010206

f24f8ae8  dbc171b0
f24f8aec  de667677 driver!bar+0×173
f24f8af0  dbc128c0
f24f8af4  dbc171c4
f24f8af8  f24f8bc4
f24f8afc  00000000


Subtracting the offset 0×38 from the address of the 00000023 value (f24f8aac) and using dt command we can check _KTRAP_FRAME structure and apply .trap command afterwards:

3: kd> dt _KTRAP_FRAME f24f8aac-38
+0x000 DbgEbp           : 0xf24f8b18
+0x004 DbgEip           : 0xde65190c
+0x008 DbgArgMark       : 0xbadb0d00
+0x00c DbgArgPointer    : 1
+0x010 TempSegCs        : 0xb0501cd
+0x014 TempEsp          : 0xdcc01cd0
+0x018 Dr0              : 0xf24f8aa8
+0x01c Dr1              : 0xde46c90a
+0x020 Dr2              : 0
+0x024 Dr3              : 0
+0x028 Dr6              : 0xdbe4a000
+0x02c Dr7              : 0
+0x030 SegGs            : 0
+0x034 SegEs            : 0x23
+0x038 SegDs            : 0x23
+0x03c Edx              : 1
+0x040 Ecx              : 0xf24f8ac4
+0x044 Eax              : 0xdbc128c0
+0x048 PreviousPreviousMode : 0xdbe4a010
+0x04c ExceptionList    : 0xffffffff _EXCEPTION_REGISTRATION_RECORD
+0x050 SegFs            : 0x30
+0x054 Edi              : 0
+0x058 Esi              : 0x46525356
+0x05c Ebx              : 0xdbe4a010
+0x060 Ebp              : 0xf24f8b18
+0x064 ErrCode          : 0
+0x068 Eip              : 0xde65190c
+0x06c SegCs            : 8
+0x070 EFlags           : 0x10206
+0x074 HardwareEsp      : 0xdbc171b0
+0x078 HardwareSegSs    : 0xde667677
+0x07c V86Es            : 0xdbc128c0
+0x080 V86Ds            : 0xdbc171c4
+0x084 V86Fs            : 0xf24f8bc4
+0x088 V86Gs            : 0

3: kd> ? f24f8aac-38
Evaluate expression: -229668236 = f24f8a74

3: kd> .trap f24f8a74
ErrCode = 00000000
eax=dbc128c0 ebx=dbe4a010 ecx=f24f8ac4 edx=00000001 esi=46525356 edi=00000000
eip=de65190c esp=f24f8ae8 ebp=f24f8b18 iopl=0 nv up ei pl nz na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010206
driver!foo+0x16:
de65190c 837e1c00        cmp     dword ptr [esi+1Ch],0 ds:0023:46525372=????????

In complete memory dumps we can see that _KTRAP_FRAME is saved when calling system services too:

3: kd> kL
ChildEBP RetAddr
f24f8ae8 de667677 driver!foo+0x16
f24f8b18 de667799 driver!bar+0x173
f24f8b90 de4a853e win32k!GreSaveScreenBits+0x69
f24f8bd8 de4922bd win32k!CreateSpb+0x167
f24f8c40 de490bb8 win32k!zzzChangeStates+0x448
f24f8c88 de4912de win32k!zzzBltValidBits+0xe2
f24f8ce0 de4926c6 win32k!xxxEndDeferWindowPosEx+0x13a
f24f8cfc de49aa8f win32k!xxxSetWindowPos+0xb1
f24f8d34 de4acf4d win32k!xxxShowWindow+0x201
f24f8d54 e0888c6c win32k!NtUserShowWindow+0x79
f24f8d54 7c94ed54 nt!KiFastCallEntry+0xfc (TrapFrame @ f24f8d64)
0006e48c 77e34f1d ntdll!KiFastSystemCallRet
0006e53c 77e2f12f USER32!NtUserShowWindow+0xc
0006e570 77e2b0fe USER32!InternalDialogBox+0xa9
0006e590 77e29005 USER32!DialogBoxIndirectParamAorW+0×37
0006e5b4 0103d569 USER32!DialogBoxParamW+0×3f
0006e5d8 0102d2f5 winlogon!Fusion_DialogBoxParam+0×24

and we can get the current thread context before its transition to kernel mode:

3: kd> .trap f24f8d64
ErrCode = 00000000
eax=7ffff000 ebx=00000000 ecx=00000000 edx=7c94ed54 esi=00532e68 edi=0002002c
eip=7c94ed54 esp=0006e490 ebp=0006e53c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
ntdll!KiFastSystemCallRet:
001b:7c94ed54 c3              ret

3: kd> kL
ChildEBP RetAddr
0006e48c 77e34f1d ntdll!KiFastSystemCallRet
0006e53c 77e2f12f USER32!NtUserShowWindow+0xc
0006e570 77e2b0fe USER32!InternalDialogBox+0xa9
0006e590 77e29005 USER32!DialogBoxIndirectParamAorW+0x37
0006e5b4 0103d569 USER32!DialogBoxParamW+0x3f
0006e5d8 0102d2f5 winlogon!Fusion_DialogBoxParam+0x24

In the next part I’ll show an example from an x64 crash dump.

- Dmitry Vostokov @ DumpAnalysis.org -