Truncated dump, stack trace collection, waiting thread time and wait chains: pattern cooperation

September 10th, 2009

It was reported that functionality of some applications was not available until ServiceA was killed and restarted. Complete memory dump was forced. For that memory dump WinDbg !stacks command shows every thread bugchecking the system, for example:

0: kd> !stacks
Proc.Thread  .Thread  Ticks   ThreadState Blocker
[...]
                            [8ca72020 ApplicationA.exe]
 4bc.0004c4  8bf76af0 0000cc4 Blocked    nt!KeBugCheckEx+0x1b
 4bc.0004d0  8bf53650 000055d Blocked    nt!KeBugCheckEx+0x1b
 4bc.0004e8  8bf33c88 0000094 Blocked    nt!KeBugCheckEx+0x1b
 4bc.001ffc  8ba86020 0000eed Blocked    nt!KeBugCheckEx+0x1b
 4bc.001e30  8b7a0ca8 00003ac Blocked    nt!KeBugCheckEx+0x1b
 4bc.001934  8b664020 0000eed Blocked    nt!KeBugCheckEx+0x1b
 4bc.001af0  8ca6d3b0 0000094 Blocked    nt!KeBugCheckEx+0x1b
 4bc.001cac  8bf41af0 0001412 Blocked    nt!KeBugCheckEx+0x1b
 4bc.00141c  8b6458d0 0000094 Blocked    nt!KeBugCheckEx+0x1b
[...]

Seeing this nonsense I checked that complete dump was truncated by half because page file was 4Gb but the amount of physical memory was 8Gb:

0: kd> !vm
*** Virtual Memory Usage ***
 Physical Memory:     2096553 (   8386212 Kb)
 Paging File Name paged out
   Current:   4195328 Kb  Free Space:   4182336 Kb
   Minimum:   4195328 Kb  Maximum:      4195328 Kb
[…]

However it was possible to get stack trace collection using !process 0 ff command with most data from _ETHREAD structures present but most of kernel and user space stack traces not present in the output. ServiceA has many threads waiting for LPC requests during last 5 minutes (the bugcheck was forced after 3 - 4 minutes the issue manifested itself):

THREAD 8cc9b590  Cid 053c.0668  Teb: 7ffaf000 Win32Thread: b84e5770 WAIT: (Unknown) UserMode Non-Alertable
    8cc9b77c  Semaphore Limit 0x1
Waiting for reply to LPC MessageId 0005bbdf:
Current LPC port e2cee338
IRP List:
    8b6548b8: (0006,0268) Flags: 00000000  Mdl: 00000000
Not impersonating
DeviceMap                 e1003860
Owning Process            8bf81d88       Image:         ServiceA.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      193823         Ticks: 13752 (0:00:03:34.875)
Context Switch Count      389                 LargeStack
UserTime                  00:00:00.015
KernelTime                00:00:00.000
Win32 Start Address 0×7d1f5e70
Start Address 0×77e617ec
Stack Init b3a44000 Current b3a43c08 Base b3a44000 Limit b3a41000 Call 0
Priority 9 BasePriority 8 PriorityDecrement 0
Kernel stack not resident.

The similar waits were found for ApplicationB and ApplicationC processes. Therefore it was advised to limit the amount of physical memory to 4Gb and reproduce the issue or if the liveliness of the system needs to be preserved to manually dump the following processes on the next occasion:

ServiceA.exe
ApplicationB.exe
ApplicationC.exe

- Dmitry Vostokov @ DumpAnalysis.org -

DebugWare Patterns (Part 7)

September 10th, 2009

Trace Expert pattern came to my mind when I was writing about software trace patterns. It is a very lightweight expert system relying on trace collector and trace formatter (patterns to be written about soon). It is a module that takes a preformatted software trace message file or a buffer and a set of built in rules and uses simple search (peharps involving regular expressions) to dig out diagnostic information and provide troubleshooting and debugging directions.

This module is schematically depicted on the following UML component diagram:

- Dmitry Vostokov @ DumpAnalysis.org -

Electronic Version of Debugged! Magazine

September 8th, 2009

Responding to numerous requests and suggestions I plan to make magazine interior excluding promotional vouchers available for free download. If someone needs covers including back covers where I put tips and tables to be used as posters or certification vouchers printed inside then they should buy the magazine from Amazon or other bookshops.

This initiative will be accompanied by a smart marketing trick that I plan to unveil in a few days together with the magazine website.

- Dmitry Vostokov @ DumpAnalysis.org -

Metaphorical Bijectionism: A Method of Inquiry

September 7th, 2009

Consider this example mapping (taken metaphorically from the mathematical notion of an injection) of one domain of knowledge to another:

This mapping between concepts and ideas was once called “bijectivism” but was trivially described either as one to one mapping between two domains (like physical vs. mathematical) or fusing different concepts together to get another emerging concept. I myself proposed the similar mapping and called it a metaphorical bijection.  

Now consider another mapping metaphorically equivalent to a mathematical notion of a surjection where all constituents of the second domain are covered metaphorically by the first domain:

What we strive for is to establish the complete bijective mapping and reorganize our knowledge of both domains to achieve that:

In diagrams above small boxes can represent sets of ideas, methods, etc. or individual ideas, methods, etc. The established metaphorical bijection can divide sets or combine them if needed. There can be several such bijections, of course, and we can use other methods of inquiry (for example, the scientific method) to choose between competing metaphorical bijections.

Useful mnemonic:

BEIS (B=I+S or to BE IS …)

Bijectionism Equals Injection + Surjection

Another mnemonic:

BET (B=T or to BE Transformation…)

Bijectionism Equals Transformation 

Note also the second letter of Alef-Beis or Alef-Bet, the letter of Light that has interpretation of Creation in Biblical Hebrew.   

More on this later as I need to come back to DebugWare patterns.

- Dmitry Vostokov @ DumpAnalysis.org -

Manual parameter reconstruction on x64 Windows systems

September 4th, 2009

Although the first 2 parameters are passed via registers RCX and RDX they are saved on a stack as the part of a function prolog (as can be seen in many examples from my book x64 Windows Debugging: Practical Foundations):

0:000> uf arithmetic
FunctionParameters!arithmetic [c:\dumps\wdpf-x64\functionparameters\arithmetic.cpp @ 2]:
    2 00000001`40001020 mov     dword ptr [rsp+10h],edx
    2 00000001`40001024 mov     dword ptr [rsp+8],ecx

    2 00000001`40001028 push    rdi
    3 00000001`40001029 mov     eax,dword ptr [rsp+10h]
    3 00000001`4000102d mov     ecx,dword ptr [rsp+18h]
    3 00000001`40001031 add     ecx,eax
    3 00000001`40001033 mov     eax,ecx
    3 00000001`40001035 mov     dword ptr [rsp+18h],eax
    4 00000001`40001039 mov     eax,dword ptr [rsp+10h]
    4 00000001`4000103d add     eax,1
    4 00000001`40001040 mov     dword ptr [rsp+10h],eax
    5 00000001`40001044 mov     eax,dword ptr [rsp+10h]
    5 00000001`40001048 imul    eax,dword ptr [rsp+18h]
    5 00000001`4000104d mov     dword ptr [rsp+18h],eax
    7 00000001`40001051 mov     eax,dword ptr [rsp+18h]
    8 00000001`40001055 pop     rdi
    8 00000001`40001056 ret

Notice that RDI is saved too. This helps us later in a more complex case. If we put a breakpoint at arithmetic entry we see that WinDbg is not able to get parameters from RCX and RDX:

0:000> bp 00000001`40001020

0:000> g
ModLoad: 000007fe`ff4d0000 000007fe`ff5d8000   C:\Windows\system32\ADVAPI32.DLL
ModLoad: 000007fe`fef80000 000007fe`ff0c3000   C:\Windows\system32\RPCRT4.dll
Breakpoint 0 hit
FunctionParameters!arithmetic:
00000001`40001020 89542410        mov     dword ptr [rsp+10h],edx ss:00000000`0012fe88=cccccccc

0:000> kv
Child-SP          RetAddr           : Args to Child                                                           : Call Site
00000000`0012fe78 00000001`400010a5 : cccccccc`cccccccc cccccccc`cccccccc cccccccc`cccccccc cccccccc`cccccccc : FunctionParameters!arithmetic
00000000`0012fe80 00000001`4000137c : 00000000`00000001 00000000`00282960 00000000`00000000 00000000`00000000 : FunctionParameters!main+0×35
00000000`0012fec0 00000001`4000114e : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : FunctionParameters!__tmainCRTStartup+0×21c
00000000`0012ff30 00000000`7776be3d : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : FunctionParameters!mainCRTStartup+0xe
00000000`0012ff60 00000000`778a6a51 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : kernel32!BaseThreadInitThunk+0xd
00000000`0012ff90 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0×1d

This seems correct approach in general because at the time of any other breakpoint in the middle of the code parameter passing registers could be already overwritten, for example, RCX at 0000000140001031. However, as soon as we execute the first two MOV instruction one by one, parameters appear on kv output one by one too:

0:000> t
ModLoad: 000007fe`fd810000 000007fe`fd845000   C:\Windows\system32\apphelp.dll
FunctionParameters!arithmetic+0x4:
00000001`40001024 894c2408        mov     dword ptr [rsp+8],ecx ss:00000000`0012fe80=cccccccc

0:000> kv
Child-SP          RetAddr           : Args to Child                                                           : Call Site
00000000`0012fe78 00000001`400010a5 : cccccccc`cccccccc cccccccc`00000001 cccccccc`cccccccc cccccccc`cccccccc : FunctionParameters!arithmetic+0×4
00000000`0012fe80 00000001`4000137c : 00000000`00000001 00000000`00282960 00000000`00000000 00000000`00000000 : FunctionParameters!main+0×35
00000000`0012fec0 00000001`4000114e : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : FunctionParameters!__tmainCRTStartup+0×21c
00000000`0012ff30 00000000`7776be3d : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : FunctionParameters!mainCRTStartup+0xe
00000000`0012ff60 00000000`778a6a51 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : kernel32!BaseThreadInitThunk+0xd
00000000`0012ff90 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0×1d

0:000> t
FunctionParameters!arithmetic+0x8:
00000001`40001028 57              push    rdi

0:000> kv
Child-SP          RetAddr           : Args to Child                                                           : Call Site
00000000`0012fe78 00000001`400010a5 : cccccccc`00000001 cccccccc`00000001 cccccccc`cccccccc cccccccc`cccccccc : FunctionParameters!arithmetic+0×8
00000000`0012fe80 00000001`4000137c : 00000000`00000001 00000000`00282960 00000000`00000000 00000000`00000000 : FunctionParameters!main+0×35
00000000`0012fec0 00000001`4000114e : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : FunctionParameters!__tmainCRTStartup+0×21c
00000000`0012ff30 00000000`7776be3d : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : FunctionParameters!mainCRTStartup+0xe
00000000`0012ff60 00000000`778a6a51 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : kernel32!BaseThreadInitThunk+0xd
00000000`0012ff90 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0×1d

Now we come to the more complex example:

1: kd> kv
  *** Stack trace for last set context - .thread/.cxr resets it
Child-SP          RetAddr           : Args to Child                                                           : Call Site
fffffa60`166a7020 fffffa60`07e9dbf2 : fffffa80`1dbd8820 00000000`00000000 fffffa80`1ec3b7a8 fffffa60`166a7278 : Driver!DeviceWrite+0xae
fffffa60`166a7050 fffffa60`062ae7cb : 00000000`00000000 fffffa80`1dbd8820 fffffa60`166a7340 fffffa80`1df4f520 : Driver!RawWrite+0×8a
[…]

1: kd> r
Last set context:
rax=000000000083a03b rbx=fffffa801ec3b800 rcx=fffffa8018cdc000
rdx=0000000000000004 rsi=fffffa801ec3b9f0 rdi=0000000005040000
rip=fffffa6007ea006e rsp=fffffa60166a7020 rbp=fffffa801ec3b7a8
 r8=fffff6fd400f61e0  r9=000000000083a03b r10=fffffa801ec3b9f8
r11=fffffa801ec3b9f8 r12=fffffa801e7c9000 r13=0000000000000000
r14=000000000038011b r15=fffffa8019891670
iopl=0         nv up ei ng nz na po cy
cs=0010 ss=0018 ds=002b es=002b fs=0053 gs=002b efl=00010287
Driver!DeviceWrite+0xae:
fffffa60`07ea006e mov     rax,qword ptr [rdi+10h] ds:002b:00000000`05040010=????????????????

We know that the first parameter to Write function is a pointer to some structure we want to explore because we see from the disassembly that some member from that structure was used ([rcx+320h]) and it was used as a pointer (assigned to RDI) that was trapping ([rdi+10h]):

1: kd> .asm no_code_bytes
Assembly options: no_code_bytes

1: kd> u Driver!DeviceWrite Driver!DeviceWrite+0xae+10
Driver!DeviceWrite:
fffffa60`07e9ffc0 mov     qword ptr [rsp+8],rbx
fffffa60`07e9ffc5 mov     qword ptr [rsp+10h],rbp
fffffa60`07e9ffca mov     qword ptr [rsp+18h],rsi
fffffa60`07e9ffcf push    rdi
fffffa60`07e9ffd0 sub     rsp,20h
fffffa60`07e9ffd4 mov     rdi,qword ptr [rcx+320h]
[…]
fffffa60`07ea006e mov     rax,qword ptr [rdi+10h] ; TRAP
[…]

Unfortunately RCX was not saved on the stack and fffffa80`1dbd8820 from kv was just the value of the saved RBX. This can be double-checked by verifying that parameter+320 doesn’t point to RDI value (05040000) at the time of the trap:

1: kd> dq fffffa80`1dbd8820+320 l1
fffffa80`1dbd8b40  00000000`00020000

Looking at DeviceWrite caller we see that RCX was initialized from RDI:

1: kd> ub fffffa60`07e9dbf2
Driver!RawWrite+0x66:
fffffa60`07e9dbce mov     rax,qword ptr [rdi+258h]
fffffa60`07e9dbd5 mov     qword ptr [rcx-18h],rax
fffffa60`07e9dbd9 mov     rax,qword ptr [rdi+260h]
fffffa60`07e9dbe0 mov     qword ptr [rcx-20h],rax
fffffa60`07e9dbe4 mov     dword ptr [rdx+10h],ebp
fffffa60`07e9dbe7 mov     rdx,rsi
fffffa60`07e9dbea mov     rcx,rdi
fffffa60`07e9dbed call    Driver!DeviceWrite (fffffa60`07e9ffc0)

We also see that RDI was saved at the function prolog so we can get our real first parameter from the raw stack bearing in mind that 0×20 was subtracted from RSP too:

1: kd> dq esp
fffffa60`166a7020  fffffa80`1ec3b800 00000000`05040000 ; SUB RSP, 20H
fffffa60`166a7030  fffffa60`07edc5dd fffffa60`07ee6f8f ;
fffffa60`166a7040  fffffa80`1ec3b520 fffffa60`07e9dbf2 ; Saved RDI - Return Address
fffffa60`166a7050  fffffa80`1dbd8820 00000000`00000000 ; Saved RBX and RBP
fffffa60`166a7060  fffffa80`1ec3b7a8 fffffa60`166a7278 ; Saved RSI 
fffffa60`166a7070  fffffa60`166a7250 fffffa60`01ab5180
fffffa60`166a7080  fffffa80`1e2937c8 fffff800`018a928a
fffffa60`166a7090  00000003`0004000d 00000026`0024000d

We see that saved RDI value +320 points to the right expected address:

1: kd> dq fffffa80`1ec3b520+320 l1
fffffa80`1ec3b840  00000000`05040000

Now we can investigate the structure but this is beyond the scope of this post. 

- Dmitry Vostokov @ DumpAnalysis.org -

Wild code and partial stack reconstruction

September 4th, 2009

I recently got a chance to see an instance of Wild Code pattern in kernel mode:

ATTEMPTED_EXECUTE_OF_NOEXECUTE_MEMORY (fc)
An attempt was made to execute non-executable memory. The guilty driver is on the stack trace (and is typically the current instruction pointer). When possible, the guilty driver's name (Unicode string) is printed on the bugcheck screen and saved in KiBugCheckDriver.
Arguments:
Arg1: a98ccfc4, Virtual address for the attempted execute.
Arg2: 17b1b963, PTE contents.
Arg3: a98ccf38, (reserved)
Arg4: 00000001, (reserved)

3: kd> .trap 0xffffffffa98ccf38
ErrCode = 00000011
eax=00000000 ebx=bf8c16eb ecx=bf855770 edx=00000026 esi=be4da0e8 edi=0000029e
eip=a98ccfc4 esp=a98ccfac ebp=a98ccfa0 iopl=0 nv up ei ng nz ac pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010296
a98ccfc4 dcda  fcomp3  st(2)

3: kd> !pte a98ccfc4
VA a98ccfc4
PDE at 00000000C0602A60    PTE at 00000000C054C660
contains 000000000B40A863  contains 8000000017B1B963
pfn b40a       ---DA--KWEV    pfn 17b1b      -G-DA--KW-V

3: kd> k
*** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr
WARNING: Frame IP not in any known module. Following frames may be wrong.
a98ccff0 80889db4 0xa98ccfc4
a98ccff0 00000000 nt!KiCallbackReturn+0×84

We see that the execution address belongs to thread kernel stack range:

3: kd> !thread
THREAD 8959f548  Cid 20c4.0dd8  Teb: 7ffdf000 Win32Thread: bc1a9550 RUNNING on processor 3
Not impersonating
DeviceMap                 e183a628
Owning Process            890af4b0       Image:         ApplicationA.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      67969974       Ticks: 0
Context Switch Count      2833569                 LargeStack
UserTime                  00:01:18.171
KernelTime                00:00:50.468
Win32 Start Address 0x00401c94
Start Address 0x7c8217f8
Stack Init a98cd260 Current a98ccc48 Base a98ce000 Limit a98c3000 Call a98cd268
Priority 13 BasePriority 8 PriorityDecrement 0
ChildEBP RetAddr  Args to Child
a98ccea8 8085eced 000000fc a98ccfc4 17b1b963 nt!KeBugCheckEx+0×1b
a98ccf20 8088c798 00000008 a98ccfc4 00000000 nt!MmAccessFault+0xb25
a98ccf20 a98ccfc4 00000008 a98ccfc4 00000000 nt!KiTrap0E+0xdc
WARNING: Frame IP not in any known module. Following frames may be wrong.
a98ccff0 80889db4 0013e3d4 0000000c 00000000 0xa98ccfc4
a98ccff0 00000000 0013e3d4 0000000c 00000000 nt!KiCallbackReturn+0×84

Something must have gone wrong after the return from KiCallbackReturn. On x86 systems this is an IDT entry (2b). See an example output I did while writing down notes on Windows Internals. Windows NT/2000 Native API Reference states that it is the same function as ZwCallBackReturn and it is used to return from win32k.sys user-mode and space callbacks, for example, to send a window message (p. 408). We find the following call on raw stack and use an extended version of k command to get partial stack trace before user-mode callback call:

3: kd> dds a98c3000 a98ce000
[...]
a98cd244 a98cd270
a98cd248 80833485 nt!KiSwapThread+0x305
a98cd24c 8959f548
a98cd250 8959f5f0
a98cd254 00000001
a98cd258 8959f548
a98cd25c 00000000
a98cd260 ffffffff
a98cd264 00000001
a98cd268 a98cd8a0
a98cd26c a98cd604
a98cd270 a98cd8a8
a98cd274 0013f2e4
a98cd278 7ffdf000
a98cd27c 0013e3ac
a98cd280 a98cd2d8
a98cd284 8091d6d1 nt!KeUserModeCallback+0×8f

a98cd288 a98cd388
a98cd28c a98cd384
a98cd290 a98cd370
a98cd294 bc1a9550
a98cd298 006fa0e8
a98cd29c 0013f2e4
a98cd2a0 7ffdf000
a98cd2a4 00000018
a98cd2a8 8948c7f8
[…]

3: kd> k L=a98cd280
ChildEBP RetAddr
WARNING: Frame IP not in any known module. Following frames may be wrong.
a98cd280 8091d6d1 0xa98ccfc4
a98cd2d8 bf858a9a nt!KeUserModeCallback+0×8f
a98cd3b4 bf8a244e win32k!SfnINOUTNCCALCSIZE+0×10b
a98cd3fc bf8a13fa win32k!xxxSendMessageToClient+0×176
a98cd448 bf8a130f win32k!xxxSendMessageTimeout+0×1a6
a98cd46c bf85cd68 win32k!xxxSendMessage+0×1b
a98cd530 bf85daca win32k!xxxCalcValidRects+0×3bc
a98cd58c bf85def4 win32k!xxxEndDeferWindowPosEx+0xf2
a98cd5a8 bf85061c win32k!xxxSetWindowPos+0xb1
a98cd5cc bf8e3df8 win32k!xxxRedrawFrame+0×16
a98cd5d8 bf8a2b69 win32k!xxxDrawMenuBar+0×19
a98cd5f4 8088978c win32k!NtUserCallHwndLock+0×6b
a98cd5f4 7c9485ec nt!KiFastCallEntry+0xfc

0013e438 00000000 0×7c9485ec

- Dmitry Vostokov @ DumpAnalysis.org -

Bugtation No.102

September 4th, 2009

I don’t read mere books. I analyze memory dumps. Books are memory dumps. Memory dumps are books.

Dmitry Vostokov, Variation on a theme “A book is a memory dump”

- Dmitry Vostokov @ DumpAnalysis.org -

SAD Events (Debugging Slang, Part 3)

September 3rd, 2009
SAD event

System or Application Dump event

- Dmitry Vostokov @ DumpAnalysis.org -

Stack Space and Program Database Types

September 2nd, 2009

Readers of my book Windows Debugging: Practical Foundations asked why so much space is allocated on a stack when they run the sample from Chapter 10, “Frame Pointer and Local Variables”. They expected the following instruction for two integer local variables:

sub esp, 8

What they see is:

sub esp, d8h

Originally I thought that it was the effect of certain optimization or verification options for default Debug configuration in Visual C++. To check that I created a small console project with the following C/C++ code:

int _tmain(int argc, _TCHAR* argv[])
{
    int a = 1;

    return 0;
}

Default Debug configuration in Visual C++ 2008 Express Edition generates this code:

0:000:x86> uf wmain
StackAllocation!wmain:
    7 01211370 push    ebp
    7 01211371 mov     ebp,esp
    7 01211373 sub     esp,0CCh
    7 01211379 push    ebx
    7 0121137a push    esi
    7 0121137b push    edi
    7 0121137c lea     edi,[ebp-0CCh]
    7 01211382 mov     ecx,33h
    7 01211387 mov     eax,0CCCCCCCCh
    7 0121138c rep stos dword ptr es:[edi]
    8 0121138e mov     dword ptr [ebp-8],1
   10 01211395 xor     eax,eax
   11 01211397 pop     edi
   11 01211398 pop     esi
   11 01211399 pop     ebx
   11 0121139a mov     esp,ebp
   11 0121139c pop     ebp
   11 0121139d ret

Default Release configuration with disabled optimization and whole program optimization properties also generates the code with large stack space reservation:

0:000:x86> uf wmain
StackAllocation!wmain:
    7 00021000 push    ebp
    7 00021001 mov     ebp,esp
    7 00021003 sub     esp,44h
    7 00021006 push    ebx
    7 00021007 push    esi
    7 00021008 push    edi
    8 00021009 mov     dword ptr [ebp-4],1
   10 00021010 xor     eax,eax
   11 00021012 pop     edi
   11 00021013 pop     esi
   11 00021014 pop     ebx
   11 00021015 mov     esp,ebp
   11 00021017 pop     ebp
   11 00021018 ret

We still have 0×44 bytes allocated instead of 4. Playing with further options I found that choosing Program Database (/Zi) instead of default Program Database for Edit & Continue (/ZI) Debug Information Format general property reduces allocated stack space down to exact 4 bytes:

0:000:x86> uf wmain
StackAllocation!wmain:
    7 00c71000 push    ebp
    7 00c71001 mov     ebp,esp
    7 00c71003 push    ecx
    8 00c71004 mov     dword ptr [ebp-4],1
   10 00c7100b xor     eax,eax
   11 00c7100d mov     esp,ebp
   11 00c7100f pop     ebp
   11 00c71010 ret

We see here push ecx instead of sub esp, 4 but the result is equivalent in terms of space reservation of 4 bytes. Going back to Debug configuration and changing Debug Information Format we reduce space allocation too:

0:000:x86> uf wmain
StackAllocation!wmain:
    7 01361010 push    ebp
    7 01361011 mov     ebp,esp
    7 01361013 push    ecx
    7 01361014 mov     dword ptr [ebp-4],0CCCCCCCCh
    8 0136101b mov     dword ptr [ebp-4],1
   10 01361022 xor     eax,eax
   11 01361024 mov     esp,ebp
   11 01361026 pop     ebp
   11 01361027 ret

We also see redundant filling of 4 bytes with 0xCC pattern but this is the effect of Basic Runtime Checks in Code Generation properties. If we change them to Default we eliminate filling and the code becomes identical to Release configuration:

0:000:x86> uf wmain
StackAllocation!wmain:
    7 010e1010 push    ebp
    7 010e1011 mov     ebp,esp
    7 010e1013 push    ecx
    8 010e1014 mov     dword ptr [ebp-4],1
   10 010e101b xor     eax,eax
   11 010e101d mov     esp,ebp
   11 010e101f pop     ebp
   11 010e1020 ret

- Dmitry Vostokov @ DumpAnalysis.org -

Exception Processing Of Crash Hypothesis (EPOCH)

September 2nd, 2009

Our Universe is Unhandled Exception Processing saving one huge Memory Dump from a runaway HUC (Big Bang of Hyper-Universal Computation, or simply HUge Computation). The idea came to me some months ago but I decided to publish it after learning today about a “Mathematical Universe Hypothesis” (MUH) from Max Tegmark stating that our Universe is Mathematics.

EPOCH

Exception Processing Of Crash Hypothesis (or Memory Dump Universe Hypothesis)

Note: what a sad feeling I had while simultaneously listening to the Light track from Thin Red Line soundtrack while writing this post…

- Dmitry Vostokov @ DumpAnalysis.org -

Dao of Debugging

September 2nd, 2009

Dao De Jing bugging

Bang the Debugger, explains his extraordinary skill in debugging:

“The Thread is the Way, the flow I follow. When I started I could only use !analyze -v command. After 4 years I could see through memory bits without lenses of a debugger program because Dao is beyond computer understanding and more than technical skills.”

Based on the famous story from Ding the Butcher.

- Dmitry Vostokov @ DumpAnalysis.org -

Trace Analysis Patterns (Part 10)

September 1st, 2009

Recently I came upon two software traces from working and non-working software environments that were a perfect example of Bifurcation Point pattern (the name is borrowed from catastrophe theory):

Working (issue is absent):

#   PID  TID  Message
[…]
25  2768 3056 Trace Statement A 
26  3756 2600 Trace Statement B 
27  3756 2600 Trace Statement C 
[…]

149 3756  836 Trace Statement X (Query result: XXX)
150 3756  836 Trace Statement 150.1 
151 3756  836 Trace Statement 151.1
152 3756  836 Trace Statement 152.1 
153 3756  836 Trace Statement 153.1 
[…]

Non-working (issue is present):

#   PID  TID  Message
[…]
27  2768 3056 Trace Statement A 
28  3756 2176 Trace Statement B 
29  3756 2176 Trace Statement C 
[…]

151 3756 5940 Trace Statement Y (Query result: YYY)
152 3756 5940 Trace Statement 152.2 
153 3756 5940 Trace Statement 153.2 
154 3756 5940 Trace Statement 154.2 
155 3756 5940 Trace Statement 155.2 
[…]

First, we notice that in both traces PID are the same (2768 and 3756) and we can conclude that most likely both traces came from the same environment and session. Second, messages A, B, C and further are identical up to messages X and Y. The latter two messages differ greatly in their query results XXX and YYY. After that, message distribution differs greatly in both size and content. Despite the same tracing time, 15 seconds, statement current is 155 msg/s for working and 388 msg/s for non-working case.

Bifurcation points are easily observed when tracing noise ratio is small and, for example, in the case of Citrix terminal services environments, could be achieved by selecting appropriate tracing modules based on problem description or filtering irrelevant ones from full CDF traces.

- Dmitry Vostokov @ TraceAnalysis.org -

Music for Debugging: The Duet of Threads

September 1st, 2009

When listening for the first time to all Bach Cantatas from Complete Works, I found Aria Duetto “Wir danken, wir preisen” (”We thank you, we praise”) from BWV 134 (#4) very stimulating to think about multithreading while debugging or analyzing memory dumps and confirmed my feelings during second listening to all Cantatas.

Bach Edition: Complete Works (155 CD Box Set)

Buy from Amazon

Whom do we thank and praise then? Of course, the creator of the correct and maintainable multithreaded code.

- Dmitry Vostokov @ DumpAnalysis.org -

Bugtation No.101

August 31st, 2009

Most bugs are permanent.

Dmitry Vostokov, Variation on a theme “Most objects are temporary”

- Dmitry Vostokov @ DumpAnalysis.org -

10 Common Mistakes in Memory Analysis (Part 5)

August 31st, 2009

Sometimes not paying attention to all aspects of default analysis makes it difficult to consider an alternative troubleshooting hypothesis. Here is a sample of !analyze -v output showing massive patching (hooked functions pattern) by DriverA module:

KERNEL_MODE_EXCEPTION_NOT_HANDLED (8e)
This is a very common bugcheck.  Usually the exception address pinpoints the driver/function that caused the problem.  Always note this address as well as the link date of the driver/image that contains this address. Some common problems are exception code 0x80000003.  This means a hard coded breakpoint or assertion was hit, but this system was booted /NODEBUG.  This is not supposed to happen as developers should never have hardcoded breakpoints in retail code, but ... If this happens, make sure a debugger gets connected, and the system is booted /DEBUG.  This will let us see why this breakpoint is happening.
Arguments:
Arg1: c0000005, The exception code that was not handled
Arg2: 8092d47f, The address that the exception occurred at
Arg3: f5205b14, Trap Frame
Arg4: 00000000

[...]

CHKIMG_EXTENSION: !chkimg -lo 50 -d !nt
    80822a49-80822a4d  5 bytes - nt!NtYieldExecution
 [ 8b ff 55 8b ec:e9 14 3a 95 76 ]
    80823c11-80823c14  4 bytes - nt!KeFlushProcessTb+2c (+0x11c8)
 [ 69 76 82 80:88 ff ff ff ]
    80823c17-80823c1a  4 bytes - nt!KeFlushProcessTb+32 (+0x06)
 [ dd 40 01 00:b5 34 b3 76 ]
    8083771f-80837725  7 bytes - nt!KeAcquireQueuedSpinLockAtDpcLevel+1b (+0x13b08)
 [ f7 41 04 01 00 00 00:e9 c4 f9 b1 76 cc cc ]
    80840945-8084094a  6 bytes - nt!KxFlushEntireTb+9 (+0x9226)
 [ ff 15 1c 10 80 80:e9 65 67 b1 76 cc ]
    80845fe0-80845fe3  4 bytes - nt!KeFlushSingleTb+49 (+0x569b)
 [ 14 1d ff ff:dd 10 b1 76 ]
    80845fe5 - nt!KeFlushSingleTb+4e (+0x05)
 [ b9:c3 ]
    8084722d-80847230  4 bytes - nt!KeFlushMultipleTb+45 (+0x1248)
 [ 5e e3 82 80:14 00 00 00 ]
    80847233-80847236  4 bytes - nt!KeFlushMultipleTb+4b (+0x06)
 [ c1 0a ff ff:99 fe b0 76 ]
    808c039c-808c039e  3 bytes - nt!NtSetContextThread
 [ 8b ff 55:e9 31 5f ]
    808c03a0 - nt!NtSetContextThread+4 (+0x04)
 [ ec:76 ]
    808e3184-808e3188  5 bytes - nt!NtCreateProcess (+0x22de4)
 [ 8b ff 55 8b ec:e9 0b 31 89 76 ]
    808f6ad0-808f6ad6  7 bytes - nt!NtLoadKeyEx (+0x1394c)
 [ 6a 70 68 98 4b 81 80:e9 e7 f8 87 76 90 90 ]
    8090c66f-8090c675  7 bytes - nt!NtDeleteValueKey (+0x15b9f)
 [ 6a 44 68 60 f0 81 80:e9 c4 9c 86 76 90 90 ]
    8090e36c-8090e370  5 bytes - nt!NtTerminateProcess (+0x1cfd)
 [ 8b ff 55 8b ec:e9 34 81 86 76 ]
    80915342-80915346  5 bytes - nt!NtDeleteKey (+0x6fd6)
 [ 8b ff 55 8b ec:e9 c7 0f 86 76 ]
    80918114-80918118  5 bytes - nt!NtOpenThread (+0x2dd2)
 [ 68 c4 00 00 00:e9 53 e1 85 76 ]
    80921eac-80921eb2  7 bytes - nt!NtEnumerateKey (+0x9d98)
 [ 6a 48 68 f0 f9 82 80:e9 f5 44 85 76 90 90 ]
    80922578-8092257e  7 bytes - nt!NtEnumerateValueKey (+0x6cc)
 [ 6a 48 68 10 fc 82 80:e9 13 3e 85 76 90 90 ]
    80922efd-80922f01  5 bytes - nt!NtNotifyChangeKey (+0x985)
 [ 8b ff 55 8b ec:e9 e4 34 85 76 ]
    809246fb-809246ff  5 bytes - nt!NtOpenProcess (+0x17fe)
 [ 68 c8 00 00 00:e9 58 1b 85 76 ]
    8092c8a0-8092c8a4  5 bytes - nt!NtCreateKey (+0x81a5)
 [ 68 c0 00 00 00:e9 55 9a 84 76 ]
    8092f3a6-8092f3ac  7 bytes - nt!NtSetValueKey (+0x2b06)
 [ 6a 58 68 a0 f6 82 80:e9 a3 6f 84 76 90 90 ]
    8092fa88-8092fa8c  5 bytes - nt!NtCreateFile (+0x6e2)
 [ 8b ff 55 8b ec:e9 ab 69 84 76 ]
    80931311-80931315  5 bytes - nt!NtOpenKey (+0x1889)
 [ 68 ac 00 00 00:e9 d0 4f 84 76 ]
    809316ed-809316f3  7 bytes - nt!NtQueryValueKey (+0x3dc)
 [ 6a 60 68 80 90 84 80:e9 72 4c 84 76 90 90 ]
    8093470f-80934715  7 bytes - nt!NtQueryKey (+0x3022)
 [ 6a 58 68 c8 97 84 80:e9 0e 1d 84 76 90 90 ]
    809354fa-80935500  7 bytes - nt!NtMapViewOfSection (+0xdeb)
 [ 6a 38 68 80 a2 84 80:e9 77 0f 84 76 90 90 ]
    80935785-80935789  5 bytes - nt!NtUnmapViewOfSection (+0x28b)
 [ 8b ff 55 8b ec:e9 02 0d 84 76 ]
    8093ba96-8093ba9c  7 bytes - nt!NtProtectVirtualMemory (+0x6311)
 [ 6a 44 68 40 03 85 80:e9 b1 a9 83 76 90 90 ]
    8093c86d-8093c871  5 bytes - nt!NtSetInformationProcess (+0xdd7)
 [ 68 08 01 00 00:e9 4c 9a 83 76 ]
    8093ce6b-8093ce71  7 bytes - nt!NtCreateProcessEx (+0x5fe)
 [ 6a 0c 68 58 0e 85 80:e9 38 94 83 76 90 90 ]
    80978fef-80978ff5  7 bytes - nt!NtQueryMultipleValueKey (+0x3c184)
 [ 6a 48 68 f0 f9 86 80:e9 86 d3 7f 76 90 90 ]
    80979775-8097977b  7 bytes - nt!NtRenameKey (+0x786)
 [ 6a 3c 68 38 fa 86 80:e9 a8 cb 7f 76 90 90 ]
    80979caf-80979cb3  5 bytes - nt!NtRestoreKey (+0x53a)
 [ 8b ff 55 8b ec:e9 46 c7 7f 76 ]
    8097a11c-8097a120  5 bytes - nt!NtUnloadKey (+0x46d)
 [ 8b ff 55 8b ec:e9 b1 c2 7f 76 ]
    8097a139-8097a13d  5 bytes - nt!NtReplaceKey (+0x1d)
 [ 8b ff 55 8b ec:e9 d0 c2 7f 76 ]
197 errors : !nt (80822a49-8097a13d)

MODULE_NAME: DriverA

IMAGE_NAME:  DriverA.sys

MEMORY_CORRUPTOR:  PATCH_DriverA

FAILURE_BUCKET_ID:  MEMORY_CORRUPTION_PATCH_DriverA

BUCKET_ID:  MEMORY_CORRUPTION_PATCH_DriverA

However, when we look at the stack trace, we would see that BSOD happened when accessing registry while updating drivers:

FAULTING_IP:
nt!HvpGetCellMapped+97
8092d47f 8b4604          mov     eax,dword ptr [esi+4]

TRAP_FRAME:  f5205b14 -- (.trap 0xfffffffff5205b14)
ErrCode = 00000000
eax=e1021000 ebx=e101a3b8 ecx=00000003 edx=89214988 esi=00000100 edi=00000000
eip=8092d47f esp=f5205b88 ebp=f5205bfc iopl=0         nv up ei pl nz na pe nc
cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00010206
nt!HvpGetCellMapped+0×97:
8092d47f 8b4604          mov     eax,dword ptr [esi+4] ds:0023:00000104=????????
Resetting default scope

PROCESS_NAME:  updatedrivers.exe

STACK_TEXT: 
f52056e0 8085bb9f 0000008e c0000005 8092d47f nt!KeBugCheckEx+0x1b
f5205aa4 808346b4 f5205ac0 00000000 f5205b14 nt!KiDispatchException+0x3a2
f5205b0c 80834668 f5205bfc 8092d47f badb0d00 nt!CommonDispatchException+0x4a
f5205b98 8092d559 e101a3b8 e63a8e40 0010fc18 nt!Kei386EoiHelper+0x186
f5205bfc 80920fcd e101a3b8 00610052 3b9aca07 nt!HvpGetCellMapped+0×36a
f5205c20 8092248b e63a8e40 e22b4794 00000000 nt!CmpGetValueKeyFromCache+0xa4
f5205cc0 80922649 e63a8e40 00000000 00000001 nt!CmEnumerateValueKey+0×45a
f5205d44 80833bdf 00000058 00000000 00000001 nt!NtEnumerateValueKey+0×1c9
f5205d44 7c9485ec 00000058 00000000 00000001 nt!KiFastCallEntry+0xfc
WARNING: Frame IP not in any known module. Following frames may be wrong.
001290fc 00000000 00000000 00000000 00000000 0×7c9485ec

So an alternative hypothesis to pursue would be some sort of registry corruption after driver updates.

- Dmitry Vostokov @ DumpAnalysis.org -

x64 book becomes a debugging bestseller

August 19th, 2009

Shortly after being published, x64 Windows Debugging: Practical Foundations book rises to the top of Amazon debugging bestesellers list (on 22:30 19.08.09):

- Dmitry Vostokov @ DumpAnalysis.org -

Free Stack Traces

August 18th, 2009

By analogy with the free verse and the anthropologist John Tedlock’s written narratives of Native American Zuni where different font size was used for different levels I tried today the similar technique with a raw stack data from the previous case study of registry corruption:

[...]
    f690a3dc f7a21a06 BOOTVID!ReadWriteMode+0×42
    f690a3e0 f7a219a7 BOOTVID!__outpw+0×17
    f690a3ec f7a21a76 BOOTVID!SetPixel+0×6a
    f690a404 f7a21c1b BOOTVID!DisplayCharacter+0×47
    f690a420 b42e14db dump_iaStor+0×3a4db
    f690a468 b4364080 dump_iaStor+0xbd080
    f690a480 f6249983 ati2mtag+0×1b6983
    f690a488 804f2ee6 nt!IopWritePageToDisk+0xe4
    f690a4e0 804f2fb6 nt!IopWriteSummaryDump+0×7e
    f690a4e4 b42e12d8 dump_iaStor+0×3a2d8
    f690a50c 804f3c8d nt!IoWriteCrashDump+0×42d
    f690a514 b42e12d8 dump_iaStor+0×3a2d8
    f690a584 804f8fa7 nt!KiDumpParameterImages+0×5f
    f690a594 f74764bb sptd+0×664bb
    f690a598 f74764a0 sptd+0×664a0
    f690a59c b42e162a dump_iaStor+0×3a62a
    f690a5a8 f7a22394 BOOTVID!PreserveRow+0×7c
    f690a5c0 b42e12d8 dump_iaStor+0×3a2d8
    f690a5cc 804f9ecd nt!KeBugCheck2+0xa4d
    f690a6e0 804f9f43 nt!KeBugCheckEx+0×1b
    f690a950 80545d00 nt!KiSwapProcess+0×60
    f690a9a0 80522d45 nt!MiDecrementReferenceCount+0×65
    f690a9ac 805067ea nt!MiDeferredUnlockPages+0×1c8

f690a9c8 804f9f43 nt!KeBugCheckEx+0×1b
f690a9e8 80548c2d nt!MiFreePoolPages+0×8b
    f690aa04 80564d20 nt!NonPagedPoolDescriptor
f690aa28 8054b49a nt!ExFreePoolWithTag+0×1ba
    f690aa3c 8062bc17 nt!CmpPinCmView+0xab
    f690aa5c 80637e13 nt!HvpDelistBinFreeCells+0xad

f690aa68 8063bf19 nt!CmpFree+0×17
f690aa78 8063eb20 nt!HvpRecoverData+0×3ec
f690aad4 8063ef05 nt!HvMapHive+0×133
    f690ab10 80539ac0 nt!_except_handler3
    f690ab14 804e0e38 nt!`string’+0×258

f690ab20 8063087e nt!HvInitializeHive+0×416
f690ab38 806383a9 nt!CmpInitializeHive+0×26d
    f690ab54 8063bf02 nt!CmpFree
    f690ab58 8063b918 nt!CmpFileSetSize
    f690ab5c 8063c466 nt!CmpFileWrite
    f690ab60 8063c33e nt!CmpFileRead
    f690ab64 8063c1fc nt!CmpFileFlush

f690aba4 80625bf9 nt!CmpInitHiveFromFile+0xa3
f690abfc 8062ad8b nt!CmpCmdHiveOpen+0×21
f690ac24 80631f24 nt!CmLoadKey+0×90
    f690ac98 80622053 nt!CmConvertHandleToKernelHandle+0×55
f690acb0 806257b4 nt!NtLoadKey2+0×1fc
    f690acc8 806259ac nt!NtLoadKey
    f690acd8 805bc33f nt!ObpCloseHandleTableEntry+0×14d
    f690ad24 805bc401 nt!ObpCloseHandle+0xab
    f690ad34 80539ac0 nt!_except_handler3
    f690ad38 804e0bd0 nt!`string’+0×364

f690ad44 806259be nt!NtLoadKey+0×12
f690ad58 8054162c nt!KiFastCallEntry+0xfc
    f690ade0 805460ee nt!KiThreadStartup+0×16
    f690ade4 80626dee nt!CmpLoadHiveThread
    f690aec0 bf875fb4 win32k!WatchdogDrvStretchBlt+0×92
    f690aee4 bf988527 win32k!_except_handler3
    f690aee8 bf995f40 win32k!`string’+0×124
    f690aef0 bf875fb4 win32k!WatchdogDrvStretchBlt+0×92
    f690aef4 bf873ec2 win32k!EngStretchBltROP+0×3a9

where the larger font size indicates the stack trace from kv command and the smaller font size indicates symbolic information found between call frames that may or may not correspond to partial stack traces left from intermediate nested function calls of the current call sequence or past stack traces and their frames.

- Dmitry Vostokov @ DumpAnalysis.org -

Experiments on Poor Bugs

August 18th, 2009

This is the real photo of tracing experiments used for the front cover of September 2009 issue of Debugged! magazine:

- Dmitry Vostokov @ DumpAnalysis.org -

September Issue of Debugged! MZ/PE

August 18th, 2009

Finally, the front cover has materialized for this issue of Debugged! magazine:

Debugged! MZ/PE: Software Tracing (ISBN: 978-1906717797)

If you have an article idea or if you’d like to write an article for us please use the following contact form:

http://www.dumpanalysis.org/contact

The deadline for this issue is 25th of September.

- Dmitry Vostokov @ DumpAnalysis.org -

Thinking about New Directions

August 17th, 2009

This Sunday while drinking Turkish coffee (after 3 years of blogging) I was thinking about the new publishing year starting in September, the new year of reading (already started) and, in particular, about the cover of the forthcoming September issue of Debugged! MZ/PE magazine (to be revealed tonight).

- Dmitry Vostokov @ DumpAnalysis.org -