Archive for September, 2009

Wild code and partial stack reconstruction

Friday, September 4th, 2009

I recently got a chance to see an instance of Wild Code pattern in kernel mode:

ATTEMPTED_EXECUTE_OF_NOEXECUTE_MEMORY (fc)
An attempt was made to execute non-executable memory. The guilty driver is on the stack trace (and is typically the current instruction pointer). When possible, the guilty driver's name (Unicode string) is printed on the bugcheck screen and saved in KiBugCheckDriver.
Arguments:
Arg1: a98ccfc4, Virtual address for the attempted execute.
Arg2: 17b1b963, PTE contents.
Arg3: a98ccf38, (reserved)
Arg4: 00000001, (reserved)

3: kd> .trap 0xffffffffa98ccf38
ErrCode = 00000011
eax=00000000 ebx=bf8c16eb ecx=bf855770 edx=00000026 esi=be4da0e8 edi=0000029e
eip=a98ccfc4 esp=a98ccfac ebp=a98ccfa0 iopl=0 nv up ei ng nz ac pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010296
a98ccfc4 dcda  fcomp3  st(2)

3: kd> !pte a98ccfc4
VA a98ccfc4
PDE at 00000000C0602A60    PTE at 00000000C054C660
contains 000000000B40A863  contains 8000000017B1B963
pfn b40a       ---DA--KWEV    pfn 17b1b      -G-DA--KW-V

3: kd> k
*** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr
WARNING: Frame IP not in any known module. Following frames may be wrong.
a98ccff0 80889db4 0xa98ccfc4
a98ccff0 00000000 nt!KiCallbackReturn+0×84

We see that the execution address belongs to thread kernel stack range:

3: kd> !thread
THREAD 8959f548  Cid 20c4.0dd8  Teb: 7ffdf000 Win32Thread: bc1a9550 RUNNING on processor 3
Not impersonating
DeviceMap                 e183a628
Owning Process            890af4b0       Image:         ApplicationA.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      67969974       Ticks: 0
Context Switch Count      2833569                 LargeStack
UserTime                  00:01:18.171
KernelTime                00:00:50.468
Win32 Start Address 0x00401c94
Start Address 0x7c8217f8
Stack Init a98cd260 Current a98ccc48 Base a98ce000 Limit a98c3000 Call a98cd268
Priority 13 BasePriority 8 PriorityDecrement 0
ChildEBP RetAddr  Args to Child
a98ccea8 8085eced 000000fc a98ccfc4 17b1b963 nt!KeBugCheckEx+0×1b
a98ccf20 8088c798 00000008 a98ccfc4 00000000 nt!MmAccessFault+0xb25
a98ccf20 a98ccfc4 00000008 a98ccfc4 00000000 nt!KiTrap0E+0xdc
WARNING: Frame IP not in any known module. Following frames may be wrong.
a98ccff0 80889db4 0013e3d4 0000000c 00000000 0xa98ccfc4
a98ccff0 00000000 0013e3d4 0000000c 00000000 nt!KiCallbackReturn+0×84

Something must have gone wrong after the return from KiCallbackReturn. On x86 systems this is an IDT entry (2b). See an example output I did while writing down notes on Windows Internals. Windows NT/2000 Native API Reference states that it is the same function as ZwCallBackReturn and it is used to return from win32k.sys user-mode and space callbacks, for example, to send a window message (p. 408). We find the following call on raw stack and use an extended version of k command to get partial stack trace before user-mode callback call:

3: kd> dds a98c3000 a98ce000
[...]
a98cd244 a98cd270
a98cd248 80833485 nt!KiSwapThread+0x305
a98cd24c 8959f548
a98cd250 8959f5f0
a98cd254 00000001
a98cd258 8959f548
a98cd25c 00000000
a98cd260 ffffffff
a98cd264 00000001
a98cd268 a98cd8a0
a98cd26c a98cd604
a98cd270 a98cd8a8
a98cd274 0013f2e4
a98cd278 7ffdf000
a98cd27c 0013e3ac
a98cd280 a98cd2d8
a98cd284 8091d6d1 nt!KeUserModeCallback+0×8f

a98cd288 a98cd388
a98cd28c a98cd384
a98cd290 a98cd370
a98cd294 bc1a9550
a98cd298 006fa0e8
a98cd29c 0013f2e4
a98cd2a0 7ffdf000
a98cd2a4 00000018
a98cd2a8 8948c7f8
[…]

3: kd> k L=a98cd280
ChildEBP RetAddr
WARNING: Frame IP not in any known module. Following frames may be wrong.
a98cd280 8091d6d1 0xa98ccfc4
a98cd2d8 bf858a9a nt!KeUserModeCallback+0×8f
a98cd3b4 bf8a244e win32k!SfnINOUTNCCALCSIZE+0×10b
a98cd3fc bf8a13fa win32k!xxxSendMessageToClient+0×176
a98cd448 bf8a130f win32k!xxxSendMessageTimeout+0×1a6
a98cd46c bf85cd68 win32k!xxxSendMessage+0×1b
a98cd530 bf85daca win32k!xxxCalcValidRects+0×3bc
a98cd58c bf85def4 win32k!xxxEndDeferWindowPosEx+0xf2
a98cd5a8 bf85061c win32k!xxxSetWindowPos+0xb1
a98cd5cc bf8e3df8 win32k!xxxRedrawFrame+0×16
a98cd5d8 bf8a2b69 win32k!xxxDrawMenuBar+0×19
a98cd5f4 8088978c win32k!NtUserCallHwndLock+0×6b
a98cd5f4 7c9485ec nt!KiFastCallEntry+0xfc

0013e438 00000000 0×7c9485ec

- Dmitry Vostokov @ DumpAnalysis.org -

Bugtation No.102

Friday, September 4th, 2009

I don’t read mere books. I analyze memory dumps. Books are memory dumps. Memory dumps are books.

Dmitry Vostokov, Variation on a theme “A book is a memory dump”

- Dmitry Vostokov @ DumpAnalysis.org -

SAD Events (Debugging Slang, Part 3)

Thursday, September 3rd, 2009
SAD event

System or Application Dump event

- Dmitry Vostokov @ DumpAnalysis.org -

Stack Space and Program Database Types

Wednesday, September 2nd, 2009

Readers of my book Windows Debugging: Practical Foundations asked why so much space is allocated on a stack when they run the sample from Chapter 10, “Frame Pointer and Local Variables”. They expected the following instruction for two integer local variables:

sub esp, 8

What they see is:

sub esp, d8h

Originally I thought that it was the effect of certain optimization or verification options for default Debug configuration in Visual C++. To check that I created a small console project with the following C/C++ code:

int _tmain(int argc, _TCHAR* argv[])
{
    int a = 1;

    return 0;
}

Default Debug configuration in Visual C++ 2008 Express Edition generates this code:

0:000:x86> uf wmain
StackAllocation!wmain:
    7 01211370 push    ebp
    7 01211371 mov     ebp,esp
    7 01211373 sub     esp,0CCh
    7 01211379 push    ebx
    7 0121137a push    esi
    7 0121137b push    edi
    7 0121137c lea     edi,[ebp-0CCh]
    7 01211382 mov     ecx,33h
    7 01211387 mov     eax,0CCCCCCCCh
    7 0121138c rep stos dword ptr es:[edi]
    8 0121138e mov     dword ptr [ebp-8],1
   10 01211395 xor     eax,eax
   11 01211397 pop     edi
   11 01211398 pop     esi
   11 01211399 pop     ebx
   11 0121139a mov     esp,ebp
   11 0121139c pop     ebp
   11 0121139d ret

Default Release configuration with disabled optimization and whole program optimization properties also generates the code with large stack space reservation:

0:000:x86> uf wmain
StackAllocation!wmain:
    7 00021000 push    ebp
    7 00021001 mov     ebp,esp
    7 00021003 sub     esp,44h
    7 00021006 push    ebx
    7 00021007 push    esi
    7 00021008 push    edi
    8 00021009 mov     dword ptr [ebp-4],1
   10 00021010 xor     eax,eax
   11 00021012 pop     edi
   11 00021013 pop     esi
   11 00021014 pop     ebx
   11 00021015 mov     esp,ebp
   11 00021017 pop     ebp
   11 00021018 ret

We still have 0×44 bytes allocated instead of 4. Playing with further options I found that choosing Program Database (/Zi) instead of default Program Database for Edit & Continue (/ZI) Debug Information Format general property reduces allocated stack space down to exact 4 bytes:

0:000:x86> uf wmain
StackAllocation!wmain:
    7 00c71000 push    ebp
    7 00c71001 mov     ebp,esp
    7 00c71003 push    ecx
    8 00c71004 mov     dword ptr [ebp-4],1
   10 00c7100b xor     eax,eax
   11 00c7100d mov     esp,ebp
   11 00c7100f pop     ebp
   11 00c71010 ret

We see here push ecx instead of sub esp, 4 but the result is equivalent in terms of space reservation of 4 bytes. Going back to Debug configuration and changing Debug Information Format we reduce space allocation too:

0:000:x86> uf wmain
StackAllocation!wmain:
    7 01361010 push    ebp
    7 01361011 mov     ebp,esp
    7 01361013 push    ecx
    7 01361014 mov     dword ptr [ebp-4],0CCCCCCCCh
    8 0136101b mov     dword ptr [ebp-4],1
   10 01361022 xor     eax,eax
   11 01361024 mov     esp,ebp
   11 01361026 pop     ebp
   11 01361027 ret

We also see redundant filling of 4 bytes with 0xCC pattern but this is the effect of Basic Runtime Checks in Code Generation properties. If we change them to Default we eliminate filling and the code becomes identical to Release configuration:

0:000:x86> uf wmain
StackAllocation!wmain:
    7 010e1010 push    ebp
    7 010e1011 mov     ebp,esp
    7 010e1013 push    ecx
    8 010e1014 mov     dword ptr [ebp-4],1
   10 010e101b xor     eax,eax
   11 010e101d mov     esp,ebp
   11 010e101f pop     ebp
   11 010e1020 ret

- Dmitry Vostokov @ DumpAnalysis.org -

Exception Processing Of Crash Hypothesis (EPOCH)

Wednesday, September 2nd, 2009

Our Universe is Unhandled Exception Processing saving one huge Memory Dump from a runaway HUC (Big Bang of Hyper-Universal Computation, or simply HUge Computation). The idea came to me some months ago but I decided to publish it after learning today about a “Mathematical Universe Hypothesis” (MUH) from Max Tegmark stating that our Universe is Mathematics.

EPOCH

Exception Processing Of Crash Hypothesis (or Memory Dump Universe Hypothesis)

Note: what a sad feeling I had while simultaneously listening to the Light track from Thin Red Line soundtrack while writing this post…

- Dmitry Vostokov @ DumpAnalysis.org -

Dao of Debugging

Wednesday, September 2nd, 2009

Dao De Jing bugging

Bang the Debugger, explains his extraordinary skill in debugging:

“The Thread is the Way, the flow I follow. When I started I could only use !analyze -v command. After 4 years I could see through memory bits without lenses of a debugger program because Dao is beyond computer understanding and more than technical skills.”

Based on the famous story from Ding the Butcher.

- Dmitry Vostokov @ DumpAnalysis.org -

Trace Analysis Patterns (Part 10)

Tuesday, September 1st, 2009

Recently I came upon two software traces from working and non-working software environments that were a perfect example of Bifurcation Point pattern (the name is borrowed from catastrophe theory):

Working (issue is absent):

#   PID  TID  Message
[…]
25  2768 3056 Trace Statement A 
26  3756 2600 Trace Statement B 
27  3756 2600 Trace Statement C 
[…]

149 3756  836 Trace Statement X (Query result: XXX)
150 3756  836 Trace Statement 150.1 
151 3756  836 Trace Statement 151.1
152 3756  836 Trace Statement 152.1 
153 3756  836 Trace Statement 153.1 
[…]

Non-working (issue is present):

#   PID  TID  Message
[…]
27  2768 3056 Trace Statement A 
28  3756 2176 Trace Statement B 
29  3756 2176 Trace Statement C 
[…]

151 3756 5940 Trace Statement Y (Query result: YYY)
152 3756 5940 Trace Statement 152.2 
153 3756 5940 Trace Statement 153.2 
154 3756 5940 Trace Statement 154.2 
155 3756 5940 Trace Statement 155.2 
[…]

First, we notice that in both traces PID are the same (2768 and 3756) and we can conclude that most likely both traces came from the same environment and session. Second, messages A, B, C and further are identical up to messages X and Y. The latter two messages differ greatly in their query results XXX and YYY. After that, message distribution differs greatly in both size and content. Despite the same tracing time, 15 seconds, statement current is 155 msg/s for working and 388 msg/s for non-working case.

Bifurcation points are easily observed when tracing noise ratio is small and, for example, in the case of Citrix terminal services environments, could be achieved by selecting appropriate tracing modules based on problem description or filtering irrelevant ones from full CDF traces.

- Dmitry Vostokov @ TraceAnalysis.org -

Music for Debugging: The Duet of Threads

Tuesday, September 1st, 2009

When listening for the first time to all Bach Cantatas from Complete Works, I found Aria Duetto “Wir danken, wir preisen” (”We thank you, we praise”) from BWV 134 (#4) very stimulating to think about multithreading while debugging or analyzing memory dumps and confirmed my feelings during second listening to all Cantatas.

Bach Edition: Complete Works (155 CD Box Set)

Buy from Amazon

Whom do we thank and praise then? Of course, the creator of the correct and maintainable multithreaded code.

- Dmitry Vostokov @ DumpAnalysis.org -