Archive for April, 2007

Who calls the postmortem debugger?

Monday, April 30th, 2007

I was trying to understand who calls dwwin.exe, the part of Windows Error Reporting on my Windows XP system when the crash happens. To find an answer I launched TestDefaultDebugger application and after pushing its crash button the following familiar WER dialog box appeared:

I repeated the same when running ProcessHistory in the background and looking at its log I found that the parent process for dwwin.exe and postmortem debugger (if we click on Debug button) was TestDefaultDebugger.exe. In my case the default postmortem debugger was drwtsn32.exe. To look inside I attached WinDbg to TestDefaultDebugger process when WER dialog box above was displayed and got the following stack trace:

0:000> k
ChildEBP RetAddr
0012d318 7c90e9ab ntdll!KiFastSystemCallRet
0012d31c 7c8094e2 ntdll!ZwWaitForMultipleObjects+0xc
0012d3b8 7c80a075 kernel32!WaitForMultipleObjectsEx+0x12c
0012d3d4 6945763c kernel32!WaitForMultipleObjects+0x18
0012dd68 694582b1 faultrep!StartDWException+0×5df
0012eddc 7c8633e9 faultrep!ReportFault+0×533
0012f47c 00411eaa kernel32!UnhandledExceptionFilter+0×587
0012f49c 0040e879 TestDefaultDebugger+0×11eaa
0012ffc0 7c816fd7 TestDefaultDebugger+0xe879
0012fff0 00000000 kernel32!BaseProcessStart+0×23

The combination of StartDWException and WaitForMultipleObjects suggests that dwwin.exe process is started there. Indeed, when I disassembled StartDWException function I saw CreateProcess call just before the wait call:

0:000> uf faultrep!StartDWException
...
...
...
69457585 8d8568f7ffff    lea     eax,[ebp-898h]
6945758b 50              push    eax
6945758c 8d8524f7ffff    lea     eax,[ebp-8DCh]
69457592 50              push    eax
69457593 8d85d4fbffff    lea     eax,[ebp-42Ch]
69457599 50              push    eax
6945759a 33c0            xor     eax,eax
6945759c 50              push    eax
6945759d 6820000004      push    4000020h
694575a2 6a01            push    1
694575a4 50              push    eax
694575a5 50              push    eax
694575a6 ffb5a4f7ffff    push    dword ptr [ebp-85Ch]
694575ac 50              push    eax
694575ad ff1558114569    call    dword ptr [faultrep!_imp__CreateProcessW (69451158)]


The second parameter of CreateProcess, [ebp-85Ch], is the address of the process command line and we know EBP value from the call stack above, 0012dd68. Just to remind that parameters for Win32 API functions are pushed from right to left. So we get the command line straight away:

0:000> dpu 0012dd68-85Ch l1
0012d50c  0012d3ec "C:\WINDOWS\system32\dwwin.exe -x -s 208"

If we dismiss WER dialog by clicking on Debug button then the postmortem debugger starts. Also it starts without WER dialog displayed if we rename faultrep.dll beforehand. So it looks like the obvious place to look for postmortem debugger launch is UnhandledExceptionFilter. Indeed, we see it there:

0:000> uf kernel32!UnhandledExceptionFilter
...
...
...
7c8636a8 8d850cfaffff    lea     eax,[ebp-5F4h]
7c8636ae 50              push    eax
7c8636af 8d857cf9ffff    lea     eax,[ebp-684h]
7c8636b5 50              push    eax
7c8636b6 33c0            xor     eax,eax
7c8636b8 50              push    eax
7c8636b9 50              push    eax
7c8636ba 50              push    eax
7c8636bb 6a01            push    1
7c8636bd 50              push    eax
7c8636be 50              push    eax
7c8636bf 53              push    ebx
7c8636c0 50              push    eax
7c8636c1 e86cecf9ff      call    kernel32!CreateProcessW (7c802332)


Because this is the code that yet to be executed we need to put a breakpoint at 7c8636c1, continue execution, and when the breakpoint is hit dump the second parameter to CreateProcess that is the memory EBX points to:

0:000> bp 7c8636c1
0:000> g
Breakpoint 0 hit
eax=00000000 ebx=0012ed78 ecx=0012ec70 edx=7c90eb94 esi=0000003a edi=00000000
eip=7c8636c1 esp=0012ed50 ebp=0012f47c iopl=0 nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000 efl=00000246
kernel32!UnhandledExceptionFilter+0x84b:
7c8636c1 e86cecf9ff      call    kernel32!CreateProcessW (7c802332)
0:000> du @ebx
0012ed78  “c:\drwatson\drwtsn32 -p 656 -e 1″
0012edb8  “72 -g”

You see that UnhandledExceptionFilter is about to launch my custom Dr. Watson postmortem debugger.  

If you look further at UnhandledExceptionFilter disassembled code you would see that after creating postmortem debugger process and waiting for it to finish saving the process dump the function calls NtTerminateProcess.

Therefore all error reporting, calling the postmortem debugger and final process termination are done in the same process that had the exception. The latter two parts are also described in Matt Pietrek’s article as I found afterwards.

One addition to that article written in 1997: starting from Windows XP UnhandledExceptionFilter locates and loads faultrep.dll which launches dwwin.exe to report an error:

kernel32!UnhandledExceptionFilter+0x4f7:
7c863359 8d85acfaffff    lea     eax,[ebp-554h]
7c86335f 50              push    eax
7c863360 8d8570faffff    lea     eax,[ebp-590h]
7c863366 50              push    eax
7c863367 56              push    esi
7c863368 56              push    esi
7c863369 e8fbacfaff      call    kernel32!LdrLoadDll (7c80e069)
0:000> dt _UNICODE_STRING 0012f47c-590
 "C:\WINDOWS\system32\faultrep.dll"
   +0x000 Length           : 0x40
   +0x002 MaximumLength    : 0x100
   +0x004 Buffer           : 0x0012f360  "C:\WINDOWS\system32\faultrep.dll"

You can also see that all processing is done using the same thread stack. So if something is wrong with that stack then you have silent process termination and no error is reported. In Vista there are some improvements that I’m going to cover in detail in a separate post.

- Dmitry Vostokov -

Spammers got my attention

Sunday, April 29th, 2007

I got an email this morning whose subject begins with “We can define a private “stack pointer”…”. It got my attention because of the phrase “stack pointer”. The full subject was even more intriguing:

“We can define a private “stack pointer” that keeps track of which stack level we’re at, and addresses the corresponding register.”

When I opened that email I was utterly disappointed: the same spam advertisement that we get with less clever subjects. However I was curious to understand whether the spammer knew about low-level stuff or built the subject sentence automatically from keywords found on my blog. The latter assumption looked less plausible because it seemed to me that the first part of the sentence was written by a human or AI caught the wave again while I was immersed in crash dump analysis. On the other hand, the sentence as the whole was confusing to me because pointers do not address registers on Intel architecture, it is the way around. Finally I found that the sentence was written in 1988 by a human indeed. I googled it and it was the exact sentence from “LET’S BUILD A COMPILER!” article series:

http://compilers.iecc.com/crenshaw/tutor2.txt

I don’t know whether there was an engine behind the spam that picks subject text from sentences found on web sites based on keywords found on a target audience site or that was just a coincidence and everyone got the same e-mail. :-)

- Dmitry Vostokov -

Interrupts and exceptions explained (Part 1)

Saturday, April 28th, 2007

So far we have seen various bugchecks depicted. What I left there is the explanation of how exceptions happen in the first place and how the execution flow reaches KiDispatchException. When some abnormal condition happens such as breakpoint, division by zero or memory protection violation then normal CPU execution flow (code stream) is interrupted (therefore I use the terms “interrupt” and “exception” interchangeably here). The type of interrupt is specified by a number called interrupt vector number. Obviously CPU has to transfer execution to some procedure in memory to handle that interrupt. CPU has to find that procedure, theoretically either having one procedure for all interrupts and specifying interrupt vector number as a parameter or having a table containing pointers to various procedures that correspond to different interrupt vectors. Intel x86 and x64 CPUs use the latter approach which is depicted on the following diagram:

 

When an exception happens (divide by zero, for example) CPU gets the address of the procedure table from IDTR (Interrupt Descriptor Table Register). This IDT (Interrupt Descriptor Table) is a zero-based array of 8-byte descriptors (x86) or 16-byte descriptors (x64). CPU calculates the location of the necessary procedure to call and does some necessary steps like saving appropriate registers before calling it.

The same happens when some external I/O device interrupts. We will talk about external interrupts later. For I/O devices the term “interrupt” is more appropriate.  On the picture above I/O hardware interrupt vector numbers were taken from some dump. These are OS and user-defined numbers. The first 32 vectors are reserved by Intel. 

Before Windows switches CPU to protected mode during boot process it creates IDT table in memory and sets IDTR to point to it by using SIDT instruction.

Let me now illustrate this by using a UML class diagram annotated by pseudo-code that shows what CPU does before calling the appropriate procedure. The pseudo-code on the diagram below is valid for interrupts and exceptions happening when the current CPU execution mode is kernel. For interrupts and exceptions generated when CPU executes code in user mode the picture is a little more complicated because the processor has to switch the current user-mode stack to kernel mode stack.

The following diagram is for 32-bit x86 processor (x64 will be illustrated later):

Let’s see all this in some dump. The address of IDT can be found by using !pcr [processor number] command. Every processor on a multiprocessor machine has its own IDT:

0: kd> !pcr 0
KPCR for Processor 0 at ffdff000:
    Major 1 Minor 1
 NtTib.ExceptionList: f2178b8c
     NtTib.StackBase: 00000000
    NtTib.StackLimit: 00000000
  NtTib.SubSystemTib: 80042000
       NtTib.Version: 0005c645
   NtTib.UserPointer: 00000001
       NtTib.SelfTib: 7ffdf000
             SelfPcr: ffdff000
                Prcb: ffdff120
                Irql: 0000001f
                 IRR: 00000000
                 IDR: ffffffff
       InterruptMode: 00000000
                 IDT: 8003f400
                 GDT: 8003f000
                 TSS: 80042000
       CurrentThread: 88c1d3c0
          NextThread: 00000000
          IdleThread: 808a68c0
           DpcQueue:

Every entry in IDT has the type _KIDTENTRY and we can get the first entry for divide by zero exception which has vector number 0:

0: kd> dt _KIDTENTRY 8003f400
   +0x000 Offset           : 0x47ca
   +0x002 Selector         : 8
   +0x004 Access           : 0x8e00
   +0x006 ExtendedOffset   : 0x8083

By gluing together ExtendedOffset and Offset fields we get the address of the interrupt handling procedure (0×808347ca) which is KiTrap00:

0: kd> ln 0x808347ca
(808347ca)   nt!KiTrap00   |  (808348a5)   nt!Dr_kit1_a
Exact matches:
    nt!KiTrap00

We can also see the interrupt trace in raw stack. For example, we can open the kernel dump and see the following stack trace and registers in the output of !analyze -v command:

ErrCode = 00000000
eax=00001b00 ebx=00001b00 ecx=00000000 edx=00000000 esi=f2178cb4 edi=bc15a838
eip=bf972586 esp=f2178c1c ebp=f2178c90 iopl=0 nv up ei ng nz ac po cy
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010293
driver!foo+0xf9:
bf972586 f77d10 idiv eax,dword ptr [ebp+10h] ss:0010:f2178ca0=00000000
STACK_TEXT:
f2178b44 809989be nt!KeBugCheck+0×14
f2178b9c 8083484f nt!Ki386CheckDivideByZeroTrap+0×41
f2178b9c bf972586 nt!KiTrap00+0×88
f2178c90 bf94c23c driver!foo+0xf9
f2178d54 80833bdf driver!bar+0×11c

The dumping memory around ESP value (f2178c1c) shows the values processor pushes when divide by zero exception happens:

0: kd> dds f2178c1c-100 f2178c1c+100
...
...
...
f2178b80  00000000
f2178b84  f2178b50
f2178b88  00000000
f2178b8c  f2178d44
f2178b90  8083a8cc nt!_except_handler3
f2178b94  80870828 nt!`string'+0xa4
f2178b98  ffffffff
f2178b9c  f2178ba8
f2178ba0  8083484f nt!KiTrap00+0x88
f2178ba4  f2178ba8
f2178ba8  f2178c90
f2178bac  bf972586 driver!foo+0xf9
f2178bb0  badb0d00
f2178bb4  00000000
f2178bb8  0000006d
f2178bbc  bf842315
f2178bc0  f2178c6c
f2178bc4  f2178bec
f2178bc8  bf844154
f2178bcc  f2178c88
f2178bd0  f2178c88
f2178bd4  00000000
f2178bd8  bc150000
f2178bdc  f2170023
f2178be0  f2170023
f2178be4  00000000
f2178be8  00000000
f2178bec  00001b00
f2178bf0  f2178c0c
f2178bf4  f2178d44
f2178bf8  f2170030
f2178bfc  bc15a838
f2178c00  f2178cb4
f2178c04  00001b00
f2178c08  f2178c90
f2178c0c  00000000 ; ErrorCode
f2178c10  bf972586 driver!foo+0xf9 ; EIP
f2178c14  00000008 ; CS
f2178c18  00010293 ; EFLAGS

f2178c1c  00000000 ; <- ESP before interrupt
f2178c20  0013c220
f2178c24  00000000
f2178c28  60000000
f2178c2c  00000001
f2178c30  00000000
f2178c34  00000000


ErrorCode is not the same as interrupt vector number although it is the same number here (0). I won’t cover interrupt error codes here. If you are interested please consult Intel Architecture Software Developer’s Manual.

If we try to execute !idt extension command it will show you only user-defined hardware interrupt vectors:

0: kd> !idt
Dumping IDT:
37: 80a7d1ac hal!PicSpuriousService37
50: 80a7d284 hal!HalpApicRebootService
51: 89495044 serial!SerialCIsrSw (KINTERRUPT 89495008)
52: 89496044 i8042prt!I8042MouseInterruptService (KINTERRUPT 89496008)
53: 894ea044 USBPORT!USBPORT_InterruptService (KINTERRUPT 894ea008)
63: 894f2044 USBPORT!USBPORT_InterruptService (KINTERRUPT 894f2008)
72: 89f59044 atapi!IdePortInterrupt (KINTERRUPT 89f59008)
73: 89580044 NDIS!ndisMIsr (KINTERRUPT 89580008)
83: 899e7824 NDIS!ndisMIsr (KINTERRUPT 899e77e8)
92: 89f56044 atapi!IdePortInterrupt (KINTERRUPT 89f56008)
93: 89f1e044 SCSIPORT!ScsiPortInterrupt (KINTERRUPT 89f1e008)
a3: 894fa044 USBPORT!USBPORT_InterruptService (KINTERRUPT 894fa008)
a4: 894a3044 cpqcidrv+0×22AE (KINTERRUPT 894a3008)
b1: 89f697dc ACPI!ACPIInterruptServiceRoutine (KINTERRUPT 89f697a0)
b3: 89498824 i8042prt!I8042KeyboardInterruptService (KINTERRUPT 894987e8)
b4: 894a2044 cpqasm2+0×5B99 (KINTERRUPT 894a2008)
c1: 80a7d410 hal!HalpBroadcastCallService
d1: 80a7c754 hal!HalpClockInterrupt
e1: 80a7d830 hal!HalpIpiHandler
e3: 80a7d654 hal!HalpLocalApicErrorService
fd: 80a7e11c hal!HalpProfileInterrupt

In the next part I’m going to cover x64 specifics. 

- Dmitry Vostokov -

Bugchecks: SYSTEM_THREAD_EXCEPTION_NOT_HANDLED

Wednesday, April 25th, 2007

Another bugcheck that is similar to KMODE_EXCEPTION_NOT_HANDLED and KERNEL_MODE_EXCEPTION_NOT_HANDLED is SYSTEM_THREAD_EXCEPTION_NOT_HANDLED (0×7E).

This bugcheck happens when you have an exception in a system thread and there is no exception handler to catch it, i.e. no __try/__except. System threads are created by calling PsCreateSystemThread function. Here is its description from DDK:

The PsCreateSystemThread routine creates a system thread that executes in kernel mode and returns a handle for the thread.

By default PspUnhandledExceptionInSystemThread function is set as a default exception handler and its purpose is to call KeBugCheckEx.

The typical call stack in dumps with 7E bugcheck is:

1: kd> k
ChildEBP RetAddr
f70703cc 809a1eb2 nt!KeBugCheckEx+0x1b (FPO: [Non-Fpo])
f70703e8 80964a94 nt!PspUnhandledExceptionInSystemThread+0x1a
f7070ddc 80841a96 nt!PspSystemThreadStartup+0x56
00000000 00000000 nt!KiThreadStartup+0x16

To see how this bugcheck is generated from processor trap we need to look at raw stack. Let’s look at some example. !analyze -v command gives us the following output:

SYSTEM_THREAD_EXCEPTION_NOT_HANDLED (7e)
This is a very common bugcheck. Usually the exception address pinpoints the driver/function that caused the problem. Always note this address as well as the link date of the driver/image that contains this address.
Arguments:
Arg1: 80000003, The exception code that was not handled
Arg2: f69d9dd7, The address that the exception occurred at
Arg3: f70708c0, Exception Record Address
Arg4: f70705bc, Context Record Address
EXCEPTION_RECORD:  f70708c0 -- (.exr fffffffff70708c0)
ExceptionAddress: f69d9dd7 (driver+0x00014dd7)
   ExceptionCode: 80000003 (Break instruction exception)
  ExceptionFlags: 00000000
NumberParameters: 3
   Parameter[0]: 00000000
   Parameter[1]: 8a784020
   Parameter[2]: 00000044
CONTEXT:  f70705bc -- (.cxr fffffffff70705bc)
eax=00000001 ebx=00000032 ecx=8a37c000 edx=00000044 esi=8a37c000 edi=000000c7
eip=f69d9dd7 esp=f7070988 ebp=00004000 iopl=0 nv up ei pl zr na pe nc
cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000 efl=00000246
driver+0x14dd7:
f69d9dd7 cc              int     3
STACK_TEXT:
WARNING: Stack unwind information not available. Following frames may be wrong.
f7070998 f69dfbd1 80800000 00000000 8a37c000 driver+0x14dd7
00000000 00000000 00000000 00000000 00000000 driver+0x1abd1

We see that the thread encountered the breakpoint instruction and we also see that the call stack that led to breakpoint exception is incomplete. Here we must dump the raw stack data and try to reconstruct the stack manually. System threads are started with the execution of KiThreadStartup function. So let’s dump the stack starting from ESP register and up to some value, find startup function there and try to walk EBP chain:

1: kd> dds esp esp+1000
...
...
...
f7070a04  f7070a40
f7070a08  f71172ec NDIS!ndisMCommonHaltMiniport+0×375
f7070a0c  8a37c000
f7070a10  8a60f5c0
f7070a14  8a610748
f7070a18  8a610748
f7070a1c  00000058
f7070a20  00000005
f7070a24  00000004
f7070a28  f7527000
f7070a2c  00000685
f7070a30  f710943a NDIS!ndisMDummyIndicatePacket
f7070a34  00000000
f7070a38  8a610778
f7070a3c  00010748
f7070a40  f7070b74
f7070a44  f7117640 NDIS!ndisMHaltMiniport+0×21
f7070a48  8a610990

… (intermediate frames are omitted to save space)

f7070d7c  e104a338
f7070d80  f7070dac
f7070d84  8083f72e nt!ExpWorkerThread+0xeb
f7070d88  894d0ed8
f7070d8c  00000000
f7070d90  8a784020
f7070d94  00000000
f7070d98  00000000
f7070d9c  00000000
f7070da0  00000001
f7070da4  00000000
f7070da8  808f0cb7 nt!PiWalkDeviceList
f7070dac  f7070ddc
f7070db0  8092ccff nt!PspSystemThreadStartup+0×2e
f7070db4  894d0ed8
f7070db8  00000000
f7070dbc  00000000
f7070dc0  00000000
f7070dc4  f7070db8
f7070dc8  f7070410
f7070dcc  ffffffff
f7070dd0  8083b9bc nt!_except_handler3
f7070dd4  808408f8 nt!ObWatchHandles+0×5f4
f7070dd8  00000000
f7070ddc  00000000
f7070de0  80841a96 nt!KiThreadStartup+0×16
f7070de4  8083f671 nt!ExpWorkerThread

The last EBP value we are able to get is f7070a04 and we can specify it to the extended version of k command together with ESP and EIP value of crash point:

1: kd> k L=f7070a04 f7070988 f69d9dd7
ChildEBP RetAddr
f7070998 f69dfbd1 driver+0x14dd7
f7070a40 f7117640 NDIS!ndisMCommonHaltMiniport+0x375
f7070a48 f711891a NDIS!ndisMHaltMiniport+0x21
f7070b74 f71196e5 NDIS!ndisPnPRemoveDevice+0x189
f7070ba4 8083f9d0 NDIS!ndisPnPDispatch+0x15d
f7070bb8 808f6a25 nt!IofCallDriver+0x45
f7070be4 808e20b5 nt!IopSynchronousCall+0xbe
f7070c38 8080beae nt!IopRemoveDevice+0x97
f7070c60 808e149b nt!IopRemoveLockedDeviceNode+0x160
f7070c78 808e18cc nt!IopDeleteLockedDeviceNode+0x50
f7070cac 808e1732 nt!IopDeleteLockedDeviceNodes+0x3f
f7070d40 808e19b6 nt!PiProcessQueryRemoveAndEject+0x7ad
f7070d5c 808e7879 nt!PiProcessTargetDeviceEvent+0x2a
f7070d80 8083f72e nt!PiWalkDeviceList+0x1d2
f7070dac 8092ccff nt!ExpWorkerThread+0xeb
f7070ddc 80841a96 nt!PspSystemThreadStartup+0x2e
00000000 00000000 nt!KiThreadStartup+0x16

This was the execution path before the exception. However when int 3 instruction was hit then processor generated a trap with interrupt vector 3 (4th entry in interrupt descriptor table, zero-based) and the corresponding function in IDT, KiTrap03, was called. Therefore we need to find this function on the same raw stack and try to build stack trace for our exception handling code path:

1: kd> dds esp esp+1000
f70703b4  0000007e
f70703b8  80000003
f70703bc  f69d9dd7 driver+0x14dd7
f70703c0  f70708c0
f70703c4  f70705bc
f70703c8  00000000
f70703cc  f70703e8
f70703d0  809a1eb2 nt!PspUnhandledExceptionInSystemThread+0x1a
f70703d4  0000007e
...
...
...
f7070418 f707043c
f707041c 8083b578 nt!ExecuteHandler2+0×26
f7070420 f70708c0
f7070424 f7070dcc
f7070428 f70705bc
f707042c f70704d8
f7070430 f7070894
f7070434 8083b58c nt!ExecuteHandler2+0×3a
f7070438 f7070dcc
f707043c f70704ec
f7070440 8083b54a nt!ExecuteHandler+0×24
f7070444 f70708c0

… (intermediate frames are omitted to save space)

f7070890  f7070410
f7070894  f7070dcc
f7070898  8083b9bc nt!_except_handler3
f707089c  80850c10 nt!`string’+0×10c
f70708a0  ffffffff
f70708a4  f7070914
f70708a8  808357a4 nt!CommonDispatchException+0×4a
f70708ac  f70708c0
f70708b0  00000000
f70708b4  f7070914
f70708b8  00000000
f70708bc  00000001
f70708c0  80000003
f70708c4  00000000
f70708c8  00000000
f70708cc  f69d9dd7 driver+0×14dd7
f70708d0  00000003
f70708d4  00000000
f70708d8  8a784020
f70708dc  00000044
f70708e0  ffffffff
f70708e4  00000001
f70708e8  f7737a7c
f70708ec  f7070934
f70708f0  8083f164 nt!KeWaitForSingleObject+0×346
f70708f4  000000c7
f70708f8  00000000
f70708fc  00000031
f7070900  00000000
f7070904  f707091c
f7070908  80840569 nt!KeSetTimerEx+0×179
f707090c  000000c7
f7070910  80835f39 nt!KiTrap03+0xb0
f7070914  00004000

The last EBP value we are able to get is f7070418 so the following stack trace emerges albeit not accurate as it doesn’t show that KeBugCheckEx was called from PspUnhandledExceptionInSystemThread as can be seen from disassembly. However it does show that the main function that orchestrates stack unwinding is KiDispatchException:

1: kd> k L=f7070418
ChildEBP RetAddr
f7070418 8083b578 nt!KeBugCheckEx+0x1b
f707043c 8083b54a nt!ExecuteHandler2+0x26
f70704ec 80810664 nt!ExecuteHandler+0x24
f70708a4 808357a4 nt!KiDispatchException+0x131
f707090c 80835f39 nt!CommonDispatchException+0x4a
f707090c f69d9dd8 nt!KiTrap03+0xb0
1: kd> uf nt!PspUnhandledExceptionInSystemThread
nt!PspUnhandledExceptionInSystemThread:
809a1e98 8bff            mov     edi,edi
809a1e9a 55              push    ebp
809a1e9b 8bec            mov     ebp,esp
809a1e9d 8b4d08          mov     ecx,dword ptr [ebp+8]
809a1ea0 ff7104          push    dword ptr [ecx+4]
809a1ea3 8b01            mov     eax,dword ptr [ecx]
809a1ea5 50              push    eax
809a1ea6 ff700c          push    dword ptr [eax+0Ch]
809a1ea9 ff30            push    dword ptr [eax]
809a1eab 6a7e            push    7Eh
809a1ead e8f197edff      call    nt!KeBugCheckEx (8087b6a3)

At the end to illustrate this bugcheck I created the following UML sequence diagram (trap processing that leads to KiDispatchException will be depicted in another post):

- Dmitry Vostokov -

Bugchecks: KMODE_EXCEPTION_NOT_HANDLED

Wednesday, April 25th, 2007

This bugcheck (0×1E) is essentially the same as KERNEL_MODE_EXCEPTION_NOT_HANDLED (0×8E) although parameters are different:

KMODE_EXCEPTION_NOT_HANDLED (1e)
This is a very common bugcheck. Usually the exception address pinpoints the driver/function that caused the problem. Always note this address as well as the link date of the driver/image that contains this address.
Arguments:
Arg1: c0000005, The exception code that was not handled
Arg2: 8046ce72, The address that the exception occurred at
Arg3: 00000000, Parameter 0 of the exception
Arg4: 00000000, Parameter 1 of the exception

KERNEL_MODE_EXCEPTION_NOT_HANDLED (8e)
This is a very common bugcheck. Usually the exception address pinpoints the driver/function that caused the problem. Always note this address as well as the link date of the driver/image that contains this address. Some common problems are exception code 0×80000003. This means a hard coded breakpoint or assertion was hit, but this system was booted /NODEBUG. This is not supposed to happen as developers should never have hardcoded breakpoints in retail code, but … If this happens, make sure a debugger gets connected, and the system is booted /DEBUG. This will let us see why this breakpoint is happening.
Arguments:
Arg1: c0000005, The exception code that was not handled
Arg2: 808cbb8d, The address that the exception occurred at
Arg3: f5a84638, Trap Frame
Arg4: 00000000

Bugcheck 0×1E is called from the same routine KiDispatchException on x64 W2K3 and on x86 W2K whereas 0×8E is called on x86 W2K3 and Vista platforms. I haven’t checked this with x64 Vista. Here is the modified UML diagram showing both bugchecks:

    

- Dmitry Vostokov -

Crash Dump Analysis Poster v1.1 (HTML version)

Sunday, April 22nd, 2007

Here is an HTML version of Crash Dump Analysis Poster with hyperlinks. Command links launch WinDbg Help for corresponding topic. If you click on !heap, for example, WinDbg Help window for that command will open. In order to have this functionality you need to save source code of the following HTML file below to your disk and launch it locally.

http://www.dumpanalysis.org/CDAPoster.html

Your WinDbg Help file must be in the default installation path, i.e.

C:\Program Files\Debugging Tools for Windows\debugger.chm

If you installed WinDbg to a different folder then you can simply create the default folder and copy debugger.chm there.

I keep this HTML file open locally on a second monitor and found it very easy to jump to an appropriate command help when I need its parameter description.

This HTML poster was created and edited in Notepad.

I’m working on the second version and will announce it as soon as it is ready.

- Dmitry Vostokov -

Bugchecks: KERNEL_MODE_EXCEPTION_NOT_HANDLED

Sunday, April 22nd, 2007

Here is the next depicted bugcheck: 0×8E. It is very common in kernel crash dumps and it means that:

  1. If access violation exception happened the read or write address was in user space

  2. Frame-based exception handling was allowed, kernel debugger (if any) didn’t handle the exception (first chance), then no exception handlers were willing to process the exception and at last kernel debugger (if any) didn’t handle the exception (second chance)

  3. Frame-based exception handling wasn’t allowed and kernel debugger (if any) didn’t handle the exception

The second option is depicted on the following UML sequence diagram:

Note: if you have an access violation and read or write address is in kernel space you get a different bugcheck as explained in Invalid Pointer Pattern 

I assumed that you know about structured and frame based exception handling (SEH). If you don’t know how it is implemented please read Matt Pietrek’s article: A Crash Course on the Depths of Win32 Structured Exception Handling

References used:

  1. “Windows NT/2000 Native API Reference” book by Gary Nebbett
  2. Local kernel debugging on Windows XP to check that the flow on the diagram above is correct

- Dmitry Vostokov -

BSOD Shortcut Icon

Saturday, April 21st, 2007

Created the most simple blue screen icon and put it on my site. Now you can observe it when you run your browser:

- Dmitry Vostokov -

More WinDbg Scripts

Saturday, April 21st, 2007

The good collection of WinDbg scripts was posted this month on Roberto Farah’s blog. Most of his scripts use DML (Debugger Markup Language). I also looked over my previous posts and put my old crash dump analysis scripts into one collection.

Found one problem. I was writing a script yesterday to call some external programs and found that I couldn’t easily pass parameters to the script. I mean something like this would be great:

$$><myscript.txt param1 param2

and I could possibly reference parameters inside the script perhaps by some pseudo-register. The current approach is to set up pseudo-registers before running the script and I don’t like it. I’ll post a better solution if I find it.

- Dmitry Vostokov -

Crash Dump Analysis Patterns (Part 12)

Friday, April 20th, 2007

Another pattern that happens so often in crash dumps: No Component Symbols. In this case we can guess what a component does by looking at its name, overall thread stack where it is called and also its import table. Here is an example. We have component.sys driver visible on some thread stack in a kernel dump but we don’t know what that component can potentially do. Because we don’t have symbols we cannot see its imported functions:

kd> x component!*
kd>

We use !dh command to dump its image headers:

kd> lmv m component
start             end                 module name
fffffadf`e0eb5000 fffffadf`e0ebc000   component   (no symbols)
    Loaded symbol image file: component.sys
    Image path: \??\C:\Component\x64\component.sys
    Image name: component.sys
    Timestamp:        Sat Jul 01 19:06:16 2006 (44A6B998)
    CheckSum:         000074EF
    ImageSize:        00007000
    Translations:     0000.04b0 0000.04e0 0409.04b0 0409.04e0
kd> !dh fffffadf`e0eb5000
File Type: EXECUTABLE IMAGE
FILE HEADER VALUES
    8664 machine (X64)
       6 number of sections
44A6B998 time date stamp Sat Jul 01 19:06:16 2006
       0 file pointer to symbol table
       0 number of symbols
      F0 size of optional header
      22 characteristics
            Executable
            App can handle >2gb addresses
OPTIONAL HEADER VALUES
     20B magic #
    8.00 linker version
     C00 size of code
     A00 size of initialized data
       0 size of uninitialized data
    5100 address of entry point
    1000 base of code
         ----- new -----
0000000000010000 image base
    1000 section alignment
     200 file alignment
       1 subsystem (Native)
    5.02 operating system version
    5.02 image version
    5.02 subsystem version
    7000 size of image
     400 size of headers
    74EF checksum
0000000000040000 size of stack reserve
0000000000001000 size of stack commit
0000000000100000 size of heap reserve
0000000000001000 size of heap commit
       0 [       0] address [size] of Export Directory
    51B0 [      28] address [size] of Import Directory
    6000 [     3B8] address [size] of Resource Directory
    4000 [      6C] address [size] of Exception Directory
       0 [       0] address [size] of Security Directory
       0 [       0] address [size] of Base Relocation Directory
    2090 [      1C] address [size] of Debug Directory
       0 [       0] address [size] of Description Directory
       0 [       0] address [size] of Special Directory
       0 [       0] address [size] of Thread Storage Directory
       0 [       0] address [size] of Load Configuration Directory
       0 [       0] address [size] of Bound Import Directory
    2000 [      88] address [size] of Import Address Table Directory
       0 [       0] address [size] of Delay Import Directory
       0 [       0] address [size] of COR20 Header Directory
       0 [       0] address [size] of Reserved Directory


Then we display the contents of Import Address Table Directory using dps command:

kd> dps fffffadf`e0eb5000+2000 fffffadf`e0eb5000+2000+88
fffffadf`e0eb7000  fffff800`01044370 nt!IoCompleteRequest
fffffadf`e0eb7008  fffff800`01019700 nt!IoDeleteDevice
fffffadf`e0eb7010  fffff800`012551a0 nt!IoDeleteSymbolicLink
fffffadf`e0eb7018  fffff800`01056a90 nt!MiResolveTransitionFault+0x7c2
fffffadf`e0eb7020  fffff800`0103a380 nt!ObDereferenceObject
fffffadf`e0eb7028  fffff800`0103ace0 nt!KeWaitForSingleObject
fffffadf`e0eb7030  fffff800`0103c570 nt!KeSetTimer
fffffadf`e0eb7038  fffff800`0102d070 nt!IoBuildPartialMdl+0x3
fffffadf`e0eb7040  fffff800`012d4480 nt!PsTerminateSystemThread
fffffadf`e0eb7048  fffff800`01041690 nt!KeBugCheckEx
fffffadf`e0eb7050  fffff800`010381b0 nt!KeInitializeTimer
fffffadf`e0eb7058  fffff800`0103ceb0 nt!ZwClose
fffffadf`e0eb7060  fffff800`012b39f0 nt!ObReferenceObjectByHandle
fffffadf`e0eb7068  fffff800`012b7380 nt!PsCreateSystemThread
fffffadf`e0eb7070  fffff800`01251f90 nt!FsRtlpIsDfsEnabled+0x114
fffffadf`e0eb7078  fffff800`01275160 nt!IoCreateDevice
fffffadf`e0eb7080  00000000`00000000
fffffadf`e0eb7088  00000000`00000000

We see that this driver under certain circumstances could bugcheck the system using KeBugCheckEx, it creates system thread(s) (PsCreateSystemThread) and uses timer(s) (KeInitializeTimer, KeSetTimer).

If you see name+offset in import table (I think this is an effect of OMAP code optimization) you can get the function by using ln command (list nearest symbols):

kd> ln fffff800`01056a90
(fffff800`01056760)   nt!MiResolveTransitionFault+0x7c2   |  (fffff800`01056a92)   nt!RtlInitUnicodeString
kd> ln fffff800`01251f90
(fffff800`01251e90)   nt!FsRtlpIsDfsEnabled+0×114   |  (fffff800`01251f92)   nt!IoCreateSymbolicLink

This technique is useful if you have a bugcheck that happens when a driver calls certain functions or must call certain function in pairs, like bugcheck 0×20:

kd> !analyze -show 0x20
KERNEL_APC_PENDING_DURING_EXIT (20)
The key data item is the thread's APC disable count. If this is non-zero, then this is the source of the problem. The APC disable count is decremented each time a driver calls KeEnterCriticalRegion, KeInitializeMutex, or FsRtlEnterFileSystem. The APC disable count is incremented each time a driver calls KeLeaveCriticalRegion, KeReleaseMutex, or FsRtlExitFileSystem.  Since these calls should always be in pairs, this value should be zero when a thread exits. A negative value indicates that a driver has disabled APC calls without re-enabling them. A positive value indicates that the reverse is true. If you ever see this error, be very suspicious of all drivers installed on the machine — especially unusual or non-standard drivers. Third party file system redirectors are especially suspicious since they do not generally receive the heavy duty testing that NTFS, FAT, RDR, etc receive. This current IRQL should also be 0.  If it is not, that a driver’s cancelation routine can cause this bugcheck by returning at an elevated IRQL.  Always attempt to note what you were doing/closing at the time of the crash, and note all of the installed drivers at the time of the crash.  This symptom is usually a severe bug in a third party driver.

Then you can see at least whether the suspicious driver could have potentially used those functions and if it imports one of them you can see whether it imports the corresponding counterpart function. 

- Dmitry Vostokov @ DumpAnalysis.org -

Crash Dump Analysis Patterns (Part 5b)

Friday, April 20th, 2007

This is a follow up to Optimized Code pattern written previously. Now I discuss the following feature that often bewilders beginners. It is called OMAP code optimization. It is used to make code that needs to be present in memory smaller. So instead of flat address space for compiled function you have pieces of it scattered here and there. This leads to an ambiguity when you try to disassemble OMAP code at its address because WinDbg doesn’t know whether it should treat address range as a function offset (starting from the beginning of the function source code) or just a memory layout offset (starting from the address of that function). Let me illustrate this on IoCreateDevice function code.

Let’s first evaluate a random address starting from the first address of the function (memory layout offset):

kd> ? nt!IoCreateDevice
Evaluate expression: -8796073668256 = fffff800`01275160
kd> ? nt!IoCreateDevice+0×144
Evaluate expression: -8796073667932 = fffff800`012752a4
kd> ? fffff800`012752a4-fffff800`01275160
Evaluate expression: 324 = 00000000`00000144

If we try to disassemble code at the same address the expression will also be evaluated as the memory layout offset:

kd> u nt!IoCreateDevice+0×144
nt!IoCreateDevice+0×1a3:
fffff800`012752a4 83c810          or      eax,10h
fffff800`012752a7 898424b0000000  mov     dword ptr [rsp+0B0h],eax
fffff800`012752ae 85ed            test    ebp,ebp
fffff800`012752b0 8bdd            mov     ebx,ebp
fffff800`012752b2 0f858123feff    jne     nt!IoCreateDevice+0×1b3
fffff800`012752b8 035c2454        add     ebx,dword ptr [rsp+54h]
fffff800`012752bc 488b1585dcf2ff  mov     rdx,qword ptr [nt!IoDeviceObjectType]
fffff800`012752c3 488d8c2488000000 lea     rcx,[rsp+88h]

You see the difference: we give +0×144 offset but the code is shown from +0×1a3! This is because OMAP optimization moved the code from the function offset +0×1a3 to memory locations starting from +0×144. The following picture illustrates this:

If you see this when disassembling a function name+offset address from a thread stack trace you can use raw address instead:

kd> k
Child-SP          RetAddr           Call Site
fffffadf`e3a18d30 fffff800`012b331e component!function+0×72
fffffadf`e3a18d70 fffff800`01044196 nt!PspSystemThreadStartup+0×3e
fffffadf`e3a18dd0 00000000`00000000 nt!KxStartSystemThread+0×16
kd> u fffff800`012b331e
nt!PspSystemThreadStartup+0×3e:
fffff800`012b331e 90              nop
fffff800`012b331f f683fc03000040  test    byte ptr [rbx+3FCh],40h
fffff800`012b3326 0f8515d30600    jne     nt!PspSystemThreadStartup+0×4c
fffff800`012b332c 65488b042588010000 mov   rax,qword ptr gs:[188h]
fffff800`012b3335 483bd8          cmp     rbx,rax
fffff800`012b3338 0f85a6d30600    jne     nt!PspSystemThreadStartup+0×10c
fffff800`012b333e 838bfc03000001  or      dword ptr [rbx+3FCh],1
fffff800`012b3345 33c9            xor     ecx,ecx

You also see OMAP in action also when you try to disassemble the function body using uf command:

kd> uf nt!IoCreateDevice
nt!IoCreateDevice+0×34d:
fffff800`0123907d 834f3008        or      dword ptr [rdi+30h],8
fffff800`01239081 e955c30300      jmp     nt!IoCreateDevice+0×351



nt!IoCreateDevice+0×14c:
fffff800`0126f320 6641be0002      mov     r14w,200h
fffff800`0126f325 e92f5f0000      jmp     nt!IoCreateDevice+0×158
nt!IoCreateDevice+0×3cc:
fffff800`01270bd0 488d4750        lea     rax,[rdi+50h]
fffff800`01270bd4 48894008        mov     qword ptr [rax+8],rax
fffff800`01270bd8 488900          mov     qword ptr [rax],rax
fffff800`01270bdb e95b480000      jmp     nt!IoCreateDevice+0×3d7
nt!IoCreateDevice+0xa4:
fffff800`01273eb9 41b801000000    mov     r8d,1
fffff800`01273ebf 488d154a010700  lea     rdx,[nt!`string’]
fffff800`01273ec6 488d8c24d8000000 lea     rcx,[rsp+0D8h]
fffff800`01273ece 440fc10522f0f2ff xadd    dword ptr [nt!IopUniqueDeviceObjectNumber],r8d
fffff800`01273ed6 41ffc0          inc     r8d
fffff800`01273ed9 e8d236deff      call    nt!swprintf
fffff800`01273ede 4584ed          test    r13b,r13b
fffff800`01273ee1 0f85c1a70800    jne     nt!IoCreateDevice+0xce


- Dmitry Vostokov @ DumpAnalysis.org -

Finding a needle in a hay

Thursday, April 19th, 2007

Found a good WinDbg command to list unique threads in a process. Some processes have so many threads that it is difficult to find anomalies in the output of ~*kv command especially when most threads are similar like waiting for LPC reply, etc. In this case we can use !uniqstack command to list only threads with unique call stacks and then list duplicate thread numbers.

0:046> !uniqstack
Processing 51 threads, please wait
.  0  Id: 1d50.1dc0 Suspend: 1 Teb: 7fffe000 Unfrozen
      Priority: 0  Priority class: 32
ChildEBP RetAddr
0012fbcc 7c821b84 ntdll!KiFastSystemCallRet
0012fbd0 77e4189f ntdll!NtReadFile+0xc
0012fc38 77f795ab kernel32!ReadFile+0×16c
0012fc64 77f7943c ADVAPI32!ScGetPipeInput+0×2a
0012fcd8 77f796c1 ADVAPI32!ScDispatcherLoop+0×51
0012ff3c 004018fb ADVAPI32!StartServiceCtrlDispatcherW+0xe3



. 26  Id: 1d50.44ec Suspend: 1 Teb: 7ffaf000 Unfrozen
      Priority: 1  Priority class: 32
ChildEBP RetAddr
0752fea0 7c822124 ntdll!KiFastSystemCallRet
0752fea4 77e6bad8 ntdll!NtWaitForSingleObject+0xc
0752ff14 77e6ba42 kernel32!WaitForSingleObjectEx+0xac
0752ff28 1b00999e kernel32!WaitForSingleObject+0×12
0752ff34 1b009966 msjet40!Semaphore::Wait+0xe
0752ff5c 1b00358c msjet40!Queue::GetMessageW+0xc9
0752ffb8 77e6608b msjet40!System::WorkerThread+0×41
0752ffec 00000000 kernel32!BaseThreadStart+0×34



Total threads: 51
Duplicate callstacks: 31 (windbg thread #s follow):
3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 21, 22, 23, 27, 28, 29, 33, 39, 40, 41, 42, 43, 44, 47, 49, 50
0:046> ~49kL
ChildEBP RetAddr
0c58fe18 7c821c54 ntdll!KiFastSystemCallRet
0c58fe1c 77c7538c ntdll!ZwReplyWaitReceivePortEx+0xc
0c58ff84 77c5778f RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0×198
0c58ff8c 77c5f7dd RPCRT4!RecvLotsaCallsWrapper+0xd
0c58ffac 77c5de88 RPCRT4!BaseCachedThreadRoutine+0×9d
0c58ffb8 77e6608b RPCRT4!ThreadStartRoutine+0×1b
0c58ffec 00000000 kernel32!BaseThreadStart+0×34
0:046> ~47kL
ChildEBP RetAddr
0b65fe18 7c821c54 ntdll!KiFastSystemCallRet
0b65fe1c 77c7538c ntdll!ZwReplyWaitReceivePortEx+0xc
0b65ff84 77c5778f RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0×198
0b65ff8c 77c5f7dd RPCRT4!RecvLotsaCallsWrapper+0xd
0b65ffac 77c5de88 RPCRT4!BaseCachedThreadRoutine+0×9d
0b65ffb8 77e6608b RPCRT4!ThreadStartRoutine+0×1b
0b65ffec 00000000 kernel32!BaseThreadStart+0×34

- Dmitry Vostokov -

Reading Chinese

Wednesday, April 18th, 2007

I’m very pleased that Da-Chang Guan started translating Crash Dump Analysis Patterns into Chinese. I’m personally thinking about translating them to Russian too, my native language. 

Please visit his blog http://windriver.polar.tw/blog/ where he also mentions Windows Academic Program I didn’t know about before. Definitely it is good to learn about Windows Internals by studying Windows source code. Now there is an alternative to studying free Linux source code in university operating system courses. 

If you are interested in learning Chinese language I’d like also to recommend some books that will help you to learn to read traditional Chinese. I’m learning it now and here I would recommend the Chinese Reader series (I own and I’m studying the first two at the moment):

My personal opinion and belief is that if you ignore writing and speaking then learning to read Chinese becomes less arduous task. If you like linguistics and languages like me I would also recommend the following two popular books that I read some time ago:

and the more linguistic one I read too: 

I have another book I haven’t read yet but it is on my reading list:

I became interested in English grammar and eventually in other languages and linguistics during my work for Programming Research Ltd., global leader in software quality and coding standards, 4 years ago when I was working on extending and writing new C++ semantics components for their C++ static analysis product. 

- Dmitry Vostokov -

Crash Dump Analysis Poster v1.0

Sunday, April 15th, 2007

In December when I announced Crash Dump Analysis Card I talked about my plans to make a poster. Here is what I came to in the first version: A4 format poster with the following goals in mind:

  • Have it easy to stick nearby or have it handy
  • Foldable into two halves (user / kernel dumps)
  • Possibility to make it a background image on a PC desktop
  • Have it displayed on a second monitor
  • To facilitate mastering commands and their options
  • Encourage to look in WinDbg Help

You can download large JPEG file (1701×1208) for free (PDF file will be available later):

Download Crash Dump Analysis Poster

In a couple of months I’m going to release the new version after using and playing with the current one and collecting feedback. I have some extension commands missing in the first version of this poster like !list command, various scripting and meta-commands and I will add them in the next version. The current choice of commands is based on my previous Crash Dump Analysis Card and my personal day-to-day crash dump analysis work.

Originally I wanted to call it like something like WinDbg Cheat Sheet or WinDbg Poster but then I realized that I had to omit various live debugging commands and options and there are already several similar cheat sheets for live debugging. 

- Dmitry Vostokov -

Race conditions on a uniprocessor machine

Saturday, April 14th, 2007

It is a known fact that hidden race conditions in code are manifested more frequently on a multiprocessor machine than on a uniprocessor machine. I was trying to create an example to illustrate this and wrote the following code which was motivated by the similar kernel level code and the discussion on Russian Software Development Network forum:

volatile bool b;
void thread_true(void *)
{
 while(true)
 {
  b = true;
 }
}
void thread_false(void *)
{
 while(true)
 {
  b = false;
 }
}
int _tmain(int argc, _TCHAR* argv[])
{
 _beginthread(thread_true, 0, NULL);
 _beginthread(thread_false, 0, NULL);
 while(true)
 {
  assert (b == false || b == true);
 }
 return 0;
}

The program has three threads. Two of them are trying to set the same boolean variable b to different values and the main thread checks that its value is either true or false. The assertion should fail in the following scenario: the first thread (thread_true) sets b variable to true value so the first comparison in assertion fails and we expect the second comparison to succeed but the main thread is preempted by the second thread (thread_false) that sets that value to false and therefore the second comparison fails too. We get an assertion dialog in debug build showing that boolean variable b is neither true nor false!

I compiled and ran that program and it wasn’t failing for hours on my uniprocessor laptop. On a multiprocessor machine it was failing in a couple of minutes. If we look at assertion assembly language code we would see that it is very short so statistically speaking the chances that our main thread is preempted in the middle of the assertion are very low. This is because on a uniprocessor machine two threads are running not in parallel but until their quantum is expired. So we should make the assertion code longer to exceed the quantum. To simulate this I added a call to SwitchToThread API. When the assertion code yields execution to another thread then perhaps that thread would be thread_false and as soon as it is preempted by main thread again we get the assertion failure:

volatile bool b;
bool SlowOp()
{
 SwitchToThread();
 return false;
}
void thread_true(void *)
{
 while(true)
 {
  b = true;
 }
}
void thread_false(void *)
{
 while(true)
 {
  b = false;
 }
}
int _tmain(int argc, _TCHAR* argv[])
{
 _beginthread(thread_true, 0, NULL);
 _beginthread(thread_false, 0, NULL);
 while(true)
 {
  assert (b == false || SlowOp() || b == true);
 }
 return 0;
}

I compiled and ran the program again and I couldn’t see any failure for a long time. It looks like thread_false is always running before the main thread and when the main thread is running then due to short-circuit operator || evaluation rule we don’t have a chance to execute SlowOp(). Then I added a fourth thread called thread_true_2 to make the number of threads setting b variable to true value as twice as many as the number of threads setting b variable to false value (2 to 1) so we could have more chances to set b variable to true value before executing the assertion:

volatile bool b;
bool SlowOp()
{
 SwitchToThread();
 return false;
}
void thread_true(void *)
{
 while(true)
 {
  b = true;
 }
}
void thread_true_2(void *)
{
 while(true)
 {
  b = true;
 }
}
void thread_false(void *)
{
 while(true)
 {
  b = false;
 }
}
int _tmain(int argc, _TCHAR* argv[])
{
 _beginthread(thread_true, 0, NULL);
 _beginthread(thread_false, 0, NULL);
 _beginthread(thread_true_2, 0, NULL);
 while(true)
 {
  assert (b == false || SlowOp() || b == true);
 }
 return 0;
}

Now when I ran the new program I got the assertion failure in a couple of minutes! It is hard to make race conditions manifest themselves on a uniprocessor machine.

- Dmitry Vostokov -

Yet another look at Zw* and Nt* functions

Tuesday, April 10th, 2007

While reading the new book “Professional Rootkits” by Ric Vieler I encountered the following macro definition to get function index in system service table:

#define HOOK_INDEX(function2hook) *(PULONG)((PUCHAR)function2hook+1)

Couldn’t understand the code until looked at disassembly of a typical ntdll!Zw and nt!Zw function (x86 W2K3):

lkd> u ntdll!ZwCreateProcess
ntdll!NtCreateProcess:
7c821298 b831000000      mov     eax,31h
7c82129d ba0003fe7f      mov     edx,offset SharedUserData!SystemCallStub (7ffe0300)
7c8212a2 ff12            call    dword ptr [edx]
7c8212a4 c22000          ret     20h
7c8212a7 90              nop
ntdll!ZwCreateProcessEx:
7c8212a8 b832000000      mov     eax,32h
7c8212ad ba0003fe7f      mov     edx,offset SharedUserData!SystemCallStub (7ffe0300)
7c8212b2 ff12            call    dword ptr [edx]

lkd> u nt!ZwCreateProcess
nt!ZwCreateProcess:
8083c2a3 b831000000      mov     eax,31h
8083c2a8 8d542404        lea     edx,[esp+4]
8083c2ac 9c              pushfd
8083c2ad 6a08            push    8
8083c2af e8c688ffff      call    nt!KiSystemService (80834b7a)
8083c2b4 c22000          ret     20h
nt!ZwCreateProcessEx:
8083c2b7 b832000000      mov     eax,32h
8083c2bc 8d542404        lea     edx,[esp+4]

You can notice that user space ntdll!Nt and ntdll!Zw variants are the same. This is not the case in kernel space:

lkd> u nt!NtCreateProcess
nt!NtCreateProcess:
808f80ea 8bff            mov     edi,edi
808f80ec 55              push    ebp
808f80ed 8bec            mov     ebp,esp
808f80ef 33c0            xor     eax,eax
808f80f1 f6451c01        test    byte ptr [ebp+1Ch],1
808f80f5 0f8549d10600    jne     nt!NtCreateProcess+0xd (80965244)
808f80fb f6452001        test    byte ptr [ebp+20h],1
808f80ff 0f8545d10600    jne     nt!NtCreateProcess+0×14 (8096524a)

nt!Zw functions are dispatched through service table. nt!Nt functions are actual code. 

For completeness let’s look at AMD x64 W2K3. User space x64 call:

0:001> u ntdll!ZwCreateProcess
ntdll!NtCreateProcess:
00000000`78ef1ab0 4c8bd1          mov     r10,rcx
00000000`78ef1ab3 b882000000      mov     eax,82h
00000000`78ef1ab8 0f05            syscall
00000000`78ef1aba c3              ret
00000000`78ef1abb 666690          xchg    ax,ax
00000000`78ef1abe 6690            xchg    ax,ax
ntdll!NtCreateProfile:
00000000`78ef1ac0 4c8bd1          mov     r10,rcx
00000000`78ef1ac3 b883000000      mov     eax,83h

User space x86 call in x64 W2K3:

0:001> u ntdll!ZwCreateProcess
ntdll!ZwCreateProcess:
7d61d428 b882000000      mov     eax,82h
7d61d42d 33c9            xor     ecx,ecx
7d61d42f 8d542404        lea     edx,[esp+4]
7d61d433 64ff15c0000000  call    dword ptr fs:[0C0h]
7d61d43a c22000          ret     20h
7d61d43d 8d4900          lea     ecx,[ecx]
ntdll!ZwCreateProfile:
7d61d440 b883000000      mov     eax,83h
7d61d445 33c9            xor     ecx,ecx

Kernel space call in x64 W2K3:

kd> u nt!ZwCreateProcess nt!ZwCreateProcess+20
nt!ZwCreateProcess:
fffff800`0103dd70 488bc4          mov     rax,rsp
fffff800`0103dd73 fa              cli
fffff800`0103dd74 4883ec10        sub     rsp,10h
fffff800`0103dd78 50              push    rax
fffff800`0103dd79 9c              pushfq
fffff800`0103dd7a 6a10            push    10h
fffff800`0103dd7c 488d057d380000  lea     rax,[nt!KiServiceLinkage (fffff800`01041600)]
fffff800`0103dd83 50              push    rax
fffff800`0103dd84 b882000000      mov     eax,82h
fffff800`0103dd89 e972310000      jmp     nt!KiServiceInternal (fffff800`01040f00)
fffff800`0103dd8e 6690            xchg    ax,ax

kd> u nt!NtCreateProcess
nt!NtCreateProcess:
fffff800`01245832 53               push    rbx
fffff800`01245833 4883ec50         sub     rsp,50h
fffff800`01245837 4c8b9c2488000000 mov     r11,qword ptr [rsp+88h]
fffff800`0124583f b801000000       mov     eax,1
fffff800`01245844 488bd9           mov     rbx,rcx
fffff800`01245847 488b8c2490000000 mov     rcx,qword ptr [rsp+90h]
fffff800`0124584f 41f6c301         test    r11b,1
fffff800`01245853 41ba00000000     mov     r10d,0

Here is the same as in kernel x86: Zw functions are dispatched but Nt functions are actual code. If you want to remember which function variant is dispatched and which is actual code I propose the mnemonic “Z-dispatch”.

- Dmitry Vostokov -

Programmer Universalis

Monday, April 9th, 2007

Just a short observation: it’s very good to be able to understand and even write everything from GUI down to machine language instructions or up. Certainly understanding how software works at every level is very helpful in memory dump analysis. Seeing thread stacks in memory dumps helps in understanding software. The more you know the better you are at dump analysis and debugging. Debugging is not about stepping through the code. This is a very narrow view of a specialist programmer. Programmer Universalis can do debugging at every possible level and therefore can write any possible software layer.

- Dmitry Vostokov @ DumpAnalysis.org

Analyzing Dr. Watson logs

Sunday, April 8th, 2007

The main problem with Dr. Watson logs is lack of symbol information but this can be alleviated by using WinDbg if you have the same binary that crashed and produced the log entry. I’m going to illustrate this by using TestDefaultDebugger tool. Its main purpose is to crash. I use this tool here just to show you how to reconstruct stack trace.

If you run it and Dr. Watson is your default postmortem debugger you will get this event recoded in your Dr. Watson log:

*** ERROR: Module load completed but symbols could not be loaded for C:\Work\TestDefaultDebugger.exe
function: TestDefaultDebugger
        004014e6 cc              int     3
        004014e7 cc              int     3
        004014e8 cc              int     3
        004014e9 cc              int     3
        004014ea cc              int     3
        004014eb cc              int     3
        004014ec cc              int     3
        004014ed cc              int     3
        004014ee cc              int     3
        004014ef cc              int     3
FAULT ->004014f0 c7050000000000000000 mov dword ptr ds:[0],0  ds:0023:00000000=????????
        004014fa c3              ret
        004014fb cc              int     3
        004014fc cc              int     3
        004014fd cc              int     3
        004014fe cc              int     3
        004014ff cc              int     3
        00401500 0fb7542404      movzx   edx,word ptr [esp+4]
        00401505 89542404        mov     dword ptr [esp+4],edx
        00401509 e98e1c0000      jmp     TestDefaultDebugger+0×319c (0040319c)
        0040150e cc              int     3
*—-> Stack Back Trace <----*
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\WINDOWS\system32\ntdll.dll -
ChildEBP RetAddr  Args to Child
WARNING: Stack unwind information not available. Following frames may be wrong.
TestDefaultDebugger+0x14f0
TestDefaultDebugger+0x3470
TestDefaultDebugger+0x2a27
TestDefaultDebugger+0x8e69
TestDefaultDebugger+0x98d9
TestDefaultDebugger+0x6258
TestDefaultDebugger+0x836d

You see that when the log entry was saved there were no symbols available and this is the most common case. If you have such a log and no corresponding user dump (perhaps it was overwritten) then you can still reconstruct stack trace. To do this run WinDbg, set path to your application symbol files and load your application as a crash dump:

Microsoft (R) Windows Debugger  Version 6.6.0007.5
Copyright (c) Microsoft Corporation. All rights reserved.
Loading Dump File [C:\Work\TestDefaultDebugger.exe]
Symbol search path is: SRV*c:\websymbols*http://msdl.microsoft.com/download/symbols;c:\work
Executable search path is:
ModLoad: 00400000 00435000   C:\Work\TestDefaultDebugger.exe
eax=00000000 ebx=00000000 ecx=00000000 edx=00000000 esi=00000000 edi=00000000
eip=0040e8bb esp=00000000 ebp=00000000 iopl=0         nv up di pl nz na po nc
cs=0000  ss=0000  ds=0000  es=0000  fs=0000  gs=0000             efl=00000000
TestDefaultDebugger!wWinMainCRTStartup:
0040e8bb e876440000      call    TestDefaultDebugger!__security_init_cookie (00412d36)

Now use ln command to find the nearest symbol:

0:000> ln TestDefaultDebugger+0×14f0
c:\testdefaultdebugger\testdefaultdebuggerdlg.cpp(155)
(004014f0)   TestDefaultDebugger!CTestDefaultDebuggerDlg::OnBnClickedButton1   |  (00401500)   TestDefaultDebugger!CDialog::Create
Exact matches:
    TestDefaultDebugger!CTestDefaultDebuggerDlg::OnBnClickedButton1 (void)
0:000> ln TestDefaultDebugger+0×3470
f:\rtm\vctools\vc7libs\ship\atlmfc\src\mfc\cmdtarg.cpp(381)+0×18
(00403358)   TestDefaultDebugger!CCmdTarget::OnCmdMsg+0×118   |  (00403472)   TestDefaultDebugger!CCmdTarget::IsInvokeAllowed
0:000> ln TestDefaultDebugger+0×2a27
f:\rtm\vctools\vc7libs\ship\atlmfc\src\mfc\dlgcore.cpp(85)+0×17
(00402a0c)   TestDefaultDebugger!CDialog::OnCmdMsg+0×1b   |  (00402a91)   TestDefaultDebugger!CDialog::`scalar deleting destructor’
0:000> ln TestDefaultDebugger+0×8e69
f:\rtm\vctools\vc7libs\ship\atlmfc\src\mfc\wincore.cpp(2299)+0xd
(00408dd9)   TestDefaultDebugger!CWnd::OnCommand+0×90   |  (00408e70)   TestDefaultDebugger!CWnd::OnNotify
0:000> ln TestDefaultDebugger+0×98d9
f:\rtm\vctools\vc7libs\ship\atlmfc\src\mfc\wincore.cpp(1755)+0xe
(004098a3)   TestDefaultDebugger!CWnd::OnWndMsg+0×36   |  (00409ecf)   TestDefaultDebugger!CWnd::ReflectChildNotify
0:000> ln TestDefaultDebugger+0×6258
f:\rtm\vctools\vc7libs\ship\atlmfc\src\mfc\wincore.cpp(1741)+0×17
(00406236)   TestDefaultDebugger!CWnd::WindowProc+0×22   |  (0040627a)   TestDefaultDebugger!CTestCmdUI::CTestCmdUI
0:000> ln TestDefaultDebugger+0×836d
f:\rtm\vctools\vc7libs\ship\atlmfc\src\mfc\wincore.cpp(243)
(004082d3)   TestDefaultDebugger!AfxCallWndProc+0×9a   |  (004083c0)   TestDefaultDebugger!AfxWndProc

So we reconstructed the stack trace:

TestDefaultDebugger!CTestDefaultDebuggerDlg::OnBnClickedButton1
TestDefaultDebugger!CCmdTarget::OnCmdMsg+0×118
TestDefaultDebugger!CDialog::OnCmdMsg+0×1b
TestDefaultDebugger!CWnd::OnCommand+0×90
TestDefaultDebugger!CWnd::OnWndMsg+0×36
TestDefaultDebugger!CWnd::WindowProc+0×22
TestDefaultDebugger!AfxCallWndProc+0×9a

To check it we disassemble the top and see that it corresponds to our crash point from Dr. Watson log:

0:000> u TestDefaultDebugger!CTestDefaultDebuggerDlg::OnBnClickedButton1
TestDefaultDebugger!CTestDefaultDebuggerDlg::OnBnClickedButton1 [c:\testdefaultdebugger\testdefaultdebuggerdlg.cpp @ 155]:
004014f0 c7050000000000000000 mov dword ptr ds:[0],0
004014fa c3              ret
004014fb cc              int     3
004014fc cc              int     3
004014fd cc              int     3
004014fe cc              int     3
004014ff cc              int     3

Although I haven’t tried it yet I believe you can also apply this technique to old Windows 98 or Windows Me Dr. Watson logs.

- Dmitry Vostokov -

ScreenHistory 1.0

Sunday, April 8th, 2007

After working with many customer issues where I needed good screenshots I decided to write a screen or window capture tool to make troubleshooting and reading other logs/traces easier. Here is ScreenHistory tool with familiar History-like GUI interface if you have seen WindowHistory, MessageHistory and ProcessHistory tools.

The tool captures the whole screen (currently the primary monitor) after specified interval (default is 1 second) or the contents of a current foreground window (multi-monitor independent) and saves its screenshot in JPEG, GIF (default) or PNG file. Additionally an HTML file is generated with links to screenshots. New forthcoming versions of WindowHistory and MessageHistory will reference these screenshots. Windows Mobile version will be released soon too.

Instead of forming a mental picture about screen when you look at messages or relating them to arbitrary screenshots sent by your customers you can easily check real-time screenshots when you look at message traces, for example, MessageHistory trace:

13:12:24:944 S WM_ACTIVATEAPP (0x1c) wParam: 0x0 lParam: 0x12ec Deactivated / TID of activated window: 0x12ec

[Screen]
13:12:47:268 S WM_ACTIVATEAPP (0×1c) wParam: 0×1 lParam: 0×0 Activated / TID of deactivated window: 0×0

[Screen]

or WindowHistory trace

Handle: 000300E4 Class: "MyClass" Title: "My Application"
Captured at: 13:11:47:983
Process ID: 6c4
Thread ID: 1054
Parent: 0
Screen position (l,t,r,b): (264,161,1032,691)
Visible: true
Window placement command: SW_SHOWNORMAL
Foreground: false
Foreground changed at 13:12:20:626 to true
[Screen]
Foreground changed at 13:12:24:959 to false
[Screen]
Foreground changed at 13:12:47:284 to true
[Screen]
Foreground changed at 13:12:51:852 to false
[Screen]

The following ScreenHistory screenshot was saved by the tool itself:

If you save an HTML file and load it in IE you would see formatted screen log (screenshot was saved by ScreenHistory):

- Dmitry Vostokov @ DumpAnalysis.org -

Upgrading Dr. Watson

Saturday, April 7th, 2007

I’ve been using NTSD as a default debugger on my laptop for a while and decided to revert it to Dr. Watson to get a couple of logs. Unfortunately Dr. Watson itself crashed in dbghelp.dll. Loading drwtsn32.exe dump reveals that it depends on both dbghelp.dll and dbgeng.dll. I tried to replace these DLLs with newer versions from the latest Debugging Tools for Windows and found that this change in system32 folder is immediately reverted back to original file versions. Instead of battling against Windows I decided to create a completely separate Dr. Watson folder and copy drwtsn32.exe, the latest dbghelp.dll and dbgeng.dll from Debugging Tools for Windows there. Then I altered “Debugger” value under the following key to include the full path to drwtsn32.exe:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\AeDebug
Debugger=c:\drwatson\drwtsn32 -p %ld -e %ld -g
 

This solved the problem. Dr. Watson now uses the latest debugging engine to save dumps and logs.

- Dmitry Vostokov -