Archive for August, 2009

Bugtation No.101

Monday, August 31st, 2009

Most bugs are permanent.

Dmitry Vostokov, Variation on a theme “Most objects are temporary”

- Dmitry Vostokov @ -

10 Common Mistakes in Memory Analysis (Part 5)

Monday, August 31st, 2009

Sometimes not paying attention to all aspects of default analysis makes it difficult to consider an alternative troubleshooting hypothesis. Here is a sample of !analyze -v output showing massive patching (hooked functions pattern) by DriverA module:

This is a very common bugcheck.  Usually the exception address pinpoints the driver/function that caused the problem.  Always note this address as well as the link date of the driver/image that contains this address. Some common problems are exception code 0x80000003.  This means a hard coded breakpoint or assertion was hit, but this system was booted /NODEBUG.  This is not supposed to happen as developers should never have hardcoded breakpoints in retail code, but ... If this happens, make sure a debugger gets connected, and the system is booted /DEBUG.  This will let us see why this breakpoint is happening.
Arg1: c0000005, The exception code that was not handled
Arg2: 8092d47f, The address that the exception occurred at
Arg3: f5205b14, Trap Frame
Arg4: 00000000


CHKIMG_EXTENSION: !chkimg -lo 50 -d !nt
    80822a49-80822a4d  5 bytes - nt!NtYieldExecution
 [ 8b ff 55 8b ec:e9 14 3a 95 76 ]
    80823c11-80823c14  4 bytes - nt!KeFlushProcessTb+2c (+0x11c8)
 [ 69 76 82 80:88 ff ff ff ]
    80823c17-80823c1a  4 bytes - nt!KeFlushProcessTb+32 (+0x06)
 [ dd 40 01 00:b5 34 b3 76 ]
    8083771f-80837725  7 bytes - nt!KeAcquireQueuedSpinLockAtDpcLevel+1b (+0x13b08)
 [ f7 41 04 01 00 00 00:e9 c4 f9 b1 76 cc cc ]
    80840945-8084094a  6 bytes - nt!KxFlushEntireTb+9 (+0x9226)
 [ ff 15 1c 10 80 80:e9 65 67 b1 76 cc ]
    80845fe0-80845fe3  4 bytes - nt!KeFlushSingleTb+49 (+0x569b)
 [ 14 1d ff ff:dd 10 b1 76 ]
    80845fe5 - nt!KeFlushSingleTb+4e (+0x05)
 [ b9:c3 ]
    8084722d-80847230  4 bytes - nt!KeFlushMultipleTb+45 (+0x1248)
 [ 5e e3 82 80:14 00 00 00 ]
    80847233-80847236  4 bytes - nt!KeFlushMultipleTb+4b (+0x06)
 [ c1 0a ff ff:99 fe b0 76 ]
    808c039c-808c039e  3 bytes - nt!NtSetContextThread
 [ 8b ff 55:e9 31 5f ]
    808c03a0 - nt!NtSetContextThread+4 (+0x04)
 [ ec:76 ]
    808e3184-808e3188  5 bytes - nt!NtCreateProcess (+0x22de4)
 [ 8b ff 55 8b ec:e9 0b 31 89 76 ]
    808f6ad0-808f6ad6  7 bytes - nt!NtLoadKeyEx (+0x1394c)
 [ 6a 70 68 98 4b 81 80:e9 e7 f8 87 76 90 90 ]
    8090c66f-8090c675  7 bytes - nt!NtDeleteValueKey (+0x15b9f)
 [ 6a 44 68 60 f0 81 80:e9 c4 9c 86 76 90 90 ]
    8090e36c-8090e370  5 bytes - nt!NtTerminateProcess (+0x1cfd)
 [ 8b ff 55 8b ec:e9 34 81 86 76 ]
    80915342-80915346  5 bytes - nt!NtDeleteKey (+0x6fd6)
 [ 8b ff 55 8b ec:e9 c7 0f 86 76 ]
    80918114-80918118  5 bytes - nt!NtOpenThread (+0x2dd2)
 [ 68 c4 00 00 00:e9 53 e1 85 76 ]
    80921eac-80921eb2  7 bytes - nt!NtEnumerateKey (+0x9d98)
 [ 6a 48 68 f0 f9 82 80:e9 f5 44 85 76 90 90 ]
    80922578-8092257e  7 bytes - nt!NtEnumerateValueKey (+0x6cc)
 [ 6a 48 68 10 fc 82 80:e9 13 3e 85 76 90 90 ]
    80922efd-80922f01  5 bytes - nt!NtNotifyChangeKey (+0x985)
 [ 8b ff 55 8b ec:e9 e4 34 85 76 ]
    809246fb-809246ff  5 bytes - nt!NtOpenProcess (+0x17fe)
 [ 68 c8 00 00 00:e9 58 1b 85 76 ]
    8092c8a0-8092c8a4  5 bytes - nt!NtCreateKey (+0x81a5)
 [ 68 c0 00 00 00:e9 55 9a 84 76 ]
    8092f3a6-8092f3ac  7 bytes - nt!NtSetValueKey (+0x2b06)
 [ 6a 58 68 a0 f6 82 80:e9 a3 6f 84 76 90 90 ]
    8092fa88-8092fa8c  5 bytes - nt!NtCreateFile (+0x6e2)
 [ 8b ff 55 8b ec:e9 ab 69 84 76 ]
    80931311-80931315  5 bytes - nt!NtOpenKey (+0x1889)
 [ 68 ac 00 00 00:e9 d0 4f 84 76 ]
    809316ed-809316f3  7 bytes - nt!NtQueryValueKey (+0x3dc)
 [ 6a 60 68 80 90 84 80:e9 72 4c 84 76 90 90 ]
    8093470f-80934715  7 bytes - nt!NtQueryKey (+0x3022)
 [ 6a 58 68 c8 97 84 80:e9 0e 1d 84 76 90 90 ]
    809354fa-80935500  7 bytes - nt!NtMapViewOfSection (+0xdeb)
 [ 6a 38 68 80 a2 84 80:e9 77 0f 84 76 90 90 ]
    80935785-80935789  5 bytes - nt!NtUnmapViewOfSection (+0x28b)
 [ 8b ff 55 8b ec:e9 02 0d 84 76 ]
    8093ba96-8093ba9c  7 bytes - nt!NtProtectVirtualMemory (+0x6311)
 [ 6a 44 68 40 03 85 80:e9 b1 a9 83 76 90 90 ]
    8093c86d-8093c871  5 bytes - nt!NtSetInformationProcess (+0xdd7)
 [ 68 08 01 00 00:e9 4c 9a 83 76 ]
    8093ce6b-8093ce71  7 bytes - nt!NtCreateProcessEx (+0x5fe)
 [ 6a 0c 68 58 0e 85 80:e9 38 94 83 76 90 90 ]
    80978fef-80978ff5  7 bytes - nt!NtQueryMultipleValueKey (+0x3c184)
 [ 6a 48 68 f0 f9 86 80:e9 86 d3 7f 76 90 90 ]
    80979775-8097977b  7 bytes - nt!NtRenameKey (+0x786)
 [ 6a 3c 68 38 fa 86 80:e9 a8 cb 7f 76 90 90 ]
    80979caf-80979cb3  5 bytes - nt!NtRestoreKey (+0x53a)
 [ 8b ff 55 8b ec:e9 46 c7 7f 76 ]
    8097a11c-8097a120  5 bytes - nt!NtUnloadKey (+0x46d)
 [ 8b ff 55 8b ec:e9 b1 c2 7f 76 ]
    8097a139-8097a13d  5 bytes - nt!NtReplaceKey (+0x1d)
 [ 8b ff 55 8b ec:e9 d0 c2 7f 76 ]
197 errors : !nt (80822a49-8097a13d)


IMAGE_NAME:  DriverA.sys




However, when we look at the stack trace, we would see that BSOD happened when accessing registry while updating drivers:

8092d47f 8b4604          mov     eax,dword ptr [esi+4]

TRAP_FRAME:  f5205b14 -- (.trap 0xfffffffff5205b14)
ErrCode = 00000000
eax=e1021000 ebx=e101a3b8 ecx=00000003 edx=89214988 esi=00000100 edi=00000000
eip=8092d47f esp=f5205b88 ebp=f5205bfc iopl=0         nv up ei pl nz na pe nc
cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00010206
8092d47f 8b4604          mov     eax,dword ptr [esi+4] ds:0023:00000104=????????
Resetting default scope

PROCESS_NAME:  updatedrivers.exe

f52056e0 8085bb9f 0000008e c0000005 8092d47f nt!KeBugCheckEx+0x1b
f5205aa4 808346b4 f5205ac0 00000000 f5205b14 nt!KiDispatchException+0x3a2
f5205b0c 80834668 f5205bfc 8092d47f badb0d00 nt!CommonDispatchException+0x4a
f5205b98 8092d559 e101a3b8 e63a8e40 0010fc18 nt!Kei386EoiHelper+0x186
f5205bfc 80920fcd e101a3b8 00610052 3b9aca07 nt!HvpGetCellMapped+0×36a
f5205c20 8092248b e63a8e40 e22b4794 00000000 nt!CmpGetValueKeyFromCache+0xa4
f5205cc0 80922649 e63a8e40 00000000 00000001 nt!CmEnumerateValueKey+0×45a
f5205d44 80833bdf 00000058 00000000 00000001 nt!NtEnumerateValueKey+0×1c9
f5205d44 7c9485ec 00000058 00000000 00000001 nt!KiFastCallEntry+0xfc
WARNING: Frame IP not in any known module. Following frames may be wrong.
001290fc 00000000 00000000 00000000 00000000 0×7c9485ec

So an alternative hypothesis to pursue would be some sort of registry corruption after driver updates.

- Dmitry Vostokov @ -

x64 book becomes a debugging bestseller

Wednesday, August 19th, 2009

Shortly after being published, x64 Windows Debugging: Practical Foundations book rises to the top of Amazon debugging bestesellers list (on 22:30 19.08.09):

- Dmitry Vostokov @ -

Free Stack Traces

Tuesday, August 18th, 2009

By analogy with the free verse and the anthropologist John Tedlock’s written narratives of Native American Zuni where different font size was used for different levels I tried today the similar technique with a raw stack data from the previous case study of registry corruption:

    f690a3dc f7a21a06 BOOTVID!ReadWriteMode+0×42
    f690a3e0 f7a219a7 BOOTVID!__outpw+0×17
    f690a3ec f7a21a76 BOOTVID!SetPixel+0×6a
    f690a404 f7a21c1b BOOTVID!DisplayCharacter+0×47
    f690a420 b42e14db dump_iaStor+0×3a4db
    f690a468 b4364080 dump_iaStor+0xbd080
    f690a480 f6249983 ati2mtag+0×1b6983
    f690a488 804f2ee6 nt!IopWritePageToDisk+0xe4
    f690a4e0 804f2fb6 nt!IopWriteSummaryDump+0×7e
    f690a4e4 b42e12d8 dump_iaStor+0×3a2d8
    f690a50c 804f3c8d nt!IoWriteCrashDump+0×42d
    f690a514 b42e12d8 dump_iaStor+0×3a2d8
    f690a584 804f8fa7 nt!KiDumpParameterImages+0×5f
    f690a594 f74764bb sptd+0×664bb
    f690a598 f74764a0 sptd+0×664a0
    f690a59c b42e162a dump_iaStor+0×3a62a
    f690a5a8 f7a22394 BOOTVID!PreserveRow+0×7c
    f690a5c0 b42e12d8 dump_iaStor+0×3a2d8
    f690a5cc 804f9ecd nt!KeBugCheck2+0xa4d
    f690a6e0 804f9f43 nt!KeBugCheckEx+0×1b
    f690a950 80545d00 nt!KiSwapProcess+0×60
    f690a9a0 80522d45 nt!MiDecrementReferenceCount+0×65
    f690a9ac 805067ea nt!MiDeferredUnlockPages+0×1c8

f690a9c8 804f9f43 nt!KeBugCheckEx+0×1b
f690a9e8 80548c2d nt!MiFreePoolPages+0×8b
    f690aa04 80564d20 nt!NonPagedPoolDescriptor
f690aa28 8054b49a nt!ExFreePoolWithTag+0×1ba
    f690aa3c 8062bc17 nt!CmpPinCmView+0xab
    f690aa5c 80637e13 nt!HvpDelistBinFreeCells+0xad

f690aa68 8063bf19 nt!CmpFree+0×17
f690aa78 8063eb20 nt!HvpRecoverData+0×3ec
f690aad4 8063ef05 nt!HvMapHive+0×133
    f690ab10 80539ac0 nt!_except_handler3
    f690ab14 804e0e38 nt!`string’+0×258

f690ab20 8063087e nt!HvInitializeHive+0×416
f690ab38 806383a9 nt!CmpInitializeHive+0×26d
    f690ab54 8063bf02 nt!CmpFree
    f690ab58 8063b918 nt!CmpFileSetSize
    f690ab5c 8063c466 nt!CmpFileWrite
    f690ab60 8063c33e nt!CmpFileRead
    f690ab64 8063c1fc nt!CmpFileFlush

f690aba4 80625bf9 nt!CmpInitHiveFromFile+0xa3
f690abfc 8062ad8b nt!CmpCmdHiveOpen+0×21
f690ac24 80631f24 nt!CmLoadKey+0×90
    f690ac98 80622053 nt!CmConvertHandleToKernelHandle+0×55
f690acb0 806257b4 nt!NtLoadKey2+0×1fc
    f690acc8 806259ac nt!NtLoadKey
    f690acd8 805bc33f nt!ObpCloseHandleTableEntry+0×14d
    f690ad24 805bc401 nt!ObpCloseHandle+0xab
    f690ad34 80539ac0 nt!_except_handler3
    f690ad38 804e0bd0 nt!`string’+0×364

f690ad44 806259be nt!NtLoadKey+0×12
f690ad58 8054162c nt!KiFastCallEntry+0xfc
    f690ade0 805460ee nt!KiThreadStartup+0×16
    f690ade4 80626dee nt!CmpLoadHiveThread
    f690aec0 bf875fb4 win32k!WatchdogDrvStretchBlt+0×92
    f690aee4 bf988527 win32k!_except_handler3
    f690aee8 bf995f40 win32k!`string’+0×124
    f690aef0 bf875fb4 win32k!WatchdogDrvStretchBlt+0×92
    f690aef4 bf873ec2 win32k!EngStretchBltROP+0×3a9

where the larger font size indicates the stack trace from kv command and the smaller font size indicates symbolic information found between call frames that may or may not correspond to partial stack traces left from intermediate nested function calls of the current call sequence or past stack traces and their frames.

- Dmitry Vostokov @ -

Experiments on Poor Bugs

Tuesday, August 18th, 2009

This is the real photo of tracing experiments used for the front cover of September 2009 issue of Debugged! magazine:

- Dmitry Vostokov @ -

September Issue of Debugged! MZ/PE

Tuesday, August 18th, 2009

Finally, the front cover has materialized for this issue of Debugged! magazine:

Debugged! MZ/PE: Software Tracing (ISBN: 978-1906717797)

If you have an article idea or if you’d like to write an article for us please use the following contact form:

The deadline for this issue is 25th of September.

- Dmitry Vostokov @ -

Thinking about New Directions

Monday, August 17th, 2009

This Sunday while drinking Turkish coffee (after 3 years of blogging) I was thinking about the new publishing year starting in September, the new year of reading (already started) and, in particular, about the cover of the forthcoming September issue of Debugged! MZ/PE magazine (to be revealed tonight).

- Dmitry Vostokov @ -

Registry Corruption: A Case Study

Monday, August 17th, 2009

A friend of mine couldn’t start Windows XP on his notebook. As soon as he entered his credentials in a logon window the system experienced a BSOD event. He booted from another media and collected mini-dumps. All of them were consistent in resisting to my attempts to load symbols and modules. Even explicit downloading the symbol package from Microsoft didn’t help. All bugcheck info and stack traces were like this pointing to pool corruption:

0: kd> !analyze -v
The current thread is making a bad pool request. Typically this is at a bad IRQL level or double freeing the same allocation, etc.
Arg1: 00000043, Attempt to free a virtual address which was never in any pool
Arg2: c9c00000, Address being freed.
Arg3: 00000000, 0
Arg4: 00000000, 0

1: kd> kv 100
ChildEBP RetAddr  Args to Child             
WARNING: Stack unwind information not available. Following frames may be wrong.
f6cc09e4 80548c2d 000000c2 00000043 c9c00000 nt+0x22f43
f6cc0a24 8054b49a c9c00000 e2039410 e23fd000 nt+0x71c2d
f6cc0a64 8063bf19 c9c00000 00000000 f6cc0ad0 nt+0x7449a
f6cc0a74 8063eb20 c9c00000 00002000 00000000 nt+0x164f19
f6cc0ad0 8063ef05 e1f6e008 00000000 00000000 nt+0x167b20
f6cc0b1c 8063087e e1f6e008 00000000 00000001 nt+0x167f05
f6cc0b34 806383a9 e1f6e101 00000005 00000000 nt+0x15987e
f6cc0ba0 80625bf9 f6cc0bdc 00000005 00000000 nt+0x1613a9
f6cc0bf8 8062ad8b f6cc0d04 00000000 f6cc0c64 nt+0x14ebf9
f6cc0c20 80631f24 f6cc0ccc f6cc0c6c f6cc0c5c nt+0x153d8b
f6cc0cac 806257b4 f6cc0ce4 f6cc0ccc 00000000 nt+0x15af24
f6cc0d40 806259be 0006dcc4 0006dcac 00000000 nt+0x14e7b4
f6cc0d54 8054162c 0006dcc4 0006dcac 0006dcf0 nt+0x14e9be
f6cc0d64 7c91e514 badb0d00 0006dc98 00000000 nt+0x6a62c
f6cc0d68 badb0d00 0006dc98 00000000 00000000 0x7c91e514
f6cc0d6c 0006dc98 00000000 00000000 00000090 0xbadb0d00
f6cc0d70 00000000 00000000 00000090 000000a4 0x6dc98

Portions of raw stack data available in minidump didn’t have any traces of other modules and drivers except nt:

1: kd> !thread
GetPointerFromAddress: unable to read from 80562134
86485da8: Unable to get thread contents

1: kd> dps f6cc09cc-3000 f6cc09cc+3000
f6cc095c  ????????
f6cc0960  ????????
f6cc0964  ????????
f6cc0968  00000000
f6cc096c  00000000
f6cc0970  003d0058
f6cc0974  f6cc09a8
f6cc0978  00000000
f6cc097c  0000c000
f6cc0980  00000000
f6cc0984  00000000
f6cc0988  8648b4d8
f6cc098c  863eb240
f6cc0990  00000000
f6cc0994  01ffffff
f6cc0998  f6cc093c
f6cc099c  00000000
f6cc09a0  f6cc0a14
f6cc09a4  80539ac0 nt+0x62ac0
f6cc09a8  804d8228 nt+0x1228
f6cc09ac  ffffffff
f6cc09b0  00000002
f6cc09b4  80506653 nt+0x2f653
f6cc09b8  f78a9548
f6cc09bc  c9c00000
f6cc09c0  0000bb40
f6cc0fcc  00000000
f6cc0fd0  00000000
f6cc0fd4  00000000
f6cc0fd8  00000000
f6cc0fdc  00000000
f6cc0fe0  7c91d5aa
f6cc0fe4  7c940574
f6cc0fe8  0015fd80
f6cc0fec  00100020
f6cc0ff0  00000000
f6cc0ff4  00000000
f6cc0ff8  00000000
f6cc0ffc  00000000
f6cc1000  ????????
f6cc1004  ????????
f6cc1008  ????????

So I advised to give me a kernel dump and fortunately there was one available too. It was more amenable for analysis and showed the involvement of registry:

0: kd> kv 100
ChildEBP RetAddr  Args to Child             
f690a9e4 80548c2d 000000c2 00000043 dcf40000 nt!KeBugCheckEx+0x1b
f690aa24 8054b49a dcf40000 e1294410 e17c6000 nt!MiFreePoolPages+0x8b
f690aa64 8063bf19 dcf40000 00000000 f690aad0 nt!ExFreePoolWithTag+0x1ba
f690aa74 8063eb20 dcf40000 00002000 00000000 nt!CmpFree+0×17
f690aad0 8063ef05 e11c4b60 00000000 00000000 nt!HvpRecoverData+0×3ec
f690ab1c 8063087e e11c4b60 00000000 00000001 nt!HvMapHive+0×133
f690ab34 806383a9 e11c4c01 00000005 00000000 nt!HvInitializeHive+0×416
f690aba0 80625bf9 f690abdc 00000005 00000000 nt!CmpInitializeHive+0×26d
f690abf8 8062ad8b f690ad04 00000000 f690ac64 nt!CmpInitHiveFromFile+0xa3
f690ac20 80631f24 f690accc f690ac6c f690ac5c nt!CmpCmdHiveOpen+0×21
f690acac 806257b4 f690ace4 f690accc 00000000 nt!CmLoadKey+0×90
f690ad40 806259be 0006dcc4 0006dcac 00000000 nt!NtLoadKey2+0×1fc
f690ad54 8054162c 0006dcc4 0006dcac 0006dcf0 nt!NtLoadKey+0×12

f690ad54 7c91e514 0006dcc4 0006dcac 0006dcf0 nt!KiFastCallEntry+0xfc
WARNING: Frame IP not in any known module. Following frames may be wrong.
0006dcf0 00000000 00000000 00000000 00000000 0×7c91e514

Examination of parameters on raw stack pointed to a user hive for MyFriend user:

0: kd> dpu f690ace4
f690ace4  00000018
f690ace8  80000ce0
f690acec  f690ad0c “Z\..(”
f690acf0  00000240
f690acf4  00000000
f690acf8  00000000
f690acfc  00660066
f690ad00  00eddea0
f690ad04  00660066
f690ad08  e10b1e60 “\??\C:\Documents and Settings\MyFriend\ntuser.dat”
f690ad14  00000028

So the solution was to log as Administrator and recreate the user.

- Dmitry Vostokov @ -

x64 WDPF book is available on Amazon

Sunday, August 16th, 2009

Finally the book came through the publishing process and is available on Amazon and other bookstores:

x64 Windows Debugging: Practical Foundations

Buy from Amazon

- Dmitry Vostokov @ -

3 Years of Blogging!

Friday, August 14th, 2009

Today I celebrate 3 years of blogging that resulted in 1,430 posts across 8 blogs. I would like to thank everyone for their continuing support!

The updated timeline

This blog post belongs to the 4th year of blogging. 

- Dmitry Vostokov @ -

Assembling code in WinDbg

Thursday, August 13th, 2009

I was recently asked why the following code used byte ptr modifier for MOV instruction when assigning a number to a memory location pointed to by a register:

C/C++ code:

int a;
int *pa = &a;

void foo()
        // ...
        mov eax,   [pa]
        mov [eax], 1
        // ...

Generated x86 assembly language code:

0:000:x86> uf foo
0042d64e c60001 mov byte ptr [eax],1

It looks like by default Visual C++ inline assembler treats MOV as “byte ptr” because it doesn’t know about C or C++ language semantics. Originally I thought that was the sign of a code optimization because the resulted binary code is smaller than the one generated by dword ptr. In order to check that I used a WinDbg command:

0:000> a
77067dfe mov dword ptr [eax], 1
mov dword ptr [eax], 1

0:000> u 77067dfe
77067dfe c70001000000    mov     dword ptr [eax],1
77067e04 0c8b            or      al,8Bh
77067e06 54              push    esp
77067e07 2408            and     al,8
77067e09 c70200000000    mov     dword ptr [edx],0
77067e0f 897a04          mov     dword ptr [edx+4],edi
77067e12 0bff            or      edi,edi
77067e14 741e            je      ntdll!RtlInitString+0×34 (77067e34)

This could be possible because the variable “a” is global, initialized to 0 during the program startup, so it is safe to change just one byte. If “a” was a local variable (on stack) than other 3 bytes of DWORD could contain garbage from the previously used stack memory. However, I noticed that the program was compiled as Debug target with all optimization turned off. If Visual C++ compiler was used it should have assumed that the variable “a” could have been referenced from other compilation units and no longer contained 0 before the assignment in foo function. I recreated the same code in C/C++, built the new Debug executable, and indeed, it used dword ptr instead of byte ptr as expected from C/C++ semantics.

- Dmitry Vostokov @ -

The Debugging Verses (1)

Thursday, August 13th, 2009

Studying poetry and reading books about Stalin certainly influenced this first verse:

Welcome, Doctor DebugLove!
Your name, pronounced, fixes bugs!

- Dmitry Vostokov @ -

Forthcoming Advanced .NET Debugging book

Wednesday, August 12th, 2009

Pre-ordered today on Amazon this forthcoming book:

Advanced .NET Debugging (Addison-Wesley Microsoft Technology Series)

Buy from Amazon

I was able to find TOC on InformIt. Looking forward to reading it. .NET crash dump (mixed managed and unmanaged code) and software trace analysis is a sizable part of my day-to-day activities.

When ordering I recalled that I’m was also working on a .NET debugging and memory dump analysis book:

Unmanaged Code: Escaping the Matrix of .NET

but I had to postpone it due to other commitments. It is now planned for the next year after I accumulate more material and real-world case studies.

Taking the opportunity, I also created a category .NET Debugging where I put some old blog posts and patterns related to managed code.

- Dmitry Vostokov @ -

Einstein’s Mistakes

Wednesday, August 12th, 2009

I finished reading Dirac’s biography The Strangest Man 3 months ago and started to read this book. Its title intrigued me when I was browsing recent physics releases on Amazon and I bought it. It looks to me like the mix of brief biographical notes with explanation of physical theories. Here learning from mistakes undoubtedly helps to understand special and general relativity better. I also liked the short and clear explanation of EPR paradox in just one page, “revisionist” and unusual biographical notes on other scientists and their faults, like Galileo and Newton, and notes about Einstein’s private life. This makes him really human (he was like an ideal scientist from Plato Universe for me before). When I was reading Not Even Wrong and the Trouble With Physics books I thought of the possible “yellow press physics” (which is not bad, and doesn’t mean bad quality for me, I like to read yellow press sometimes and listen to pop music) and one day, at lunch, when reading about Newton madness and other peculiar character traits I thought about “yellow press physics” again. Was the choice of this book hardcover and jacket colors (yellow) made deliberate? Anyway, while approaching the end of the book and reading about how Einstein wasted 20-30 years on his idée fixe unified theories I immediately recalled String Theory, and indeed, the author voiced the same thoughts a few moments later when I turned a page over. I also liked the discussion on how General Relativity might have been discovered if it wasn’t formulated by Einstein. The author tells us that it would have been done via a QFT route. Einstein has fallen in my eyes, and now, after reading this book, he is not quite the hero of science like I imagined before. Nevertheless, his stature from McDonald’s is still on my shelves.

Einstein’s Mistakes: The Human Failings of Genius
Buy from Amazon

I don’t want to repeat Einstein’s mistakes… 

- Dmitry Vostokov @ -

RADII Process Illustrated

Wednesday, August 12th, 2009

Previously introduced RADII software development process acquires definite shape as a product supportability driven software support tools development process. In summary, supportability of a product gives rise to Requirements, they expand into Architecture segments, then into Design segments, then into Implementation segments, and finally, into several Improvement phases. In short, RADII:

Every segment is a separate troubleshooting or debugging tool. All segments share elements of RADII via DebugWare patterns and can be further refined via iterative and incremental SDLC if needed.

- Dmitry Vostokov @ -

Stack trace collection, suspended threads, not my version, special process, main thread and blocked LPC chain threads: pattern cooperation

Tuesday, August 11th, 2009

It was reported that one server was hanging during automated reboot. Stack trace collection shows a few suspended and frozen threads. They all belong to the same process, ServiceA:

PROCESS 8545eb18  SessionId: 0  Cid: 0fec    Peb: 7ffd4000  ParentCid: 0fdc
    DirBase: 3fbeb8e0  ObjectTable: e19dd1d0  HandleCount: 169.
    Image: ServiceA.exe

THREAD 859cc900  Cid 0fec.0ff0  Teb: 7ffdf000 Win32Thread: bc1738d0 WAIT: (Unknown) KernelMode Non-Alertable
SuspendCount 1
FreezeCount 1

       859cca90  Semaphore Limit 0×2

THREAD 858c6480  Cid 0fec.0ff4  Teb: 7ffde000 Win32Thread: bc178c40 WAIT: (Unknown) KernelMode Non-Alertable
SuspendCount 1
       f55747d8  SynchronizationEvent

THREAD 859f2338  Cid 0fec.0ff8  Teb: 7ffdd000 Win32Thread: 00000000 WAIT: (Unknown) KernelMode Non-Alertable
SuspendCount 1
FreezeCount 1

       859f24c8  Semaphore Limit 0×2

THREAD 859be2b8  Cid 0fec.0ffc  Teb: 7ffdc000 Win32Thread: bc1915d8 WAIT: (Unknown) KernelMode Non-Alertable
SuspendCount 1
FreezeCount 1

       859be448  Semaphore Limit 0×2


When zooming into this process we see that one thread was processing an exception:

0: kd> .process /r /p 8545eb18
Implicit process is now 8545eb18
Loading User Symbols

0: kd> !process 8545eb18
THREAD 858c6480  Cid 0fec.0ff4  Teb: 7ffde000 Win32Thread: bc178c40 WAIT: (Unknown) KernelMode Non-Alertable
SuspendCount 1
    f55747d8  SynchronizationEvent
Not impersonating
DeviceMap                 e10008e8
Owning Process            8545eb18       Image:         ServiceA.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      6927           Ticks: 89866 (0:00:23:24.156)
Context Switch Count      156                 LargeStack
UserTime                  00:00:00.031
KernelTime                00:00:00.000
Win32 Start Address 0x611054cb
Start Address kernel32!BaseThreadStartThunk (0x7c8217ec)
Stack Init f5575000 Current f557471c Base f5575000 Limit f5571000 Call 0
Priority 10 BasePriority 8 PriorityDecrement 0
ChildEBP RetAddr 
f5574734 80833ec5 nt!KiSwapContext+0x26
f5574760 80829c14 nt!KiSwapThread+0x2e5
f55747a8 809a25c8 nt!KeWaitForSingleObject+0x346
f5574888 809a3739 nt!DbgkpQueueMessage+0x178
f55748ac 809a386e nt!DbgkpSendApiMessage+0x45
f5574938 8082d7ec nt!DbgkForwardException+0x90
f5574cf4 8088bed2 nt!KiDispatchException+0×1ea
f5574d5c 8088be86 nt!CommonDispatchException+0×4a
f5574da0 7c829c3a nt!Kei386EoiHelper+0×186
f5574dd0 00000000 kernel32!LoadResource+0×5d

We zoom into its parameters in search of semantically consistent output of .exr, .cxr and .trap commands:

0: kd> .thread 858c6480
Implicit thread is now 858c6480

0: kd> kv 100
ChildEBP RetAddr  Args to Child             
f5574734 80833ec5 858c6480 858c6528 00000200 nt!KiSwapContext+0x26
f5574760 80829c14 00000000 858c6480 f55747d0 nt!KiSwapThread+0x2e5
f55747a8 809a25c8 f55747d8 00000000 00000000 nt!KeWaitForSingleObject+0x346
f5574888 809a3739 8545eb18 00000000 f55748c0 nt!DbgkpQueueMessage+0x178
f55748ac 809a386e f55748c0 00000001 f5574d64 nt!DbgkpSendApiMessage+0x45
f5574938 8082d7ec f5574d10 00000001 00000000 nt!DbgkForwardException+0x90
f5574cf4 8088bed2 f5574d10 00000000 f5574d64nt!KiDispatchException+0×1ea
f5574d5c 8088be86 005bf4b4 61213267 badb0d00 nt!CommonDispatchException+0×4a
f5574da0 7c829c3a 71c22898 00000001 ffffffff nt!Kei386EoiHelper+0×186
f5574dd0 00000000 005bf448 00000023 00000000 kernel32!LoadResource+0×5d

After probing parameters for KiDispatchException we get these results pointing to ModuleA:

0: kd> .exr f5574d10
ExceptionAddress: 61213267 (ModuleA!GetData+0×0000b57f)
   ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000000
NumberParameters: 2
   Parameter[0]: 00000000
   Parameter[1]: 71c22898
Attempt to read from address 71c22898

0: kd> .trap f5574d64
ErrCode = 00000004
eax=71c22898 ebx=0073a7a8 ecx=7c829c3a edx=71c1c000 esi=00000104 edi=00000000
eip=61213267 esp=005bf448 ebp=005bf4b4 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202
001b:61213267 0fb700 movzx   eax,word ptr [eax]  ds:0023:71c22898=????

We check its data using lmv WinDbg command and find out that it is old and needs to be updated. But we don’t stop our investigation here. The fact that ServiceA was suspended means that it was probably being debugged or memory dumped. And indeed, we see NTSD in a process list:

0: kd> !process 0 0
PROCESS 8619d5d0  SessionId: none  Cid: 0004    Peb: 00000000  ParentCid: 0000
    DirBase: 3fbeb020  ObjectTable: e1001e08  HandleCount: 1651.
    Image: System

PROCESS 85e95d88  SessionId: none  Cid: 019c    Peb: 7ffdf000  ParentCid: 0004
    DirBase: 3fbeb040  ObjectTable: e16d5f18  HandleCount:  19.
    Image: smss.exe

PROCESS 85e4fd88  SessionId: 0  Cid: 01cc    Peb: 7ffd4000  ParentCid: 019c
    DirBase: 3fbeb060  ObjectTable: e1561d70  HandleCount: 907.
    Image: csrss.exe

PROCESS 85e42d88  SessionId: 0  Cid: 01e4    Peb: 7ffde000  ParentCid: 019c
    DirBase: 3fbeb080  ObjectTable: e16a97b0  HandleCount: 504.
    Image: winlogon.exe


PROCESS 85a4dd18  SessionId: 0  Cid: 0fdc    Peb: 7ffda000  ParentCid: 0214
    DirBase: 3fbeb520  ObjectTable: e1aa5b38  HandleCount: 121.
    Image: ntsd.exe


If we zoom into NTSD process we would see that its main thread was waiting for a console input:

0: kd> !process 0fdc ff
THREAD 859f8768  Cid 0fdc.0fe0  Teb: 7ffdf000 Win32Thread: bc14cb38 WAIT: (Unknown) UserMode Non-Alertable
    859f8954  Semaphore Limit 0x1
Waiting for reply to LPC MessageId 00001f98:
Current LPC port e19f03a0
Not impersonating
DeviceMap                 e10008e8
Owning Process            85a4dd18       Image:         ntsd.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      6932           Ticks: 89861 (0:00:23:24.078)
Context Switch Count      450                 LargeStack
UserTime                  00:00:00.000
KernelTime                00:00:00.078
Win32 Start Address ntsd!mainCRTStartup (0×0100845a)
Start Address kernel32!BaseProcessStartThunk (0×7c8217f8)
Stack Init f55c5000 Current f55c4c08 Base f55c5000 Limit f55c1000 Call 0
Priority 13 BasePriority 13 PriorityDecrement 0
Kernel stack not resident.
ChildEBP RetAddr 
f55c4c20 80833ec5 nt!KiSwapContext+0×26
f55c4c4c 80829c14 nt!KiSwapThread+0×2e5
f55c4c94 80920fba nt!KeWaitForSingleObject+0×346
f55c4d50 8088b3fc nt!NtRequestWaitReplyPort+0×776
f55c4d50 7c94860c nt!KiFastCallEntry+0xfc
0006ece0 7c947899 ntdll!KiFastSystemCallRet
0006ece4 7c94ec4a ntdll!ZwRequestWaitReplyPort+0xc
0006ed04 7c80cf8c ntdll!CsrClientCallServer+0×8c
0006edfc 7c872904 kernel32!ReadConsoleInternal+0×1b8
0006ee84 7c8018f4 kernel32!ReadConsoleA+0×3b
0006eedc 01005141 kernel32!ReadFile+0×64

0006ef04 01006974 ntsd!ConIn+0×183
0006ff38 010082d1 ntsd!MainLoop+0×1eb
0006ff44 01008589 ntsd!main+0×149
0006ffc0 7c82f23b ntsd!mainCRTStartup+0×12f
0006fff0 00000000 kernel32!BaseProcessStart+0×23

We follow LPC chain to csrss.exe to find out another blocked thread there:

0: kd> !lpc message 00001f98
Searching message 1f98 in threads …
Client thread 859f8768 waiting a reply from 1f98                         
Searching thread 859f8768 in port rundown queues …

Server communication port 0xe19b6b08
    Handles: 1   References: 1
    The LpcDataInfoChainHead queue is empty
        Connected port: 0xe19f03a0      Server connection port: 0xe1361d20

Client communication port 0xe19f03a0
    Handles: 1   References: 4
    The LpcDataInfoChainHead queue is empty

Server connection port e1361d20  Name: ServiceAPort
    Handles: 1   References: 233
    Server process  : 85e4fd88 (csrss.exe)
    Queue semaphore : 85e9b078
    Semaphore state 0 (0×0)
    The message queue is empty
    The LpcDataInfoChainHead queue is empty

0: kd> !process 85e4fd88 ff
THREAD 8549db60  Cid 01cc.1390  Teb: 7ffad000 Win32Thread: bc15aea8 WAIT: (Unknown) UserMode Non-Alertable
    8549dd4c  Semaphore Limit 0×1
Waiting for reply to LPC MessageId 00004feb:
Pending LPC Reply Message:
e191b6d0: [e1a162e8,e19ffc18]
Not impersonating
DeviceMap                 e10008e8
Owning Process            85e4fd88       Image:         csrss.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      12095          Ticks: 84698 (0:00:22:03.406)
Context Switch Count      35                 LargeStack
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Start Address 0×75943b55
Stack Init f5625000 Current f5624bf0 Base f5625000 Limit f5622000 Call 0
Priority 15 BasePriority 13 PriorityDecrement 0
Kernel stack not resident.
ChildEBP RetAddr 
f5624c08 80833ec5 nt!KiSwapContext+0×26
f5624c34 80829c14 nt!KiSwapThread+0×2e5
f5624c7c 809222f6 nt!KeWaitForSingleObject+0×346
f5624d38 8088b3fc nt!NtSecureConnectPort+0×6ce
f5624d38 7c94860c nt!KiFastCallEntry+0xfc
015ff778 7c947939 ntdll!KiFastSystemCallRet
015ff77c 77c2e7c3 ntdll!NtSecureConnectPort+0xc
015ff8a0 77c4607b RPCRT4!LRPC_CASSOCIATION::OpenLpcPort+0×21e
015ff8e0 77c45ffb RPCRT4!LRPC_CASSOCIATION::ActuallyDoBinding+0×55
015ff958 77c4f6a5 RPCRT4!LRPC_CASSOCIATION::AllocateCCall+0×190
015ff98c 77c4f5d1 RPCRT4!LRPC_BINDING_HANDLE::AllocateCCall+0×1f2
015ff9b8 77c4f201 RPCRT4!LRPC_BINDING_HANDLE::NegotiateTransferSyntax+0xd3
015ff9d0 77c4ed14 RPCRT4!I_RpcGetBufferWithObject+0×5b
015ff9e0 77c4f464 RPCRT4!I_RpcGetBuffer+0xf
015ff9f0 77cb30e4 RPCRT4!NdrGetBuffer+0×2e
015ffddc 779b4695 RPCRT4!NdrClientCall2+0×197

We follow LPC chain again to see that csrss.exe thread was waiting for a reply from our suspended and frozen ServiceA: 

0: kd> !lpc message 00004feb
Searching message 4feb in threads …
Client thread 8549db60 waiting a reply from 4feb                         
Searching thread 8549db60 in port rundown queues …

Server connection port e19a50e0  Name: ServiceAPort
    Handles: 1   References: 20
    Server process  : 8545eb18 (ServiceA.exe)
    Queue semaphore : 85443320
    Semaphore state 9 (0×9)
        Messages in queue:
        0000 e1a866e0 - Busy  Id=000022e7  From: 01e4.01e8  Context=80060004  [e19a50f0 . e1878688]
                   Length=011800e8  Type=0000000a (LPC_CONNECTION_REQUEST)
                   Data: 00000000 00000000 00000000 00000000 00000000 00000000
        0000 e1878688 - Busy  Id=00003297  From: 0ac0.0b54  Context=804d0045  [e1a866e0 . e1036740]
                   Length=011800e8  Type=0000000a (LPC_CONNECTION_REQUEST)
                   Data: 00000000 00000000 00000000 00000000 00000000 00000000
        0000 e1036740 - Busy  Id=00003986  From: 0ce4.0ce8  Context=00000042  [e1878688 . e1441228]
                   Length=011800e8  Type=0000000a (LPC_CONNECTION_REQUEST)
                   Data: 00000000 00000000 00000000 00000000 00000000 00000000
        0000 e1441228 - Busy  Id=00003a32  From: 0db4.0e14  Context=00000050  [e1036740 . e1a162e8]
                   Length=011800e8  Type=0000000a (LPC_CONNECTION_REQUEST)
                   Data: 00000000 00000000 00000000 00000000 00000000 00000000
        0000 e1a162e8 - Busy  Id=00004c75  From: 059c.05ac  Context=00000051  [e1441228 . e191b6d0]
                   Length=011800e8  Type=0000000a (LPC_CONNECTION_REQUEST)
                   Data: 00000000 00000000 00000000 00000000 00000000 00000000
        0000 e191b6d0 - Busy  Id=00004feb  From: 01cc.1390  Context=00000051  [e1a162e8 . e19ffc18]
                   Length=011800e8  Type=0000000a (LPC_CONNECTION_REQUEST)
                   Data: 00000000 00000000 00000000 00000000 00000000 00000000
        0000 e19ffc18 - Busy  Id=000055e3  From: 13fc.05b4  Context=800d0009  [e191b6d0 . e19f4ea0]
                   Length=011800e8  Type=0000000a (LPC_CONNECTION_REQUEST)
                   Data: 00000000 00000000 00000000 00000000 00000000 00000000
        0000 e19f4ea0 - Busy  Id=00006844  From: 0b00.0f20  Context=006b3d60  [e19ffc18 . e19a50f0]
                   Length=011800e8  Type=0000000a (LPC_CONNECTION_REQUEST)
                   Data: 00000000 00000000 00000000 00000000 00000000 00000000
    The message queue contains 8 messages
    The LpcDataInfoChainHead queue is empty

It doesn’t look as a deadlock because, although we have a cyclic process wait chain ServiceA -> NTSD -> CSRSS -> ServiceA, NTSD was waiting for a different thread in CSRSS than the one in CSRSS waiting for a reply from ServiceA. If these threads are unrelated then we don’t have a deadlock, strictly speaking, because the latter involves activity chains with ownership, not a container dependency (a process is a container for threads). I illustrated all this on the following diagram:

- Dmitry Vostokov @ -

The Strange Love of Dr. DebugLove

Monday, August 10th, 2009

I’m very delighted to be a Dr. DebugLove! There are many Dr. Debug out there (Google shows 1,840,000 hits) but do they really love debugging like I do? Of course, they do, but I’m the first to acknowledge my strange love publicly by accepting a pseudonym.

- Dmitry Vostokov @ -

Memory Dumps as Posets

Sunday, August 9th, 2009

Last week I was comparing the existing collection of memory dump analysis patterns to the collection of trace analysis patterns (in formation) in the search of isomorphism (or more correctly, general morphism) similar to Missing Component pattern. It is not a coincidence that such pattern pairs can be formed. For example, it is possible to discern deadlocks from both crash dumps and software traces (if appropriate information is available there). Fundamentally, it is implied by the definition of a software trace as some sort of a memory dump. And we can see traces in memory dumps too, for example, Execution Residue pattern. Because raw stack data resides in stack pages and in contemporary operating systems they are created from zero pages (metaphorically, out of the void) we can say that stack regions of threads are sorted by their creation time, for example, in this process user memory dump:

0:017> !runaway 4
 Elapsed Time
  Thread       Time
   0:49c       0 days 5:16:31.076
   4:4d8       0 days 5:16:30.967
   3:4d0       0 days 5:16:30.967
   2:4cc       0 days 5:16:30.967
   1:4c8       0 days 5:16:30.967
   5:4e8       0 days 5:16:30.936
   6:b6c       0 days 5:16:15.695
   7:b70       0 days 5:16:15.679
   9:b88       0 days 5:16:15.586
   8:b84       0 days 5:16:15.586
  11:348       0 days 5:16:12.934
  10:bfc       0 days 5:16:12.934
  12:1200      0 days 5:15:16.528
  15:1298      0 days 5:15:15.220
  14:1290      0 days 5:15:15.220
  13:128c      0 days 5:15:15.220
  17:12e4      0 days 5:15:13.257
  16:12dc      0 days 5:15:13.257
  18:12ec      0 days 5:15:13.117
  20:12f4      0 days 5:15:13.085
  19:12f0      0 days 5:15:13.085
  21:17a0      0 days 5:13:16.321
  22:1628      0 days 5:13:15.729
  24:1778      0 days 1:35:50.773
  23:17ec      0 days 1:35:50.773
  25:1570      0 days 1:27:54.190
  26:1724      0 days 1:27:10.151
  27:1490      0 days 0:05:46.732
  28:1950      0 days 0:02:28.153
  29:19b4      0 days 0:00:58.108
  30:177c      0 days 0:00:38.358
  31:1798      0 days 0:00:23.351
  32:1a7c      0 days 0:00:08.343

If we have complete memory dumps we can also account for other processes and their elapsed time. Within stack pages we have partial stack traces but do not have exact timing information between them except for stack frames from the current frozen thread stack trace or, if we are lucky, from a partial stack trace from the past execution. However, the timing between frames from different stacks is undefined and we can only guess it from higher level considerations like semantics of procedure calls and other information.

These considerations and the notion of a poset (partially ordered set) let me thinking about memory dumps as posets. I even created my interpretation of POSET abbreviation for this occasion:


Partially Ordered Software Execution Trace   

- Dmitry Vostokov @ -

Errata for WDPF book

Sunday, August 9th, 2009

Errata for the previous book Windows Debugging: Practical Foundations has been published:


Next week the updated version (revision 2.0) should be available on Amazon and other stores for both paperback and hardback titles. Digital version on Lulu has already been updated.

- Dmitry Vostokov @ -

x64 Windows Debugging: Practical Foundations

Saturday, August 8th, 2009

The digital version of the book is finally available:

x64 Windows Debugging: Practical Foundations

Paperback should be available in 1-2 weeks on Amazon and other stores. When working on the book I fixed errors in the previous x86 version. Errata file for it should be available tomorrow.

- Dmitry Vostokov @ -