Archive for the ‘Crash Dump Patterns’ Category

Platformorphism

Thursday, June 18th, 2009

This is a kind of a “faultomorphism”, a fault, a crash point and stack trace shape preserving map between two platforms (such as 32-bit and 64-bit). This new word was derived from the concatenation of platform and morphism. Here is an example:

; 64-bit crash dump

0: kd> r
Last set context:
rax=0000000063537852 rbx=0000000000000000 rcx=0000000000000009
rdx=0000000000000000 rsi=0000000000000000 rdi=0000000000000000
rip=fffffadf262760da rsp=fffffadf15973968 rbp=0000000070537852
 r8=fffffadf31614b00  r9=fffffadffe9fa7b0 r10=000000000000000a
r11=fffffadf31614bf0 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0 nv up ei pl nz na po nc
cs=0010 ss=0018 ds=0000 es=0000 fs=0000 gs=0000 efl=00010206
rdbss!RxIsThisACscAgentOpen+0×30:
fffffadf`262760da f3a6 repe cmps byte ptr [rsi],byte ptr [rdi]

0: kd> kL 100
Child-SP          RetAddr           Call Site
fffffadf`15973968 fffffadf`2629e768 rdbss!RxIsThisACscAgentOpen+0x30
fffffadf`15973970 fffffadf`262988f5 rdbss!RxInitializeVNetRootParameters+0x31d
fffffadf`159739f0 fffffadf`2629bcfd rdbss!RxFindOrConstructVirtualNetRoot+0x180
fffffadf`15973ad0 fffffadf`26297a6c rdbss!RxCanonicalizeNameAndObtainNetRoot+0x223
fffffadf`15973b70 fffffadf`26272a77 rdbss!RxCommonCreate+0x470
fffffadf`15973c80 fffffadf`261be3e8 rdbss!RxFsdCommonDispatch+0x51c
fffffadf`15973d80 fffffadf`29314db3 mrxsmb!MRxSmbFsdDispatch+0x211
[...]

; 32-bit crash dump

0: kd> r
eax=00000000 ebx=b6a23a80 ecx=00000009 edx=00000000 esi=00000008 edi=b6a23a80
eip=b6a23a5f esp=b3ce800c ebp=b3ce801c iopl=0 nv up ei pl zr na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010246
rdbss!RxIsThisACscAgentOpen+0×38:
b6a23a5f f3a6 repe cmps byte ptr [esi],byte ptr es:[edi]

0: kd> kL 100
b3ce801c b6a2b431 rdbss!RxIsThisACscAgentOpen+0x38
b3ce803c b6a2bbf7 rdbss!RxInitializeVNetRootParameters+0x282
b3ce809c b6a2e6cd rdbss!RxFindOrConstructVirtualNetRoot+0xdc
b3ce80d0 b6a2ae15 rdbss!RxCanonicalizeNameAndObtainNetRoot+0x197
b3ce8134 b6a20d51 rdbss!RxCommonCreate+0x2c3
b3ce81cc b6a2acc2 rdbss!RxFsdCommonDispatch+0x353
b3ce81f4 b69ac317 rdbss!RxFsdDispatch+0xda
b3ce8214 804e13d9 mrxsmb!MRxSmbFsdDispatch+0x134
[...]

We can see that stack traces are almost the same, function offsets are very close and faulted instruction is the same up to an opcode. Not to mention that bugchecks are identical:

RDR_FILE_SYSTEM (27)
    If you see RxExceptionFilter on the stack then the 2nd and 3rd parameters are the
    exception record and context record. Do a .cxr on the 3rd parameter and then kb to
    obtain a more informative stack trace.
    The high 16 bits of the first parameter is the RDBSS bugcheck code, which is defined
    as follows:
     RDBSS_BUG_CHECK_CACHESUP  = 0xca550000,
     RDBSS_BUG_CHECK_CLEANUP   = 0xc1ee0000,
     RDBSS_BUG_CHECK_CLOSE     = 0xc10e0000,
     RDBSS_BUG_CHECK_NTEXCEPT  = 0xbaad0000

Therefore, we can also say that these crashes are platformorphic. Obviously, this stems from the fact that source code was identical or almost identical for both platforms.

- Dmitry Vostokov @ DumpAnalysis.org -

Software Defect Construction

Tuesday, June 16th, 2009

This is the main topic of the forthcoming next issue of Debugged MZ/PE magazine. The most close term is called “fault injection” but I rediscovered it as a “software defect construction”, “software defect simulation” or “software defect modeling”. The latter term is also used to refer to construction of mathematical models related to software product quality and corresponding statistics but “modeling software defects” seems appropriate subtitle for the magazine front cover picture… Software defect construction is more general term than fault injection. The latter is used for testing but we want to simulate bugs and abnormal system conditions to study debugging and memory dump analysis techniques or to build reproduction environments. I actually recently found and bought the used copy of this book:

Software Fault Injection: Inoculating Programs Against Errors

Buy from Amazon

and plan to write my own book with the following working title later:

Software Defect Construction: Simulation and Modeling of Software Bugs (ISBN: 978-1906717759)

- Dmitry Vostokov @ DumpAnalysis.org -

The Importance of Symbols

Sunday, June 14th, 2009

This post looks at incorrect stack traces in more detail and provides an example and explanation of why WinDbg is not able to get them right and why we need symbol files to beautify them. Consider the following stack trace from !analyze -v command:

STACK_TEXT: 
00c0ecc8 0042fa36 0190483c 4d445443 0032422c msvcrt!wcscmp+0×18
WARNING: Stack unwind information not available. Following frames may be wrong.
00c0ecec 0043045d Application!Array::operator=+0×2ea1b
00c0ed08 0042d7fb Application!Array::operator=+0×2f442
00c0ed38 0042d87a Application!Array::operator=+0×2c7e0
00c0ed58 0040f393 Application!Array::operator=+0×2c85f
00c0efc8 0040b2c1 Application!Array::operator=+0xe378
00c0eff8 0040c104 Application!Array::operator=+0xa2a6
00c0f038 004192e0 Application!Array::operator=+0xb0e9
00c0f0f0 7c829f5d Application!Array::operator=+0×182c5
01c3f138 00000000 00000000 00000000 00000000 ntdll!RtlFreeHeap+0×20e

Although string comparison function could be called from Application code, it has a semantic inconsistency: FreeHeap surely wouldn’t call Application code. So why it appears on the stack trace? To find out, we look at the thread raw stack data from ESP up to the base and try to do a manual descent to reconstruct that stack trace, starting from EBP value (EIP points to wcscmp):

0:000> r
eax=4d445443 ebx=00000013 ecx=018fe8a0 edx=0190483c esi=4d445443 edi=018fa148
eip=77bd8e21 esp=00c0ecc0 ebp=00c0ecc8 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010202
msvcrt!wcscmp+0×18:
77bd8e21 668b0e          mov     cx,word ptr [esi]        ds:0023:4d445443=????

0:000> !teb
TEB at 7ffdc000
    ExceptionList:        00c0efbc
    StackBase:            00c10000
    StackLimit:           00c0a000
    SubSystemTib:         00000000
    FiberData:            00001e00
    ArbitraryUserPointer: 00000000
    Self:                 7ffdc000
    EnvironmentPointer:   00000000
    ClientId:             00005d68 . 0000503c
    RpcHandle:            00000000
    Tls Storage:          00000000
    PEB Address:          7ffd7000
    LastErrorValue:       0
    LastStatusValue:      0
    Count Owned Locks:    0
    HardErrorMode:        0

0:000> dds 00c0ecc0 00c10000
00c0ecc0  018fa148
00c0ecc4  018f9ff0
00c0ecc8  00c0ecec
00c0eccc  0042fa36 Application!Array::operator=+0×2ea1b
00c0ecd0  0190483c
[…]
00c0ecec  00c0ed08
00c0ecf0  0043045d Application!Array::operator=+0×2f442
00c0ecf4  0190483c
[…]
00c0ed08  00c0ed38
00c0ed0c  0042d7fb Application!Array::operator=+0×2c7e0
00c0ed10  01bb4914
[…]
00c0ed38  00c0ed58
00c0ed3c  0042d87a Application!Array::operator=+0×2c85f
00c0ed40  00000000
[…]
00c0ed58  00c0efc8
00c0ed5c  0040f393 Application!Array::operator=+0xe378
00c0ed60  01bb467c
[…]
00c0efc8  00c0eff8
00c0efcc  0040b2c1 Application!Array::operator=+0xa2a6
00c0efd0  00c0f014
[…]
00c0eff8  00c0f038
00c0effc  0040c104 Application!Array::operator=+0xb0e9
00c0f000  00c0f014
[…]
00c0f038  00c0f0d8
00c0f03c  004192e0 Application!Array::operator=+0×182c5
00c0f040  01902380
[…]
00c0f0d8  01c3f138
00c0f0dc  01c037a8
00c0f0e0  00000002
00c0f0e4  01c316d8
00c0f0e8  001402f8
00c0f0ec  01c3eab0
00c0f0f0  00c0f1d0
00c0f0f4  7c829f5d ntdll!RtlFreeHeap+0×20e
00c0f0f8  00140c28
[…]
00c0f1d0  00000000
00c0f1d4  00000001

What we can see that 00c0f0d8 points to 01c3f138 which is outside stack data range so WinDbg searches for the next symbol and it is ntdll!RtlFreeHeap+0×20e.

After applying correct symbols the magic occurs:

0:000> kl 100
ChildEBP RetAddr 
00c0ecc8 0042fa36 msvcrt!wcscmp+0x18
00c0ecec 0043045d Application!foo17+0x5d
00c0ed08 0042d7fb Application!foo16+0x69
00c0ed38 0042d87a Application!foo15+0xe1
00c0ed58 0040f393 Application!foo14+0x64
00c0ef7c 0040a4fe Application!foo13+0x8a
00c0efc8 0040b2c1 Application!foo12+0x76a
00c0eff8 0040c104 Application!foo11+0x126
00c0f038 004192e0 Application!foo10+0x11e
00c0f280 004193dd Application!foo9+0x34c
00c0f2b4 0041953d Application!foo8+0xc3
00c0f2e8 00415abd Application!foo7+0xb7
00c0f304 0040c19a Application!foo6+0x36
00c0f81c 0040ca44 Application!foo5+0x6a
00c0f834 0040cb04 Application!foo4+0x7e
00c0f874 00415cc6 Application!foo3+0x83
00c0f884 00403839 Application!foo2+0x26
00c0f898 77c80193 Application!foo+0x56
00c0f8b4 77ce33e1 rpcrt4!Invoke+0x30
00c0fcb4 77ce35c4 rpcrt4!NdrStubCall2+0x299
00c0fcd0 77c7ff7a rpcrt4!NdrServerCall2+0x19
00c0fd04 77c8042d rpcrt4!DispatchToStubInCNoAvrf+0x38
00c0fd58 77c80353 rpcrt4!RPC_INTERFACE::DispatchToStubWorker+0x11f
00c0fd7c 77c7e0d4 rpcrt4!RPC_INTERFACE::DispatchToStub+0xa3
00c0fdbc 77c7e080 rpcrt4!RPC_INTERFACE::DispatchToStubWithObject+0xc0
00c0fdfc 77c812f0 rpcrt4!LRPC_SCALL::DealWithRequestMessage+0x41e
00c0fe20 77c88678 rpcrt4!LRPC_ADDRESS::DealWithLRPCRequest+0x127
00c0ff84 77c88792 rpcrt4!LRPC_ADDRESS::ReceiveLotsaCalls+0x430
00c0ff8c 77c8872d rpcrt4!RecvLotsaCallsWrapper+0xd
00c0ffac 77c7b110 rpcrt4!BaseCachedThreadRoutine+0x9d
00c0ffb8 77e6482f rpcrt4!ThreadStartRoutine+0x1b
00c0ffec 00000000 kernel32!BaseThreadStart+0x34

- Dmitry Vostokov @ DumpAnalysis.org -

Invalid pointer, incorrect stack trace, multiple exceptions, insufficient memory and memory leak: pattern cooperation

Wednesday, June 10th, 2009

Users cannot connect and ultimately the server bluescreens. The dump shows an invalid pointer access while copying a buffer:

Loading Dump File [MEMORY.DMP]
Kernel Summary Dump File: Only kernel address space is available

6: kd> !analyze -v
[...]
PAGE_FAULT_IN_NONPAGED_AREA (50)
Invalid system memory was referenced.  This cannot be protected by try-except,
it must be protected by a Probe.  Typically the address is just plain bad or it
is pointing at freed memory.
Arguments:
Arg1: fe2cc128, memory referenced.
Arg2: 00000001, value 0 = read operation, 1 = write operation.
Arg3: b574414c, If non-zero, the instruction address which referenced the bad memory
 address.
Arg4: 00000000, (reserved)
[...]

TRAP_FRAME:  aefed9d8 -- (.trap 0xffffffffaefed9d8)
ErrCode = 00000002
eax=fe2fb09c ebx=00000006 ecx=00000001 edx=00000006 esi=f5312bcd edi=fe2cc128
eip=b574414c esp=aefeda4c ebp=aefeda58 iopl=0 nv up ei pl nz na po cy
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010203
Driver+0×1614c:
b574414c f3a5            rep movs dword ptr es:[edi],dword ptr [esi]
Resetting default scope

STACK_TEXT: 
aefed948 8085ed19 00000050 fe2cc128 00000001 nt!KeBugCheckEx+0x1b
aefed9c0 8088c7c8 00000001 fe2cc128 00000000 nt!MmAccessFault+0xb25
aefed9c0 b574414c 00000001 fe2cc128 00000000 nt!KiTrap0E+0xdc
WARNING: Stack unwind information not available. Following frames may be wrong.
aefeda58 b574472d f5312bcd 00000006 fe2fb09c Driver+0×1614c
[…]
aefedb70 8081df85 fe2fb000 aefedbcc 88d0e010 Driver+0×2758a
aefedc50 808f5437 8c556c88 8a1fea78 8c556c18 nt!IofCallDriver+0×45
aefedc64 808f61bf 90510070 8c556c18 8a1fea78 nt!IopSynchronousServiceTail+0×10b
aefedd00 808eed08 00001250 00000000 00000000 nt!IopXxxControlFile+0×5e5
aefedd34 808897bc 00001250 00000000 00000000 nt!NtDeviceIoControlFile+0×2a
aefedd34 7c82860c 00001250 00000000 00000000 nt!KiFastCallEntry+0xfc
0620f428 00000000 00000000 00000000 00000000 0×7c82860c

6: kd> !pte edi
               VA fe2cc000
PDE at 00000000C0603F88    PTE at 00000000C07F1660
contains 0000000000ABE863  contains 0000000000000000
pfn abe        —DA–KWEV

The warning about frames suggests that stack trace could be incorrect but backward and forward disassembly and preceding frames show that it is correct indeed and the warning results from the lack of symbol files:

6: kd> ub b574414c
*** ERROR: Module load completed but symbols could not be loaded for Driver.SYS
Driver+0×16139:
b5744139 8d341a          lea     esi,[edx+ebx]
b574413c 03fa            add     edi,edx
b574413e 3bf1            cmp     esi,ecx
b5744140 8b7508          mov     esi,dword ptr [ebp+8]
b5744143 771f            ja      Driver+0×16164 (b5744164)
b5744145 8bcb            mov     ecx,ebx
b5744147 8bd1            mov     edx,ecx
b5744149 c1e902          shr     ecx,2

6: kd> u b574414c
Driver+0×1614c:
b574414c f3a5            rep movs dword ptr es:[edi],dword ptr [esi]
b574414e 8bca            mov     ecx,edx
b5744150 83e103          and     ecx,3
b5744153 f3a4            rep movs byte ptr es:[edi],byte ptr [esi]
b5744155 8b4810          mov     ecx,dword ptr [eax+10h]
b5744158 5f              pop     edi
b5744159 03cb            add     ecx,ebx
b574415b 5e              pop     esi

At the same time we see a second fault on another processor but it seems to be a demand zero page access:

6: kd> ~7s;kv
ChildEBP RetAddr  Args to Child             
ae331854 8095dddf badb0d00 0000000f 808a623c nt!KiTrap0E+0xbc (FPO: [0,0] TrapFrame @ ae331854)
ae3318cc 809731b2 de019000 00000001 3b9aca07 nt!RtlCreateSecurityDescriptorRelative+0×13 (FPO: [2,0,4])
ae331934 808d3229 ae3319e8 de019000 ae3319ec nt!SeQuerySecurityDescriptorInfo+0×198 (FPO: [SEH])
ae33194c 808d443c dc0d9330 ae3319e8 de019000 nt!CmpQuerySecurityDescriptorInfo+0×23 (FPO: [5,0,0])
ae3319b0 809383d3 e09e33f0 00000001 ae3319e8 nt!CmpSecurityMethod+0×212 (FPO: [SEH])
ae3319f0 8093866e e09e33f0 ae331a1c e09e33ec nt!ObpGetObjectSecurity+0×99 (FPO: [4,2,0])
ae331a20 808d8ecc e09e33f0 8ae88740 00000001 nt!ObCheckObjectAccess+0×2c (FPO: [5,4,0])
ae331a7c 808da58b dbef6008 009fd5e0 00000000 nt!CmpDoOpen+0×3a0 (FPO: [SEH])
ae331b90 809374b1 f0d9d348 00000000 8ae88740 nt!CmpParseKey+0×547 (FPO: [10,50,0])
ae331c10 80933a76 0000004c ae331c50 00000040 nt!ObpLookupObjectName+0×11f (FPO: [11,17,4])
ae331c64 808bb471 00000000 908fb868 80934601 nt!ObOpenObjectByName+0xea (FPO: [7,5,4])
ae331d50 808897bc 0013f7bc 00020019 0013f720 nt!NtOpenKey+0×1ad (FPO: [SEH])
ae331d50 7c82860c 0013f7bc 00020019 0013f720 nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ ae331d64)

7: kd> .trap ae331854
ErrCode = 00000002
eax=00000000 ebx=ae3319e8 ecx=de019000 edx=0000000f esi=de019000 edi=de019000
eip=8095dddf esp=ae3318c8 ebp=ae3318cc iopl=0  nv up ei pl zr na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010246
nt!RtlCreateSecurityDescriptorRelative+0×13:
8095dddf ab              stos    dword ptr es:[edi]   es:0023:de019000=????????

7: kd> !pte edi
               VA de019000
PDE at 00000000C0603780    PTE at 00000000C06F00C8
contains 00000003D2FF9863  contains 00000000000000C0
pfn 3d2ff9     —DA–KWEV    not valid
                       DemandZero
                       Protect: 6 - ReadWriteExecute

We suspect that users cannot connect because of the lack of kernel pool resources and see the shortage of paged pool along with session pool shortage for one session:

7: kd> !vm 4

*** Virtual Memory Usage ***
Physical Memory:     4193705 (  16774820 Kb)
Page File: \??\C:\pagefile.sys
  Current:  25160704 Kb  Free Space:  18350628 Kb
  Minimum:  25160704 Kb  Maximum:     25160704 Kb
Available Pages:     2556764 (  10227056 Kb)
ResAvail Pages:      3996322 (  15985288 Kb)
Locked IO Pages:         292 (      1168 Kb)
Free System PTEs:      22827 (     91308 Kb)

******* 4 system PTE allocations have failed ******

Free NP PTEs:          32534 (    130136 Kb)
Free Special NP:           0 (         0 Kb)
Modified Pages:           86 (       344 Kb)
Modified PF Pages:        85 (       340 Kb)
NonPagedPool Usage:    32376 (    129504 Kb)
NonPagedPool Max:      65215 (    260860 Kb)
PagedPool 0 Usage:     84341 (    337364 Kb)
PagedPool 1 Usage:      7945 (     31780 Kb)
PagedPool 2 Usage:      8073 (     32292 Kb)
PagedPool 3 Usage:      7979 (     31916 Kb)
PagedPool 4 Usage:      8048 (     32192 Kb)
PagedPool Usage:      116386 (    465544 Kb)
PagedPool Maximum:    134144 (    536576 Kb)

********** 841 pool allocations have failed **********

Shared Commit:         95929 (    383716 Kb)
Special Pool:              0 (         0 Kb)
Shared Process:       126106 (    504424 Kb)
PagedPool Commit:     116435 (    465740 Kb)
Driver Commit:          2605 (     10420 Kb)
Committed pages:     3454252 (  13817008 Kb)
Commit limit:       10419631 (  41678524 Kb)

Total Private:       2961488 (  11845952 Kb)
721c Application.exe  334248 (   1336992 Kb)
4b4c iexplore.exe      36048 (    144192 Kb)
[…]

 Terminal Server Memory Usage By Session:

 Session Paged Pool Maximum is 32768K
 Session View Space Maximum is 20480K

 Session ID 0 @ ba1c5000:
 Paged Pool Usage:        2288K
 Commit Usage:            3344K

[...]

 Session ID 1b @ ba1e3000:
 Paged Pool Usage:       12020K

 *** 622 Pool Allocation Failures ***

 Commit Usage:           13176K

[...]

We also see that the process with PID 721c (Application.exe) consumed 1.3Gb of memory and it belongs to the problem session 0×1b (0n27):

7: kd> !sprocess 1b
Dumping Session 1b

_MM_SESSION_SPACE ba1e3000
_MMSESSION        ba1e3d80
PROCESS 8a4fd020  SessionId: 27  Cid: 1f24    Peb: 7ffde000  ParentCid: 01e4
    DirBase: cfe3f820  ObjectTable: ec731d08  HandleCount: 281.
    Image: csrss.exe

PROCESS 8a4d6b18  SessionId: 27  Cid: 2160    Peb: 7ffde000  ParentCid: 01e4
    DirBase: cfe3f800  ObjectTable: ee932bf0  HandleCount: 436.
    Image: winlogon.exe

[...]

PROCESS 8a467020  SessionId: 27  Cid: 721c    Peb: 7ffdf000  ParentCid: 5728
    DirBase: cfe3f8a0  ObjectTable: f0397350  HandleCount: 380.
    Image: Application.exe

[...]

Unfortunately the dump file is only kernel dump and we cannot inspect user space of that process. We can only say that it was not a handle leak because the number of handles is low (380). We also inspect paged pool stats and find a 3rd-party driver with AAAA tag that used almost 80Mb of paged pool:

7: kd> !poolused 4
   Sorting by  Paged Pool Consumed

  Pool Used:
            NonPaged            Paged
 Tag    Allocs     Used    Allocs     Used
 MmSt        0        0     57216 91753456 Mm section object prototype ptes , Binary: nt!mm
 AAAA        0        0       613 79465472 UNKNOWN pooltag ‘AAAA’, please update pooltag.txt
 CM35        0        0       529 40398848 Internal Configuration manager allocations , Binary: nt!cm
 Ntff        5     1040     35955 29339280 FCB_DATA , Binary: ntfs.sys
 Toke        0        0      3334 12631056 Token objects , Binary: nt!se
 IoNm        0        0     43494  9611392 Io parsing names , Binary: nt!io
 CMAl        0        0      1981  8114176 internal registry memory allocator pool tag , Binary: nt!cm
 Obtb        0        0      2569  6687344 object tables via EX handle.c , Binary: nt!ob
 CM16        0        0      1429  6299648 Internal Configuration manager allocations , Binary: nt!cm
[…]

We can find this driver like explained here. We recommend to contact the vendor of that driver and also monitor instances of Application.exe and get 2-3 subsequent process memory dumps, when it grows, to inspect its virtual memory using differential memory analysis techniques.

- Dmitry Vostokov @ DumpAnalysis.org -

Inconsistent dump, blocked threads, wait chains, incorrect stack trace and process factory: pattern cooperation

Tuesday, June 9th, 2009

The more busy the system is, the more inconsistent are complete memory dumps produced by external physical memory dumpers. On the contrary, quiet systems, with idle existent users and when no one can connect, are more consistent than usual. Here is one example:

Loading Dump File [complete_dump.dmp]
Kernel Complete Dump File: Full address space is available

[...]

WARNING: Process directory table base 00039000 doesn't match CR3 CB759000
[...]

BugCheck 0, {0, 0, 0, 0}

Probably caused by : LiveKdD.SYS ( LiveKdD+12d1 )

In such cases I often dump all processes and threads to look for anomalies and indeed we find plenty of blocked threads in several services. For example, ServiceA.exe has blocked threads waiting for critical sections and LPC requests:

0: kd> !process 0 ff
**** NT ACTIVE PROCESS DUMP ****

[...]

THREAD 86f21118  Cid 07d4.2914  Teb: 7ffa5000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable
     86f21304  Semaphore Limit 0x1
Waiting for reply to LPC MessageId 00f2b546:
Current LPC port d79c6e00
Impersonation token:  db621028 (Level Impersonation)
DeviceMap                 dca24d50
Owning Process            88948d88       Image:         ServiceA.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      14788708       Ticks: 3643121 (0:15:48:43.765)
Context Switch Count      52            
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Win32 Start Address ServiceA!EventWorkerThread (0×0042e2f8)
Start Address kernel32!BaseThreadStartThunk (0×77e617ec)
Stack Init b2246000 Current b2245c08 Base b2246000 Limit b2243000 Call 0
Priority 12 BasePriority 10 PriorityDecrement 0
Kernel stack not resident.
ChildEBP RetAddr 
b2245c10 00000000 0×149c3
[…]

THREAD 8736fdb0  Cid 07d4.3cc4  Teb: 7ff9c000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable
     8736ff9c  Semaphore Limit 0x1
Waiting for reply to LPC MessageId 00f2b518:
Current LPC port d79c6e00
Impersonation token:  da1b51a8 (Level Impersonation)
DeviceMap                 d9f9e368
Owning Process            88948d88       Image:         ServiceA.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      14788707       Ticks: 3643122 (0:15:48:43.781)
Context Switch Count      44            
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Win32 Start Address ServiceA!EventWorkerThread (0×0042e2f8)
Start Address kernel32!BaseThreadStartThunk (0×77e617ec)
Stack Init ae5d3000 Current ae5d2c08 Base ae5d3000 Limit ae5d0000 Call 0
Priority 12 BasePriority 10 PriorityDecrement 0
Kernel stack not resident.
ChildEBP RetAddr 
WARNING: Frame IP not in any known module. Following frames may be wrong.
ae5d2c10 3fe63bfa 0×73e40763
[…]

THREAD 85e89c80  Cid 07d4.5648  Teb: 7ff6a000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable
     871d6e80  SynchronizationEvent
Impersonation token:  d7a1c028 (Level Impersonation)
DeviceMap                 d66ef860
Owning Process            88948d88       Image:         ServiceA.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      14320472       Ticks: 4111357 (0:17:50:39.953)
Context Switch Count      28            
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Win32 Start Address 0x00ebebcc
LPC Server thread working on message Id ebebcc
Start Address kernel32!BaseThreadStartThunk (0x77e617ec)
Stack Init a536b000 Current a536ac60 Base a536b000 Limit a5368000 Call 0
Priority 11 BasePriority 10 PriorityDecrement 0
Kernel stack not resident.
ChildEBP RetAddr 
a536ac78 8083d5b1 nt!KiSwapContext+0x26
a536aca4 8083df9e nt!KiSwapThread+0x2e5
a536acec 8092ae57 nt!KeWaitForSingleObject+0x346
a536ad50 80833bdf nt!NtWaitForSingleObject+0x9a
a536ad50 7c8285ec nt!KiFastCallEntry+0xfc
05b6f7d0 7c827d0b ntdll!KiFastSystemCallRet
05b6f7d4 7c83d236 ntdll!NtWaitForSingleObject+0xc
05b6f810 7c83d281 ntdll!RtlpWaitOnCriticalSection+0×1a3
05b6f830 0040bb5b ntdll!RtlEnterCriticalSection+0xa8

05b6f874 00415889 ServiceA!OnRequest+0×23
[…]
05b6f8b4 77ce33e1 RPCRT4!Invoke+0×30
05b6fcb4 77ce35c4 RPCRT4!NdrStubCall2+0×299
05b6fcd0 77c7ff7a RPCRT4!NdrServerCall2+0×19
05b6fd04 77c8042d RPCRT4!DispatchToStubInCNoAvrf+0×38
05b6fd58 77c80353 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0×11f
05b6fd7c 77c7e0d4 RPCRT4!RPC_INTERFACE::DispatchToStub+0xa3
05b6fdbc 77c7e080 RPCRT4!RPC_INTERFACE::DispatchToStubWithObject+0xc0
05b6fdfc 77c812f0 RPCRT4!LRPC_SCALL::DealWithRequestMessage+0×41e
05b6fe20 77c88678 RPCRT4!LRPC_ADDRESS::DealWithLRPCRequest+0×127
05b6ff84 77c88792 RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0×430
05b6ff8c 77c8872d RPCRT4!RecvLotsaCallsWrapper+0xd
05b6ffac 77c7b110 RPCRT4!BaseCachedThreadRoutine+0×9d
05b6ffb8 77e64829 RPCRT4!ThreadStartRoutine+0×1b
05b6ffec 00000000 kernel32!BaseThreadStart+0×34

Knowing that ServiceA is crucial for our system functions we examine LPC wait chains. All “Waiting for reply to LPC MessageId <…>” threads depend on another service:

0: kd> !lpc message 00f2b546
Searching message f2b546 in threads ...
Client thread 86f21118 waiting a reply from f2b546                         
    Server thread 860cc9e0 is working on message f2b546
[…]
    Server process  : 87d21698 (svchost.exe)
[…]

We find this thread in svchost.exe hosting ServiceB:

0: kd> !thread 860cc9e0 1f
THREAD 860cc9e0  Cid 1310.4ad8  Teb: 7ff88000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable
     871d47d0  Mutant - owning thread 867322e0
Not impersonating
DeviceMap                 d64008f8
Owning Process            87d21698       Image:         svchost.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      14788708       Ticks: 3643121 (0:15:48:43.765)
Context Switch Count      1791            
UserTime                  00:00:00.031
KernelTime                00:00:00.046
Win32 Start Address 0×00f2b546
LPC Server thread working on message Id f2b546
Start Address kernel32!BaseThreadStartThunk (0×77e617ec)
Stack Init b055e000 Current b055dc60 Base b055e000 Limit b055b000 Call 0
Priority 12 BasePriority 10 PriorityDecrement 0
Kernel stack not resident.
ChildEBP RetAddr 
b055dc78 8083d5b1 nt!KiSwapContext+0×26
b055dca4 8083df9e nt!KiSwapThread+0×2e5
b055dcec 8092ae57 nt!KeWaitForSingleObject+0×346
b055dd50 80833bdf nt!NtWaitForSingleObject+0×9a
b055dd50 7c8285ec nt!KiFastCallEntry+0xfc
0400f718 7c827d0b ntdll!KiFastSystemCallRet
0400f71c 7654bf59 ntdll!NtWaitForSingleObject+0xc
0400f73c 7654c694 ServiceB!Lock+0×42
0400f754 76558b17 ServiceB!FindObjectById+0×39
0400f8ac 7654fa50 ServiceB!QueryInformation+0×3e
0400f8c8 77c80193 ServiceB!RpcQueryInformation+0×49
0400f8f8 77ce33e1 RPCRT4!Invoke+0×30
0400fcf8 77ce35c4 RPCRT4!NdrStubCall2+0×299
0400fd14 77c7ff7a RPCRT4!NdrServerCall2+0×19
0400fd48 77c8042d RPCRT4!DispatchToStubInCNoAvrf+0×38
0400fd9c 77c80353 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0×11f
0400fdc0 77c811dc RPCRT4!RPC_INTERFACE::DispatchToStub+0xa3
0400fdfc 77c812f0 RPCRT4!LRPC_SCALL::DealWithRequestMessage+0×42c
0400fe20 77c88678 RPCRT4!LRPC_ADDRESS::DealWithLRPCRequest+0×127
0400ff84 77c88792 RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0×430
0400ff8c 77c8872d RPCRT4!RecvLotsaCallsWrapper+0xd
0400ffac 77c7b110 RPCRT4!BaseCachedThreadRoutine+0×9d
0400ffb8 77e64829 RPCRT4!ThreadStartRoutine+0×1b
0400ffec 00000000 kernel32!BaseThreadStart+0×34

We notice that about 45 threads are waiting for the mutant 871d47d0 owned by thread 867322e0, like this one with incorrect stack trace because of inconsistent dump file data:

0: kd> !thread 86cad210 1f
THREAD 86cad210  Cid 1310.31a4  Teb: 7ff15000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable
     871d47d0  Mutant - owning thread 867322e0
Not impersonating
DeviceMap                 d64008f8
Owning Process            87d21698       Image:         svchost.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      18020896       Ticks: 410933 (0:01:47:00.828)
Context Switch Count      43            
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Win32 Start Address RPCRT4!ThreadStartRoutine (0×77c7b0f5)
Start Address kernel32!BaseThreadStartThunk (0×77e617ec)
Stack Init abb1b000 Current abb1ac60 Base abb1b000 Limit abb18000 Call 0
Priority 11 BasePriority 10 PriorityDecrement 0
Kernel stack not resident.
ChildEBP RetAddr 
WARNING: Frame IP not in any known module. Following frames may be wrong.
abb1ac68 60203020 0×84
[…]

We inspect the thread 867322e0:

0: kd> !thread 867322e0 1f
THREAD 867322e0  Cid 1310.26dc  Teb: 7ff56000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable
     869d4020  ProcessObject
     86732358  NotificationTimer
Not impersonating
DeviceMap                 d64008f8
Owning Process            87d21698       Image:         svchost.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      18454684    
Context Switch Count      5172            
UserTime                  00:00:00.000
KernelTime                00:00:00.031
Win32 Start Address 0×0083fd5d
LPC Server thread working on message Id 83fd5d
Start Address kernel32!BaseThreadStartThunk (0×77e617ec)
Stack Init ad350000 Current ad34fc60 Base ad350000 Limit ad34d000 Call 0
Priority 10 BasePriority 10 PriorityDecrement 0
ChildEBP RetAddr 
ad34fc78 8083d5b1 nt!KiSwapContext+0×26
ad34fca4 8083df9e nt!KiSwapThread+0×2e5
ad34fcec 8092ae57 nt!KeWaitForSingleObject+0×346
ad34fd50 80833bdf nt!NtWaitForSingleObject+0×9a
ad34fd50 7c8285ec nt!KiFastCallEntry+0xfc
04f5fbec 7c827d0b ntdll!KiFastSystemCallRet
04f5fbf0 76548721 ntdll!NtWaitForSingleObject+0xc
04f5fc0c 7654c596 ServiceB!WaitForProcess+0×4a

04f5ffb8 77e64829 ServiceB!LpcThread+0×14e
04f5ffec 00000000 kernel32!BaseThreadStart+0×34

We see that it is waiting for the process 869d4020 and we find the latter one with a single thread blocked by DriverA:

0: kd> !process 869d4020 ff
PROCESS 869d4020  SessionId: 18  Cid: 56b8    Peb: 7ffde000  ParentCid: 3608
    DirBase: 758b3000  ObjectTable: dd7083c0  HandleCount:  79.
    Image: rundll32.exe
    VadRoot 869d5228 Vads 107 Clone 0 Private 330. Modified 1706. Locked 0.
    DeviceMap d8d260a8
    Token                             dfa8f028
    ElapsedTime                       1 Day 23:52:06.918
    UserTime                          00:00:00.312
    KernelTime                        00:00:01.484
    QuotaPoolUsage[PagedPool]         34784
    QuotaPoolUsage[NonPagedPool]      5016
    Working Set Sizes (now,min,max)  (810, 50, 345) (3240KB, 200KB, 1380KB)
    PeakWorkingSetSize                2088
    VirtualSize                       37 Mb
    PeakVirtualSize                   47 Mb
    PageFaultCount                    34473
    MemoryPriority                    BACKGROUND
    BasePriority                      10
    CommitCharge                      707
[..]
    CommandLine:  ‘rundll32 C:\WINDOWS\system32\DllA.dll,CheckData /name”PropertyA”‘
[…]

THREAD 86bac020  Cid 56b8.4904  Teb: 7ffdb000 Win32Thread: 00000000 WAIT: (Unknown) KernelMode Non-Alertable
     8774a930  NotificationEvent
Not impersonating
DeviceMap                 d8d260a8
Owning Process            869d4020       Image:         rundll32.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      10113815       Ticks: 8318014 (1:12:06:08.968)
Context Switch Count      188            
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Win32 Start Address 0x2b169ecb
Start Address kernel32!BaseThreadStartThunk (0x77e617ec)
Stack Init af12b000 Current af12ac34 Base af12b000 Limit af128000 Call 0
Priority 12 BasePriority 10 PriorityDecrement 0
ChildEBP RetAddr 
af12ac4c 8083d5b1 nt!KiSwapContext+0x26
af12ac78 8083df9e nt!KiSwapThread+0x2e5
af12acc0 b5075a82 nt!KeWaitForSingleObject+0x346
WARNING: Stack unwind information not available. Following frames may be wrong.
af12acf4 b507f989 DriverA+0×3a82
[…]

We also check virtual memory consumption and instantly see hundreds of rundll32.exe processes like in a process factory pattern:

0: kd> !vm
[...]
57f4 rundll32.exe       586 (      2344 Kb)
57d0 rundll32.exe       586 (      2344 Kb)
57b4 rundll32.exe       586 (      2344 Kb)
578c rundll32.exe       586 (      2344 Kb)
5780 rundll32.exe       586 (      2344 Kb)
577c rundll32.exe       586 (      2344 Kb)
5744 rundll32.exe       586 (      2344 Kb)
572c rundll32.exe       586 (      2344 Kb)
5724 rundll32.exe       586 (      2344 Kb)
5708 rundll32.exe       586 (      2344 Kb)
56d4 rundll32.exe       586 (      2344 Kb)
56a4 rundll32.exe       586 (      2344 Kb)
564c rundll32.exe       586 (      2344 Kb)
5630 rundll32.exe       586 (      2344 Kb)
55e0 rundll32.exe       586 (      2344 Kb)
55b8 rundll32.exe       586 (      2344 Kb)
55ac rundll32.exe       586 (      2344 Kb)
558c rundll32.exe       586 (      2344 Kb)
550c rundll32.exe       586 (      2344 Kb)
546c rundll32.exe       586 (      2344 Kb)
5428 rundll32.exe       586 (      2344 Kb)
5418 rundll32.exe       586 (      2344 Kb)
53d0 rundll32.exe       586 (      2344 Kb)
53bc rundll32.exe       586 (      2344 Kb)
53b0 rundll32.exe       586 (      2344 Kb)
53a8 rundll32.exe       586 (      2344 Kb)
530c rundll32.exe       586 (      2344 Kb)
5300 rundll32.exe       586 (      2344 Kb)
52e8 rundll32.exe       586 (      2344 Kb)
52a4 rundll32.exe       586 (      2344 Kb)
5294 rundll32.exe       586 (      2344 Kb)
5228 rundll32.exe       586 (      2344 Kb)
5118 rundll32.exe       586 (      2344 Kb)
50bc rundll32.exe       586 (      2344 Kb)
50a8 rundll32.exe       586 (      2344 Kb)
5080 rundll32.exe       586 (      2344 Kb)
5030 rundll32.exe       586 (      2344 Kb)
4fc8 rundll32.exe       586 (      2344 Kb)
4f88 rundll32.exe       586 (      2344 Kb)
4ef0 rundll32.exe       586 (      2344 Kb)
4e84 rundll32.exe       586 (      2344 Kb)
4e70 rundll32.exe       586 (      2344 Kb)
4e68 rundll32.exe       586 (      2344 Kb)
4e4c rundll32.exe       586 (      2344 Kb)
4e38 rundll32.exe       586 (      2344 Kb)
4df8 rundll32.exe       586 (      2344 Kb)
4df4 rundll32.exe       586 (      2344 Kb)
4d2c rundll32.exe       586 (      2344 Kb)
4d00 rundll32.exe       586 (      2344 Kb)
4cd4 rundll32.exe       586 (      2344 Kb)
4cbc rundll32.exe       586 (      2344 Kb)
4c90 rundll32.exe       586 (      2344 Kb)
4c80 rundll32.exe       586 (      2344 Kb)
4c68 rundll32.exe       586 (      2344 Kb)
4c40 rundll32.exe       586 (      2344 Kb)
[...]

By switching the current process context to 869d4020 (.process /r /p command) and using lmv command we check vendors for DllA and DriverA. It seems that they are different. We also see that DllA module is no longer loaded (not seen in the module list) by the time DriverA is waiting for a notification event. All rundll32.exe processes are waiting for that DriverA kernel module. We therefore decide to contact its vendor.

- Dmitry Vostokov @ DumpAnalysis.org -

Main blocked thread, missing component, execution residue and data contents locality: pattern cooperation

Monday, June 8th, 2009

The application failed to start and exception monitor saved the process memory dump with the following stack trace of the main thread that was blocked by raising a status:

0:000> !analyze -v

[...]

ERROR_CODE: (NTSTATUS) 0xc0150004 - The referenced assembly is not installed on your system.

[...]

STACK_TEXT: 
0012fca4 7c93f14e c0150004 0012fd30 7c90d950 ntdll!RtlRaiseStatus+0×26
0012fd1c 7c90e437 0012fd30 7c900000 00000000 ntdll!_LdrpInitialize+0×241
00000000 00000000 00000000 00000000 00000000 ntdll!KiUserApcDispatcher+0×7

[...]

The error code suggests that some DLL was missing. We see that the process just started indeed and only a few DLL were loaded:

0:000> version
Windows XP Version 2600 (Service Pack 3) MP (2 procs) Free x86 compatible
[...]
System Uptime: 0 days 18:00:08.328
Process Uptime: 0 days 0:00:05.000
  Kernel time: 0 days 0:00:00.000
  User time: 0 days 0:00:00.000
[…]

0:000> !dlls -l

0x00261ec0: C:\Program Files\CompanyA\Application.exe
      Base   0x00400000  EntryPoint  0x00410a13  Size        0x00057000
      Flags  0x00005000  LoadCount   0x0000ffff  TlsIndex    0x00000000
             LDRP_LOAD_IN_PROGRESS
             LDRP_ENTRY_PROCESSED

0x00261f18: C:\WINDOWS\system32\ntdll.dll
      Base   0x7c900000  EntryPoint  0x7c912c28  Size        0x000af000
      Flags  0x00085004  LoadCount   0x0000ffff  TlsIndex    0x00000000
             LDRP_IMAGE_DLL
             LDRP_LOAD_IN_PROGRESS
             LDRP_ENTRY_PROCESSED
             LDRP_PROCESS_ATTACH_CALLED

0x00261fc0: C:\WINDOWS\system32\kernel32.dll
      Base   0x7c800000  EntryPoint  0x7c80b63e  Size        0x000f6000
      Flags  0x00085004  LoadCount   0x0000ffff  TlsIndex    0x00000000
             LDRP_IMAGE_DLL
             LDRP_LOAD_IN_PROGRESS
             LDRP_ENTRY_PROCESSED
             LDRP_PROCESS_ATTACH_CALLED

So we try to find any execution residue left on the raw stack that suggests the name of the missing DLL. We dump all raw stack range and try to interpret every double word as a pointer to UNICODE and ASCII strings. We find these entries:

0:000> !teb
TEB at 7ffde000
    ExceptionList:        0012fd0c
    StackBase:            00130000
    StackLimit:           0012e000

    SubSystemTib:         00000000
    FiberData:            00001e00
    ArbitraryUserPointer: 00000000
    Self:                 7ffde000
    EnvironmentPointer:   00000000
    ClientId:             000001dc . 00000dc8
    RpcHandle:            00000000
    Tls Storage:          00000000
    PEB Address:          7ffdf000
    LastErrorValue:       0
    LastStatusValue:      0
    Count Owned Locks:    0
    HardErrorMode:        0

0:000> dpa 0012e000 00130000
0012e000  00000000
0012e004  00000000
0012e008  00000000
0012e00c  00000000
[…]
0012f28c  0012f2a0 “”
0012f290  7c945afc “SXS: RtlGetAssemblyStorageRoot() unable to resolve stor”
0012f294  00000000
0012f298  7c91eaf4 “”
0012f29c  7c91eaf4 “”
0012f2a0  00000000
0012f2a4  00000001
0012f2a8  35313063
0012f2ac  34303030
0012f2b0  00000001
0012f2b4  0012f34c “SXS: RtlGetAssemblyStorageRoot().Z.|…”
0012f2b8  000074da
0012f2bc  0012f2f8 “”
0012f2c0  7c92fb96 “…….t..M…^..”
0012f2c4  0012f2d8 “”
0012f2c8  7c945afc “SXS: RtlGetAssemblyStorageRoot() unable to resolve stor”
0012f2cc  00000000
[…]

0:000> dpu 0012e000 00130000
0012e000  00000000
0012e004  00000000
0012e008  00000000
0012e00c  00000000
0012e010  00000000
[…]
0012f164  020800ac
0012f168  0012f194 “C:\WINDOWS\WinSxS\x86_Microsoft.VC80.CRT_1fc8b3b9a1e18e”
0012f16c  00000000
0012f170  00000000
0012f174  00860086
0012f178  001505d0 “x86_Microsoft.VC80.CRT_1fc8b3b9a1e18e3b_8.0.50727.762_x”
0012f17c  00000000
0012f180  00000000
0012f184  019101bb
0012f188  ffffffff
[…]

We also find this path by visual inspection of the raw stack data that is located near the exception code c0150004:

0:000> dc 0012e000 00130000
0012e000  00000000 00000000 00000000 00000000  …………….
0012e010  00000000 00000000 00000000 00000000  …………….
[…]
0012f190  0012f164 003a0043 0057005c 004e0049  d…C.:.\.W.I.N.
0012f1a0  004f0044 00530057 0057005c 006e0069  D.O.W.S.\.W.i.n.
0012f1b0  00780053 005c0053 00380078 005f0036  S.x.S.\.x.8.6._.
0012f1c0  0069004d 00720063 0073006f 0066006f  M.i.c.r.o.s.o.f.
0012f1d0  002e0074 00430056 00300038 0043002e  t…V.C.8.0…C.
0012f1e0  00540052 0031005f 00630066 00620038  R.T._.1.f.c.8.b.
0012f1f0  00620033 00610039 00650031 00380031  3.b.9.a.1.e.1.8.
0012f200  00330065 005f0062 002e0038 002e0030  e.3.b._.8…0…
0012f210  00300035 00320037 002e0037 00360037  5.0.7.2.7…7.6.
0012f220  005f0032 002d0078 00770077 0036005f  2._.x.-.w.w._.6.
0012f230  00310062 00380032 00300037 005c0030  b.1.2.8.7.0.0.\.

0012f240  00000000 002a056c 000006e2 001601c8  ….l.*………
0012f250  00000001 00000fa0 7c911066 7c9101bb  ……..f..|…|
0012f260  00000000 00271f10 00000004 7c910ee4  ……’……..|
0012f270  35313063 34303030 00000314 0000000a  c0150004……..
0012f280  000074da 0012f2c0 7c92fb96 0012f2a0  .t………|….
0012f290  7c945afc 00000000 7c91eaf4 7c91eaf4  .Z.|…….|…|
0012f2a0  00000000 00000001 35313063 34303030  ……..c0150004
0012f2b0  00000001 0012f34c 000074da 0012f2f8  ….L….t……
0012f2c0  7c92fb96 0012f2d8 7c945afc 00000000  …|…..Z.|….
0012f2d0  7c91eaf4 7c91eaf4 00000000 00000001  …|…|……..
0012f2e0  0012f2fc 7c92fcbb 00000001 0012f384  …….|……..
0012f2f0  0000005b 00000035 00000000 0012f5a0  […5………..
0012f300  7c92fc78 0012f37c 00000035 7c91eab5  x..||…5……|
0012f310  00000000 c0150004 00000003 00000000  …………….
0012f320  00150020 00000000 0012f57c 7c935996   …….|….Y.|
0012f330  00162f90 7c945afc 0012f598 00000000  ./…Z.|……..
0012f340  0000005b 7c91005b 0012f34c 3a535853  […[..|L…SXS:
0012f350  6c745220 41746547 6d657373 53796c62   RtlGetAssemblyS
0012f360  61726f74 6f526567 2928746f 7c945afc  torageRoot().Z.|
[…]

0:000> du /c 90 012f190+4
0012f194  “C:\WINDOWS\WinSxS\x86_Microsoft. VC80.CRT_1fc8b3b9a1e18e3b_8.0.50727.762_x-ww_6b128700\”

Doing Google search we find that others experienced similar problems and they were resolved by (re-)installing VC++ 2005 redistribution package and the latest service pack.

- Dmitry Vostokov @ DumpAnalysis.org -

Manual dump, virtualized process, stack trace collection, multiple exceptions, optimized code, wild code pointer, incorrect stack trace and hidden exception: pattern cooperation

Thursday, June 4th, 2009

The following WER dialog appeared after I copied an RTF text from an e-mail client to IE while editing another blog post:

 

Although the dialog points to html.iec it is good to show the basic techniques for component identification using WinDbg. I took a manual user dump of that unresponsive process while it was showing that dialog above using Task Manager:

The saved process memory dump shows that IE was virtualized 32-bit process running on x64 Windows:

STACK_TEXT: 
00000000`000dee18 00000000`75bbab46 : wow64cpu!TurboDispatchJumpAddressEnd+0xef
00000000`000deec0 00000000`75bba14c : wow64!Wow64SystemServiceEx+0×27a
00000000`000deef0 00000000`778f52d3 : wow64!Wow64LdrpInitialize+0×4b4
00000000`000df450 00000000`778f5363 : ntdll!CsrClientConnectToServer+0×493
00000000`000df700 00000000`778e85ce : ntdll!CsrClientConnectToServer+0×523
00000000`000df7b0 00000000`00000000 : ntdll!LdrInitializeThunk+0xe

However, even after switching to x86 mode, !analyze -v doesn’t report any signs of exception violation, c0000005, seen from WER dialog above:

0:000> .load wow64exts; .effmach x86
Effective machine: x86 compatible (x86)

0:000:x86> !analyze -v
[...]

EXCEPTION_RECORD:  ffffffffffffffff -- (.exr 0xffffffffffffffff)
ExceptionAddress: 0000000000000000
   ExceptionCode: 80000003 (Break instruction exception)
  ExceptionFlags: 00000000
NumberParameters: 0

ERROR_CODE: (NTSTATUS) 0x80000003 - {EXCEPTION}  Breakpoint  A breakpoint has been reached.

STACK_TEXT: 
0027e428 7679e91a ntdll_77a60000!NtWaitForMultipleObjects+0x15
0027e4c4 76a58f76 kernel32!WaitForMultipleObjectsEx+0x11d
0027e518 75ad6071 user32!RealMsgWaitForMultipleObjectsEx+0x14d
WARNING: Frame IP not in any known module. Following frames may be wrong.
0027e538 75ad61f0 ieui+0x6071
0027e560 75ad6196 ieui+0x61f0
0027e59c 7291ffc6 ieui+0x6196
0027e5a8 7290f579 ieframe!BrowserThreadProc+0x3f
0027e5cc 7290f4c7 ieframe!BrowserNewThreadProc+0x7b
0027f63c 728fd1ba ieframe!SHOpenFolderWindow+0x188
0027f86c 013133c3 ieframe!IEWinMain+0x2d9
0027fcb0 0131325a iexplore!wWinMain+0x27b
0027fd44 7680e4a5 iexplore!_initterm_e+0x1b1
0027fd50 77adcfed kernel32!BaseThreadInitThunk+0xe
0027fd90 77add1ff ntdll_77a60000!__RtlUserThreadStart+0x23
0027fda8 00000000 ntdll_77a60000!_RtlUserThreadStart+0x1b

We list all threads for any exception processing signs and we find one such thread indeed:

0:002:x86> ~*kc 100

[...]

   2  Id: 404.ba4 Suspend: 0 Teb: 7efaa000 Unfrozen

ntdll_77a60000!NtWaitForMultipleObjects
kernel32!WaitForMultipleObjectsEx
kernel32!WaitForMultipleObjects
kernel32!WerpReportFaultInternal
kernel32!WerpReportFault
kernel32!UnhandledExceptionFilter
ntdll_77a60000!__RtlUserThreadStart
ntdll_77a60000!_EH4_CallFilterFunc
ntdll_77a60000!_except_handler4
ntdll_77a60000!ExecuteHandler2
ntdll_77a60000!ExecuteHandler
ntdll_77a60000!KiUserExceptionDispatcher

WARNING: Frame IP not in any known module. Following frames may be wrong.
0×0
html!LwMultDivRU
html!FMarkListCallback
html!JcCalcFromXaExtents
html!EmitNonBreakingSpace
html!FEmitHtmlFnOtag
html!ConvertRtfToForeign
html!FceRtfToForeign
html!RtfToForeign32
mshtmled!CRtfToHtmlConverter::ExternalRtfToInternalHtml
mshtmled!CRtfToHtmlConverter::StringRtfToStringHtml
mshtmled!CRtfToHtmlConverter::StringRtfToStringHtml
mshtmled!CHTMLEditor::ConvertRTFToHTML
mshtmled!CPasteCommand::PasteFromClipboard
mshtmled!CPasteCommand::PrivateExec
mshtmled!CCommand::Exec
mshtmled!CMshtmlEd::Exec
mshtml!CEditRouter::ExecEditCommand
mshtml!CDoc::ExecHelper
mshtml!CFrameSite::Exec
mshtml!CDoc::RouteCTElement
mshtml!CDoc::ExecHelper
mshtml!CDoc::Exec
mshtml!CDoc::OnCommand
mshtml!CDoc::OnWindowMessage
mshtml!CServer::WndProc
user32!InternalCallWinProc
user32!UserCallWinProcCheckWow
user32!SendMessageWorker
user32!SendMessageW
mshtml!CElement::PerformTA
mshtml!CDoc::PerformTA
mshtml!CDoc::PumpMessage
mshtml!CDoc::DoTranslateAccelerator
mshtml!CServer::TranslateAcceleratorW
mshtml!CDoc::TranslateAcceleratorW
ieframe!CProxyActiveObject::TranslateAcceleratorW
ieframe!CDocObjectView::TranslateAcceleratorW
ieframe!CCommonBrowser::v_MayTranslateAccelerator
ieframe!CShellBrowser2::_MayTranslateAccelerator
ieframe!CShellBrowser2::v_MayTranslateAccelerator
ieframe!CTabWindow::_TabWindowThreadProc
kernel32!BaseThreadInitThunk
ntdll_77a60000!__RtlUserThreadStart
ntdll_77a60000!_RtlUserThreadStart

[...]

The same stack fragment with function parameters has a zero for its UnhandledExceptionFilter first parameter, perhaps because of optimized code reusing local variable slots, and we cannot apply .exptr meta-command:

0:002:x86> ~2kv
ChildEBP RetAddr  Args to Child             
0361d938 7679e91a 00000002 0361d988 00000001 ntdll_77a60000!NtWaitForMultipleObjects+0x15
0361d9d4 767949d9 0361d988 0361da24 00000000 kernel32!WaitForMultipleObjectsEx+0x11d
0361d9f0 7684573d 00000002 0361da24 00000000 kernel32!WaitForMultipleObjects+0x18
0361da5c 76845969 0361db2c 00000001 00000001 kernel32!WerpReportFaultInternal+0x16d
0361da70 7681c66f 0361db2c 00000001 35137a6c kernel32!WerpReportFault+0x70
0361dafc 77add03e 00000000 77aaf2d0 00000000 kernel32!UnhandledExceptionFilter+0×1b5
0361db04 77aaf2d0 00000000 0361fe04 77a8da38 ntdll_77a60000!__RtlUserThreadStart+0×6f
0361db18 77b129b3 00000000 00000000 00000000 ntdll_77a60000!_EH4_CallFilterFunc+0×12
0361db40 77a83099 fffffffe 0361fdf4 0361dc7c ntdll_77a60000!_except_handler4+0×8e
0361db64 77a8306b 0361dc2c 0361fdf4 0361dc7c ntdll_77a60000!ExecuteHandler2+0×26
0361dc14 77a82eff 0161dc2c 0361dc7c 0361dc2c ntdll_77a60000!ExecuteHandler+0×24
0361dc18 0161dc2c 0361dc7c 0361dc2c 0361dc7c ntdll_77a60000!KiUserExceptionDispatcher+0xf
WARNING: Frame IP not in any known module. Following frames may be wrong.
0361df5c 72333b0e 07feaee4 0000001f 00000000 0×161dc2c
0361df74 72332ba9 00000000 0361dffc 00000000 html!LwMultDivRU+0×4b6
0361dfb8 7233518a 0361dffc 0000000c 0361e074 html!FMarkListCallback+0×56c
0361dfc8 723368b7 0000001f 0361dffc 0000000f html!JcCalcFromXaExtents+0×91
0361e074 723370c5 0000000f 00000000 000007d0 html!EmitNonBreakingSpace+0×445
0361e1a0 723407ff 00000000 0000000f 00000104 html!FEmitHtmlFnOtag+0×17d
0361e1c4 7231c6a8 00000001 00000000 00000000 html!ConvertRtfToForeign+0×105
0361e64c 7231c745 00ed03bc 00000000 00ed0374 html!FceRtfToForeign+0×266

Also the transition from html.iec module to exception processing code is through an invalid address 0×161dc2c so we might guess that this was an instance of wild code pointer or the case of incorrect stack trace. However using techniques to get exception context from hidden exceptions we get the following stack trace:

0:002:x86> .cxr 0361dc7c
eax=0000004e ebx=09d5755f ecx=07feaee4 edx=07feaf08 esi=07feaee4 edi=00000000
eip=7233d50b esp=0361df48 ebp=0361df5c iopl=0  nv up ei pl nz na pe nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010206
html!FPseudoStyleBis+0×26:
7233d50b 0fb70f          movzx   ecx,word ptr [edi]       ds:002b:00000000=????

0:002:x86> kL 100
ChildEBP RetAddr 
0361df50 7233d597 html!FPseudoStyleBis+0x26
0361df5c 72333b0e html!BisFromLpxszStyle+0x1c
0361df74 72332ba9 html!LwMultDivRU+0x4b6
0361dfb8 7233518a html!FMarkListCallback+0x56c
0361dfc8 723368b7 html!JcCalcFromXaExtents+0x91
0361e074 723370c5 html!EmitNonBreakingSpace+0x445
0361e1a0 723407ff html!FEmitHtmlFnOtag+0x17d
0361e1c4 7231c6a8 html!ConvertRtfToForeign+0x105
0361e64c 7231c745 html!FceRtfToForeign+0x266
0361e674 744e5ad4 html!RtfToForeign32+0x51
0361eabc 744e5c83 mshtmled!CRtfToHtmlConverter::ExternalRtfToInternalHtml+0x163
0361ef10 7449cc15 mshtmled!CRtfToHtmlConverter::StringRtfToStringHtml+0x11a
0361ef2c 7449cd81 mshtmled!CRtfToHtmlConverter::StringRtfToStringHtml+0x38
0361ef40 744cdcea mshtmled!CHTMLEditor::ConvertRTFToHTML+0x12
0361efac 744ce392 mshtmled!CPasteCommand::PasteFromClipboard+0x2c0
0361f01c 7448d218 mshtmled!CPasteCommand::PrivateExec+0x47a
0361f040 7448d1ad mshtmled!CCommand::Exec+0x4b
0361f064 74000d14 mshtmled!CMshtmlEd::Exec+0xf9
0361f094 73ff88a8 mshtml!CEditRouter::ExecEditCommand+0xd6
0361f43c 7415eccf mshtml!CDoc::ExecHelper+0x338d
0361f488 73ff8a2f mshtml!CFrameSite::Exec+0x264
0361f4bc 73ff87af mshtml!CDoc::RouteCTElement+0xf1
0361f854 73ff8586 mshtml!CDoc::ExecHelper+0x325e
0361f874 740a0e7b mshtml!CDoc::Exec+0x1e
0361f8ac 7401a708 mshtml!CDoc::OnCommand+0x9c
0361f9c0 73f297e1 mshtml!CDoc::OnWindowMessage+0x841
0361f9ec 76a58807 mshtml!CServer::WndProc+0x78
0361fa18 76a58962 user32!InternalCallWinProc+0x23
0361fa90 76a5c4b6 user32!UserCallWinProcCheckWow+0x109
0361fad4 76a5c517 user32!SendMessageWorker+0x55b
0361faf8 7400fb9b user32!SendMessageW+0x7f
0361fb24 73f68e5a mshtml!CElement::PerformTA+0x71
0361fb44 73f68db9 mshtml!CDoc::PerformTA+0xd8
0361fbc0 73ff381c mshtml!CDoc::PumpMessage+0x8e0
0361fc74 73ff3684 mshtml!CDoc::DoTranslateAccelerator+0x33f
0361fc90 73ff34cc mshtml!CServer::TranslateAcceleratorW+0x56
0361fcb0 7296f550 mshtml!CDoc::TranslateAcceleratorW+0x83
0361fccc 7296f600 ieframe!CProxyActiveObject::TranslateAcceleratorW+0x30
0361fcf0 7296fca1 ieframe!CDocObjectView::TranslateAcceleratorW+0xb1
0361fd10 7296faf4 ieframe!CCommonBrowser::v_MayTranslateAccelerator+0xda
0361fd3c 7296f7b0 ieframe!CShellBrowser2::_MayTranslateAccelerator+0x68
0361fd4c 7296f7f5 ieframe!CShellBrowser2::v_MayTranslateAccelerator+0x15
0361fdb8 7680e4a5 ieframe!CTabWindow::_TabWindowThreadProc+0x264
0361fdc4 77adcfed kernel32!BaseThreadInitThunk+0xe
0361fe04 77add1ff ntdll_77a60000!__RtlUserThreadStart+0x23
0361fe1c 00000000 ntdll_77a60000!_RtlUserThreadStart+0x1b

Then we can use lmv command to see the component version:

0:002:x86> lmv m html
start             end                 module name
722f0000 7236f000   html     # (pdb symbols)          c:\mss\html.pdb\34A9E6645AEA4E6A83EA51D2849C731C1\html.pdb
    Loaded symbol image file: html.iec
    Image path: C:\Windows\System32\html.iec
    Image name: html.iec
    Timestamp:        Tue Mar 03 03:26:33 2009 (49ACA369)
    CheckSum:         00086241
    ImageSize:        0007F000
    File version:     2017.0.0.18226
    Product version:  6.0.6001.18226
    File flags:       0 (Mask 3F)
    File OS:          40004 NT Win32
    File type:        2.0 Dll
    File date:        00000000.00000000
    Translations:     0409.04b0
    CompanyName:      Microsoft Corporation
    ProductName:      Microsoft® Windows® Operating System
    InternalName:     HTML.IEC
    OriginalFilename: HTML.IEC
    ProductVersion:   6.0.6001.18226
    FileVersion:      2017.0.0.18226 (vistasp1_gdr.090302-1506)
    FileDescription:  Microsoft HTML Converter
    LegalCopyright:   © Microsoft Corporation. All rights reserved.

- Dmitry Vostokov @ DumpAnalysis.org -

Graphical Notation for Memory Dumps (Part 1)

Saturday, May 23rd, 2009

Inspired by Penrose tensor notation encountered in The Road to Reality book and Feynman diagrams I’d like to introduce Visual Dump Objects (VDO) graphical notation to depict and communicate memory dump analysis patterns, their combinations and analysis results. Let’s look at some basic visual objects (shown in handwriting).

1. A thread:

   or   

2. A function:

3. A module:

4. A thread running through functions, modules or both (stack trace). Optional arrowhead can indicate stack trace direction:

  or    or  

Threads running through modules depict collapsed stack traces.

5. A blocked thread:

An example of 3 threads blocked by another thread (an arrowhead can disambiguate the direction of the waiting chain):

6. A spiking thread (colors are encouraged in VDO notation):

   or   

7. Space boundary between user land and kernel land:

 

Here is an example of the thread spiking in kernel space:

or with modules from stack trace:

More notation to come very soon.

- Dmitry Vostokov @ DumpAnalysis.org -

Comparative Memory Dump Analysis: CPU Spikes

Friday, May 22nd, 2009

Comparative analysis is based on saving several consecutive memory dump files to see similarities and differences. Most often this technique is used for memory leaks, for example, process heap leaks. Here we see another example related to CPU spikes.

One application was spiking CPU and its user dump file was saved manually. !runaway WinDbg command showed the following distribution of kernel and user thread times:

0:000> !runaway f
 User Mode Time
  Thread       Time
  14:f48       0 days 1:22:34.343
  15:f4c       0 days 0:00:45.671
  17:f54       0 days 0:00:41.359
  18:f58       0 days 0:00:33.468
  16:f50       0 days 0:00:24.171
  12:f40       0 days 0:00:17.531
  13:f44       0 days 0:00:07.109
   0:f10       0 days 0:00:00.500
[…]
 Kernel Mode Time
  Thread       Time
  15:f4c       0 days 0:00:51.046
  17:f54       0 days 0:00:41.125
  18:f58       0 days 0:00:32.765
  16:f50       0 days 0:00:23.500
  12:f40       0 days 0:00:18.671
  13:f44       0 days 0:00:07.453
  14:f48       0 days 0:00:02.031
   1:f14       0 days 0:00:00.343
[…]
 Elapsed Time
  Thread       Time
   1:f14       0 days 12:01:42.656
   0:f10       0 days 12:01:42.656
  24:f70       0 days 12:01:42.093
  23:f6c       0 days 12:01:42.093
  22:f68       0 days 12:01:42.093
  21:f64       0 days 12:01:42.093
  20:f60       0 days 12:01:42.093
  19:f5c       0 days 12:01:42.093
  18:f58       0 days 12:01:42.093
  17:f54       0 days 12:01:42.093
  16:f50       0 days 12:01:42.093
  15:f4c       0 days 12:01:42.093
  14:f48       0 days 12:01:42.093
  13:f44       0 days 12:01:42.093
[…]

We see that thread #14 spent 1 hour and 22 minutes in user space and only 2 seconds in kernel space. Its thread stack showed DllA and DllB modules:

0:000> ~14s
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for DllA.dll -
eax=00010da2 ebx=09227bdc ecx=09236bcc edx=0000003a esi=006769e4 edi=00000335
eip=6120b231 esp=00fefd04 ebp=00fefd2c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
DllA!foo+0×3549:
6120b231 8b45ec          mov     eax,dword ptr [ebp-14h] ss:0023:00fefd18=09251a30

0:000> kv
ChildEBP RetAddr  Args to Child             
WARNING: Stack unwind information not available. Following frames may be wrong.
00fefd2c 6120be31 00000063 8633cdb9 006769e4 DllA!foo+0×3549
00fefd64 6120c6db 00fefe00 00006f07 0353f594 DllA!foo+0×4149
00fefdbc 6120d517 00fefe00 00fefde8 8633cd25 DllA!foo+0×49f3
00fefdf8 6120db4d 0353f594 09251848 00000000 DllA!foo+0×582f
00fefe14 6120f280 006769a0 00000000 0000020c DllA!foo+0×5e65
00fefe84 00933436 0000020c 0353f5ac 00000001 DllA!foo+0×7598

00fefec0 008414f1 0353f580 034db5d0 00000003 DllB!read+0xc6
[…]
00feffb8 77e64829 0078f290 00000000 00000000 msvcr71!_endthreadex+0xa0
00feffec 00000000 7c3494f6 0078f290 00000000 kernel32!BaseThreadStart+0×34

These modules belong to different vendors and the code could loop in DllB or in one of DllA functions. In order to show statistically which module is more suspect than the other, we requested 3 consecutive memory dumps of the same spiking process. They revealed the similar uniform pattern:

0:000> !runaway f
 User Mode Time
  Thread       Time
  16:e5c       0 days 1:37:24.640
   0:e1c       0 days 0:00:00.640
  17:e60       0 days 0:00:00.453
  18:e64       0 days 0:00:00.328
  15:e58       0 days 0:00:00.265
  14:e54       0 days 0:00:00.203
[…]
 Kernel Mode Time
  Thread       Time
  16:e5c       0 days 0:00:00.687
  15:e58       0 days 0:00:00.421
  17:e60       0 days 0:00:00.375
  14:e54       0 days 0:00:00.265
  18:e64       0 days 0:00:00.250
   0:e1c       0 days 0:00:00.046
[…]
 Elapsed Time
  Thread       Time
   1:e20       0 days 4:41:11.562
   0:e1c       0 days 4:41:11.562
  18:e64       0 days 4:41:10.875
  17:e60       0 days 4:41:10.875
  16:e5c       0 days 4:41:10.875
  15:e58       0 days 4:41:10.875
  14:e54       0 days 4:41:10.875
  13:e50       0 days 4:41:10.875
  12:e4c       0 days 4:41:10.875
  11:e48       0 days 4:41:10.875
  10:e44       0 days 4:41:10.875
   9:e40       0 days 4:41:10.875
   8:e3c       0 days 4:41:10.875
   7:e38       0 days 4:41:10.875
   6:e34       0 days 4:41:10.875
   5:e30       0 days 4:41:10.875
   4:e2c       0 days 4:41:10.875
   3:e28       0 days 4:41:10.875
   2:e24       0 days 4:41:10.875
[…]

1st dump file: Process Uptime: 0 days 4:41:12.000

0:000> ~16r
eax=07abbb6c ebx=07ab45d4 ecx=07ab4604 edx=00679344 esi=006769e4 edi=00000035
eip=6120b23e esp=0106fd30 ebp=0106fd58 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
DllA!foo+0×3556:
6120b23e 8bcb            mov     ecx,ebx

0:000> ~16kv
ChildEBP RetAddr  Args to Child             
WARNING: Stack unwind information not available. Following frames may be wrong.
0106fd58 6120be31 0000000b 47d6689c 006769e4 DllA!foo+0×3556
0106fd90 6120c6db 0106fe2c 00007329 0342d14c DllA!foo+0×4149
0106fde8 6120d517 0106fe2c 0106fe14 47d66b28 DllA!foo+0×49f3
0106fe24 6120db4d 0342d14c 07abb960 00000000 DllA!foo+0×582f
0106fe40 6120f280 006769a0 00000000 000001d8 DllA!foo+0×5e65
0106feb0 00933436 000001d8 0342d164 00000001 DllA!foo+0×7598

0106feec 008414f1 0342d138 00000000 0340b290 DllB!read+0xc6
[…]
0106ffb8 77e64829 00784be0 00000000 00000000 msvcr71!_endthreadex+0xa0
0106ffec 00000000 7c3494f6 00784be0 00000000 kernel32!BaseThreadStart+0×34

2nd dump file: Process Uptime: 0 days 4:41:42.000

0:000> ~16r
eax=0002550b ebx=00679344 ecx=07ab45d4 edx=00000009 esi=006769e4 edi=00000035
eip=6120b231 esp=0106fd30 ebp=0106fd58 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for DllA.dll -
DllA!foo+0×3549:
6120b231 8b45ec          mov     eax,dword ptr [ebp-14h] ss:0023:0106fd44=07abbb48

0:000> ~16kv
ChildEBP RetAddr  Args to Child             
WARNING: Stack unwind information not available. Following frames may be wrong.
0106fd58 6120be31 0000000b 47d6689c 006769e4 DllA!foo+0×3549
0106fd90 6120c6db 0106fe2c 00007329 0342d14c DllA!foo+0×4149
0106fde8 6120d517 0106fe2c 0106fe14 47d66b28 DllA!foo+0×49f3
0106fe24 6120db4d 0342d14c 07abb960 00000000 DllA!foo+0×582f
0106fe40 6120f280 006769a0 00000000 000001d8 DllA!foo+0×5e65
0106feb0 00933436 000001d8 0342d164 00000001 DllA!foo+0×7598

0106feec 008414f1 0342d138 00000000 0340b290 DllB!read+0xc6
[…]
0106ffb8 77e64829 00784be0 00000000 00000000 msvcr71!_endthreadex+0xa0
0106ffec 00000000 7c3494f6 00784be0 00000000 kernel32!BaseThreadStart+0×34

3rd dump file: Process Uptime: 0 days 4:41:59.000

0:000> ~16r
eax=07abbb6c ebx=07ab4604 ecx=07ab45d4 edx=07ab4604 esi=006769e4 edi=00000035
eip=6120b239 esp=0106fd30 ebp=0106fd58 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
DllA!foo+0×3551:
6120b239 895108          mov     dword ptr [ecx+8],edx ds:0023:07ab45dc=07ab4604

0:000> ~16kv
ChildEBP RetAddr  Args to Child             
WARNING: Stack unwind information not available. Following frames may be wrong.
0106fd58 6120be31 0000000b 47d6689c 006769e4 DllA!foo+0×3551
0106fd90 6120c6db 0106fe2c 00007329 0342d14c DllA!foo+0×4149
0106fde8 6120d517 0106fe2c 0106fe14 47d66b28 DllA!foo+0×49f3
0106fe24 6120db4d 0342d14c 07abb960 00000000 DllA!foo+0×582f
0106fe40 6120f280 006769a0 00000000 000001d8 DllA!foo+0×5e65
0106feb0 00933436 000001d8 0342d164 00000001 DllA!foo+0×7598

0106feec 008414f1 0342d138 00000000 0340b290 DllB!read+0xc6
[…]
0106ffb8 77e64829 00784be0 00000000 00000000 msvcr71!_endthreadex+0xa0
0106ffec 00000000 7c3494f6 00784be0 00000000 kernel32!BaseThreadStart+0×34

We see exactly the same stack trace except the first (top) frame. The offset differs very little: +0×3556, +0×3549 (twice) and +0×3551. We can suspect the small loop here:

0:000> ub
DllA!foo+0x3549:
6120b231 8b45ec          mov     eax,dword ptr [ebp-14h]
6120b234 8d0490          lea     eax,[eax+edx*4]
6120b237 8b10            mov     edx,dword ptr [eax]
6120b239 895108          mov     dword ptr [ecx+8],edx
6120b23c 8908            mov     dword ptr [eax],ecx
6120b23e 8bcb            mov     ecx,ebx
6120b240 85db            test    ebx,ebx
6120b242 75e3            jne     DllA!foo+0x353f (6120b227)

0:000> ub 6120b231
DllA!foo+0x3535:
6120b21d 8b4d08          mov     ecx,dword ptr [ebp+8]
6120b220 8b0c88          mov     ecx,dword ptr [eax+ecx*4]
6120b223 3bcb            cmp     ecx,ebx
6120b225 741d            je      DllA!foo+0x355c (6120b244)
6120b227 8b5908          mov     ebx,dword ptr [ecx+8]
6120b22a 8b410c          mov     eax,dword ptr [ecx+0Ch]
6120b22d 33d2            xor     edx,edx
6120b22f f7f7            div     eax,edi

0:000> u
DllA!foo+0x3549:
6120b231 8b45ec          mov     eax,dword ptr [ebp-14h]
6120b234 8d0490          lea     eax,[eax+edx*4]
6120b237 8b10            mov     edx,dword ptr [eax]
6120b239 895108          mov     dword ptr [ecx+8],edx
6120b23c 8908            mov     dword ptr [eax],ecx
6120b23e 8bcb            mov     ecx,ebx
6120b240 85db            test    ebx,ebx
6120b242 75e3            jne     DllA!foo+0x353f (6120b227)

0:000> u 6120b242
DllA!foo+0x355a:
6120b242 75e3            jne     DllA!foo+0x353f (6120b227)
6120b244 ff4508          inc     dword ptr [ebp+8]
6120b247 8b4508          mov     eax,dword ptr [ebp+8]
6120b24a 3b4608          cmp     eax,dword ptr [esi+8]
6120b24d 72cc            jb      DllA!foo+0x3533 (6120b21b)
6120b24f ff36            push    dword ptr [esi]
6120b251 e88dc70000      call    DllA!foo+0xfcfb (612179e3)
6120b256 59              pop     ecx

All this strongly rules out DllB and points to DllA as the first vendor contact.

- Dmitry Vostokov @ DumpAnalysis.org -

Crash Dump Analysis Patterns (Part 84)

Friday, May 15th, 2009

Sometimes the assembly code looks almost wild (not like generated by your favourite compiler). For example (that also shows .NET runtime native unhandled exception processing):

0:000> kL 100
ChildEBP RetAddr 
0014dbb4 77189254 ntdll!KiFastSystemCallRet
0014dbb8 75fec244 ntdll!ZwWaitForSingleObject+0xc
0014dc28 75fec1b2 kernel32!WaitForSingleObjectEx+0xbe
0014dc3c 72605389 kernel32!WaitForSingleObject+0x12
0014dc6c 726058e7 mscorwks!ClrWaitForSingleObject+0x24
0014e128 72608084 mscorwks!RunWatson+0x1df
0014e86c 7260874a mscorwks!DoFaultReportWorker+0xb59
0014e8a8 72657452 mscorwks!DoFaultReport+0xc3
0014e8cc 7265c0c7 mscorwks!WatsonLastChance+0x3f
0014e924 7265c173 mscorwks!CLRAddVectoredHandlers+0x209
0014e92c 7603f4be mscorwks!InternalUnhandledExceptionFilter+0x22
0014e9e8 771a85b7 kernel32!UnhandledExceptionFilter+0×127
0014e9f0 77139a14 ntdll!__RtlUserThreadStart+0×6f
0014ea04 771340f4 ntdll!_EH4_CallFilterFunc+0×12
0014ea2c 77189b99 ntdll!_except_handler4+0×8e
0014ea50 77189b6b ntdll!ExecuteHandler2+0×26
0014eb00 771899f7 ntdll!ExecuteHandler+0×24
0014eb00 03ca0141 ntdll!KiUserExceptionDispatcher+0xf
WARNING: Frame IP not in any known module. Following frames may be wrong.
0014ee28 634c2f42 0×3ca0141
0014ee34 67715e44 System_ni+0×132f42
0014ee70 72431b4c System_ServiceProcess_ni+0×25e44
0014ee80 724421f9 mscorwks!CallDescrWorker+0×33
0014ef00 72456571 mscorwks!CallDescrWorkerWithHandler+0xa3
0014f03c 724565a4 mscorwks!MethodDesc::CallDescr+0×19c
0014f058 724565c2 mscorwks!MethodDesc::CallTargetWorker+0×1f
0014f070 724afac5 mscorwks!MethodDescCallSite::CallWithValueTypes+0×1a
0014f1d4 724af9e5 mscorwks!ClassLoader::RunMain+0×223
0014f43c 724aff35 mscorwks!Assembly::ExecuteMainMethod+0xa6
0014f90c 724b011f mscorwks!SystemDomain::ExecuteMainMethod+0×456
0014f95c 724b004f mscorwks!ExecuteEXE+0×59
0014f9a4 72f57c24 mscorwks!_CorExeMain+0×15c
0014f9b4 75fe4911 mscoree!_CorExeMain+0×2c
0014f9c0 7716e4b6 kernel32!BaseThreadInitThunk+0xe
0014fa00 7716e489 ntdll!__RtlUserThreadStart+0×23
0014fa18 00000000 ntdll!_RtlUserThreadStart+0×1b

We set exception context:

0:000> kv 100
ChildEBP RetAddr  Args to Child             
[...]
0014e9e8 771a85b7 0014ea18 77139a14 00000000 kernel32!UnhandledExceptionFilter+0×127 (FPO: [SEH])
[…]

0:000> .exptr 0014ea18

----- Exception record at 0014eb18:
ExceptionAddress: 03ca0141
   ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000000
NumberParameters: 2
   Parameter[0]: 00000000
   Parameter[1]: 00000000
Attempt to read from address 00000000

----- Context record at 0014eb34:
eax=00000001 ebx=08394ff8 ecx=00000000 edx=00000001 esi=056a2a94 edi=00000000
eip=03ca0141 esp=0014ee00 ebp=0014ee28 iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010246
03ca0141 3909            cmp     dword ptr [ecx],ecx  ds:0023:00000000=????????

Then we disassemble the code at crash point and it looks strange including calls through DS data segment:

0:000> .asm no_code_bytes
Assembly options: no_code_bytes

0:000> u 03ca0141
03ca0141 cmp     dword ptr [ecx],ecx
03ca0143 call    dword ptr ds:[36067C0h]
03ca0149 mov     ecx,dword ptr [esi+5Ch]
03ca014c cmp     dword ptr [ecx],ecx
03ca014e call    dword ptr ds:[3606D10h]
03ca0154 mov     dword ptr [ebp-1Ch],0
03ca015b mov     dword ptr [ebp-18h],0FCh
03ca0162 push    3CA0180h

However further disassembly finally reaches RET instruction:

0:000> u
03ca0167 jmp     03ca0169
03ca0169 movzx   edx,byte ptr [ebp-24h]
03ca016d mov     ecx,dword ptr [ebp-28h]
03ca0170 call    System_ServiceProcess_ni+0x25140 (67715140)
03ca0175 pop     eax
03ca0176 jmp     eax
03ca0178 lea     esp,[ebp-0Ch]
03ca017b pop     ebx

0:000> u
03ca017c pop     esi
03ca017d pop     edi
03ca017e pop     ebp
03ca017f ret

03ca0180 mov     dword ptr [ebp-18h],0
03ca0187 jmp     03ca0178
03ca0189 add     byte ptr [eax],al
03ca018b add     byte ptr [eax],al

and backward disassembling shows the matching function prolog code:

0:000> ub 03ca0141
03ca0127 movzx   eax,byte ptr [ebp-24h]
03ca012b test    eax,eax
03ca012d je      03ca0154
03ca012f cmp     dword ptr [esi+60h],0
03ca0133 je      03ca013e
03ca0135 mov     ecx,dword ptr [esi+60h]
03ca0138 call    dword ptr ds:[3C20010h]
03ca013e mov     ecx,dword ptr [esi+58h]

0:000> ub 03ca0127
03ca0114 push    esi
03ca0115 push    ebx
03ca0116 sub     esp,1Ch

03ca0119 xor     eax,eax
03ca011b mov     dword ptr [ebp-18h],eax
03ca011e mov     dword ptr [ebp-28h],ecx
03ca0121 mov     dword ptr [ebp-24h],edx
03ca0124 mov     esi,dword ptr [ebp-28h]

0:000> ub 03ca0114
03ca0102 retf
03ca0103 add     eax,dword ptr [eax+36h]
03ca0106 retf
03ca0107 add     ebx,dword ptr [esi+esi-35h]
03ca010b add     esi,esp
03ca010d cmp     eax,8B550360h
03ca0112 in      al,dx
03ca0113 push    edi

From stack trace I suspected this code as JIT-compiled .NET code of the the main assemebly method. And indeed I found the similar call signatures like mine

03ca0141 cmp     dword ptr [ecx],ecx
03ca0143 call    dword ptr ds:[36067C0h]

in the following MSDN article:

Drill Into .NET Framework Internals to See How the CLR Creates Runtime Objects

Hence the name of this pattern: JIT Code.

- Dmitry Vostokov @ DumpAnalysis.org -

Pattern-Driven Memory Analysis (Part 3)

Thursday, May 14th, 2009

Part 2 briefly discussed debuggers and their commands. Debugger commands can be grouped together into scripts that can be run against memory dump files and their resulted textual output can be redirected to log files.

Therefore we start with the first step in our analysis process diagram introduced in Part 1:

Information Collection: Scripts

Here a script is a sequence of instructions to extract formatted textual information from a memory dump file (or a live system). There are many advantages of scripts and logs. Many companies, for example, banks, avoid sending plain memory dump files because of security considerations in order to prevent exposure of company or private information. After the advent of 64-bit personal computing complete memory dump files became larger and larger and it is now common to get 32Gb memory dumps. Processing multiple long-running commands can easily be done from textual log files. Additionally, scripts can be used to process hundreds of memory dumps in one go to look for similarities and differences. Many tools can be used for such purpose including built-in debugger scripting capabilities, shell scripting and regular expressions for intelligent search.

- Dmitry Vostokov @ DumpAnalysis.org -

Sentinel Pointers

Wednesday, May 13th, 2009

Consider this crash point:

0:000> r
eax=02d0f15c ebx=02a62918 ecx=77e41c30 edx=00000000 esi=ffffffff edi=02a8ed28
eip=76154193 esp=02d0f124 ebp=02d0f130 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010246
Application!GetData+0xb:
76154193 8b9eac000000    mov     ebx,dword ptr [esi+0ACh] ds:0023:000000ab=????????

Seeing 000000ab address we can think that ESI was 0 but it was 0xFFFFFFFF. Adding 0xAC to it produced an effective NULL data pointer 0xAB through integer addition overflow if we consider addition as unsigned. It is easy to see the result if we consider 0xFFFFFFFF as signed -1. Looking at stack trace and function disassembly we see that 0xFFFFFFFF was passed as a parameter:

0:000> kv
ChildEBP RetAddr  Args to Child             
02d0f130 7616328d ffffffff 02d0f15c 02d0f150 Application!GetData+0xb
[…]
02d0ffec 00000000 740420d8 02a74070 00000000 kernel32!BaseThreadStart+0×34

0:000> u Application!GetData
Application!GetData:
76154188 mov     edi,edi
7615418a push    ebp
7615418b mov     ebp,esp
7615418d push    ecx
7615418e push    ebx
7615418f push    esi
76154190 mov     esi,dword ptr [ebp+8]
76154193 mov     ebx,dword ptr [esi+0ACh]

This is an example of a sentinel pointer marking the end of a linked list, for example, although NULL pointers having 0 value are usually used. Also -1 value can be used to assign an invalid pointer value. 

- Dmitry Vostokov @ DumpAnalysis.org -

Stack trace, invalid code pointer and hooked functions: pattern cooperation

Wednesday, May 13th, 2009

When looking at a stack trace of one crashed process we noticed an invalid code pointer. It is not a NULL code pointer but has the same stack trace pattern:

0:000> kL
ChildEBP RetAddr 
WARNING: Frame IP not in any known module. Following frames may be wrong.
0013dfb4 00401791 0×5e388583
0013fdf4 0040189f Application!RequestData+0×3e1

0013fee4 00401d0a Application!main+0×3f
0013ffc0 77e4f23b Application!mainCRTStartup+0×16c
0013fff0 00000000 kernel32!BaseProcessStart+0×23

When we look at raw stack data and examine the backward disassembly of the return address we see that invalid code was called from RequestData function and WinDbg stack trace reconstruction is not suspicious (it is structurally and semantically correct):

0:000> dds esp l10
0013dfb8  00401791 Application!RequestData+0x3e1
0013dfbc  00000140
0013dfc0  0013ee50
0013dfc4  00000fa4
0013dfc8  00000000
0013dfcc  00000000
0013dfd0  00000ece
0013dfd4  0013ffc0
0013dfd8  7ffdc000
0013dfdc  00000140
0013dfe0  0000054c
0013dfe4  50000002
0013dfe8  4b4919ac
0013dfec  00000000
0013dff0  00000000
0013dff4  003f003c

0:000> .asm no_code_bytes
Assembly options: no_code_bytes

0:000> ub 00401791
Application!RequestData+0x3c8:
00401778 Application!RequestData+0x3d0 (00401780)
0040177a lea     ebx,[ebx]
00401780 push    0
00401782 push    eax
00401783 lea     ecx,[esp+esi+0E30h]
0040178a push    ecx
0040178b push    edi
0040178c call    Application!recv (0040e382)

When seeing recv call we might suspect that the crash happened just inside that function because the raw stack data upwards (lower addresses) doesn’t have any execution residue left from nested function calls:

0:000> dds esp-100 esp
0013deb8  00000000
0013debc  00000000
0013dec0  00000000
0013dec4  00000000
0013dec8  00000000
0013decc  00000000
0013ded0  00000000
0013ded4  00000000
0013ded8  00000000
0013dedc  00000000
0013dee0  00000000
0013dee4  00000000
0013dee8  00000000
0013deec  00000000
0013def0  00000000
0013def4  00000000
0013def8  00000000
0013defc  00000000
0013df00  00000000
0013df04  00000000
0013df08  00000000
0013df0c  00000000
0013df10  00000000
0013df14  00000000
0013df18  00000000
0013df1c  00000000
0013df20  00000000
0013df24  00000000
0013df28  00000000
0013df2c  00000000
0013df30  00000000
0013df34  00000000
0013df38  00000000
0013df3c  00000000
0013df40  00000000
0013df44  00000000
0013df48  00000000
0013df4c  00000000
0013df50  00000000
0013df54  00000000
0013df58  00000000
0013df5c  00000000
0013df60  00000000
0013df64  00000000
0013df68  00000000
0013df6c  00000000
0013df70  00000000
0013df74  00000000
0013df78  00000000
0013df7c  00000000
0013df80  00000000
0013df84  00000000
0013df88  00000000
0013df8c  00000000
0013df90  00000000
0013df94  00000000
0013df98  00000000
0013df9c  00000000
0013dfa0  00000000
0013dfa4  00000000
0013dfa8  00000000
0013dfac  00000000
0013dfb0  00000000
0013dfb4  00000000
0013dfb8  00401791 Application!RequestData+0x3e1

So we follow recv call forward disassembly (notice that the first jump is indirect):

0:000> u 0040e382
Application!recv:
0040e382 jmp     dword ptr [Application!_imp__recv (00410180)]

Application!closesocket:
0040e388 jmp     dword ptr [Application!_imp__closesocket (00410170)]
Application!WSAGetLastError:
0040e38e jmp     dword ptr [Application!_imp__WSAGetLastError (00410174)]
Application!send:
0040e394 jmp     dword ptr [Application!_imp__send (00410178)]
Application!connect:
0040e39a jmp     dword ptr [Application!_imp__connect (0041017c)]
Application!htons:
0040e3a0 jmp     dword ptr [Application!_imp__htons (00410198)]
Application!setsockopt:
0040e3a6 jmp     dword ptr [Application!_imp__setsockopt (00410184)]
Application!socket:
0040e3ac jmp     dword ptr [Application!_imp__socket (00410188)]

0:000> dps 00410180 l10
00410180  71ad2f7f ws2_32!recv
00410184  71ad2d47 ws2_32!setsockopt
00410188  71ad410c ws2_32!socket
0041018c  71ad7ca1 ws2_32!gethostbyname
00410190  71ad4f3b ws2_32!WSAStartup
00410194  71ad7b5b ws2_32!gethostname
00410198  71ad28bc ws2_32!htons
0041019c  71ad3da8 ws2_32!WSACleanup
004101a0  00000000
004101a4  00000000
004101a8  00000000
004101ac  00000000
004101b0  00000000
004101b4  45cd184e
004101b8  00000000
004101bc  00000002

0:000> u 71ad2f7f
ws2_32!recv:
71ad2f7f jmp     7fd60000

71ad2f84 sub     esp,10h
71ad2f87 push    ebx
71ad2f88 xor     ebx,ebx
71ad2f8a cmp dword ptr [ws2_32!PrologPointer (71ae4044)],offset ws2_32!Prolog_v2 (71ad6067)
71ad2f94 push    esi
71ad2f95 je      ws2_32!recv+0×18 (71ad6207)
71ad2f9b lea     eax,[ebp-8]

0:000> u 7fd60000
7fd60000 jmp     DllA!recv_patch (612101b6)
7fd60005 mov     edi,edi
7fd60007 push    ebp
7fd60008 mov     ebp,esp
7fd6000a jmp     ws2_32!recv+0×5 (71ad2f84)
7fd6000f add     byte ptr [eax],al
7fd60011 add     byte ptr [eax],al
7fd60013 add     byte ptr [eax],al

Finally we see that sockets library functions were patched by a 3rd-party module DllA and we need to contact its vendor.

- Dmitry Vostokov @ DumpAnalysis.org -

The Structure of MDA Certification

Tuesday, May 12th, 2009

Previously announced memory dump analysis certification will have x86/x64-based tracks for Windows and Unix (including Linux / FreeBSD / Mac OS X). Each track consists of 3 exams, each having its own set of requirements and scope:

  • Fundamentals of Memory Dump Analysis
  • Intermediate Memory Dump Analysis
  • Advanced Memory Dump Analysis

More information will be available later. The initiative is supported by OpenTask.

- Dmitry Vostokov @ DumpAnalysis.org -

Stack trace collection, blocked threads, pass through functions and main thread: pattern cooperation

Friday, May 8th, 2009

A server was hanging and a complete memory dump was generated. Usual checks for resource contention or CPU spikes like !locks and !running WinDbg commands didn’t reveal anything. Therefore we decided to list all threads in the system using !stacks and !process 0 ff commands. The former command gives a birds eye overview of threads and it was found suspicious that many threads in different processes had 0 ticks waiting and had this top level call:

0: kd> !stacks
Proc.Thread .Thread Ticks ThreadState Blocker
[...]

[88d9cd88 explorer.exe]
Proc.Thread .Thread  Ticks   ThreadState Blocker
1ecc.001eb4 88d99db0 0000000 Blocked     3rdPartyAVDriver+0×1394
[…]

[862bad88 userinit.exe]
4394.005570 878f0790 0000000 Blocked     3rdPartyAVDriver+0×1394

Inspection of the latter command detailed output of thread stacks showed that all these suspicious threads were in fact blocked when loading a DLL:

0: kd> .process /r /p 862bad88
Implicit process is now 862bad88
Loading User Symbols

0: kd> .thread 878f0790
Implicit thread is now 878f0790

0: kd> kL 100
ChildEBP RetAddr 
9aeb529c 80833e95 nt!KiSwapContext+0x26
9aeb52c8 80829027 nt!KiSwapThread+0x2e5
9aeb5310 b8dff394 nt!KeDelayExecutionThread+0x2ab
WARNING: Stack unwind information not available. Following frames may be wrong.
9aeb5338 b8dffe35 3rdPartyAVDriver+0×1394
9aeb5354 b8e00208 3rdPartyAVDriver+0×1e35
9aeb5398 b8e0e45a 3rdPartyAVDriver+0×2208
9aeb53d0 8081e087 3rdPartyAVDriver+0×1045a

9aeb53e4 b7d6e41b nt!IofCallDriver+0×45
9aeb53f4 b7d6a3ee DriverA!Dispatch+0xfe
9aeb5404 8081e087 DriverA!KUI_dispatch+0×6d

9aeb5418 b8bdb4e1 nt!IofCallDriver+0×45
9aeb543c b8bdb755 DriverB!PassThrough+0xd3
9aeb545c 8081e087 DriverB!Dispatch+0×153

9aeb5470 b6a7b81d nt!IofCallDriver+0×45
9aeb5498 b6a6c5bb DriverC!LowerPassThrough+0×48
9aeb54bc b6a6cc32 DriverC+0×75bb
9aeb5554 b6a7aa57 DriverC+0×7c32
9aeb5564 b6a7aaa7 DriverC+0×15a57
9aeb558c 8081e087 DriverC!DispatchPassThrough+0×48
9aeb55a0 808fb13b nt!IofCallDriver+0×45
9aeb5688 80939c6a nt!IopParseDevice+0xa35
9aeb5708 80935d9e nt!ObpLookupObjectName+0×5b0
9aeb575c 808ece57 nt!ObOpenObjectByName+0xea
9aeb57d8 808ee0f1 nt!IopCreateFile+0×447
9aeb5834 808f1e31 nt!IoCreateFile+0xa3
9aeb5874 8088b19c nt!NtOpenFile+0×27
9aeb5874 7c94860c nt!KiFastCallEntry+0xfc
0007e1fc 7c947339 ntdll!KiFastSystemCallRet
0007e200 7c952653 ntdll!NtOpenFile+0xc
0007e278 7c9525db ntdll!LdrpCreateDllSection+0×4e
0007e364 7c95425c ntdll!LdrpMapDll+0×297
0007e610 7c95408d ntdll!LdrpLoadDll+0×1e4
0007e88c 7c801bf7 ntdll!LdrLoadDll+0×198
0007e8f4 7c81c70b kernel32!LoadLibraryExW+0×1b2
0007e908 7c93a352 kernel32!LoadLibraryW+0×11
0007ee38 7c95348d ntdll!LdrpCallInitRoutine+0×14
0007ef40 7c954339 ntdll!LdrpRunInitializeRoutines+0×367
0007f1d4 7c95408d ntdll!LdrpLoadDll+0×3cd
0007f450 7c801bf7 ntdll!LdrLoadDll+0×198
0007f4b8 7c81c70b kernel32!LoadLibraryExW+0×1b2
0007f4cc 77e556fb kernel32!LoadLibraryW+0×11
0007f558 77e3dc66 USER32!LoadAppDlls+0×179
0007f694 77e3df10 USER32!ClientThreadSetup+0×1c4
0007f698 7c948556 USER32!__ClientThreadSetup+0×5
0007f698 80831348 ntdll!KiUserCallbackDispatcher+0×2e
9aeb5b4c 8091f9e1 nt!KiCallUserMode+0×4
9aeb5ba4 bf8bc7c3 nt!KeUserModeCallback+0×8f
9aeb5bcc bf8bbc93 win32k!xxxClientThreadSetup+0×21
9aeb5ccc bf8b6c46 win32k!xxxCreateThreadInfo+0×804
9aeb5cdc bf8b7ae4 win32k!UserThreadCallout+0×71
9aeb5cf8 80947a1b win32k!W32pThreadCallout+0×3a
9aeb5d54 8088ae9e nt!PsConvertToGuiThread+0×153
9aeb5d58 00000000 nt!KiBBTUnexpectedRange+0xc

The functions shown in blue are known from past issues to be pass through forwarding IRP to the lower drivers in a device driver stack.

Actually the thread stack above belongs to the main thread blocked while loading application DLLs. The conclusion was to check 3rdPartyAVDriver.sys vendor.

- Dmitry Vostokov @ DumpAnalysis.org -

Bugtation No.91

Monday, May 4th, 2009

On universal memory dumps:

“[…] the first man who noticed the analogy between a” dump “and” an observation “made a notable advance in the history of thought.”

Alfred North Whitehead, Science and the Modern World

- Dmitry Vostokov @ DumpAnalysis.org -

Heap corruption, module variety, execution residue, coincidental symbolic information and critical section corruption: pattern cooperation

Friday, May 1st, 2009

This is a synthesized dump analysis of many similar print spooler crashes in multi-user terminal service environments where old printer drivers are used that were tested only in single-user environments or insufficiently tested in multi-threaded environments. Many such crashes result from dynamic memory corruption of a process heap:

(40dc.4278): Access violation - code c0000005 (!!! second chance !!!)
eax=00000079 ebx=00000001 ecx=0008bff8 edx=00107898 esi=07d7522a edi=00107890
eip=7c8199b2 esp=0155fc14 ebp=0155fc44 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010202
ntdll!RtlpLowFragHeapFree+0×30:
7c8199b2 8b4604          mov     eax,dword ptr [esi+4] ds:0023:07d7522e=????????

0:017> kL
ChildEBP RetAddr 
0155fc44 7c819770 ntdll!RtlpLowFragHeapFree+0×30
0155fd1c 77c87a2b ntdll!RtlFreeHeap+0×5c

0155fd30 77c87a02 RPCRT4!FreeWrapper+0×1e
0155fd3c 77c821c2 RPCRT4!operator delete+0xd
0155fd50 77c8047b RPCRT4!LRPC_SCALL::FreeBuffer+0×77
0155fd9c 77c80353 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0×192
0155fdc0 77c811dc RPCRT4!RPC_INTERFACE::DispatchToStub+0xa3
0155fdfc 77c812f0 RPCRT4!LRPC_SCALL::DealWithRequestMessage+0×42c
0155fe20 77c88678 RPCRT4!LRPC_ADDRESS::DealWithLRPCRequest+0×127
0155ff84 77c88792 RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0×430
0155ff8c 77c8872d RPCRT4!RecvLotsaCallsWrapper+0xd
0155ffac 77c7b110 RPCRT4!BaseCachedThreadRoutine+0×9d
0155ffb8 77e64829 RPCRT4!ThreadStartRoutine+0×1b
0155ffec 00000000 kernel32!BaseThreadStart+0×34

Although any module could corrupt the heap and either Gflags or Application Verifier is recommended to enable full page heap, sometimes we need to point to some print drivers to eliminate or upgrade them in the meantime. When there are many of them we can point to the oldest one:

0:017> lmt
start    end        module name
[...]
010d0000 010d8000   PrintDriver1  2007
01260000 01272000   PrintDriver2  1999
01290000 012da000   PrintDriver3  2009
012f0000 01302000   PrintDriver4  2003
01310000 01320000   PrintDriver5  2004
01320000 01332000   PrintDriver6  2004
01340000 01353000   PrintDriver7  2005
01360000 0139e000   PrintDriver8  2007
013b0000 013c3000   PrintDriver9  2004
013d0000 013e0000   PrintDriver10 2005
013e0000 013f3000   PrintDriver11 2005
01400000 01413000   PrintDriver12 2006
01420000 0146b000   PrintDriver13 2007
01480000 01488000   PrintDriver14 2003
017f0000 0181e000   PrintDriver15 2004
01920000 0192d000   PrintDriver16 2008
01930000 01936000   PrintDriver17 2008
01950000 01959000   PrintDriver18 2008
01960000 01969000   PrintDriver19 2008
01f80000 021e8000   PrintDriver20 2004
032f0000 03514000   PrintDriver21 2003
03cd0000 03cd6000   PrintDriver22 2008
32100000 32148000   PrintDriver23 2008
3ea40000 3ea46000   PrintDriver24 2007
3f000000 3f03d000   PrintDriver25 2009
3f100000 3f133000   PrintDriver26 2009
[…]

The age distribution among 121 modules can be visualized on a CAD diagram:

 

Alternatively we can look at the execution residue on a raw thread stack:

0:017> !teb
TEB at 7ffa9000
    ExceptionList:        0155fd0c
    StackBase:            01560000
    StackLimit:           01550000
    SubSystemTib:         00000000
    FiberData:            00001e00
    ArbitraryUserPointer: 00000000
    Self:                 7ffa9000
    EnvironmentPointer:   00000000
    ClientId:             000040dc . 00004278
    RpcHandle:            00000000
    Tls Storage:          00000000
    PEB Address:          7ffd8000
    LastErrorValue:       0
    LastStatusValue:      8000001a
    Count Owned Locks:    0
    HardErrorMode:        0

0:017> dds 01550000 01560000
01550000  00000000
01550004  00000000
[...]
01554e78  00000000
01554e7c  00000000
01554e80  01554ecc
01554e84  7c82d1bb ntdll!RtlFindActivationContextSectionString+0xe1
01554e88  01554ea4
01554e8c  01554efc
01554e90  00000000
01554e94  020a0018 PrintDriver20!Callback+0×5c88
01554e98  7ffa9c00
01554e9c  00000000
01554ea0  01554ed4
01554ea4  7c82dd6c ntdll!RtlEncodeSystemPointer+0×45b
01554ea8  00020000
01554eac  01554ec8
01554eb0  01554ec8
01554eb4  01554ec8
[…]
01555eb4  0040003e
01555eb8  01556854
01555ebc  00000000
01555ec0  0000003e
01555ec4  0208003e PrintDriver20!GetValue+0xb37fe
01555ec8  00000000
01555ecc  43000000
01555ed0  0000003e
01555ed4  01555ffa
01555ed8  01555fbc
01555edc  01555fa8
01555ee0  001190d8
01555ee4  7c81990d ntdll!RtlpLowFragHeapAlloc+0×210
01555ee8  7c819962 ntdll!RtlpLowFragHeapAlloc+0xc6a
01555eec  0008bff8
01555ef0  00000000
01555ef4  00080000
[…]

The first address 020a0018 seems to be coincidental because its disassembled code is not good:

0:017> ub 020a0018
                 ^ Unable to find valid previous instruction for 'ub 020a0018'

0:017> u 020a0018
PrintDriver20!Callback+0x5c88:
020a0018 048b            add     al,8Bh
020a001a c7              ???

020a001b ebe8            jmp     PrintDriver20!Callback+0×5c75 (020a0005)
020a001d 8d4e28          lea     ecx,[esi+28h]
020a0020 e8d9960100      call    PrintDriver20!DlgProc+0×86ee (020b96fe)
020a0025 8d4e28          lea     ecx,[esi+28h]
020a0028 e87b930100      call    PrintDriver20!DlgProc+0×8398 (020b93a8)
020a002d 8b4618          mov     eax,dword ptr [esi+18h]

However the second address code 0208003e seems sound: cmp is followed by jne:

0:017> ub 0208003e
PrintDriver20!GetValue+0xb37e5:
02080025 8d442414        lea     eax,[esp+14h]
02080029 8b4b2c          mov     ecx,dword ptr [ebx+2Ch]
0208002c 50              push    eax
0208002d e8ce3a0000      call    PrintDriver20!GetValue+0xb72c0 (02083b00)
02080032 8b38            mov     edi,dword ptr [eax]
02080034 8b4c2410        mov     ecx,dword ptr [esp+10h]
02080038 8b41fc          mov     eax,dword ptr [ecx-4]
0208003b 3947fc          cmp     dword ptr [edi-4],eax

0:017> u 0208003e
PrintDriver20!GetValue+0xb37fe:
0208003e 751b            jne     PrintDriver20!GetValue+0xb381b (0208005b)
02080040 8bc8            mov     ecx,eax
02080042 8b742410        mov     esi,dword ptr [esp+10h]
02080046 c1e902          shr     ecx,2
02080049 f3a7            repe cmps dword ptr [esi],dword ptr es:[edi]
0208004b 750e            jne     PrintDriver20!GetValue+0xb381b (0208005b)
0208004d 8bc8            mov     ecx,eax
0208004f 83e103          and     ecx,3

But this is not a function call resulted in saved return address so we can still consider it as a coincidence. However, on the raw stack we also see a large chunk of ASCII data pointing to the same driver in a textual form:

[...]
0155d360  6f742064
0155d364  696e6920
0155d368  6c616974
0155d36c  20657a69
0155d370  61636562
0155d374  20657375
0155d378  75732061
0155d37c  62617469
[...]

0:017> da 0155d360 0155d734
0155d360  "d to initialize because a suitab"
0155d380  "le PrinterDriver20 inf file was ”
[…]

This reinforces our belief in PrinterDriver20. Finally, when looking at critical section list we see corruption signs pointing to the same driver addresses:

0:017> !cs -l -o -s
DebugInfo          = 0x0014bc60
Critical section   = 0x020f7140 (PrintDriver20!DlgProc+0×46130)
LOCKED
LockCount          = 0xFF85EA7F
WaiterWoken        = Yes
OwningThread       = 0×8b0c244c
RecursionCount     = 0×8BFFFBB4
LockSemaphore      = 0×83182444
SpinCount          = 0×088908c4

WARNING: critical section DebugInfo = 0x00000008 doesn't point back
to the DebugInfo found in the active critical sections list = 0x0014bc60.
The critical section was probably reused without calling DeleteCriticalSection.

Cannot read structure field value at 0x0000000a, error 0
ntdll!RtlpStackTraceDataBase is NULL. Probably the stack traces are not enabled.
ntdll!RtlpStackTraceDataBase is NULL. Probably the stack traces are not enabled.

DebugInfo          = 0x0014bc88
Critical section   = 0x020f7110 (PrintDriver20!DlgProc+0×46100)
LOCKED
LockCount          = 0×1E7245FF
WaiterWoken        = No
OwningThread       = 0xccccc304
RecursionCount     = 0xC483FFFD
LockSemaphore      = 0×158638B9
SpinCount          = 0xff96e902

WARNING: critical section DebugInfo = 0x0f712068 doesn't point back
to the DebugInfo found in the active critical sections list = 0x0014bc88.
The critical section was probably reused without calling DeleteCriticalSection.

Cannot read structure field value at 0x0f71206a, error 0
[...]

- Dmitry Vostokov @ DumpAnalysis.org -

Null data pointer, incorrect stack trace, changed environment, hooked functions and coincidental symbolic information: pattern cooperation

Tuesday, April 28th, 2009

One application was constantly crashing in a highly dynamic multi-user environment with various value-added GUI-enhancing hooking and patching 3rd-party products. The dump was analyzed and it shows the data NULL pointer access violation:

0:000> r
eax=05503ff0 ebx=05503f10 ecx=6bb7da25 edx=00000000 esi=02203c10 edi=00000000
eip=6b542ba3 esp=04f6f860 ebp=04f6f87c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010246
gui_plugin!entry+0xe63:
6b542ba3 8b0f            mov     ecx,dword ptr [edi]  ds:0023:00000000=????????

Unfortunately due to the lack of gui_plugin symbols the stack trace is incorrect:

0:000> kv
ChildEBP RetAddr  Args to Child             
WARNING: Stack unwind information not available. Following frames may be wrong.
04f6f87c 00000000 000003b9 00000690 00000d20 gui_plugin!entry+0xe63

We do not attempt stack trace reconstruction here but show one of troubleshooting methods to eliminate possible effects of the changed environment. We look for any code patching modules by examining hooked functions:

0:000> !chkimg -lo 50 -d !kernel32 -v
Searching for module with expression: !kernel32
Will apply relocation fixups to file used for comparison
Will ignore NOP/LOCK errors
Will ignore patched instructions
Image specific ignores will be applied
Comparison image path: c:\mss\kernel32.dll\4791A76Ddb000\kernel32.dll
No range specified

Scanning section:    .text
Size: 835001
Range to scan: 76701000-767ccdb9
    7670a672-7670a676  5 bytes - kernel32!MoveFileW
 [ 8b ff 55 8b ec:e9 89 59 80 09 ]
    7671c5c8-7671c5cc  5 bytes - kernel32!DeleteFileW (+0×11f56)
 [ 8b ff 55 8b ec:e9 33 3a 80 09 ]
    76721070-76721074  5 bytes - kernel32!MoveFileExW (+0×4aa8)
 [ 8b ff 55 8b ec:e9 8b ef 83 09 ]
    7678ff3f-7678ff43  5 bytes - kernel32!ReplaceFileW (+0×6eecf)
 [ 68 b4 03 00 00:e9 bc 00 7a 09 ]
Total bytes compared: 835001(100%)
Number of errors: 20
20 errors : !kernel32 (7670a672-7678ff43)

0:000> u 7670a672
kernel32!MoveFileW:
7670a672 e989598009      jmp     7ff10000
7670a677 6a02            push    2
7670a679 6a00            push    0
7670a67b 6a00            push    0
7670a67d ff750c          push    dword ptr [ebp+0Ch]
7670a680 ff7508          push    dword ptr [ebp+8]
7670a683 e8c4690100      call    kernel32!MoveFileWithProgressW (7672104c)
7670a688 5d              pop     ebp

0:000> u 7ff10000
7ff10000 e99b1e94f5      jmp     3rdPartyHook!MoveFileW (75851ea0)
7ff10005 8bff            mov     edi,edi
7ff10007 55              push    ebp
7ff10008 8bec            mov     ebp,esp
7ff1000a e968a67ff6      jmp     kernel32!MoveFileW+0×5 (7670a677)
7ff1000f 0000            add     byte ptr [eax],al
7ff10011 0000            add     byte ptr [eax],al
7ff10013 0000            add     byte ptr [eax],al

We see lots of references to 3rdPartyHook on the thread raw stack:

0:000> !teb
TEB at 7ffd5000
    ExceptionList:        04f6ff70
    StackBase:            04f70000
    StackLimit:           04f6c000
    SubSystemTib:         00000000
    FiberData:            00001e00
    ArbitraryUserPointer: 00000000
    Self:                 7ffd5000
    EnvironmentPointer:   00000000
    ClientId:             00001790 . 000017d0
    RpcHandle:            00000000
    Tls Storage:          7ffd502c
    PEB Address:          7ffd8000
    LastErrorValue:       0
    LastStatusValue:      0
    Count Owned Locks:    0
    HardErrorMode:        0

0:000> dds 04f6c000 04f70000
[...]
04f6cf28  00440042
04f6cf2c  04f6d2d4
04f6cf30  77179834 ntdll!_except_handler4
04f6cf34  003ce48e
04f6cf38  fffffffe
04f6cf3c  7719d584 ntdll!LdrpResSearchResourceInsideDirectory+0x80f
04f6cf40  7719cf9d ntdll!LdrpResSearchResourceMappedFile+0x521
04f6cf44  6b920002
04f6cf48  00000000
04f6cf4c  000bf000
04f6cf50  6b9d9000
04f6cf54  6b9200f8
04f6cf58  00000000
04f6cf5c  01f6d1d4 <Unloaded_D3D9.DLL>+0x12d1d4
04f6cf60  00000003
04f6cf64  04f6cf88
04f6cf68  04f6d370
04f6cf6c  04f6d340
04f6cf70  771d8fb2 ntdll!_SEH_epilog4_GS+0xa
04f6cf74  7719d018 ntdll!LdrpResSearchResourceMappedFile+0x781
04f6cf78  73d77612
04f6cf7c  6b920002
04f6cf80  000bf000
04f6cf84  00000000
04f6cf88  00000001
04f6cf8c  00000000
04f6cf90  00000000
04f6cf94  00000000
04f6cf98  00000000
04f6cf9c  04f6d44c
04f6cfa0  75851c2f 3rdPartyHook+0×1c2f
04f6cfa4  04f6d1e8
04f6cfa8  00000000
04f6cfac  00000000
04f6cfb0  00000000
04f6cfb4  00000000
[…]

The symbolic references point to valid and sound code:

0:000> ub 75851c2f
3rdPartyHook!GetData+0x22d:
75851c1d c20c00          ret     0Ch
75851c20 8b4dfc          mov     ecx,dword ptr [ebp-4]
75851c23 5f              pop     edi
75851c24 5e              pop     esi
75851c25 33cd            xor     ecx,ebp
75851c27 33c0            xor     eax,eax
75851c29 5b              pop     ebx
75851c2a e8590f0000      call    3rdPartyHook!__security_check_cookie (75852b88)

0:000> u 75851c2f
3rdPartyHook!GetData+0x23f:
75851c2f 8be5            mov     esp,ebp
75851c31 5d              pop     ebp
75851c32 c20c00          ret     0Ch
75851c35 cc              int     3
75851c36 cc              int     3
75851c37 cc              int     3
75851c38 cc              int     3
75851c39 cc              int     3

 

We also see the other 4th-party module that we know as patching from our past experience:

[...]
04f6f850  00000000
04f6f854  00000000
04f6f858  00000000
04f6f85c  00000000
04f6f860  1000c200 4thPartyHook!Shared+0×80
04f6f864  00000000
04f6f868  04f6f87c
04f6f86c  00000001
04f6f870  00000000
04f6f874  00000000
04f6f878  00000000
04f6f87c  0000006c
[…]

The address 1000c200 looks suspicious and coincidental as the set of flags. However, when we disassemble the address, we see the valid and sound code and the module seems to be loaded at 0×10000000 address:

0:000> ub 1000c200
4thPartyHook!Shared+0x80:
1000c1e8 e8234effff      call    10001010
1000c1ed a160020710      mov     eax,dword ptr [10070260]
1000c1f2 83c418          add     esp,18h
1000c1f5 8b750c          mov     esi,dword ptr [ebp+0Ch]
1000c1f8 85f6            test    esi,esi
1000c1fa 7530            jne     1000c22c
1000c1fc 85c0            test    eax,eax
1000c1fe 742c            je      1000c22c

0:000> u 1000c200
4thPartyHook!Shared+0x80:
1000c200 8b08            mov     ecx,dword ptr [eax]
1000c202 8b5004          mov     edx,dword ptr [eax+4]
1000c205 53              push    ebx
1000c206 8bd9            mov     ebx,ecx
1000c208 0bda            or      ebx,edx
1000c20a 5b              pop     ebx
1000c20b 741f            je      1000c22c
1000c20d f6400801        test    byte ptr [eax+8],1

0:000> !dh 10000000
[...]
File Type: DLL
FILE HEADER VALUES
     14C machine (i386)
       5 number of sections
       0 file pointer to symbol table
       0 number of symbols
      E0 size of optional header
    2102 characteristics
            Executable
            32 bit word machine
            DLL
[...]
OPTIONAL HEADER VALUES
     10B magic #
    8.00 linker version
   62000 size of code
   15000 size of initialized data
       0 size of uninitialized data
   3E1B6 address of entry point
    1000 base of code
         ----- new -----
10000000 image base
    1000 section alignment
    1000 file alignment
       2 subsystem (Windows GUI)
    4.00 operating system version
    0.00 image version
    4.00 subsystem version
   7A000 size of image
    1000 size of headers
[...]

0:000> lm
start    end        module name
00400000 00425000   App        
[...]
10000000 1007a000   4thPartyHook
[...]
 

The recommendation I usually give in such cases it to remove or disable 3rd- and 4th-party products that do code patching to see if this resolves the problem. This makes the module list internally clean and if the problem persists then we should look for another causes and restore 3rd- and 4th-party products.

- Dmitry Vostokov @ DumpAnalysis.org -

Looking for abnormal: case study

Monday, April 27th, 2009

I’m RARE rule #5 says:

“Provide appropriate explanations and narrative in the cases where analysis is inconclusive”.

Here is the typical example of such case when a kernel dump was taken with the vague description about server problems. The dump file analysis revealed the following abnormal conditions warranting further troubleshooting steps:

The AppA.exe, the part of the customer product, is about 1Gb in size when its typical size should be no more than 200Mb. Perhaps we have a memory leak here. We can suggest to take a few consecutive memory dumps of the growing memory and analyze it later as described in a heap leak pattern. This can also be a .NET leak too if unmanaged AppA.exe happened to load any managed components through 3rd-party DLLs. It could be also  some unknown loaded component reserved and committed large portion of virtual memory space.

0: kd> !vm
[...]
0eec AppA.exe        241366 (    965464 Kb)
03c0 svchost.exe      10304 (     41216 Kb)
0230 lsass.exe         8764 (     35056 Kb)
0298 svchost.exe       6402 (     25608 Kb)
01f4 winlogon.exe      5787 (     23148 Kb)
[…]

We can confirm the absence of handle leaks:

0: kd> !process 0eec
Searching for Process with Cid == eec
Cid Handle table at fffffa80014d6000 with 794 Entries in use
PROCESS fffffade6e601860
    SessionId: 0  Cid: 0eec    Peb: 7efdf000  ParentCid: 0eb8
    DirBase: b10fa000  ObjectTable: fffffa8000c39170  HandleCount: 865.
    Image: AppA.exe
    VadRoot fffffade68d7e580 Vads 1961 Clone 0 Private 237843. Modified 77. Locked 1.
    DeviceMap fffffa8001221580
    Token                             fffffa8001fdebe0
    ElapsedTime                       6 Days 22:18:09.271
    UserTime                          00:23:00.406
    KernelTime                        00:27:31.281

    QuotaPoolUsage[PagedPool]         106968
    QuotaPoolUsage[NonPagedPool]      19055186
    Working Set Sizes (now,min,max)  (240529, 50, 345) (962116KB, 200KB, 1380KB)
    PeakWorkingSetSize                240671
    VirtualSize                       1244 Mb
    PeakVirtualSize                   1246 Mb
    PageFaultCount                    244053
    MemoryPriority                    BACKGROUND
    BasePriority                      8
    CommitCharge                      241366

Kernel and user times seem high (about 30 minutes) but it correlates with almost 7 day extensive application usage that involves constant database access.

Looking further at running processes we see that the crucial AppB and AppC applications that were supposed to be running to serve user requests are orphaned:

09e8 AppB.exe            0 (         0 Kb)
09c0 AppC.exe            0 (         0 Kb)

Were they closed normally, forcefully terminated after being hang or crashed? These questions should be asked and appropriate measures taken to capture crash dumps in case event logs reveal access violations.

- Dmitry Vostokov @ DumpAnalysis.org -

Blocked thread, historical information, execution residue, hidden exception, dynamic memory corruption, incorrect stack trace and not my version: pattern cooperation

Thursday, April 23rd, 2009

One application was hanging after printing and its user dump was taken. Inspection of threads reveals only one blocked thread:

0:000> ~*kL

.  0  Id: 486c.5704 Suspend: 1 Teb: 7ffa9000 Unfrozen
ChildEBP RetAddr 
01c0fb68 7c827cfb ntdll!KiFastSystemCallRet
01c0fb6c 77e6202c ntdll!NtWaitForMultipleObjects+0xc
01c0fc14 77e62fbe kernel32!WaitForMultipleObjectsEx+0x11a
01c0fc30 00439f16 kernel32!WaitForMultipleObjects+0×18
01c0fd94 77bcaed6 App!~MyPrinter+0xe
01c0fdc0 77bcaf82 msvcrt!_cinit+0xad
01c0fdd0 77baf582 msvcrt!_cexit+0xb
01c0fde0 7c81a352 msvcrt!__CRTDLL_INIT+0×15c
01c0fe00 7c830e70 ntdll!LdrpCallInitRoutine+0×14
01c0feb8 77e668a3 ntdll!LdrShutdownProcess+0×182

We see that the process was being terminated and the last thread was blocked in a destructor of MyPrinter class. However we see some historical information, faultrep.dll unloaded, and this means that some exception happened prior to the process shutdown:

0:000> lm
[...]
Unloaded modules:
69500000 69517000   faultrep.dll
[…]

Then we look at the thread raw stack to see any signs of execution residue and hidden exceptions in particular:

0:000> !teb
TEB at 7ffa9000
    ExceptionList:        01c0fc04
    StackBase:            01c10000
    StackLimit:           01c08000
    SubSystemTib:         00000000
    FiberData:            00001e00
    ArbitraryUserPointer: 00000000
    Self:                 7ffa9000
    EnvironmentPointer:   00000000
    ClientId:             0000486c . 00005704
    RpcHandle:            00000000
    Tls Storage:          00000000
    PEB Address:          7ffd9000
    LastErrorValue:       0
    LastStatusValue:      c0000010
    Count Owned Locks:    0
    HardErrorMode:        0

0:000> dds 01c08000 01c10000
01c08000  00000000
01c08004  00000000
01c08008  00000000
[...]
01c0e704  00001000
01c0e708  01c0ea14
01c0e70c  7c82855e ntdll!KiUserExceptionDispatcher+0xe
01c0e710  01c0a000
01c0e714  01c0e73c
01c0e718  01c0e720
01c0e71c  01c0e73c
01c0e720  c0000005
01c0e724  00000000
01c0e728  00000000
01c0e72c  7c82a756 ntdll!RtlpCoalesceFreeBlocks+0×370
01c0e730  00000002
01c0e734  00000000
[…]

Applying exception context we see heap corruption:

0:000> .cxr 01c0e73c
eax=01998db0 ebx=00370000 ecx=01998db0 edx=4d445443 esi=01998da8 edi=01998010
eip=7c82a756 esp=01c0ea08 ebp=01c0ea14 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000    efl=00010246
ntdll!RtlpCoalesceFreeBlocks+0×370:
7c82a756 3b4a04  cmp    ecx,dword ptr [edx+4] ds:0023:4d445447=????????

However stack trace a bit incorrect at the end (shown in magenta) but this can be explained that it is a residual stack trace that happened in the past with some data overwritten by exception processing:

0:000> kL
  *** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr 
01c0ea14 7c82a82b ntdll!RtlpCoalesceFreeBlocks+0x370
01c0eafc 77bbcef6 ntdll!RtlFreeHeap+0x38e
01c0eb44 00421fdc msvcrt!free+0xc3
01c0eb6c 00422395 App!SetData+0x3a5
[...]
01c0f81c 7c829f60 App!CreatePrinter+0x6a
01c0f824 7c829f59 ntdll!CheckHeapFillPattern+0×64
01c0f948 7c82770b ntdll!RtlFreeHeap+0×70f
01c0f96c 77c27631 ntdll!NtQueryValueKey+0xc
01c0fab8 00000000 gdi32!ReadDisableMetaFilesRegKey+0xfc

Because the hang happens after printing we detect old printer drivers in the module list (certainly not my version of the environment used for application testing):

0:000> lmt
start    end        module name
[...]
6b900000 6b925000   PrinterDriverA  [...] 2002
6c900000 6c947000   PrinterDriverB  […] 2002
[…]

- Dmitry Vostokov @ DumpAnalysis.org -