Archive for November, 2015

Crash Dump Analysis Patterns (Part 234)

Monday, November 30th, 2015

Sometimes we identify a function parameter that caused abnormal behavior, for example, Invalid Pointer access violation and want to see from which module or function it was probably originated:

0:009> kvL
# ChildEBP RetAddr Args to Child
00 0381f048 74e715e9 00000002 0381f098 00000001 ntdll!NtWaitForMultipleObjects+0x15
01 0381f0e4 76ce19fc 0381f098 0381f10c 00000000 KERNELBASE!WaitForMultipleObjectsEx+0x100
02 0381f12c 76ce41d8 00000002 7efde000 00000000 kernel32!WaitForMultipleObjectsExImplementation+0xe0
03 0381f148 76d08074 00000002 0381f17c 00000000 kernel32!WaitForMultipleObjects+0x18
04 0381f1b4 76d07f33 0381f294 00000001 00000001 kernel32!WerpReportFaultInternal+0x186
05 0381f1c8 76d07828 0381f294 00000001 0381f264 kernel32!WerpReportFault+0x70
06 0381f1d8 76d077a7 0381f294 00000001 9a131a15 kernel32!BasepReportFault+0x20
07 0381f264 772574ff 00000000 772573dc 00000000 kernel32!UnhandledExceptionFilter+0x1af
08 0381f26c 772573dc 00000000 0381ffd4 7720c550 ntdll!__RtlUserThreadStart+0x62
09 0381f280 77257281 00000000 00000000 00000000 ntdll!_EH4_CallFilterFunc+0x12
0a 0381f2a8 7723b499 fffffffe 0381ffc4 0381f3e4 ntdll!_except_handler4+0x8e
0b 0381f2cc 7723b46b 0381f394 0381ffc4 0381f3e4 ntdll!ExecuteHandler2+0x26
0c 0381f2f0 7723b40e 0381f394 0381ffc4 0381f3e4 ntdll!ExecuteHandler+0x24
0d 0381f37c 771f0133 0181f394 0381f3e4 0381f394 ntdll!RtlDispatchException+0x127
0e 0381f37c 00010101 0181f394 0381f3e4 0381f394 ntdll!KiUserExceptionDispatcher+0xf (CONTEXT @ 0381f3e4)
WARNING: Frame IP not in any known module. Following frames may be wrong.
0f 0381f844 00df3d1e 064bf018 0b925970 06501dd8 0×10101
10 0381f860 00ca0a15 064bf018 0b925970 0c86a949 ModuleA!Bar+0×3e
11 0381f888 00ca07d4 064bf018 0b925970 09584580 ModuleB!Foo2+0×75
12 0381f8e4 00ca0ae3 0b925970 00000000 09584008 ModuleB!Foo1+0xb4

Here we backtrack the invalid function call pointer to a function argument parameter:

0:009> .cxr 0381f3e4
eax=06501dd8 ebx=0381f8c4 ecx=0b925970 edx=0b925527 esi=064bf018 edi=0381f898
eip=00010101 esp=0381f848 ebp=0c86a949 iopl=0 nv up ei pl nz na pe nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010206
00010101 1f pop ds

0:009> k 1
*** Stack trace for last set context - .thread/.cxr resets it
# ChildEBP RetAddr
WARNING: Frame IP not in any known module. Following frames may be wrong.
00 0381f844 00df3d1e 0×10101

0:009> ub 00df3d1e
00df3d0a mov ecx,dword ptr [esp+10h]
00df3d0e mov edi,dword ptr [esp+18h]
00df3d12 push edi
00df3d13 push eax
00df3d14 mov edx,dword ptr [ecx]
00df3d16 push ecx
00df3d17 push esi
00df3d18 mov edx,dword ptr [edx+4]
00df3d1b call dword ptr [edx+10h]

The value of EDX comes from EDX+4 which comes from ECX pointer value which was probably one of function parameters:

0:009> dp ecx L1
0b925970 0b925978

0:009> dp poi(ecx)+4 L1
0b92597c 0b925527

0:009> dc 0b925527
0b925527 20202020 80202020 01010101 01010101 ………
0b925537 00010101 01010101 01010000 01000101 …………….

Since the data came from ModuleA function parameter it was passed from the previous call:

0:009> kL 4
# ChildEBP RetAddr
WARNING: Frame IP not in any known module. Following frames may be wrong.
00 0381f844 00df3d1e 0x10101
01 0381f860 00ca0a15 ModuleA!Bar+0×3e
02 0381f888 00ca07d4 ModuleB!Foo2+0×75
03 0381f8e4 00ca0ae3 ModuleB!Foo1+0xb4

0:009> ub 00ca0a15
00ca0a01 add esp,10h
00ca0a04 and edi,0FFFFFFF8h
00ca0a07 lea edx,[esp+20h]
00ca0a0b push edx
00ca0a0c push ebp
00ca0a0d push edi
00ca0a0e push esi
00ca0a0f call dword ptr [ModuleA!_imp__Bar (00cc32c0)]

The passed parameter looks like the second argument of Bar and ModuleB!Foo2, and the first argument of ModuleB!Foo1:

0:009> dps 0381f860
0381f860 064bf018
0381f864 00ca0a15 ModuleB!Foo2+0x75 ; return address when calling ModuleA!Bar
0381f868 064bf018
0381f86c 0b925970
0381f870 0c86a949
0381f874 0381f898
0381f878 0b925970
0381f87c 09584580
0381f880 064bf018
0381f884 0381f8c4
0381f888 0b925970
0381f88c 00ca07d4 ModuleB!Foo1+0xb4 ; return address when calling ModuleB!Foo2
0381f890 064bf018
0381f894 0b925970
0381f898 09584580
0381f89c 064bf018
0381f8a0 00ccd7d0
0381f8a4 0381f954
0381f8a8 00000000
0381f8ac 00000001
0381f8b0 00000000
0381f8b4 0b926d18
0381f8b8 00000000
0381f8bc 00000005
0381f8c0 00000000
0381f8c4 0b925970
0381f8c8 00000000
0381f8cc 00000000
0381f8d0 00000000
0381f8d4 00000000
0381f8d8 00000000
0381f8dc 00000000
0381f8e0 00000000
0381f8e4 00000000
0381f8e8 00ca0ae3 ModuleB!iFoo0+0×13 ; return address when calling ModuleB!Foo1
0381f8ec 0b925970
0381f8f0 00000000

We see Parameter Flow analysis pattern:

0:009> kvL 4
# ChildEBP RetAddr Args to Child
WARNING: Frame IP not in any known module. Following frames may be wrong.
00 0381f844 00df3d1e 064bf018 0b925970 06501dd8 0×10101
01 0381f860 00ca0a15 064bf018 0b925970 0c86a949 ModuleA!Bar+0×3e
02 0381f888 00ca07d4 064bf018 0b925970 09584580 ModuleB!Foo2+0×75
03 0381f8e4 00ca0ae3 0b925970 00000000 09584008 ModuleB!Foo1+0xb4

The example is from x86 Windows where parameters are passed through stack, and, therefore, are seen in the verbose stack trace output (kv). For x64 Windows systems such parameters may be Hidden and what we see in verbose output can be False Function Parameters. However, we can still track in some cases certain values found in Execution Residue and associate them with particular stack trace frames (including Past Stack Traces).

Parameter Flow memory analysis pattern is similar to trace analysis Data Flow pattern.

- Dmitry Vostokov @ + -

Crash Dump Analysis Patterns (Part 233)

Wednesday, November 18th, 2015

With the possibility of process cloning (reflection) starting from Windows 7 it is possible to get memory snapshots (Clone Dump) from a process clone (similar to fork API in Unix). Procdump tool has -r switch for that purpose. We checked this with x64 Windows 7 notepad.exe. We got two memory dumps: one is a clone with this stack trace:

Loading Dump File [C:\DebuggingTV\Procdump\notepad.exe_151117_000755.dbgcfg.dmp]
User Mini Dump File with Full Memory: Only application data is available

Comment: '
*** procdump -ma -r Notepad.exe
*** Manual dump'

0:000> ~*k

. 0 Id: 25ec.147c Suspend: 1 Teb: 000007ff`fffdb000 Unfrozen
# Child-SP RetAddr Call Site
00 00000000`02c8fd38 00000000`7733aae7 ntdll!NtSuspendThread+0xa
01 00000000`02c8fd40 00000000`77165a4d ntdll!RtlpProcessReflectionStartup+0×2e7
02 00000000`02c8fe30 00000000`7729b831 kernel32!BaseThreadInitThunk+0xd
03 00000000`02c8fe60 00000000`00000000 ntdll!RtlUserThreadStart+0×1d

The process memory has all address space of the original process including module list and heap structure:

0:000> lmn
start end module name
00000000`77050000 00000000`7714a000 user32 user32.dll
00000000`77150000 00000000`77270000 kernel32 kernel32.dll
00000000`77270000 00000000`77419000 ntdll ntdll.dll
00000000`ff030000 00000000`ff065000 notepad notepad.exe
000007fe`f57d0000 000007fe`f5841000 winspool winspool.drv
000007fe`fb730000 000007fe`fb786000 uxtheme uxtheme.dll
000007fe`fb910000 000007fe`fbb04000 comctl32 comctl32.dll
000007fe`fbf00000 000007fe`fbf0c000 version version.dll
000007fe`fceb0000 000007fe`fcebf000 CRYPTBASE CRYPTBASE.dll
000007fe`fd310000 000007fe`fd37c000 KERNELBASE KERNELBASE.dll
000007fe`fd3d0000 000007fe`fd499000 usp10 usp10.dll
000007fe`fd4a0000 000007fe`fd511000 shlwapi shlwapi.dll
000007fe`fd520000 000007fe`fd64d000 rpcrt4 rpcrt4.dll
000007fe`fd650000 000007fe`fd66f000 sechost sechost.dll
000007fe`fd680000 000007fe`fd6ae000 imm32 imm32.dll
000007fe`fd6b0000 000007fe`fd78b000 advapi32 advapi32.dll
000007fe`fd810000 000007fe`fd8a7000 comdlg32 comdlg32.dll
000007fe`fdd60000 000007fe`fddff000 msvcrt msvcrt.dll
000007fe`fdfe0000 000007fe`fed69000 shell32 shell32.dll
000007fe`fed70000 000007fe`fedd7000 gdi32 gdi32.dll
000007fe`ff0b0000 000007fe`ff1b9000 msctf msctf.dll
000007fe`ff1c0000 000007fe`ff1ce000 lpk lpk.dll
000007fe`ff1d0000 000007fe`ff2a7000 oleaut32 oleaut32.dll
000007fe`ff2b0000 000007fe`ff349000 clbcatq clbcatq.dll
000007fe`ff350000 000007fe`ff553000 ole32 ole32.dll

0:000> !address -summary

Mapping file section regions...
Mapping module regions...
Mapping PEB regions...
Mapping TEB and stack regions...
Mapping heap regions...
Mapping page heap regions...
Mapping other regions...
Mapping stack trace database regions...
Mapping activation context regions...

--- Usage Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
Free 48 7ff`faa06000 ( 8.000 TB) 100.00%
Image 129 0`01e79000 ( 30.473 MB) 35.47% 0.00%
21 0`01d10000 ( 29.063 MB) 33.83% 0.00%
Other 9 0`016be000 ( 22.742 MB) 26.47% 0.00%
Heap 26 0`00320000 ( 3.125 MB) 3.64% 0.00%
Stack 3 0`00080000 ( 512.000 kB) 0.58% 0.00%
TEB 1 0`00002000 ( 8.000 kB) 0.01% 0.00%
PEB 1 0`00001000 ( 4.000 kB) 0.00% 0.00%

--- Type Summary (for busy) ------ RgnCount ----------- Total Size -------- %ofBusy %ofTotal
MEM_IMAGE 130 0`01e7a000 ( 30.477 MB) 35.47% 0.00%
MEM_PRIVATE 47 0`01449000 ( 20.285 MB) 23.61% 0.00%
MEM_MAPPED 11 0`00c97000 ( 12.590 MB) 14.65% 0.00%

--- State Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
MEM_FREE 50 7ff`fc096000 ( 8.000 TB) 100.00%
MEM_COMMIT 176 0`02c10000 ( 44.063 MB) 51.29% 0.00%
MEM_RESERVE 12 0`0134a000 ( 19.289 MB) 22.45% 0.00%

--- Protect Summary (for commit) - RgnCount ----------- Total Size -------- %ofBusy %ofTotal
PAGE_READONLY 83 0`01b46000 ( 27.273 MB) 31.75% 0.00%
PAGE_EXECUTE_READ 25 0`00f6d000 ( 15.426 MB) 17.95% 0.00%
PAGE_WRITECOPY 48 0`00126000 ( 1.148 MB) 1.34% 0.00%
PAGE_READWRITE 18 0`00032000 ( 200.000 kB) 0.23% 0.00%
PAGE_READWRITE|PAGE_GUARD 2 0`00005000 ( 20.000 kB) 0.02% 0.00%

--- Largest Region by Usage ----------- Base Address -------- Region Size ----------
Free 0`ff065000 7fd`f676b000 ( 7.992 TB)
Image 7fe`fe4cb000 0`0089e000 ( 8.617 MB)
0`7f0e0000 0`00f00000 ( 15.000 MB)
Other 0`00610000 0`01590000 ( 21.563 MB)
Heap 0`003b8000 0`000c8000 ( 800.000 kB)
Stack 0`02c10000 0`0006c000 ( 432.000 kB)
TEB 7ff`fffdb000 0`00002000 ( 8.000 kB)
PEB 7ff`fffdf000 0`00001000 ( 4.000 kB)

0:000> !heap -s

LFH Key : 0x000000381021167d
Termination on corruption : ENABLED
Heap Flags Reserv Commit Virt Free List UCR Virt Lock Fast
(k) (k) (k) (k) length blocks cont. heap
0000000000280000 00000002 1024 412 1024 14 4 1 0 0 LFH
0000000000250000 00001002 1088 256 1088 5 2 2 0 0 LFH
0000000001cc0000 00001002 64 8 64 3 1 1 0 0
0000000001e60000 00001002 512 120 512 49 3 1 0 0
0000000001dc0000 00001002 512 8 512 2 1 1 0 0

The other dump saved is a minidump from which we can get thread information for Execution Residue (raw stack data) and reconstruct stack traces in Clone Dump:

Loading Dump File [C:\DebuggingTV\Procdump\notepad.exe_151117_000755.dmp]
Comment: '
*** procdump -ma -r Notepad.exe
*** Manual dump'
User Mini Dump File: Only registers, stack and portions of memory are available

0:000> ~
. 0 Id: 87c.27f4 Suspend: 0 Teb: 000007ff`fffdd000 Unfrozen

0:000> k
# Child-SP RetAddr Call Site
00 00000000`0016fac8 00000000`77069e9e 0x77069e6a
01 00000000`0016fad0 00000000`00000000 0x77069e9e

0:000> r
rax=0000000000000000 rbx=000000000016fb40 rcx=0000000000280000
rdx=0000000000000000 rsi=0000000000000001 rdi=0000000000000000
rip=0000000077069e6a rsp=000000000016fac8 rbp=00000000ff030000
r8=000000000016f8e8 r9=00000000000a0cdc r10=0000000000000000
r11=0000000000000000 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0 nv up ei pl zr na po nc
cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246
00000000`77069e6a c3 ret

Now we can see the original stack trace in Clone Dump:

0:000> k =000000000016fac8 ff
# Child-SP RetAddr Call Site
00 00000000`0016fac8 00000000`77069e9e ntdll!NtSuspendThread+0xa
01 00000000`0016fad0 00000000`ff031064 user32!GetMessageW+0x34
02 00000000`0016fb00 00000000`ff03133c notepad!WinMain+0x182
03 00000000`0016fb80 00000000`77165a4d notepad!DisplayNonGenuineDlgWorker+0x2da
04 00000000`0016fc40 00000000`7729b831 kernel32!BaseThreadInitThunk+0xd
05 00000000`0016fc70 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

Since we know TEB address from the minidump we can get stack region boundaries in Clone Dump:

0:000> dt _NT_TIB 000007fffffdd000
+0x000 ExceptionList : (null)
+0x008 StackBase : 0x00000000`00170000 Void
+0x010 StackLimit : 0x00000000`0015b000 Void
+0x018 SubSystemTib : (null)
+0x020 FiberData : 0x00000000`00001e00 Void
+0x020 Version : 0x1e00
+0x028 ArbitraryUserPointer : (null)
+0x030 Self : 0x000007ff`fffdd000 _NT_TIB

Now we can check Execution Residue (for example, for signs of Hidden Exceptions):

0:000> dpS 0x00000000`0015b000 0x00000000`00170000

- Dmitry Vostokov @ + -

Trace Analysis Patterns (Part 119)

Saturday, November 14th, 2015

Multiple traces and logs are usually collected for diagnosing distributed systems. Different tools and tracing settings (circular, sequential, file size limit) may be used, systems may be unsynchronized, and individual system tracing may be started at different times due to manual tracing setup and switching between systems. There may be Blackouts, Circular, and Truncated traces. When we analyze such a trace set (Inter-Correlation) we usually select one trace or log that is used as Calibrating Trace. It is used for measuring all other traces against Basic Facts such as start and end tracing times, and the time of the problem. One such scenario is illustrated in the following diagram:

- Dmitry Vostokov @ + -

Trace Analysis Patterns (Part 118)

Tuesday, November 10th, 2015

We would like to introduce Trace Dimension pattern to address the emerging complexity of logs from distributed environments. By a distributed environment we mean not only a collection of multiple computers (for example, client-server) but also terminal services environments with several different user sessions on one computer (OS) and even multiple user processes (IPC) in some cases. If some task can be performed on one machine or session or inside one process then splitting it across several computers, sessions, or processes usually results in logs with added Distributed Infrastructure Messages (DIMs) such as from proxies, channels and their management:


So, one of the trace simplification strategies is to request the reproduction and its tracing in a simplified environment (such as inside one terminal services session) to eliminate DIMs. In one case, we analyzed a trace for a clipboard paste problem in Windows terminal services environment. After a clipboard copy different data was pasted to different applications. The same behavior was observed for application processes running inside different sessions and processes running inside one session. However, the log was collected for the more complex multiple session scenario with many False Positive Errors which completely disappeared from one session scenario log.

DIM abbreviation played a role in naming this pattern. Additionally, if sessions can be considered a second dimension, then separate VMs can be considered as a third dimension, separate clouds as a 4th dimension, and so on.

- Dmitry Vostokov @ + -