Archive for the ‘Crash Dump Patterns’ Category
Sunday, April 24th, 2011
Module Hint pattern is frequently observed in dynamic memory corruption incidents. It is similar to Execution Residue or String Parameter patterns were we have ASCII or UNICODE fragments providing troubleshooting and debugging hints. Module Hint is therefore a more specialized pattern where we can link module names to raw data. For example, a kernel memory dump saved after the detected pool corruption shows P12345.DLL module name in a pool entry that can provide a link to the corresponding functionally to be reconfigured or removed:
BAD_POOL_HEADER (19)
The pool is already corrupt at the time of the current request.
This may or may not be due to the caller.
The internal pool links must be walked to figure out a possible cause of the problem, and then special pool applied to the suspect tags or the driver verifier to a suspect driver.
Arguments:
Arg1: 00000020, a pool block header size is corrupt.
Arg2: 8b79d078, The pool entry we were looking for within the page.
Arg3: 8b79d158, The next pool entry.
Arg4: 8a1c0004, (reserved)
STACK_TEXT:
b3e0aa4c 808947bb 00000019 00000020 8b79d078 nt!KeBugCheckEx+0x1b
b3e0aab4 b368c00f 8b79d080 00000000 00000000 nt!ExFreePoolWithTag+0×477
b3e0aac4 b366c68e 8b79d080 00000000 00000000 DriverA!MemFree+0xf
[…]
b3e0ac44 8081e0c3 808f77c9 b3e0ac64 808f77c9 nt!IovCallDriver+0×112
b3e0ac50 808f77c9 8a8eef60 8b6862a8 8a8eeef0 nt!IofCallDriver+0×13
b3e0ac64 808f856b 8ce456b0 8a8eeef0 8b6862a8 nt!IopSynchronousServiceTail+0×10b
b3e0ad00 808f109a 000009dc 00000000 00000000 nt!IopXxxControlFile+0×5e5
b3e0ad34 8088b45c 000009dc 00000000 00000000 nt!NtDeviceIoControlFile+0×2a
b3e0ad34 7c82847c 000009dc 00000000 00000000 nt!KiFastCallEntry+0xfc
WARNING: Frame IP not in any known module. Following frames may be wrong.
00f5fb18 00000000 00000000 00000000 00000000 0×7c82847c
2: kd> !pool 8b79d080
Pool page 8b79d080 region is Unknown
8b79d000 size: 30 previous size: 0 (Allocated) FSfm
8b79d030 size: 28 previous size: 30 (Allocated) VadS
8b79d058 size: 20 previous size: 28 (Allocated) ReEv
*8b79d078 size: e0 previous size: 20 (Allocated) *DRIV
Owning component : Unknown (update pooltag.txt)
8b79d158 is not a valid large pool allocation, checking large session pool…
8b79d158 is freed (or corrupt) pool
Bad previous allocation size @8b79d158, last size was 1c
***
*** An error (or corruption) in the pool was detected;
*** Pool Region unknown (0xFFFFFFFF8B79D158)
***
*** Use !poolval 8b79d000 for more details.
***
2: kd> dc 8b79d078
8b79d078 [...] ..DRIV ......AP
8b79d088 [...] P12345.DLL……
8b79d098 […] …..<%n……..
8b79d0a8 […] ….$…:.F…X.
[…]
- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in Crash Dump Analysis, Crash Dump Patterns, Debugging | 3 Comments »
Friday, April 22nd, 2011
A page to reference all different kinds of .NET related patterns is necessary, so I created this post:
I’ll update it as soon as I add more similar patterns.
- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in .NET Debugging, CDA Pattern Classification, Crash Dump Analysis, Crash Dump Patterns, Debugging, Training and Seminars | No Comments »
Sunday, April 17th, 2011
I’m pleased to announce that MDAA, Volume 5 is available in PDF format:
www.dumpanalysis.org/Memory+Dump+Analysis+Anthology+Volume+5
It features:
- 25 new crash dump analysis patterns
- 11 new pattern interaction case studies (including software tracing)
- 16 new trace analysis patterns
- 7 structural memory patterns
- 4 modeling case studies for memory dump analysis patterns
- Discussion of 3 common analysis mistakes
- Malware analysis case study
- Computer independent architecture of crash analysis report service
- Expanded coverage of software narratology
- Metaphysical and theological implications of memory dump worldview
- More pictures of memory space and physicalist art
- Classification of memory visualization tools
- Memory visualization case studies
- Close reading of the stories of Sherlock Holmes: Dr. Watson’s observational patterns
- Fully cross-referenced with Volume 1, Volume 2, Volume 3, and Volume 4
Its table of contents is available here:
www.dumpanalysis.org/MDAA/MDA-Anthology-V5-TOC.pdf
Paperback and hardcover versions should be available in a week or two. I also started working on Volume 6 that should be available in November-December.
- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in Aesthetics of Memory Dumps, Analysis Notation, Announcements, AntiPatterns, Archaeology of Computer Memory, Art, Assembly Language, Best Practices, Books, C and C++, CDF Analysis Tips and Tricks, Categorical Debugging, Citrix, Common Mistakes, Common Questions, Complete Memory Dump Analysis, Computer Forensics, Computer Science, Crash Analysis Report Environment (CARE), Crash Dump Analysis, Crash Dump De-analysis, Crash Dump Patterns, Crash Dumps for Dummies, Cyber Warfare, Debugging, Debugging Bureau, Debugging Industry, Debugging Methodology, Debugging Slang, Debugging Trends, Deep Down C++, Dr. Watson, Dublin School of Security, Education and Research, Escalation Engineering, Fun with Crash Dumps, Fun with Debugging, Fun with Software Traces, General Memory Analysis, Hermeneutics of Memory Dumps and Traces, Images of Computer Memory, Kernel Development, Malware Analysis, Mathematics of Debugging, Memiotics (Memory Semiotics), Memory Analysis Forensics and Intelligence, Memory Diagrams, Memory Dump Analysis Services, Memory Dumps in Myths, Memory Space Art, Memory Systems Language, Memory Visualization, Memory and Glitches, Metaphysics of Memory Worldview, Multithreading, Music for Debugging, New Acronyms, New Debugging School, New Words, Pattern Models, Philosophy, Physicalist Art, Publishing, Reverse Engineering, Science of Memory Dump Analysis, Science of Software Tracing, Security, Software Architecture, Software Behavior Patterns, Software Chorography, Software Chorology, Software Defect Construction, Software Engineering, Software Generalist, Software Maintenance Institute, Software Narratology, Software Technical Support, Software Trace Analysis, Software Trace Reading, Software Trace Visualization, Software Tracing for Dummies, Software Troubleshooting Patterns, Software Victimology, Structural Memory Patterns, Structural Trace Patterns, Systems Thinking, Testing, The Way of Philip Marlowe, Tools, Trace Analysis Patterns, Training and Seminars, Troubleshooting Methodology, Victimware, Vista, Webinars, WinDbg Scripting Extensions, WinDbg Scripts, WinDbg Tips and Tricks, WinDbg for GDB Users, Windows 7, Windows Server 2008, Windows System Administration, Workaround Patterns, x64 Windows | No Comments »
Tuesday, April 12th, 2011
I’m to present pattern-driven software trace analysis with examples from application and desktop delivery environments featuring Memory Dump Analysis Services at the forthcoming E2E Virtualization Conference (PubForum) in Dublin on 13th of May, 2011. Topics include a case study covering simultaneous analysis of software traces and memory dumps.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in Announcements, CDF Analysis Tips and Tricks, Crash Dump Analysis, Crash Dump Patterns, Debugging, Presentations, Software Technical Support, Software Trace Analysis, Software Trace Reading, Tools, Trace Analysis Patterns | 1 Comment »
Tuesday, April 5th, 2011
This is a kernel space counterpart to Custom Exception Handler pattern in user space. In the following stack trace below we see that DriverA code intercepted an access violation exception resulted from dereferencing a NULL pointer and generated a custom bugcheck:
kd> !analyze -v
[...]
EXCEPTION_RECORD: fffff8801c757158 -- (.exr 0xfffff8801c757158)
ExceptionAddress: fffff88003977de1 (DriverA!foo+0x0000000000000381)
ExceptionCode: c0000005 (Access violation)
ExceptionFlags: 00000000
NumberParameters: 2
Parameter[0]: 0000000000000000
Parameter[1]: 0000000000000070
Attempt to read from address 0000000000000070
TRAP_FRAME: fffff8801c757200 -- (.trap 0xfffff8801c757200)
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=0000000000000000 rbx=0000000000000000 rcx=fffff8a00da3f3c0
rdx=0000000000000000 rsi=0000000000000000 rdi=0000000000000000
rip=fffff88003977de1 rsp=fffff8801c757390 rbp=fffffa8009a853f0
r8=0000000000000000 r9=0000000000000000 r10=006800740020006e
r11=fffff8a00da3f3c6 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0 nv up ei pl zr na po nc
DriverA!foo+0×381:
fffff880`03977de1 0fb74070 movzx eax,word ptr [rax+70h] ds:0703:0070=????
Resetting default scope
[...]
kd> kL 100
Child-SP RetAddr Call Site
fffff880`1c7560f8 fffff880`039498f7 nt!KeBugCheckEx
fffff880`1c756100 fffff880`039352a0 DriverA!MyBugCheckEx+0×93
fffff880`1c756140 fffff800`016f1d1c DriverA!MyExceptionFilter+0×1d0
fffff880`1c756210 fffff800`016e940d nt!_C_specific_handler+0×8c
fffff880`1c756280 fffff800`016f0a90 nt!RtlpExecuteHandlerForException+0xd
fffff880`1c7562b0 fffff800`016fd9ef nt!RtlDispatchException+0×410
fffff880`1c756990 fffff800`016c2d82 nt!KiDispatchException+0×16f
fffff880`1c757020 fffff800`016c18fa nt!KiExceptionDispatch+0xc2
fffff880`1c757200 fffff880`03977de1 nt!KiPageFault+0×23a
fffff880`1c757390 fffff880`03977754 DriverA!foo+0×381
fffff880`1c757430 fffff880`0396f006 DriverA!bar+0×74
[…]
fffff880`1c7579b0 fffff800`019a6e0a DriverA!QueryInformation+0×30b
fffff880`1c757a70 fffff800`016c2993 nt!NtQueryInformationFile+0×535
fffff880`1c757bb0 00000000`76e5fe6a nt!KiSystemServiceCopyEnd+0×13
00000000`0a08dfe8 00000000`00000000 0×76e5fe6a
kd> !exchain
24 stack frames, scanning for handlers...
Frame 0×05: nt!RtlpExecuteHandlerForException+0xd (fffff800`016e940d)
ehandler nt!RtlpExceptionHandler (fffff800`016e93d0)
Frame 0×07: nt!KiDispatchException+0×16f (fffff800`016fd9ef)
ehandler nt!_GSHandlerCheck_SEH (fffff800`0169aec0)
Frame 0×0b: DriverA!bar+0×74 (fffff880`03977754)
ehandler DriverA!__GSHandlerCheck (fffff880`039a12fc)
[…]
Frame 0×14: DriverA!QueryInformation+0×30b (fffff880`039303ab)
ehandler DriverA!_C_specific_handler (fffff880`039a1864)
Frame 0×15: nt!NtQueryInformationFile+0×535 (fffff800`019a6e0a)
ehandler nt!_C_specific_handler (fffff800`016f1c90)
Frame 0×16: nt!KiSystemServiceCopyEnd+0×13 (fffff800`016c2993)
ehandler nt!KiSystemServiceHandler (fffff800`016c2580)
Frame 0×17: error getting module for 0000000076e5fe6a
- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in Crash Dump Analysis, Crash Dump Patterns, Debugging, Kernel Development, x64 Windows | No Comments »
Monday, March 28th, 2011
Data Correlation is a general pattern where values found in different parts of a memory dump correlate between each other according to some rules, for example, in some proportion. Here we show a variant for function parameters.
A process user memory dump showed a C++ exception:
0:000> kL
*** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr
0012e950 78158e89 kernel32!RaiseException+0x53
0012e988 7830770c msvcr80!_CxxThrowException+0×46
0012e99c 783095bc mfc80u!AfxThrowMemoryException+0×19
0012e9b4 02afa8ca mfc80u!operator new+0×27
0012e9c8 02b0992f ModuleA!std::_Allocate<…>+0×1a
0012e9e0 02b09e7c ModuleA!std::vector<double,std::allocator<double> >::vector<double,std::allocator<double> >+0×3f
[…]
We suspected an out-of-memory condition and looked for function parameters:
0:000> kv 5
ChildEBP RetAddr Args to Child
0012e950 78158e89 e06d7363 00000001 00000003 kernel32!RaiseException+0x53
0012e988 7830770c 0012e998 783b0110 783c8d68 msvcr80!_CxxThrowException+0x46
0012e99c 783095bc 0000a7c0 0012ea40 000014f8 mfc80u!AfxThrowMemoryException+0x19
0012e9b4 02afa8ca 0000a7c0 089321b0 089321f0 mfc80u!operator new+0×27 (FPO: [Uses EBP] [1,0,0])
0012e9c8 02b0992f 000014f8 00000000 00000008 ModuleA!std::_Allocate<…>+0×1a (FPO: [2,3,0])
Because of FPO optimization we originally thought that stack arguments would be invalid. However, bearing in mind the function prototype and semantics of operator new and std::vector double element type we immediately see the correlation between 0xa7c0 and 0×14f8 which are proportional to sizeof(double) == 8:
0:000> ? 0000a7c0/000014f8
Evaluate expression: 8 = 00000000`00000008
We therefore conclude without looking at disassembly that memory allocation size was 42944 bytes:
0:000> .formats 0000a7c0
Evaluate expression:
Hex: 00000000`0000a7c0
Decimal: 42944
Octal: 0000000000000000123700
Binary: 00000000 00000000 00000000 00000000 00000000 00000000 10100111 11000000
Chars: ……..
Time: Thu Jan 01 11:55:44 1970
Float: low 6.01774e-041 high 0
Double: 2.12172e-319
- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in Crash Dump Analysis, Crash Dump Patterns, Debugging | No Comments »
Saturday, March 26th, 2011
The presentation materials from the webinar (25th of March, 2011) are available for download:
http://www.dumpanalysis.com/PDSPSI-materials
Thanks to everyone who registered and attended!
- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in Announcements, Crash Dump Analysis, Crash Dump Patterns, DebugWare Patterns, Debugging, Debugging Methodology, Escalation Engineering, Mac Crash Corner, Malware Analysis, Malware Patterns, Memory Analysis Forensics and Intelligence, Software Architecture, Software Behavior Patterns, Software Engineering, Software Technical Support, Software Trace Analysis, Software Tracing Implementation Patterns, Software Troubleshooting Patterns, Structural Memory Patterns, Structural Trace Patterns, Tools, Trace Analysis Patterns, Training and Seminars, Troubleshooting Methodology, Unified Debugging Patterns, Webinars, Workaround Patterns | No Comments »
Sunday, March 20th, 2011
Sometimes we see Truncated Stack Trace with missing stack frames. For example, in one incident, after enabling user mode stack trace database for a memory leaking application we got these entries from the growing heap segment (other segments had non-truncated saved stack traces):
0bdc1350: 40010 . 40010 [101] - busy (3fff8) Internal
7702fbd2: ntdll!RtlAllocateHeap+0x0000021d
77005eef: ntdll!RtlpAllocateUserBlock+0x000000a2
77026a65: ntdll!RtlpLowFragHeapAllocFromContext+0x00000785
7702661f: ntdll!RtlAllocateHeap+0x0000017c
0be01360: 40010 . 40010 [101] - busy (3fff8) Internal
7702fbd2: ntdll!RtlAllocateHeap+0x0000021d
77005eef: ntdll!RtlpAllocateUserBlock+0x000000a2
77026a65: ntdll!RtlpLowFragHeapAllocFromContext+0x00000785
7702661f: ntdll!RtlAllocateHeap+0x0000017c
0be41370: 40010 . 40010 [101] - busy (3fff8) Internal
7702fbd2: ntdll!RtlAllocateHeap+0x0000021d
77005eef: ntdll!RtlpAllocateUserBlock+0x000000a2
77026a65: ntdll!RtlpLowFragHeapAllocFromContext+0x00000785
7702661f: ntdll!RtlAllocateHeap+0x0000017c
Truncated traces are different from incorrect stack traces because their surviving part is correct. How can we find the rest of such stack traces? Here we can suggest to look at other heap segments and see allocations of the same size. If a truncated trace comes from a stack trace collection we can compare it with a non-truncated thread stack from another process instance having the same thread position. Other suggestions are welcome too, especially, for the mysterious memory leak shown above.
- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in Crash Dump Analysis, Crash Dump Patterns, Debugging | 4 Comments »
Friday, March 18th, 2011
The forthcoming Software Behavior Analysis Patterns and Checklists: Corporate Reference Poster Edition for Windows Crash and Hang Memory Dump and Software Trace Analysis (ISBN: 978-1908043030) will feature Analysis Compass based on the reversal of an extended version of pattern-to-command table published in the printed version of Debugged! MZ/PE March, 2009 issue. Here’s a sneak preview screenshot of this feature:
The notation F”Function” means searching for frames having the given Function name in stack traces from Stack Trace Collection. S”String” means searching for String in the output of WinDbg commands, for example, !process 0 ff.
- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in Books, Crash Dump Analysis, Crash Dump Patterns, Debugging, Debugging Methodology, Software Behavior Patterns | No Comments »
Saturday, March 12th, 2011
Sometimes simple comparison of crash signatures is not enough to find similar support incidents. We then traverse stack trace frames to find Least Common Frame matching similar stack traces in a database. For example, consider this signature:
0:026> r
eax=011349ec ebx=01136738 ecx=79f943e1 edx=00000000 esi=011349ec edi=0888f3b8
eip=00dfbef8 esp=0888f348 ebp=0888f3c8 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
00dfbef8 3902 cmp dword ptr [edx],eax ds:0023:00000000=????????
0:026> k 100
ChildEBP RetAddr
WARNING: Frame IP not in any known module. Following frames may be wrong.
0888f3c8 792a842f 0xdfbef8
0888f3e4 792a839b mscorlib_ni+0x1e842f
0888f3fc 79e71b4c mscorlib_ni+0x1e839b
0888f40c 79e821b9 mscorwks!CallDescrWorker+0×33
0888f48c 79e8281f mscorwks!CallDescrWorkerWithHandler+0xa3
0888f4ac 79e82860 mscorwks!DispatchCallBody+0×1e
0888f510 79e828d1 mscorwks!DispatchCallDebuggerWrapper+0×3d
0888f544 79ec50f5 mscorwks!DispatchCallNoEH+0×51
0888f5a0 79e9848f mscorwks!AddTimerCallback_Worker+0×66
0888f5b4 79e9842b mscorwks!Thread::DoADCallBack+0×32a
0888f648 79e98351 mscorwks!Thread::ShouldChangeAbortToUnload+0xe3
0888f684 79e984dd mscorwks!Thread::ShouldChangeAbortToUnload+0×30a
0888f6ac 79ec4a84 mscorwks!Thread::ShouldChangeAbortToUnload+0×33e
0888f6c4 79ec5075 mscorwks!ManagedThreadBase::ThreadPool+0×13
0888f70c 79ec50a4 mscorwks!AddTimerCallbackEx+0×83
0888f720 79ec514a mscorwks!AddTimerCallback+0×10
0888f75c 79ec4e0c mscorwks!ThreadpoolMgr::AsyncTimerCallbackCompletion+0×64
0888f7a8 79ec471e mscorwks!UnManagedPerAppDomainTPCount::DispatchWorkItem+0×9a
0888f7bc 79ec4892 mscorwks!ThreadpoolMgr::ExecuteWorkRequest+0xaf
0888f814 79f75715 mscorwks!ThreadpoolMgr::WorkerThreadStart+0×20b
0888ffb4 7c80b729 mscorwks!Thread::intermediateThreadProc+0×49
0888ffec 00000000 kernel32!BaseThreadStart+0×37
Most likely we won’t find any similar stack trace when searching for 0xdfbef8. The search for mscorlib_ni+0×1e842f brings several results but they are not crashes but hangs with the frame being in the middle of call stack. The same is for mscorlib_ni+0×1e839b. So we finally try searching for CallDescrWorker+0×33 but with a limit to results having the same application module name. And indeed we find the similar software incident with the same stack trace after our least common frame:
0:004> r
eax=00000024 ebx=03e6f738 ecx=738129d8 edx=00495ef0 esi=01a87c4c edi=019c5f1c
eip=00a92037 esp=03e6f6cc ebp=03e6f6e8 iopl=0 nv up ei pl nz na pe nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010206
00a92037 ?? ???
0:004> k 100
*** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr
WARNING: Frame IP not in any known module. Following frames may be wrong.
03e6f6c8 737d6bb5 0xa92037
03e6f6e8 737a509f mscorlib_ni+0x216bb5
03e6f6f8 737a834c mscorlib_ni+0x1e509f
03e6f70c 74171b6c mscorlib_ni+0x1e834c
03e6f71c 74182209 mscorwks!CallDescrWorker+0×33
03e6f79c 7418286f mscorwks!CallDescrWorkerWithHandler+0xa3
03e6f7bc 741828b0 mscorwks!DispatchCallBody+0×1e
03e6f820 74182921 mscorwks!DispatchCallDebuggerWrapper+0×3d
03e6f854 742ced79 mscorwks!DispatchCallNoEH+0×51
03e6f8b0 7419846f mscorwks!AddTimerCallback_Worker+0×66
03e6f8c4 7419840b mscorwks!Thread::DoADCallBack+0×32a
03e6f958 74198331 mscorwks!Thread::ShouldChangeAbortToUnload+0xe3
03e6f994 741984bd mscorwks!Thread::ShouldChangeAbortToUnload+0×30a
03e6f9bc 742ce708 mscorwks!Thread::ShouldChangeAbortToUnload+0×33e
03e6f9d4 742cecf9 mscorwks!ManagedThreadBase::ThreadPool+0×13
03e6fa1c 742ced28 mscorwks!AddTimerCallbackEx+0×83
03e6fa30 742cedce mscorwks!AddTimerCallback+0×10
03e6fa6c 742cea90 mscorwks!ThreadpoolMgr::AsyncTimerCallbackCompletion+0×64
03e6fab8 742ce3a2 mscorwks!UnManagedPerAppDomainTPCount::DispatchWorkItem+0×9a
03e6facc 742ce516 mscorwks!ThreadpoolMgr::ExecuteWorkRequest+0xaf
03e6fb64 74441ec9 mscorwks!ThreadpoolMgr::WorkerThreadStart+0×20b
03e6fc84 76813677 mscorwks!Thread::intermediateThreadProc+0×49
03e6fc90 77219d72 kernel32!BaseThreadInitThunk+0xe
03e6fcd0 77219d45 ntdll!__RtlUserThreadStart+0×70
03e6fce8 00000000 ntdll!_RtlUserThreadStart+0×1b
- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in Crash Dump Analysis, Crash Dump Patterns, Debugging | No Comments »
Monday, March 7th, 2011

The first Webinar to start an in-depth discussion of pattern-driven software troubleshooting, debugging and maintenance:
Date: 25th of March 2011
Time: 18:30 (GMT) 14:30 (EST) 11:30 (PST)
Duration: 60 minutes
Space is limited.
Reserve your Webinar seat now at:
https://www3.gotomeeting.com/register/448268158
Topics include:
- A Short History of DumpAnalysis.org
- Memory Dump Analysis Patterns
- Troubleshooting and Debugging Tools (Debugware) Patterns
- Software Trace Analysis Patterns
- From Software Defects to Software Behavior
- Workaround Patterns
- Structural Memory Patterns
- Memory Analysis Domain Pattern Hierarchy
- New Directions
Prerequisites: experience in software troubleshooting and/or debugging.
- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in .NET Debugging, Analysis Notation, Announcements, AntiPatterns, Best Practices, CDA Pattern Classification, Crash Dump Analysis, Crash Dump Patterns, DebugWare Patterns, Debugging, Debugging Methodology, Debugging Trends, Escalation Engineering, Java Debugging, Linux Crash Corner, Mac Crash Corner, Malware Analysis, Malware Patterns, Memory Analysis Forensics and Intelligence, Memory Dump Analysis Services, Pattern Models, Pattern Prediction, Presentations, Software Behavior Patterns, Software Chorology, Software Engineering, Software Narratology, Software Technical Support, Software Trace Analysis, Software Trace Reading, Software Tracing Implementation Patterns, Software Troubleshooting Patterns, Structural Memory Patterns, Structural Trace Patterns, Systems Thinking, Testing, Trace Analysis Patterns, Training and Seminars, Troubleshooting Methodology, Unified Debugging Patterns, Webinars, Workaround Patterns | No Comments »
Monday, March 7th, 2011
In certain software behavior scenarios such as a memory leak when we see top modules calling OS API functions we might suspect them having defects. However, this might not be the case and these modules were used from Directing Module keeping references or handles preventing top modules from freeing memory or releasing resources.
For example, a memory dump from a process had 2 growing heap segments and one of them had this recurrent stack trace saved in a user mode stack trace database:
38D2CE78: 02ba8 . 02ba8 [07] - busy (2b90), tail fill
Stack trace (38101) at 83e390:
7d6568be: ntdll!RtlAllocateHeapSlowly+0×00000041
7d62b846: ntdll!RtlAllocateHeap+0×00000E9F
337d0572: ModuleA!XHeapAlloc+0×00000115
[…]
338809e2: ModuleA!Execute+0×000002CD
488b3fc1: ModuleB!Execute+0×000000D3
679b8c64: ModuleC!ExecuteByHandle+0×00000074
[…]
67d241cb: ModuleD!Query+0×0000016B
67ba2ed4: ModuleE!Browse+0×000000E4
[…]
667122c6: ModuleF!Check+0×00000126
65e73826: ModuleG!Enum+0×00000406
[…]
Initially we suspected ModuleA but found a different recurrent stack trace corresponding to another growing segment:
40C81688: 000c8 . 00058 [07] - busy (40), tail fill
Stack trace (38136) at 83f6a4:
7d6568be: ntdll!RtlAllocateHeapSlowly+0×00000041
7d62b846: ntdll!RtlAllocateHeap+0×00000E9F
7c3416b3: msvcr71!_heap_alloc+0×000000E0
7c3416db: msvcr71!_nh_malloc+0×00000010
67745875: ModuleX!BufAllocate+0×00000015
6775085e: ModuleY!QueryAttribute+0×0000008E
[…]
677502b5: ModuleY!Query+0×00000015
67ba2f19: ModuleE!Browser+0×00000129
[…]
667122c6: ModuleF!Check+0×00000126
65e73826: ModuleG!Enum+0×00000406
[…]
From the common stack trace fragment (highlighted in blue) we transferred our investigation to ModuleE and indeed the similar software incident (as the latter trace) was found in our troubleshooting database.
- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in Crash Dump Analysis, Crash Dump Patterns, Debugging, Software Architecture | No Comments »
Wednesday, March 2nd, 2011
A page to reference all different kinds of stack overflow is necessary, so I created this post:
I’ll update it as soon as I add more similar patterns.
- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in CDA Pattern Classification, Crash Dump Analysis, Crash Dump Patterns, Debugging, Training and Seminars | No Comments »
Tuesday, March 1st, 2011
Stack Overflow pattern variants in user and kernel mode are ISA (Instruction Set Architecture) and processor architecture oriented. Another pattern variant is software stack implementations where push and pop operations check stack ADT preconditions and throw a software exception (overflow or underflow) or call an assertion mechanism to display an error message. For the latter example, we look at a bugcheck for the specific stack implementation on Windows: IRP stack locations array. For a graphical reminder on how driver-to-driver communication is implemented by an IRP stack corresponding to a driver stack please refer to UML diagram no. 3 in the old post about using UML for describing device driver design. The following WinDbg command output is from a kernel memory dump:
0: kd> !analyze -v
[...]
NO_MORE_IRP_STACK_LOCATIONS (35)
A higher level driver has attempted to call a lower level driver through the IoCallDriver() interface, but there are no more stack locations in the packet, hence, the lower level driver would not be able to access its parameters, as there are no parameters for it. This is a disasterous situation, since the higher level driver "thinks" it has filled in the parameters for the lower level driver (something it MUST do before it calls it), but since there is no stack location for the latter driver, the former has written off of the end of the packet. This means that some other memory has probably been trashed at this point.
Arguments:
Arg1: fffffa800500c9e0, Address of the IRP
Arg2: 0000000000000000
Arg3: 0000000000000000
Arg4: 0000000000000000
[…]
0: kd> kL 100
Child-SP RetAddr Call Site
fffff880`01fe2338 fffff800`016d7732 nt!KeBugCheckEx
fffff880`01fe2340 fffff800`01754f27 nt!KiBugCheck3+0x12
fffff880`01fe2370 fffff880`0177e271 nt! ?? ::FNODOBFM::`string’+0×3f31b
fffff880`01fe23a0 fffff880`0177c138 DriverA!CallProvider+0×161
[…]
fffff880`01fe2cb0 fffff800`0197a7c6 nt!ExpWorkerThread+0×111
fffff880`01fe2d40 fffff800`016b5c26 nt!PspSystemThreadStartup+0×5a
fffff880`01fe2d80 00000000`00000000 nt!KxStartSystemThread+0×16
0: kd> !irp fffffa800500c9e0
Irp is active with 1 stacks 0 is current (= 0xfffffa8006c2e960)
No Mdl: No System Buffer: Thread 00000000: Irp stack trace.
cmd flg cl Device File Completion-Context
[ 4, 0] 0 e0 fffffa8004045c50 fffffa8006c2e960 fffff88005a04460-fffffa8005b9c370 Success Error Cancel
\DriverA DriverB!CompleteRoutine
Args: 00000008 00000000 00000000 00000000
0: kd> ub fffff880`0177e271
DriverA!CallProvider+0×13e:
fffff880`0177e24e mov qword ptr [r11-10h],rax
fffff880`0177e252 mov qword ptr [r11-8],r12
fffff880`0177e256 mov byte ptr [r11-45h],0E0h
fffff880`0177e25b mov rcx,qword ptr [rdi+40h]
fffff880`0177e25f call qword ptr [DriverA!_imp_IoGetAttachedDevice (fffff880`017790b0)]
fffff880`0177e265 mov rdx,rbp
fffff880`0177e268 mov rcx,rax
fffff880`0177e26b call qword ptr [DriverA!_imp_IofCallDriver (fffff880`01779068)]
- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in Crash Dump Analysis, Crash Dump Patterns, Debugging, Kernel Development | No Comments »
Saturday, February 19th, 2011
This part starts the block of patterns called Semantic Structures. These structures are fragments of memory which have meaning helping us in troubleshooting and debugging. The first pattern in this block deals with PID.TID structures of the form DWORD : DWORD or QWORD : QWORD. Such memory fragments are useful for wait chain analysis, for example, by looking at the excution residue left on a raw stack to find a target or an origin of RPC or (A)LPC calls. RPC target example can be found in the post: In Search of Lost CID. Here we look at another example, this time to find the originator of an ALPC call.
A ServiceA was executing some undesired functionality and a breakpoint was set on ModuleA code to trigger it under unreproducible conditions. Then a complete memory dump was saved for offline analysis. There we see an ALPC server thread that triggered the breakpoint but don’t see the message information in the output of WinDbg !thread command that can help us finding an ALPC client thread easily:
THREAD fffffa8005e6b060 Cid 0cc0.1838 Teb: 000007fffff8e000 Win32Thread: 0000000000000000 WAIT: (Executive) KernelMode Non-Alertable
SuspendCount 1
fffff880094ad0a0 SynchronizationEvent
Not impersonating
DeviceMap fffff8a001aba3c0
Owning Process fffffa8004803b30 Image: ServiceA.exe
Attached Process N/A Image: N/A
Wait Start TickCount 1441562 Ticks: 106618 (0:00:27:43.251)
Context Switch Count 414
UserTime 00:00:00.000
KernelTime 00:00:00.031
Win32 Start Address ntdll!TppWorkerThread (0×0000000077c88f00)
Stack Init fffff880094addb0 Current fffff880094acdb0
Base fffff880094ae000 Limit fffff880094a8000 Call 0
Priority 12 BasePriority 10 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
Child-SP RetAddr Call Site
fffff880`094acdf0 fffff800`01678992 nt!KiSwapContext+0×7a
fffff880`094acf30 fffff800`0167acff nt!KiCommitThreadWait+0×1d2
fffff880`094acfc0 fffff800`01a150e8 nt!KeWaitForSingleObject+0×19f
fffff880`094ad060 fffff800`01a1546c nt!DbgkpQueueMessage+0×2a8
fffff880`094ad230 fffff800`019b9116 nt!DbgkpSendApiMessage+0×5c
fffff880`094ad270 fffff800`016abb96 nt! ?? ::NNGAKEGL::`string’+0×3463d
fffff880`094ad3b0 fffff800`01670d82 nt!KiDispatchException+0×316
fffff880`094ada40 fffff800`0166ebb4 nt!KiExceptionDispatch+0xc2
fffff880`094adc20 000007fe`f79365d1 nt!KiBreakpointTrap+0xf4 (TrapFrame @ fffff880`094adc20)
00000000`035ee568 000007fe`f80670b5 ModuleA+0×38611
[…]
00000000`035ee5d0 000007fe`ff4bc7f5 ModuleB!Start+0×6e1
00000000`035ee770 000007fe`ff56b62e RPCRT4!Invoke+0×65
00000000`035ee7c0 000007fe`ff4bf1f6 RPCRT4!Ndr64StubWorker+0×61b
00000000`035eed80 000007fe`ffedf223 RPCRT4!NdrStubCall3+0xb5
00000000`035eede0 000007fe`ffedfc0d ole32!CStdStubBuffer_Invoke+0×5b
00000000`035eee10 000007fe`ffedfb83 ole32!SyncStubInvoke+0×5d
00000000`035eee80 000007fe`ffd7fd60 ole32!StubInvoke+0xdb
00000000`035eef30 000007fe`ffedfa22 ole32!CCtxComChnl::ContextInvoke+0×190
00000000`035ef0c0 000007fe`ffedf76b ole32!AppInvoke+0xc2
00000000`035ef130 000007fe`ffeded6d ole32!ComInvokeWithLockAndIPID+0×52b
00000000`035ef2c0 000007fe`ff4b9c24 ole32!ThreadInvoke+0×30d
00000000`035ef360 000007fe`ff4b9d86 RPCRT4!DispatchToStubInCNoAvrf+0×14
00000000`035ef390 000007fe`ff4bc44b RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0×146
00000000`035ef4b0 000007fe`ff4bc38b RPCRT4!RPC_INTERFACE::DispatchToStub+0×9b
00000000`035ef4f0 000007fe`ff4bc322 RPCRT4!RPC_INTERFACE::DispatchToStubWithObject+0×5b
00000000`035ef570 000007fe`ff4ba11d RPCRT4!LRPC_SCALL::DispatchRequest+0×422
00000000`035ef650 000007fe`ff4c7ddf RPCRT4!LRPC_SCALL::HandleRequest+0×20d
00000000`035ef780 000007fe`ff4c7995 RPCRT4!LRPC_ADDRESS::ProcessIO+0×3bf
00000000`035ef8c0 00000000`77c8b43b RPCRT4!LrpcIoComplete+0xa5
00000000`035ef950 00000000`77c8923f ntdll!TppAlpcpExecuteCallback+0×26b
00000000`035ef9e0 00000000`77a6f56d ntdll!TppWorkerThread+0×3f8
00000000`035efce0 00000000`77ca3281 kernel32!BaseThreadInitThunk+0xd
00000000`035efd10 00000000`00000000 ntdll!RtlUserThreadStart+0×1d
We inspect the raw stack starting from the first top Child-SP value for RPCRT4 subtrace and find NNN:NNN data there resembling PID:TID pair:
1: kd> dpp 00000000`035ef360 l100
[...]
00000000`035ef698 00000000`00000000
00000000`035ef6a0 00000000`00000001
00000000`035ef6a8 00000000`00000000
00000000`035ef6b0 00000000`00000000
00000000`035ef6b8 00000000`00000118
00000000`035ef6c0 00000000`0000048c
00000000`035ef6c8 00000000`00495e50 000007fe`ff57d920 RPCRT4!LRPC_ADDRESS::`vftable’
00000000`035ef6d0 00000000`00000000
[…]
We find such CID in the stack trace collection and see a wait for an ALPC message reply:
THREAD fffffa8003d49b60 Cid 0118.048c Teb: 000007fffffaa000 Win32Thread: fffff900c01e4c30 WAIT: (WrLpcReply) UserMode Non-Alertable
fffffa8003d49f20 Semaphore Limit 0×1
Waiting for reply to ALPC Message fffff8a000bdb6c0 : queued at port fffffa80042f8090 : owned by process fffffa8004803b30
Not impersonating
DeviceMap fffff8a000008600
Owning Process fffffa8003cf15d0 Image: ServiceB.exe
Attached Process N/A Image: N/A
Wait Start TickCount 1441554 Ticks: 106626 (0:00:27:43.376)
Context Switch Count 23180 LargeStack
UserTime 00:00:00.468
KernelTime 00:00:03.057
Win32 Start Address ntdll!TppWorkerThread (0×0000000077c88f00)
Stack Init fffff88004ffcdb0 Current fffff88004ffc620
Base fffff88004ffd000 Limit fffff88004ff7000 Call 0
Priority 6 BasePriority 6 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
Kernel stack not resident.
Child-SP RetAddr Call Site
fffff880`04ffc660 fffff800`01678992 nt!KiSwapContext+0×7a
fffff880`04ffc7a0 fffff800`0167acff nt!KiCommitThreadWait+0×1d2
fffff880`04ffc830 fffff800`0168fd1f nt!KeWaitForSingleObject+0×19f
fffff880`04ffc8d0 fffff800`01977ac6 nt!AlpcpSignalAndWait+0×8f
fffff880`04ffc980 fffff800`01975a50 nt!AlpcpReceiveSynchronousReply+0×46
fffff880`04ffc9e0 fffff800`01972fcb nt!AlpcpProcessSynchronousRequest+0×33d
fffff880`04ffcb00 fffff800`01670993 nt!NtAlpcSendWaitReceivePort+0×1ab
fffff880`04ffcbb0 00000000`77cc070a nt!KiSystemServiceCopyEnd+0×13 (TrapFrame @ fffff880`04ffcc20)
00000000`018ce308 000007fe`ff4caa76 ntdll!ZwAlpcSendWaitReceivePort+0xa
00000000`018ce310 000007fe`ff4bf802 RPCRT4!LRPC_CCALL::SendReceive+0×156
00000000`018ce3d0 000007fe`ffee0900 RPCRT4!I_RpcSendReceive+0×42
00000000`018ce400 000007fe`ffee05ef ole32!ThreadSendReceive+0×40
00000000`018ce450 000007fe`ffee041b ole32!CRpcChannelBuffer::SwitchAptAndDispatchCall+0xa3
00000000`018ce4f0 000007fe`ffd819c6 ole32!CRpcChannelBuffer::SendReceive2+0×11b
00000000`018ce6b0 000007fe`ffd81928 ole32!CAptRpcChnl::SendReceive+0×52
00000000`018ce780 000007fe`ffedfcf5 ole32!CCtxComChnl::SendReceive+0×68
00000000`018ce830 000007fe`ff56ba3b ole32!NdrExtpProxySendReceive+0×45
00000000`018ce860 000007fe`ffee02d0 RPCRT4!NdrpClientCall3+0×2e2
00000000`018ceb20 000007fe`ffd818a2 ole32!ObjectStublessClient+0×11d
00000000`018ceeb0 00000000`ff5afe64 ole32!ObjectStubless+0×42
[…]
00000000`018cf7a0 00000000`77c8f8eb ServiceB!Worker+0×366
00000000`018cf800 00000000`77c89d9f ntdll!RtlpTpWorkCallback+0×16b
00000000`018cf8e0 00000000`77a6f56d ntdll!TppWorkerThread+0×5ff
00000000`018cfbe0 00000000`77ca3281 kernel32!BaseThreadInitThunk+0xd
00000000`018cfc10 00000000`00000000 ntdll!RtlUserThreadStart+0×1d
Inspection of that message shows that it was directed to our server thread that triggered the breakpoint:
1: kd> !alpc /m fffff8a000bdb6c0
Message @ fffff8a000bdb6c0
MessageID : 0x0600 (1536)
CallbackID : 0x2D910D (2986253)
SequenceNumber : 0x0002CB50 (183120)
Type : LPC_REQUEST
DataLength : 0x0068 (104)
TotalLength : 0x0090 (144)
Canceled : No
Release : No
ReplyWaitReply : No
Continuation : Yes
OwnerPort : fffffa8004823a80 [ALPC_CLIENT_COMMUNICATION_PORT]
WaitingThread : fffffa8003d49b60
QueueType : ALPC_MSGQUEUE_PENDING
QueuePort : fffffa80042f8090 [ALPC_CONNECTION_PORT]
QueuePortOwnerProcess : fffffa8004803b30 (ServiceA.exe)
ServerThread : fffffa8005e6b060
QuotaCharged : No
CancelQueuePort : 0000000000000000
CancelSequencePort : 0000000000000000
CancelSequenceNumber : 0×00000000 (0)
ClientContext : 000000000020f0c0
ServerContext : 0000000000000000
PortContext : 0000000000416990
CancelPortContext : 0000000000000000
SecurityData : 0000000000000000
View : 0000000000000000
- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in Complete Memory Dump Analysis, Crash Dump Analysis, Crash Dump Patterns, Debugging, x64 Windows | 1 Comment »