Crash Dump Analysis Patterns (Part 2)

CARE: Crash Analysis Report Environment

DATA (Dump Analysis + Trace Analysis) Facebook group
Please join the community of memory (dump) and trace analysis engineers. This group promotes scientific methods and memory dump-based worldview.

Twitter @ DumpAnalysis
You can now follow portal and blog news at DumpAnalysis on Twitter

LinkedIn Group Dr. Watson Enthusiasts
All about Dr. Watson errors and more. Get news, excerpts and progress reports about the forthcoming book The Science of Dr. Watson: An Illustrated History of Debugging (ISBN 978-1906717070)

2010 (0x7DA) - The Year of Dump Analysis
2011 (0x7DB) - 2020 (0x7E4) The Debugging Decade

Another pattern I would like to discuss is Dynamic Memory Corruption (and its user and kernel variants called Heap Corruption and Pool Corruption). You might have already guessed it :-) It is so ubiquitous. And its manifestations are random and usually crashes happen far away from the original corruption point. In your user mode and space part of exception threads (don’t forget about Multiple Exceptions pattern) you would see something like this:

ntdll!RtlpCoalesceFreeBlocks+0x10c
ntdll!RtlFreeHeap+0x142
MSVCRT!free+0xda
componentA!xxx

or this

ntdll!RtlpCoalesceFreeBlocks+0x10c
ntdll!RtlpExtendHeap+0x1c1
ntdll!RtlAllocateHeap+0x3b6
componentA!xxx

or any similar variants and you need to know exact component that corrupted application heap (which usually is not the same as componentA.dll you see in crashed thread stack).

For this common recurrent problem we have a general solution: enable heap checking. This general solution has many variants applied in a specific context:

  • parameter value checking for heap functions

  • user space software heap checks before or after certain checkpoints (like “malloc”/”new” and/or “free”/”delete” calls): usually implemented by checking various fill patterns, etc.

  • hardware/OS supported heap checks (like using guard and nonaccessible pages to trap buffer overruns)

The latter variant is the mostly used according to my experience and mainly due to the fact that most heap corruptions originate from buffer overflows. And it is easier to rely on instant MMU support than on checking fill patterns. Here is the article from Citrix support web site describing how you can enable full page heap. It uses specific process as an example: Citrix Independent Management Architecture (IMA) service but you can substitute any application name you are interested in debugging:

How to enable full page heap

and another article:

How to check in a user dump that full page heap was enabled

The following Microsoft article discusses various heap related checks:

How to use Pageheap.exe in Windows XP and Windows 2000

The Windows kernel analog to user mode and space heap corruption is called page and nonpaged pool corruption. If we consider Windows kernel pools as variants of heap then exactly the same techniques are applicable there, for example, the so called special pool enabled by Driver Verifier is implemented by nonaccessible pages. Refer to the following Microsoft article for further details:

How to use the special pool feature to isolate pool damage

- Dmitry Vostokov @ DumpAnalysis.org -

DBG_HeapAnalysis from Narasimha Vedala

           

Announcements

Coming Soon:

Debugging Notebook: Essential Concepts, WinDbg Commands and Tools

Crash Dump Analysis for System Administrators and Support Engineers

New Magazines:

Debugged! MZ/PE: MagaZine for/from Practicing Engineers


New Books:

Memory Dump Analysis Anthology, Volume 3

First Fault Software Problem Solving: A Guide for Engineers, Managers and Users

x64 Windows Debugging: Practical Foundations

Also available:

Windows Debugging: Practical Foundations

DLL List Landscape: The Art from Computer Memory Space

Dumps, Bugs and Debugging Forensics: The Adventures of Dr. Debugalov

WinDbg: A Reference Poster and Learning Cards

Memory Dump Analysis Anthology, Volume 2

Memory Dump Analysis Anthology, Volume 1

New Children's Book:

Baby Turing

14 Responses to “Crash Dump Analysis Patterns (Part 2)”

  1. Crash Dump Analysis » Blog Archive » Crash Dump Analysis Patterns (Part 23a) Says:

    Double Free pattern:

    […] bugs lead to Dynamic Memory Corruption. The reason why Double Free deserves its own pattern name is the fact that either debug runtime […]

  2. Management Bits and Tips » Blog Archive » Project Failure Analysis Patterns (Part 2) Says:

    […] are added to a pile or removed from it. Therefore we have just established the mapping between Dynamic Memory Corruption pattern from crash dump analysis domain to Project Artifact Corruption […]

  3. Crash Dump Analysis » Blog Archive » Crash Dump Analysis Patterns (Part 2b) Says:

    […] is an additional kernel example to my old Dynamic Memory Corruption pattern. If kernel pools are corrupt then calls that allocate or free memory result in bugchecks […]

  4. Crash Dump Analysis » Blog Archive » Crash Dump Analysis Patterns (Part 71) Says:

    […] memory corruption patterns in user and kernel spaces are specializations of one big parent pattern called Corrupt Structure […]

  5. !analyze -v : 크래쉬 덤프 분석 패턴 (Part 2) Says:

    […] http://www.dumpanalysis.org/blog/index.php/2006/10/31/crash-dump-analysis-patterns-part-2/ […]

  6. Crash Dump Analysis » Blog Archive » Exception and deadlock: pattern cooperation Says:

    […] was an exception while loading a DLL. Applying exception context WinDbg command .cxr reveals heap corruption […]

  7. Crash Dump Analysis » Blog Archive » Heap Corruption Says:

    […] Dynamic Memory Corruption (process heap) […]

  8. Crash Dump Analysis » Blog Archive » Dynamic Memory Corruption Patterns Says:

    […] Dynamic Memory Corruption (process heap) […]

  9. Crash Dump Analysis » Blog Archive » Manual dump, dynamic memory corruption, blocked threads, stack trace collection, multiple exceptions, wait chains and deadlock: pattern cooperation Says:

    […] see the exeption was dispatched because of heap corruption and the unhandled exception filter is blocked waiting for a critical section. We can immediately […]

  10. Crash Dump Analysis » Blog Archive » Manual dump, wait chain, blocked thread, dynamic memory corruption and historical information: pattern cooperation Says:

    […] exception filter raising a hard error message box. Applying the new exception context we confirm heap corruption […]

  11. La billeterie » Blog Archive » Analyse mémoire sous windows Says:

    […] des options de gflags.exe Tous les flags de gflags Gflags : Enable page heap Full/Standard enable heap checking La structure des blocs mémoire quand quand le page heap est activé appverifier The Structure of a […]

  12. Crash Dump Analysis » Blog Archive » Heap corruption, module variety, execution residue, coincidental symbolic information and critical section corruption: pattern cooperation Says:

    […] environments or insufficiently tested in multi-threaded environments. Many such crashes result from dynamic memory corruption of a process […]

  13. Crash Dump Analysis » Blog Archive » Crash Dump Analysis AntiPatterns (Part 13) Says:

    […] without questioning them or not considering possible exceptions. For example, the usual advise to heap corruption signs in process memory dumps is to ask to enable full page heap. However page heap helps to […]

  14. Crash Dump Analysis » Blog Archive » Icons for Memory Dump Analysis Patterns (Part 3) Says:

    […] Today we introduce an icon for Dynamic Memory Corruption (process heap) pattern: […]

Leave a Reply