Crash Dump Analysis Patterns (Part 1)
After doing crash dump analysis exclusively for more than 3 years I decided to organize my knowledge into a set of patterns (so to speak in a dump analysis pattern language and therefore try to facilitate its common vocabulary).
What is a pattern? It is a general solution you can apply in a specific context to a common recurrent problem.
There are many pattern and pattern languages in software engineering, for example, look at the following almanac that lists +700 patterns:
and the following link is very useful:
The first pattern I’m going to introduce today is Multiple Exceptions. This pattern captures the known fact that there could be as many exceptions (”crashes”) as many threads in a process. The following UML diagram depicts the relationship between Process, Thread and Exception entities:
Every process in Windows has at least one execution thread so there could be at least one exception per thread (like invalid memory reference) if things go wrong. There could be second exception in that thread if exception handling code experiences another exception or the first exception was handled and you have another one and so on.
So what is the general solution to that common problem when an application or service crashes and you have a crash dump file (common recurrent problem) from a customer (specific context)? The general solution is to look at all threads and their stacks and do not rely on what tools say.
Here is a concrete example from one of the dumps I got today:
Internet Explorer crashed and I opened it in WinDbg and ran ‘!analyze -v’ command. This is what I got in my WinDbg output:
ExceptionAddress: 7c822583 (ntdll!DbgBreakPoint)
ExceptionCode: 80000003 (Break instruction exception)
ExceptionFlags: 00000000
NumberParameters: 3
Parameter[0]: 00000000
Parameter[1]: 8fb834b8
Parameter[2]: 00000003
Break instruction, you might think, shows that the dump was taken manually from the running application and there was no crash - the customer sent the wrong dump or misunderstood instructions. However I looked at all threads and noticed the following two stacks (threads 15 and 16):
0:016>~*kL
...
15 Id: 1734.8f4 Suspend: 1 Teb: 7ffab000 Unfrozen
ntdll!KiFastSystemCallRet
ntdll!NtRaiseHardError+0xc
kernel32!UnhandledExceptionFilter+0x54b
kernel32!BaseThreadStart+0x4a
kernel32!_except_handler3+0x61
ntdll!ExecuteHandler2+0x26
ntdll!ExecuteHandler+0x24
ntdll!KiUserExceptionDispatcher+0xe
componentA!xxx
componentB!xxx
mshtml!xxx
kernel32!BaseThreadStart+0x34
# 16 Id: 1734.11a4 Suspend: 1 Teb: 7ffaa000 Unfrozen
ntdll!DbgBreakPoint
ntdll!DbgUiRemoteBreakin+0x36
So we see here that the real crash happened in componentA.dll and componentB.dll or mshtml.dll might have influenced that. Why this happened? The customer might have dumped Internet Explorer manually while it was displaying an exception message box. The following reference says that ZwRaiseHardError displays a message box containing an error message:
Windows NT/2000 Native API Reference
Or perhaps something else happened. Many cases where we see multiple thread exceptions in one process dump happened because crashed threads displayed message boxes like Visual C++ debug message box and preventing that process from termination. In our dump under discussion WinDbg automatic analysis command recognized only the last breakpoint exception (shown as # 16). In conclusion we shouldn’t rely on ”automatic analysis” often anyway and probably should write our own extension to list possible multiple exceptions (based on some heuristics I will talk about later).
- Dmitry Vostokov @ DumpAnalysis.org -
November 15th, 2006 at 7:15 am
Hai,
Really very very thanks for the information that you have posted, it helped me lot.
December 25th, 2007 at 12:28 pm
[…] first pattern is called Multiple Failures and it is direct mapping from application crash analysis Multiple Exceptions pattern. The running instance of a computer application (process) can experience multiple […]
September 17th, 2008 at 9:42 am
[…] 출처(source) - http://www.dumpanalysis.org/blog/index.php/2006/10/30/crash-dump-analysis-patterns-part-1/ […]
September 24th, 2008 at 6:36 am
Great Information ! Thanks !
October 15th, 2008 at 7:32 am
[…] 2 years have passed since I wrote the first post about crash dump analysis patterns: Multiple Exceptions. Today I write about multiple exceptions or faults in kernel mode. Here I distinguish multiple […]
February 17th, 2009 at 10:53 am
[…] for a memory dump analysis pattern and only mentioned similar design pattern definition in the first pattern post. Now it is time to draft […]
July 8th, 2009 at 11:42 pm
[…] look at the stack trace collection to find another exception and we see it indeed on 7th stack […]
October 20th, 2009 at 1:05 pm
[…] we have multiple exceptions here. Let’s extract thread 0 […]
March 8th, 2010 at 4:28 pm
[…] From now on, every memory dump analysis pattern (an later trace analysis patterns) will have platform-independent pictorial representation. Today we introduce an icon for Multiple Exceptions (user mode) pattern: […]
May 16th, 2010 at 1:46 am
[…] that corresponds to memory or trace analysis patterns. Today we introduce an example model for Multiple Exceptions (user mode) pattern. The following source code models 3 threads each having an exception during their […]