Software Diagnostics Library » Crash Dump Analysis

Archive for the ‘Crash Dump Analysis’ Category

Crash Dump Analysis Patterns (Part 39)

Friday, November 23rd, 2007

As mentioned in Early Crash Dump pattern saving crash dumps on first-chance exceptions helps to diagnose components that might have caused corruption and later crashes, hangs or CPU spikes by ignoring abnormal exceptions like access violation. In such cases we need to know whether an application installs its own Custom Exception Handler or several of them. If it uses only default handlers provided by runtime or windows subsystem then most likely a first-chance access violation exception will result in a last-chance exception and a postmortem dump. To check a chain of exception handlers we can use WinDbg !exchain extention command. For example:

0:000> !exchain 0017f9d8: TestDefaultDebugger!AfxWinMain+3f5 (00420aa9) 0017fa60: TestDefaultDebugger!AfxWinMain+34c (00420a00) 0017fb20: user32!_except_handler4+0 (770780eb) 0017fcc0: user32!_except_handler4+0 (770780eb) 0017fd24: user32!_except_handler4+0 (770780eb) 0017fe40: TestDefaultDebugger!AfxWinMain+16e (00420822) 0017feec: TestDefaultDebugger!AfxWinMain+797 (00420e4b) 0017ff90: TestDefaultDebugger!_except_handler4+0 (00410e00) 0017ffdc: ntdll!_except_handler4+0 (77961c78)

We see that TestDefaultDebugger doesn’t have its own exception handlers except ones provided by MFC and C/C++ runtime libraries which were linked statically. Here is another example. It was reported that a 3rd-party application was hanging and spiking CPU (Spiking Thread pattern) so a user dump was saved using command line userdump.exe:

0:000> vertarget Windows Server 2003 Version 3790 (Service Pack 2) MP (4 procs) Free x86 compatible Product: Server, suite: TerminalServer kernel32.dll version: 5.2.3790.4062 (srv03_sp2_gdr.070417-0203) Debug session time: Thu Nov 22 12:45:59.000 2007 (GMT+0) System Uptime: 0 days 10:43:07.667 Process Uptime: 0 days 4:51:32.000 Kernel time: 0 days 0:08:04.000 User time: 0 days 0:23:09.000

0:000> !runaway 3 User Mode Time Thread Time 0:1c1c 0 days 0:08:04.218 1:2e04 0 days 0:00:00.015 Kernel Mode Time Thread Time 0:1c1c 0 days 0:23:09.156 1:2e04 0 days 0:00:00.031

0:000> kL ChildEBP RetAddr 0012fb80 7739bf53 ntdll!KiFastSystemCallRet 0012fbb4 05ca73b0 user32!NtUserWaitMessage+0xc WARNING: Stack unwind information not available. Following frames may be wrong. 0012fd20 05c8be3f 3rdPartyDLL+0x573b0 0012fd50 05c9e9ea 3rdPartyDLL+0x3be3f 0012fd68 7739b6e3 3rdPartyDLL+0x4e9ea 0012fd94 7739b874 user32!InternalCallWinProc+0x28 0012fe0c 7739c8b8 user32!UserCallWinProcCheckWow+0x151 0012fe68 7739c9c6 user32!DispatchClientMessage+0xd9 0012fe90 7c828536 user32!__fnDWORD+0x24 0012febc 7739d1ec ntdll!KiUserCallbackDispatcher+0x2e 0012fef8 7738cee9 user32!NtUserMessageCall+0xc 0012ff18 0050aea9 user32!SendMessageA+0x7f 0012ff70 00452ae4 3rdPartyApp+0x10aea9 0012ffac 00511941 3rdPartyApp+0x52ae4 0012ffc0 77e6f23b 3rdPartyApp+0x111941 0012fff0 00000000 kernel32!BaseProcessStart+0x23

Exception chain showed custom exception handlers:

0:000> !exchain 0012fb8c: 3rdPartyDLL+57acb (05ca7acb) 0012fd28: 3rdPartyDLL+3be57 (05c8be57) 0012fd34: 3rdPartyDLL+3be68 (05c8be68) 0012fdfc: user32!_except_handler3+0 (773aaf18) CRT scope 0, func: user32!UserCallWinProcCheckWow+156 (773ba9ad) 0012fe58: user32!_except_handler3+0 (773aaf18) 0012fea0: ntdll!KiUserCallbackExceptionHandler+0 (7c8284e8) 0012ff3c: 3rdPartyApp+53310 (00453310) 0012ff48: 3rdPartyApp+5334b (0045334b) 0012ff9c: 3rdPartyApp+52d06 (00452d06) 0012ffb4: 3rdPartyApp+38d4 (004038d4) 0012ffe0: kernel32!_except_handler3+0 (77e61a60) CRT scope 0, filter: kernel32!BaseProcessStart+29 (77e76a10) func: kernel32!BaseProcessStart+3a (77e81469)

The customer then enabled MS Exception Monitor and selected only Access violation exception code (c0000005) to avoid False Positive Dumps. During application execution various 1st-chance exception crash dumps were saved pointing to numerous access violations including function calls into unloaded modules, for example:

0:000> kL 100 ChildEBP RetAddr WARNING: Frame IP not in any known module. Following frames may be wrong. 0012f910 7739b6e3 <Unloaded_Another3rdParty.dll>+0x4ce58 0012f93c 7739b874 user32!InternalCallWinProc+0x28 0012f9b4 7739c8b8 user32!UserCallWinProcCheckWow+0x151 0012fa10 7739c9c6 user32!DispatchClientMessage+0xd9 0012fa38 7c828536 user32!__fnDWORD+0x24 0012fa64 7739d1ec ntdll!KiUserCallbackDispatcher+0x2e 0012faa0 7738cee9 user32!NtUserMessageCall+0xc 0012fac0 0a0f2e01 user32!SendMessageA+0x7f 0012fae4 0a0f2ac7 3rdPartyDLL+0x52e01 0012fb60 7c81a352 3rdPartyDLL+0x52ac7 0012fb80 7c839dee ntdll!LdrpCallInitRoutine+0x14 0012fc94 77e6b1bb ntdll!LdrUnloadDll+0x41a 0012fca8 0050c9c1 kernel32!FreeLibrary+0x41 0012fdf4 004374af 3rdPartyApp+0x10c9c1 0012fe24 0044a076 3rdPartyApp+0x374af 0012fe3c 7739b6e3 3rdPartyApp+0x4a076 0012fe68 7739b874 user32!InternalCallWinProc+0x28 0012fee0 7739ba92 user32!UserCallWinProcCheckWow+0x151 0012ff48 773a16e5 user32!DispatchMessageWorker+0x327 0012ff58 00452aa0 user32!DispatchMessageA+0xf 0012ffac 00511941 3rdPartyApp+0x52aa0 0012ffc0 77e6f23b 3rdPartyApp+0x111941 0012fff0 00000000 kernel32!BaseProcessStart+0x23

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Crash Dump Analysis, Crash Dump Patterns, WinDbg Tips and Tricks | 3 Comments »

Four causes of crash dumps

Friday, November 23rd, 2007

Obviously the appearance of crash dumps on your computer was caused by something. A bug, fault, defect or something else?

Aristotle suggested 4 types of causation 2 millennia ago and they are:

Material cause - presence of some substance, usually material one (hardware) but can be machine code (software). The distinction between hardware and software is often blurred today because of virtualization.

Formal cause - some form or arrangement (an algorithm)

Efficient cause - an agent (data flow or event caused an algorithm to be executed)

Final cause - the desire of someone (or something, operating system, for example).

We skip material causes because hardware and software are always involved. Obviously final causality should be among of crash dump causes because they were either anticipated or made deliberately. Let’s look at 3 examples with possible causes:

Buffer Overflow

Formal cause - a defect in code which might have arisen from incomplete or wrong model
Efficient cause - data is too big to fit in a buffer
Final cause - operating system and runtime library support decided to save a crash dump

Bugcheck (NMI)

Formal cause - NMI handler
Efficient cause - a button on a hardware panel or KeBugCheckEx
Final cause - “I need a memory dump” desire. Also crash dump saving functions were written before by kernel developers in anticipation of future crash dumps.

Bugcheck (A)

Formal cause - a defect in code again or particular disposition of threads
Efficient cause - Driver Verifier triggered paging out data
Final cause - deliberate OS bugcheck (here we can also say that it was anticipated by OS designers)

Concrete causes depend on the organizational level you use: software/hardware systems/components, modeling act by humans, etc.

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Crash Dump Analysis, Crash Dumps for Dummies, Science of Memory Dump Analysis | 1 Comment »

Crash Dump Analysis Patterns (Part 38)

Thursday, November 22nd, 2007

Hooking functions using trampoline method is so common on Windows and sometimes we need to check Hooked Functions in specific modules and determine which module hooked them for troubleshooting or memory forensic analysis needs. If original unhooked modules are available (via symbol server, for example) this can be done by using !chkimg WinDbg extension command:

0:002> !chkimg -lo 50 -d !kernel32 -v Searching for module with expression: !kernel32 Will apply relocation fixups to file used for comparison Will ignore NOP/LOCK errors Will ignore patched instructions Image specific ignores will be applied Comparison image path: c:\mss\kernel32.dll\44C60F39102000\kernel32.dll No range specified

Scanning section: .text Size: 564445 Range to scan: 77e41000-77ecacdd 77e44004-77e44008 5 bytes - kernel32!GetDateFormatA [ 8b ff 55 8b ec:e9 f7 bf 08 c0 ] 77e4412e-77e44132 5 bytes - kernel32!GetTimeFormatA (+0×12a) [ 8b ff 55 8b ec:e9 cd be 06 c0 ] 77e4e857-77e4e85b 5 bytes - kernel32!FileTimeToLocalFileTime (+0xa729) [ 8b ff 55 8b ec:e9 a4 17 00 c0 ] 77e56b5f-77e56b63 5 bytes - kernel32!GetTimeZoneInformation (+0×8308) [ 8b ff 55 8b ec:e9 9c 94 00 c0 ] 77e579a9-77e579ad 5 bytes - kernel32!GetTimeFormatW (+0xe4a) [ 8b ff 55 8b ec:e9 52 86 06 c0 ] 77e57fc8-77e57fcc 5 bytes - kernel32!GetDateFormatW (+0×61f) [ 8b ff 55 8b ec:e9 33 80 08 c0 ] 77e6f32b-77e6f32f 5 bytes - kernel32!GetLocalTime (+0×17363) [ 8b ff 55 8b ec:e9 d0 0c 00 c0 ] 77e6f891-77e6f895 5 bytes - kernel32!LocalFileTimeToFileTime (+0×566) [ 8b ff 55 8b ec:e9 6a 07 01 c0 ] 77e83499-77e8349d 5 bytes - kernel32!SetLocalTime (+0×13c08) [ 8b ff 55 8b ec:e9 62 cb 00 c0 ] 77e88c32-77e88c36 5 bytes - kernel32!SetTimeZoneInformation (+0×5799) [ 8b ff 55 8b ec:e9 c9 73 01 c0 ] Total bytes compared: 564445(100%) Number of errors: 50 50 errors : !kernel32 (77e44004-77e88c36)

0:002> u 77e44004 kernel32!GetDateFormatA: 77e44004 e9f7bf08c0 jmp 37ed0000 77e44009 81ec18020000 sub esp,218h 77e4400f a148d1ec77 mov eax,dword ptr [kernel32!__security_cookie (77ecd148)] 77e44014 53 push ebx 77e44015 8b5d14 mov ebx,dword ptr [ebp+14h] 77e44018 56 push esi 77e44019 8b7518 mov esi,dword ptr [ebp+18h] 77e4401c 57 push edi

0:002> u 37ed0000 *** ERROR: Symbol file could not be found. Defaulted to export symbols for MyDateTimeHooks.dll - 37ed0000 e99b262f2d jmp MyDateTimeHooks+0×26a0 (651c26a0) 37ed0005 8bff mov edi,edi 37ed0007 55 push ebp 37ed0008 8bec mov ebp,esp 37ed000a e9fa3ff73f jmp kernel32!GetDateFormatA+0×5 (77e44009) 37ed000f 0000 add byte ptr [eax],al 37ed0011 0000 add byte ptr [eax],al 37ed0013 0000 add byte ptr [eax],al

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Crash Dump Analysis, Crash Dump Patterns, Security, WinDbg Tips and Tricks | 15 Comments »

Crash Dump Analysis AntiPatterns (Part 6)

Thursday, November 22nd, 2007

Need the crash dump. Period. This might be the first thought when an engineer gets a stack trace fragment without symbolic information. It is usually based on the following presupposition:

We need an actual dump file to suggest further troubleshooting steps.

This is not actually true unless it is the first time you have the problem and get stack trace for it. Consider the following fragment from bugcheck kernel dump when no symbols were applied because the customer didn’t have them:

b90529f8 8085eced nt!KeBugCheckEx+0x1b b9052a70 8088c798 nt!MmAccessFault+0xb25 b9052a70 bfabd940 nt!_KiTrap0E+0xdc WARNING: Stack unwind information not available. Following frames may be wrong. b9052b14 bfabe452 MyDriver+0x27940

We can convert module+offset information into module!function+offset2 using MAP files or using DIA SDK (Debug Interface Access SDK) to query PDB files if we know module timestamp. This might be seen as a tedious exercise but we don’t need to do it if we keep raw stack trace signatures in some database when doing crash dump analysis. If we use our own symbol servers we might want to remove references to them and reload symbols. Then redo previous stack trace commands.

In my case it happened that I already analyzed similar previous bugcheck crash dumps months ago and saved stack trace prior to applying symbols. This helped me to point to solution without requesting the crash dump corresponding to that stack trace.

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in AntiPatterns, Crash Dump Analysis, Crash Dump Patterns, Software Technical Support | No Comments »

Critical thinking when troubleshooting

Thursday, November 22nd, 2007

Faulty thinking happens all the time in technical support environments partly due to hectic and demanding business realities.

Simple*ology book pointed me to this website:

http://www.fallacyfiles.org/

which taxonomically organizes fallacies:

http://www.fallacyfiles.org/taxonomy.html

For example, False Cause. Technical examples might include false causes inferred from trace analysis, customer problem description that includes steps to reproduce the problem, etc. This also applies to debugging and importance of thinking skills has been emphasized in the following book:

Debugging by Thinking: A Multidisciplinary Approach

Surface-level of basic crash dump analysis is less influenced by false cause fallacies because it doesn’t have explicitly recorded sequence of events although some caution should be exercised during detailed analysis of thread waiting times and other historical information.

Warning: when exercising critical thinking recursively we need to stop at the right time to avoid paralysis of analysis :-)

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Crash Dump Analysis, Debugging, Software Technical Support | No Comments »

Crash Dump Analysis Patterns (Part 37)

Wednesday, November 21st, 2007

Some bugs are fixed using brute-force approach via putting an exception handler to catch access violations and other exceptions. Long time ago I saw one such “incredible fix” when the image processing application was crashing after approximately Nth heap free runtime call. To ignore crashes a SEH handler was put in place but the application started to crash in different places. Therefore the additional fix was to skip free calls when approaching N and resume afterwards. The application started to crash less frequently.

Here getting Early Crash Dump when a first-chance exception happens can help in component identification before corruption starts spreading across data. Recall that when an access violation happens in a process thread in user mode the system generates the first-chance exception which can be caught by an attached debugger and if there is no such debugger the system tries to find an exception handler and if that exception handler catches and dismisses the exception the thread resumes its normal execution path. If there are no such handlers found the system generates the so called second-chance exception with the same exception context to notify the attached debugger and if it is not attached a default thread exception handler usually saves a postmortem user dump.

You can get first-chance exception memory dumps with:

Debug Diagnostics
ADPlus in crash mode from Debugging Tools for Windows
Exception Monitor from User Mode Process Dumper package

Here is an example configuration rule for crashes in Debug Diagnostic tool for TestDefaultDebugger process (Unconfigured First Chance Exceptions option is set to Full Userdump):

When we push the big crash button in TestDefaultDebugger dialog box two crash dumps are saved, with first and second-chance exceptions pointing to the same code:

Loading Dump File [C:\Program Files (x86)\DebugDiag\Logs\Crash rule for all instances of TestDefaultDebugger.exe\TestDefaultDebugger__PID__4316__ Date__11_21_2007__Time_04_28_27PM__2__First chance exception 0XC0000005.dmp] User Mini Dump File with Full Memory: Only application data is available

Comment: 'Dump created by DbgHost. First chance exception 0XC0000005′ Symbol search path is: srv*c:\mss*http://msdl.microsoft.com/download/symbols Executable search path is: Windows Vista Version 6000 MP (2 procs) Free x86 compatible Product: WinNt, suite: SingleUserTS Debug session time: Wed Nov 21 16:28:27.000 2007 (GMT+0) System Uptime: 0 days 23:45:34.711 Process Uptime: 0 days 0:01:09.000

This dump file has an exception of interest stored in it. The stored exception information can be accessed via .ecxr. (10dc.590): Access violation - code c0000005 (first/second chance not available) eax=00000000 ebx=00000001 ecx=0017fe70 edx=00000000 esi=00425ae8 edi=0017fe70 eip=004014f0 esp=0017f898 ebp=0017f8a4 iopl=0 nv up ei ng nz ac pe cy cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010297 TestDefaultDebugger!CTestDefaultDebuggerDlg::OnBnClickedButton1: 004014f0 c7050000000000000000 mov dword ptr ds:[0],0 ds:002b:00000000=????????

Loading Dump File [C:\Program Files (x86)\DebugDiag\Logs\Crash rule for all instances of TestDefaultDebugger.exe\TestDefaultDebugger__PID__4316__ Date__11_21_2007__Time_04_28_34PM__693__ Second_Chance_Exception_C0000005.dmp] User Mini Dump File with Full Memory: Only application data is available

Comment: 'Dump created by DbgHost. Second_Chance_Exception_C0000005‘ Symbol search path is: srv*c:\mss*http://msdl.microsoft.com/download/symbols Executable search path is: Windows Vista Version 6000 MP (2 procs) Free x86 compatible Product: WinNt, suite: SingleUserTS Debug session time: Wed Nov 21 16:28:34.000 2007 (GMT+0) System Uptime: 0 days 23:45:39.313 Process Uptime: 0 days 0:01:16.000

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Crash Dump Analysis, Crash Dump Patterns, Debugging, Tools | 5 Comments »

Crash Dump Analysis on Solaris x86 - AMD64

Tuesday, November 20th, 2007

Found the following book which is an interesting read to see crash dump analysis from a different operating system architecture perspective but on the same Intel / AMD platform:

http://www.genunix.org/gen/crashdump/book.pdf

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Assembly Language, Books, Crash Dump Analysis, Software Architecture | No Comments »

Crash Dump Analysis Patterns (Part 31a)

Tuesday, November 20th, 2007

I have already discussed Passive Thread pattern in user space. In this part I continue with kernel space and passive system threads that don’t run in any user process context. These threads belong to the so called System process, don’t have any user space stack and their full stack traces can be seen from the output of !process command (if not completely paged out):

1: kd> !process 0 ff System

or from system portion of !stacks 2 command.

Some system threads from that list belong to core OS functionality and are not passive (function offsets can vary for different OS versions and service packs):

nt!KiSwapContext+0x84 nt!KiSwapThread+0x125 nt!KeWaitForSingleObject+0x5f5 nt!MmZeroPageThread+0×180 nt!Phase1Initialization+0xe nt!PspSystemThreadStartup+0×5b nt!KiStartSystemThread+0×16

nt!KiSwapContext+0x84 nt!KiSwapThread+0x125 nt!KeWaitForSingleObject+0x5f5 nt!MiModifiedPageWriter+0×59 nt!PspSystemThreadStartup+0×5b nt!KiStartSystemThread+0×16

nt!KiSwapContext+0x84 nt!KiSwapThread+0x125 nt!KeWaitForMultipleObjects+0x703 nt!MiMappedPageWriter+0xad nt!PspSystemThreadStartup+0×5b nt!KiStartSystemThread+0×16

nt!KiSwapContext+0x84 nt!KiSwapThread+0x125 nt!KeWaitForMultipleObjects+0x703 nt!KeBalanceSetManager+0×101 nt!PspSystemThreadStartup+0×5b nt!KiStartSystemThread+0×16

nt!KiSwapContext+0x84 nt!KiSwapThread+0x125 nt!KeWaitForSingleObject+0x5f5 nt!KeSwapProcessOrStack+0×44 nt!PspSystemThreadStartup+0×5b nt!KiStartSystemThread+0×16

nt!KiSwapContext+0x84 nt!KiSwapThread+0x125 nt!KeWaitForSingleObject+0x5f5 nt!EtwpLogger+0xdd nt!PspSystemThreadStartup+0×5b nt!KiStartSystemThread+0×16

nt!KiSwapContext+0x84 nt!KiSwapThread+0x125 nt!KeWaitForSingleObject+0x5f5 nt!KiExecuteDpc+0×198 nt!PspSystemThreadStartup+0×5b nt!KiStartSystemThread+0×16

nt!KiSwapContext+0x84 nt!KiSwapThread+0x125 nt!KeWaitForMultipleObjects+0x703 nt!CcQueueLazyWriteScanThread+0×73 nt!PspSystemThreadStartup+0×5b nt!KiStartSystemThread+0×16

nt!KiSwapContext+0x84 nt!KiSwapThread+0x125 nt!KeWaitForMultipleObjects+0x703 nt!ExpWorkerThreadBalanceManager+0×85 nt!PspSystemThreadStartup+0×5b nt!KiStartSystemThread+0×16

Other threads belong to various worker queues (they can also be seen from !exqueue ff command output) and wait for data items to arrive (passive threads):

nt!KiSwapContext+0x84 nt!KiSwapThread+0x125 nt!KeRemoveQueueEx+0x848 nt!ExpWorkerThread+0×104 nt!PspSystemThreadStartup+0×5b nt!KiStartSystemThread+0×16

nt!KiSwapContext+0x26 nt!KiSwapThread+0x2e5 nt!KeRemoveQueue+0x417 nt!ExpWorkerThread+0xc8 nt!PspSystemThreadStartup+0×2e nt!KiThreadStartup+0×16

Non-Exp system threads having Worker, Logging or Logger substrings in their function names are passive threads and wait for data too, for example:

nt!KiSwapContext+0x84 nt!KiSwapThread+0x125 nt!KeWaitForMultipleObjects+0x703 nt!PfTLoggingWorker+0×81 nt!PspSystemThreadStartup+0×5b nt!KiStartSystemThread+0×16

nt!KiSwapContext+0x84 nt!KiSwapThread+0x125 nt!KeWaitForSingleObject+0x5f5 nt!EtwpLogger+0xdd nt!PspSystemThreadStartup+0×5b nt!KiStartSystemThread+0×16

nt!KiSwapContext+0x84 nt!KiSwapThread+0x125 nt!KeRemoveQueueEx+0x848 nt!KeRemoveQueue+0x21 rdpdr!RxpWorkerThreadDispatcher+0×6f nt!PspSystemThreadStartup+0×5b nt!KiStartSystemThread+0×16

nt!KiSwapContext+0x84 nt!KiSwapThread+0x125 nt!KeWaitForSingleObject+0x5f5 HTTP!UlpThreadPoolWorker+0×26c nt!PspSystemThreadStartup+0×5b nt!KiStartSystemThread+0×16

nt!KiSwapContext+0x84 nt!KiSwapThread+0x125 nt!KeRemoveQueueEx+0x848 nt!KeRemoveQueue+0x21 srv2!SrvProcWorkerThread+0×74 nt!PspSystemThreadStartup+0×5b nt!KiStartSystemThread+0×16

nt!KiSwapContext+0x84 nt!KiSwapThread+0x125 nt!KeRemoveQueueEx+0x848 nt!KeRemoveQueue+0x21 srv!WorkerThread+0×90 nt!PspSystemThreadStartup+0×5b nt!KiStartSystemThread+0×16

Any deviations in memory dump can raise suspicion like in the stack below for driver.sys

nt!KiSwapContext+0x26 nt!KiSwapThread+0x284 nt!KeWaitForSingleObject+0×346 nt!ExpWaitForResource+0xd5 nt!ExAcquireResourceExclusiveLite+0×8d nt!ExEnterCriticalRegionAndAcquireResourceExclusive+0×19 driver!ProcessItem+0×2f driver!DelayedWorker+0×27 nt!ExpWorkerThread+0×104 nt!PspSystemThreadStartup+0×5b nt!KiStartSystemThread+0×16

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Crash Dump Analysis, Crash Dump Patterns | 4 Comments »

NotMyLeak

Monday, November 19th, 2007

To troubleshoot and study memory leaks the following tool called NotMyLeak will be released soon. It injects different kinds of leaks into specified processes and system:

Process heap
Runtime library
Performance counters
Kernel paged pool
Kernel nonpaged pool
IRP
Handles
PTE
etc…

The idea is to model various real-time leaks, analyze memory dumps and then apply discovered patterns to crash dump analysis of memory dumps coming from real-world systems.

The draft GUI (subject to change):

Note: the tool name prefix NotMy… was inspired by the name of Mark Russinovich’s tool called NotMyFault.

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Announcements, Crash Dump Analysis, Crash Dump Patterns, Debugging, Tools | No Comments »

Windows Internals book

Monday, November 19th, 2007

Scheduled to be updated with Windows Vista and Windows Server 2008 details:

Windows® Internals, Fifth Edition

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Announcements, Books, Crash Dump Analysis, Debugging, Software Architecture, Software Technical Support, Vista | 2 Comments »

Filtering processes

Monday, November 19th, 2007

When I analyze memory dumps coming from Microsoft or Citrix terminal service environments I frequently need to find a process hosting terminal service. In Windows 2000 it was the separate process termsrv.exe and now it is termsrv.dll which can be loaded into any of several instances of svchost.exe. The simplest way to narrow down that svchost.exe process if we have a complete memory dump is to use the module option of WinDbg !process command:

!process /m termsrv.dll 0

!process /m wsxica.dll 0

!process /m ctxrdpwsx.dll 0

Note: this option works only with W2K3, XP and later OS

Also to list all processes with user space stacks having the same image name we can use:

!process 0 ff msiexec.exe

!process 0 ff svchost.exe

Note: this command works with W2K too as well as session option (/s)

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Crash Dump Analysis, Debugging, WinDbg Tips and Tricks | No Comments »

Exceptions Ab Initio

Friday, November 16th, 2007

Where do native exceptions come from? How do they propagate from hardware and eventually result in crash dumps? I was asking these questions when I started doing crash dump analysis more than four years ago and I tried to find answers using IA-32 Intel® Architecture Software Developer’s Manual, WinDbg and complete memory dumps.

Eventually I wrote some blog posts about my findings. They are buried between many other posts so I dug them out and put on a dedicated page:

Interrupts and Exceptions Explained

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Announcements, Assembly Language, Bugchecks Depicted, Crash Dump Analysis, Debugging, Hardware | No Comments »

Memorillion and Quadrimemorillion

Thursday, November 15th, 2007

What are these? These are names of the number of possible unique complete memory dumps when address space is 32 bit and 64-bit correspondingly:

256^2³² and 256^2⁶⁴

The first of them can be approximated by 10^10¹⁰

This idea came to me after I learnt about the so called “immense number” proposed by Walter Elsasser. This number is so big that its digits cannot be listed because there is not enough particles in observable Universe to write them.

Certainly one memorillion is more than one googol 10¹⁰⁰ but it requires only approx. 10¹⁰ particles in ideal case to list its digits and therefore not an immense number. It is however far less than one googolplex 10^10¹⁰⁰.

Consider a complete memory dump with bytes written in hexadecimal notation:

0x50414745554d500f000000ce0e00000090...

This number has more than 8 billion digits… And it is one possible number out of memorillion of them. So one memorillion in hexadecimal notation is just

0xFFFFFFFFFFFFFFFFFFFFF... + 1

where we have 2*2³² ‘F’ symbols written sequentially. One quadrimemorillion has 2*2⁶⁴ ‘F’ symbols.

Also the question about the number of possible crash dumps can be considered as Microsoft interview style question when you have possible candidates and you want to assess their ability to think out of the box and handle large numbers.

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Crash Dump Analysis, Crash Dump Patterns, Crash Dumps for Dummies, Memory Dump Analysis Jobs, Science of Memory Dump Analysis | 5 Comments »

Crash Dump Analysis Patterns (Part 36)

Wednesday, November 14th, 2007

The pattern I should have written as one of the first is called Local Buffer Overflow. It is observed on x86 platforms when a local variable and a function return address and/or saved frame pointer EBP are overwritten with some data. As a result, the instruction pointer EIP becomes Wild Pointer and we have a process crash in user mode or a bugcheck in kernel mode. Sometimes this pattern is diagnosed by looking at mismatched EBP and ESP values and in the case of ASCII or UNICODE buffer overflow EIP register may contain 4-char or 2-wchar_t value and ESP or EBP or both registers might point at some string fragment like in the example below:

0:000> r eax=000fa101 ebx=0000c026 ecx=01010001 edx=bd43a010 esi=000003e0 edi=00000000 eip=0048004a esp=0012f158 ebp=00510044 iopl=0 nv up ei pl nz na po nc cs=001b ss=0023 ds=0023 es=0023 fs=0038 gs=0000 efl=00000202 0048004a 0000 add byte ptr [eax],al ds:0023:000fa101=??

0:000> kL ChildEBP RetAddr WARNING: Frame IP not in any known module. Following frames may be wrong. 0012f154 00420047 0x48004a 0012f158 00440077 0x420047 0012f15c 00420043 0x440077 0012f160 00510076 0x420043 0012f164 00420049 0x510076 0012f168 00540041 0x420049 0012f16c 00540041 0x540041 ... ... ...

Good buffer overflow case studies with complete analysis including assembly language tutorial can be found in Buffer Overflow Attacks book.

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Assembly Language, Books, Crash Dump Analysis, Crash Dump Patterns, Debugging, Security | 4 Comments »

Crash Dump Analysis Patterns (Part 35)

Monday, November 12th, 2007

In kernel or complete memory dumps coming from hanging or slow workstations and servers !irpfind WinDbg command may show IRP Distribution Anomaly pattern when certain drivers have excessive count of active IRPs not observed under normal circumstances. I created two IRP distribution graphs from two problem kernel dumps by preprocessing command output using Visual Studio keyboard macros to eliminate completed IRPs and then using Excel. In one case it was a big number of I/O request packets from 3rd-party antivirus filter driver:

\Driver\3rdPartyAvFilter

In the second case it was the huge number of active IRPs targeted to kernel socket ancillary function driver:

\Driver\AFD

Two other peaks on both graphs are related to NTPS and NTFS, pipes and file system and usually normal. Here is IRP distribution graph from my Vista workstation captured while I was writing this post:

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Crash Dump Analysis, Crash Dump Patterns, Memory Visualization | 4 Comments »

Memory Dump Analysis using Excel

Friday, November 9th, 2007

Some WinDbg commands output data in tabular format so it is possible to save their output to a text file, import it to Excel and do sorting, filtering, and graph visualization, etc. Some commands from WinDbg include:

!stacks 1

Lists all threads with Ticks column so you can sort and filter threads that had been waiting no more than 100 ticks, for
example.

!irpfind

Here we can create various histograms, for example, IRP distribution based on [Driver] column.

I’ll show more examples later but now the graph depicting thread distribution in PID - TID coordinates on a busy multiprocessor
system with 25 user sessions and more than 3,000 threads:

WinDbg scripts offer possibility to output various tabulated data via .printf:

0:000> .printf "a\tb\tc" a b c

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Crash Dump Analysis, Crash Dump Patterns, Memory Visualization | 3 Comments »

TestDefaultDebugger.NET

Thursday, November 8th, 2007

Sometimes there are situations when we need to test exception handling to see whether it works and how to get dumps or logs from it. For example, a customer reports infrequent process crashes but no dumps are saved. Then we can try some application that crashes immediately to see whether it results in error messages and/or saved crash dumps. This was the motivation behind TestDefaultDebugger package. Unfortunately it contains only native applications and today I needed to test .NET CLR exception handling and see what messages it shows in my environment. So I wrote a simple program in C# that creates an empty Stack object and then calls its Pop method which triggers “Stack empty” exception sufficient for my purposes.

The updated package now includes TestDefaultDebugger.NET.exe and can be downloaded from Citrix support web site (requires free registration):

Download TestDefaultDebugger package

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in .NET Debugging, Announcements, Crash Dump Analysis, Debugging, Tools | No Comments »

Symbol file warnings in WinDbg 6.8.0004.0

Thursday, November 8th, 2007

I started using new WinDbg 6.8.4.0 and found that it prints the following message twice when I open a process dump or a complete memory dump where the current context is from some user mode process:

0:000> !analyze -v ... ... ... *** *** Your debugger is not using the correct symbols *** *** In order for this command to work properly, your symbol path *** must point to .pdb files that have full type information. *** *** Certain .pdb files (such as the public OS symbols) do not *** contain the required information. Contact the group that *** provided you with these symbols if you need this command to *** work. *** *** Type referenced: kernel32!pNlsUserInfo ***

Fortunately kernel32.dll symbols were loaded correctly despite the warning:

0:000> lmv m kernel32 start end module name 77e40000 77f42000 kernel32 (pdb symbols) c:\mssymbols\kernel32.pdb\DF4F569C743446809ACD3DFD1E9FA2AF2\kernel32.pdb Loaded symbol image file: kernel32.dll Image path: C:\WINDOWS\system32\kernel32.dll Image name: kernel32.dll Timestamp: Tue Jul 25 13:31:53 2006 (44C60F39) CheckSum: 001059A9 ImageSize: 00102000 File version: 5.2.3790.2756 Product version: 5.2.3790.2756 File flags: 0 (Mask 3F) File OS: 40004 NT Win32 File type: 2.0 Dll File date: 00000000.00000000 Translations: 0409.04b0 CompanyName: Microsoft Corporation ProductName: Microsoft® Windows® Operating System InternalName: kernel32 OriginalFilename: kernel32 ProductVersion: 5.2.3790.2756 FileVersion: 5.2.3790.2756 (srv03_sp1_gdr.060725-0040) FileDescription: Windows NT BASE API Client DLL LegalCopyright: © Microsoft Corporation. All rights reserved.

Also double checking return addresses on the stack trace shows that symbol mapping was correct (from another dump with the same message):

kd> dpu kernel32!pNlsUserInfo l1 77ecb0a8 77ecb760 "ENU"

kd> kv ChildEBP RetAddr Args to Child f552bbec f79e1743 000000e2 cccccccc 858a0470 nt!KeBugCheckEx+0x1b WARNING: Stack unwind information not available. Following frames may be wrong. f552bc38 8081d39d 85699390 8596fe78 860515f8 SystemDump+0x743 f552bc4c 808ec789 8596fee8 860515f8 8596fe78 nt!IofCallDriver+0x45 f552bc60 808ed507 85699390 8596fe78 860515f8 nt!IopSynchronousServiceTail+0x10b f552bd00 808e60be 00000090 00000000 00000000 nt!IopXxxControlFile+0x5db f552bd34 80882fa8 00000090 00000000 00000000 nt!NtDeviceIoControlFile+0x2a f552bd34 7c82ed54 00000090 00000000 00000000 nt!KiFastCallEntry+0xf8 0012efc4 7c8213e4 77e416f1 00000090 00000000 ntdll!KiFastSystemCallRet 0012efc8 77e416f1 00000090 00000000 00000000 ntdll!NtDeviceIoControlFile+0xc 0012f02c 00402208 00000090 9c400004 00947eb8 kernel32!DeviceIoControl+0×137 0012f884 00404f8e 0012fe80 00000001 00000000 SystemDump_400000+0×2208

kd> ub 77e416f1 kernel32!DeviceIoControl+0x11d: 77e416db lea eax,[ebp-28h] 77e416de push eax 77e416df push ebx 77e416e0 push ebx 77e416e1 push ebx 77e416e2 push dword ptr [ebp+8] 77e416e5 je kernel32!DeviceIoControl+0x131 (77e417f3) 77e416eb call dword ptr [kernel32!_imp__NtDeviceIoControlFile (77e4103c)]

So everything is allright and messages above shall be ignored. I also got e-mails from other people having the same problem so it seems to be related with this WinDbg release and not with my debugging environment.

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Crash Dump Analysis, Debugging, WinDbg Tips and Tricks | 3 Comments »

Crash Dumps for Dummies (Part 7)

Thursday, November 8th, 2007

In the previous part I introduced clear separation between crashes and hangs and outlined memory dump capturing methods for each category. However, looking from user point of view we need to tell them what is the best way to capture a dump based on observations they have and their failure level, system or component. The latter failure type usually happens with user applications and services.

For user applications the best way is to get a memory dump proactively or put in another words, manually, and do not rely on a postmortem debugger that may not be set up correctly on a problem server in 100 server farm. If any error message box appears with a message that an application stopped working or that it has encountered an application error then you can use process dumpers like userdump.exe.

Suppose we have the following error message when TestDefaultDebugger application crashes on Vista x64 (the same technique is applicable to earlier OS too):

Then we can dump the process while it displays the problem if we know its process ID:

In Vista this can be done even more easily by dumping the process from Task Manager directly:

Choose Create Dump File:

and the process dump is saved in a user location for temporary files:

Although the application above is the native Windows application the same method applies for .NET applications. For example, the forthcoming TestDefaultDebugger.NET application

shows the following dialog:

and we can dump the process manually while it displays the message.

Although both applications will disappear from Task Manager if we choose Close or Quit on their error message boxes and therefore will be considered as crashes under my terminology, at the time when they show their stop messages they are considered as application hangs and this is why we use manual process dumpers.

Other scenarios including system failures will be considered in the next part.

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Crash Dump Analysis, Crash Dumps for Dummies | 2 Comments »

WinDbg has been updated to version 6.8.4.0

Wednesday, November 7th, 2007

A bit late notice. I have just found that the new version of WinDbg was released last month:

http://www.microsoft.com/whdc/devtools/debugging/installx86.mspx

http://www.microsoft.com/whdc/devtools/debugging/install64bit.mspx

Seems not so many enhancements in this release according to the link below and relnotes.txt and at least it is not called Beta:

http://www.microsoft.com/whdc/devtools/debugging/whatsnew.mspx

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Announcements, Crash Dump Analysis, Debugging, Tools | 3 Comments »

May 2026
M	T	W	T	F	S	S
« Mar
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Archive for the ‘Crash Dump Analysis’ Category

Pages

Recent Comments

Categories

Archives

ARM64

Automated Analysis

Blogroll

Debugging Channels

Forensics

Hardware

Linux

Mac OS X

Magazines and Newspapers

Malware Analysis

Medical Diagnostics

Narratology

Related Links

Reversing

Scripting Languages

Source Code

Tracing Tools

Meta