Crash Dump Analysis Checklist

Sometimes the root cause of a problem is not obvious from a memory dump. Here is the first version of crash dump analysis checklist to help experienced engineers not to miss any important information. The check list doesn’t prescribe any specific steps, just lists all possible points to double check when looking at a memory dump. Of course, it is not complete at the moment and any suggestions are welcome.

General:

  • Symbol servers (.symfix)
  • Internal database(s) search
  • Google or Microsoft search for suspected components as this could be a known issue. Sometimes a simple search immediately points to the fix on a vendor’s site
  • The tool used to save a dump (to flag false positive, incomplete or inconsistent dumps)
  • OS/SP version (version)
  • Language
  • Debug time
  • System uptime
  • Computer name (dS srv!srvcomputername or !envvar COMPUTERNAME)
  • List of loaded and unloaded modules (lmv or !dlls)
  • Hardware configuration (!sysinfo)
  • .kframes 1000

Application or service:

  • Default analysis (!analyze -v or !analyze -v -hang for hangs)
  • Critical sections (!cs -s -l -o, !locks) for both crashes and hangs
  • Component timestamps, duplication and paths. DLL Hell? (lmv and !dlls)
  • Do any newer components exist?
  • Process threads (~*kv or !uniqstack) for multiple exceptions and blocking functions
  • Process uptime
  • Your components on the full raw stack of the problem thread
  • Your components on the full raw stack of the main application thread
  • Process size
  • Number of threads
  • Gflags value (!gflag)
  • Time consumed by threads (!runaway)
  • Environment (!peb)
  • Import table (!dh)
  • Hooked functions (!chkimg)
  • Exception handlers (!exchain)
  • Computer name (!envvar COMPUTERNAME)
  • Process heap stats and validation (!heap -s, !heap -s -v)
  • CLR threads? (mscorwks or clr modules on stack traces) Yes: use .NET checklist below
  • Hidden (unhandled and handled) exceptions on thread raw stacks

System hang:

  • Default analysis (!analyze -v -hang)
  • ERESOURCE contention (!locks)
  • Processes and virtual memory including session space (!vm 4)
  • Important services are present and not hanging
  • Pools (!poolused)
  • Waiting threads (!stacks)
  • Critical system queues (!exqueue f)
  • I/O (!irpfind)
  • The list of all thread stack traces (!process 0 3f)
  • LPC/ALPC chain for suspected threads (!lpc message or !alpc /m after search for “Waiting for reply to LPC” or “Waiting for reply to ALPC” in !process 0 3f output)
  • RPC threads (search for “RPCRT4!OSF” in !process 0 3f output)
  • Mutants (search for “Mutants - owning thread” in !process 0 3f output)
  • Critical sections for suspected processes (!cs -l -o -s)
  • Sessions, session processes (!session, !sprocess)
  • Processes (size, handle table size) (!process 0 0)
  • Running threads (!running)
  • Ready threads (!ready)
  • DPC queues (!dpcs)
  • The list of APCs (!apc)
  • Internal queued spinlocks (!qlocks)
  • Computer name (dS srv!srvcomputername)
  • File cache, VACB (!filecache)
  • File objects for blocked thread IRPs (!irp -> !fileobj)
  • Network (!ndiskd.miniports and !ndiskd.pktpools)
  • Disk (!scsikd.classext -> !scsikd.classext class_device 2)
  • Modules rdbss, mrxdav, mup, mrxsmb in stack traces
  • Functions Ntfs!Ntfs*, nt!Fs* and fltmgr!Flt* in stack traces

BSOD:

  • Default analysis (!analyze -v)
  • Pool address (!pool)
  • Component timestamps (lmv)
  • Processes and virtual memory (!vm 4)
  • Current threads on other processors
  • Raw stack
  • Bugcheck description (including ln exception address for corrupt or truncated dumps)
  • Bugcheck callback data (!bugdump for systems prior to Windows XP SP1)
  • Bugcheck secondary callback data (.enumtag)
  • Computer name (dS srv!srvcomputername)
  • Hardware configuration (!sysinfo)

.NET application or service:

  • CLR module and SOS extension versions (lmv and .chain)
  • Managed exceptions (~*e !pe)
  • Nested managed exceptions (!pe -nested)
  • Managed threads (!Threads -special)
  • Managed stack traces (~*e !CLRStack)
  • Managed execution residue (~*e !DumpStackObjects and !DumpRuntimeTypes)
  • Managed heap (!VerifyHeap, !DumpHeap -stat and !eeheap -gc)
  • GC handles (!GCHandles, !GCHandleLeaks)
  • Finalizer queue (!FinalizeQueue)
  • Sync blocks (!syncblk)

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

51 Responses to “Crash Dump Analysis Checklist”

  1. Dmitry Vostokov Says:

    Added the check for a pool address: !pool <address>.
    Useful to see what pool tag is associated with the data, gives an idea about the data structure, currently accessed struct field, what component it somes from, etc.

  2. Dmitry Vostokov Says:

    Added import table check (!dh) to see if it is not corrupt. Useful in some cases where memory optimization or rebasing products are used.

  3. Yury Yushchenko Says:

    Hello, Dmitry!
    Terrific site! Thank you!!!!
    (I’m a developer who supports load testing of Citrix protocol in HP (former Mercury Interactive) LoadRunner product. I sometimes have headache with customer support/remote debugging also. )

  4. Dmitry Vostokov Says:

    Thanks!
    Dmitry

  5. Dmitry Vostokov Says:

    Added the command to list of ready-to-run threads: !ready

  6. Dmitry Vostokov Says:

    Added !dpcs and !apc commands

  7. Dmitry Vostokov Says:

    Added !chkimg

  8. Dmitry Vostokov Says:

    Added !exchain

  9. Dmitry Vostokov Says:

    Added !qlocks

  10. idb Says:

    wait_for_client_connects: Process 1872 generated fatal exception c0000005 EXCEPTION_ACCESS_VIOLATION. SQL Server is terminating this process.
    *
    * BEGIN STACK DUMP:
    * 04/16/08 10:18:42 spid 0
    *
    * Exception Address = 77FCC2C2 (RtlAllocateHeap + 1d3)
    * Exception Code = c0000005 E
    * Access Violation occurred writing address 00000005
    *
    * MODULE BASE END SIZE
    * sqlservr 00400000 008d2fff 004d3000
    * ntdll 77f80000 77ffcfff 0007d000
    * KERNEL32 77e50000 77f31fff 000e2000
    * ADVAPI32 796d0000 79731fff 00062000
    * RPCRT4 786f0000 7875efff 0006f000
    * USER32 77de0000 77e44fff 00065000
    * GDI32 77f40000 77f7bfff 0003c000
    * ole32 7cf00000 7cfeefff 000ef000
    * OLEAUT32 77980000 77a1afff 0009b000
    * VERSION 777d0000 777d6fff 00007000
    * LZ32 75940000 75945fff 00006000
    * opends60 41060000 41085fff 00026000
    * ums 41090000 4109cfff 0000d000
    * MSVCRT 78000000 78044fff 00045000
    * sqlsort 04000000 0408efff 0008f000
    * MSVCIRT 780a0000 780b1fff 00012000
    * IMM32 75df0000 75e09fff 0001a000
    * sqlevn70 410a0000 410a6fff 00007000
    * COMNEVNT 410b0000 410fefff 0004f000
    * ODBC32 00eb0000 00ee1fff 00032000
    * COMCTL32 716f0000 71779fff 0008a000
    * SHELL32 78f90000 791d5fff 00246000
    * SHLWAPI 63180000 631cbfff 0004c000
    * comdlg32 76ae0000 76b1dfff 0003e000
    * SQLWOA 41100000 4110bfff 0000c000
    * odbcint 1f850000 1f865fff 00016000
    * NDDEAPI 76930000 76936fff 00007000
    * WINSPOOL 777b0000 777cdfff 0001e000
    * MPR 79b20000 79b2ffff 00010000
    * SQLTrace 41130000 4117dfff 0004e000
    * NETAPI32 7cea0000 7ceeffff 00050000
    * Secur32 797b0000 797befff 0000f000
    * NTDSAPI 77bc0000 77bd0fff 00011000
    * DNSAPI 77950000 77973fff 00024000
    * WSOCK32 74fc0000 74fc9fff 0000a000
    * WS2_32 74fa0000 74fb3fff 00014000
    * WS2HELP 74f90000 74f97fff 00008000
    * WLDAP32 77920000 77949fff 0002a000
    * NETRAP 75140000 75145fff 00006000
    * SAMLIB 750d0000 750defff 0000f000
    * SSNMPN70 41190000 41195fff 00006000
    * SSMSRP70 411b0000 411b7fff 00008000
    * SSMSSO70 411a0000 411aafff 0000b000
    * XOLEHLP 048e0000 048e7fff 00008000
    * MSDTCPRX 048f0000 049aafff 000bb000
    * MTXCLU 049b0000 049bffff 00010000
    * CLUSAPI 049c0000 049cffff 00010000
    * RESUTILS 049d0000 049dcfff 0000d000
    * USERENV 049e0000 04a40fff 00061000
    * rnr20 04a50000 04a5bfff 0000c000
    * iphlpapi 04aa0000 04ab2fff 00013000
    * ICMP 04ac0000 04ac4fff 00005000
    * MPRAPI 04ad0000 04ae6fff 00017000
    * ACTIVEDS 04af0000 04b1efff 0002f000
    * ADSLDPC 04b20000 04b42fff 00023000
    * RTUTILS 04b50000 04b5dfff 0000e000
    * SETUPAPI 04b60000 04c0dfff 000ae000
    * RASAPI32 04c10000 04c42fff 00033000
    * RASMAN 04c50000 04c60fff 00011000
    * TAPI32 04c70000 04c91fff 00022000
    * DHCPCSVC 04ca0000 04cb8fff 00019000
    * winrnr 04d50000 04d57fff 00008000
    * rasadhlp 04d60000 04d64fff 00005000
    * mswsock 05490000 054a1fff 00012000
    * msafd 054f0000 0550dfff 0001e000
    * wshtcpip 05550000 05556fff 00007000
    * SQLRGSTR 059f0000 059f4fff 00005000
    * security 05a90000 05a93fff 00004000
    * msv1_0 05aa0000 05ac0fff 00021000
    * CRYPT32 05ad0000 05b56fff 00087000
    * MSASN1 05b70000 05b7ffff 00010000
    * xpsqlbot 05b90000 05b95fff 00006000
    * sqlboot 05ba0000 05ba7fff 00008000
    * xpsql70 075b0000 075bafff 0000b000
    * xpstar 07600000 07630fff 00031000
    * SQLWID 07640000 07645fff 00006000
    * SQLSVC 07650000 07668fff 00019000
    * odbcbcp 07670000 07675fff 00006000
    * SQLRESLD 07680000 07685fff 00006000
    * W95SCM 07690000 07697fff 00008000
    * SQLSVC 076a0000 076a5fff 00006000
    * imagehlp 0d9e0000 0da02fff 00023000
    * DBGHELP 0db50000 0db7cfff 0002d000
    * sqlimage 10180000 101acfff 0002d000
    *
    * Edi: 00D90000: 0000fe00 00000000 00001002 eeffeeff 00000100 000000c8
    * Esi: 04DFAFD8: 04dfafe8 04dfafe8 00d901b8 00000001 00100001 00070008
    * Eax: 00000001:
    * Ebx: 00000007:
    * Ecx: 00D901B8: 00d901c8 00d901c8 00d901c0 00d901c0 04dfafe0 00000001
    * Edx: 00000061:
    * Eip: 77FCC2C2: ffff2885 8903e8c1 c18b0eb7 0f000000 d6850fc1 3b044889
    * Ebp: 0540FE70: 00000020 00000030 00000000 00d90000 78001532 0540feb0
    * SegCs: 0000001B:
    * EFlags: 00000216:
    * Esp: 0540FCA4: 00000000 00000000 00000000 00000000 01762254 00000020
    * SegSs: 00000023:

  11. Dmitry Vostokov Says:

    * BEGIN STACK DUMP:
    * 04/16/08 10:18:42 spid 0
    *
    * Exception Address = 77FCC2C2 (RtlAllocateHeap + 1d3)

    Due to RtlAllocateHeap you have an instance of heap corruption and you need to enable full page heap to isolate the component:

    http://www.dumpanalysis.org/blog/index.php/2006/10/31/crash-dump-analysis-patterns-part-2/

  12. Dmitry Vostokov Says:

    Added !bugdump and .enumtag

  13. Dmitry Vostokov Says:

    Added .kframes 100

  14. Dmitry Vostokov Says:

    Added a check for component paths

  15. Dmitry Vostokov Says:

    Added a check for duplicated components

  16. Dmitry Vostokov Says:

    Added !locks -v check when we have signs of critical section corruption

  17. Dmitry Vostokov Says:

    Added !cs -s -l -o for process memory dumps

  18. Subbu Says:

    Great work and very useful to all professional developers.

    ~Subbu

  19. Crash Dump Analysis » Blog Archive » Introducing EasyDbg Says:

    […] This is already written application by me (10 years ago by me) that I’m adapting as a high-level interface to WinDbg (can be any GUI debugger actually). The basic idea revolves around floating buttons (listbox and task bar icons, optionally) that dynamically change with every new window or application. The number of buttons can be unlimited, they can be repositioned to any corner of the screen, they can play sounds, show video and pictures. On click they play elaborated macro commands, including keystrokes and mouse movements, written in a special scripting language. For example, we can create buttons for CDA checklist.  […]

  20. Dmitry Vostokov Says:

    Added commands to extract computer name

  21. Crash Dump Analysis » Blog Archive » CMDTREE.TXT for CDA Checklist Says:

    […] Farah who blogged about .cmdtree command I was able create the first version of cmdtree.txt for Crash Dump Analysis Checklist to include common commands that I use. It can be found […]

  22. Dmitry Vostokov Says:

    Added search for “Waiting for reply to LPC” in !process 0 ff output to detect LPC wait chains

  23. Crash Dump Analysis » Blog Archive » The Measure of Memory Dump Analysis Complexity Says:

    […] ago I started with a few commands like !analyze -v, kv and dd and progressed to an elaborate checklist. Here the natural logarithm can be used to approximate the […]

  24. Crash Dump Analysis » Blog Archive » Debugger Log Reading Techniques (Part 1) Says:

    […] 1. First, have a checklist […]

  25. Dmitry Vostokov Says:

    Added search for “Mutant - owning thread”

  26. Dmitry Vostokov Says:

    Added “Waiting for reply to ALPC Message” and !alpc /m

  27. Crash Dump Analysis » Blog Archive » 10 Common Mistakes in Memory Analysis (Part 4) Says:

    […] prevent such mistakes checklists are indispensable. For one example, see Crash Dump Analysis Checklist. You can also order it in […]

  28. Dmitry Vostokov Says:

    Added a check for important services for your environment

  29. Dmitry Vostokov Says:

    Added !sysinfo for checking hardware configuration

  30. Dmitry Vostokov Says:

    Added .symfix and version commands

  31. Dmitry Vostokov Says:

    Added !filecache to check for free VACB

  32. Dmitry Vostokov Says:

    Added lmv and !dlls (the latter is for user and complete memory dumps) as a general check for loaded and unloaded modules and their versions

  33. whunmr Says:

    Hi Dmitry!
    I found following command is handy during debugging.

    .lastevent
    !gle -all
    !heap -s
    .cxr and .exr
    !address

    and for managed side:
    ~* e !clrstack -all
    !dumpstack
    !pe
    !dumpheap -stat -type Exception
    !dso

  34. DinoS Says:

    “!sysinfo” ?!?!

  35. DinoS Says:

    “!sysinfo” ?!?!

    Oh, right, I got it. Kernel debugging. Never mind!

  36. After party or summary of presentation @ Powered by MVP « Thoughts and knowledge exposed Says:

    […] Crash dump analysis checklist: http://www.dumpanalysis.org/blog/index.php/2007/06/20/crash-dump-analysis-checklist/ […]

  37. Dmitry Vostokov Says:

    Removed !ntsd.locks as deprecated and replaced by !cs

  38. Dmitry Vostokov Says:

    Massive update. Added process heap stats,hidden (unhandled and handled) exceptions on thread raw stacks, multiple exceptions and blocking calls. The first initial checklist for .NET

  39. Dmitry Vostokov Says:

    Adding !locks back to user process memory dump analysis checklist as it’s working now with the latest WinDbg

  40. Dmitry Vostokov Says:

    Added !DumpRuntimeTypes for .NET execution residue

  41. Dmitry Vostokov Says:

    Added !heap -s -v to validate heap

  42. Crash Dump Analysis Checklist | Famel Says:

    […] am just copy and paste from http://www.dumpanalysis.org/blog/index.php/2007/06/20/crash-dump-analysis-checklist/ for my own […]

  43. Installing and Configuring WinDbg (Windows Debug Tools)‏ | Sysadmin Fanatic Says:

    […] - Crash Dump Analysis Checklist […]

  44. Dmitry Vostokov Says:

    Added file objects for blocked thread IRPs: !irp then !fileobj

  45. Dmitry Vostokov Says:

    Added !process 0 3f for W8

  46. Dmitry Vostokov Says:

    Added !GCHandleLeaks

  47. Dmitry Vostokov Says:

    Added network and disk checks

  48. Dmitry Vostokov Says:

    Added rdbss, mrxdav, mup, mrxsmb checks for remote file access

  49. Dmitry Vostokov Says:

    Added Ntfs!Ntfs* and nt!Fs* thread stack trace checks

  50. Dmitry Vostokov Says:

    Added fltmgr!Flt* for stack trace checks

  51. Dmitry Vostokov Says:

    Added “RPCRT4!OSF” for stack trace checks

Leave a Reply