Archive for January 2nd, 2011

The Way of Philip Marlowe: The Abductive Reasoning for Troubleshooting and Debugging

Sunday, January 2nd, 2011

Working for more than 7 years in technical support environment I found that many support incidents were resolved more easily by abductive reasoning than by induction and deduction practiced by Sherlock Holmes and observed by Dr. Watson. Abduction as a way to build an incident theory to advance in problem resolution was practiced by a USA colleague of Holmes: Philip Marlowe. Because technical support is less detached from customers (”the world”) when compared to software engineering departments I see the way of Marlowe as more natural. Of course, from time to time the way of Holmes is also appropriate. All depends on a support case. I found that abductive reasoning is also appropriate for memory dump and software trace analysis where “leaps of faith” are necessary because of insufficient information. Such leaps of abduction actually happen all the time when analysts give troubleshooting advice based on patterns.

I plan to write more about the 3rd way of reasoning after I finish reading two Raymond Chandler’s novels and a few other inference, causality and explanation books I mention later: The Big Sleep & Farewell, My Lovely (Modern Library).

I’m grateful for Clive Gamble for pointing this way out in his book Archaeology: The Basics

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Abridged dump, embedded comment, spiking thread, incorrect stack trace and top module: pattern cooperation

Sunday, January 2nd, 2011

When loading a process user memory dump we recognized it as abridged and embedded comment pointed to a spiking thread

Loading Dump File [ApplicationA_101212_165342.dmp]
User Mini Dump File: Only registers, stack and portions of memory are available

Comment: '
*** procdump -c 60 -s 5 -n 3 ApplicationA.exe
*** Process exceeded 60% CPU for 5 seconds. Thread consuming CPU: 540 (0×21c)

This thread is already default:

0:005> ~
   0  Id: c1c.c20 Suspend: 0 Teb: 7ffdf000 Unfrozen
   1  Id: c1c.c44 Suspend: 0 Teb: 7ffde000 Unfrozen
   2  Id: c1c.d34 Suspend: 0 Teb: 7ffdc000 Unfrozen
   3  Id: c1c.d38 Suspend: 0 Teb: 7ffda000 Unfrozen
   4  Id: c1c.d3c Suspend: 0 Teb: 7ffd9000 Unfrozen
.  5  Id: c1c.21c Suspend: 0 Teb: 7ffd8000 Unfrozen
   6  Id: c1c.1c10 Suspend: 0 Teb: 7ffdd000 Unfrozen
   7  Id: c1c.1678 Suspend: 0 Teb: 7ffd6000 Unfrozen
   8  Id: c1c.cbc Suspend: 0 Teb: 7ffd5000 Unfrozen
   9  Id: c1c.1754 Suspend: 0 Teb: 7ffaf000 Unfrozen
  10  Id: c1c.c40 Suspend: 0 Teb: 7ffad000 Unfrozen
  11  Id: c1c.1d24 Suspend: 0 Teb: 7ffd7000 Unfrozen

The stack trace looks incorrect

0:005> kL
ChildEBP RetAddr 
01abc4d8 6efba23d ntdll!KiFastSystemCallRet
WARNING: Stack unwind information not available. Following frames may be wrong.
01abc988 7c820833 ModuleB+0×2a23d
01abcbe4 7c8207f6 kernel32!GetVolumeNameForRoot+0×26
01abcc0c 7c82e6de kernel32!BasepGetVolumeNameForVolumeMountPoint+0×75
01abcc54 6efaf70b kernel32!GetVolumePathNameW+0×18a
01abccdc 6efbd1a6 ModuleB+0×1f70b
01abcce0 00000000 ModuleB+0×2d1a6

However, we see a 3rd party top module and advise to keep an eye on it:

0:005> lmt m ModuleB
start    end        module name
6ef90000 6efff000   ModuleB    Wed Mar 10 20:18:21 2010

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Crash Dump Analysis Patterns (Part 125)

Sunday, January 2nd, 2011

Similar to Blocking Module pattern we also have Top Module pattern where the difference is in stack trace syntax only. A top module is any module we choose that is simply on top of a stack trace. Most of the time it is likely to be a non-OS vendor module. Whether the stack trace is well-formed and semantically sound or incorrect is irrelevant:

0:005> kL
ChildEBP RetAddr 
01abc4d8 6efba23d ntdll!KiFastSystemCallRet
WARNING: Stack unwind information not available. Following frames may be wrong.
01abc988 7c820833 ModuleB+0×2a23d
01abcbe4 7c8207f6 kernel32!GetVolumeNameForRoot+0×26
01abcc0c 7c82e6de kernel32!BasepGetVolumeNameForVolumeMountPoint+0×75
01abcc54 6efaf70b kernel32!GetVolumePathNameW+0×18a
01abccdc 6efbd1a6 ModuleB+0×1f70b
01abcce0 00000000 ModuleB+0×2d1a6

Here we can also check the validity of ModuleB code by backwards disassembly of 6efba23d return address (ub command) unless we have an abridged dump file (minidump) and we need to specify the image file path in WinDbg,

Why a top module is important? In various troubleshooting scenarious we can check the module timestamp (Not My Version pattern) and other useful information (lmv and !lmi WinDbg commands). If we suspect the module belonging to hooksware we can also recommend removing it or its software vendor package for testing purposes.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -