Archive for the ‘Crash Dump Analysis’ Category

Crash Dump Analysis Patterns (Part 29)

Monday, October 8th, 2007

Some Windows synchronization objects like executive resources and critical sections have a struct member called ContentionCount. This is the number of times a resource was accessed or, in another words, it is the accumulated number of threads waiting for an object: when a thread tries to acquire an object and put into a wait state the count is incremented. Hence the name of this pattern: High Contention.

Here is an example. In a kernel memory dump we have just one exclusively owned lock and seems no other threads were blocked by it at the time the dump was saved. However the high contention count reveals CPU spike:

3: kd> !locks
**** DUMP OF ALL RESOURCE OBJECTS ****
KD: Scanning for held locks...

Resource @ 0x8abc11f0    Exclusively owned
    Contention Count = 19648535
     Threads: 896395f8-01<*>
KD: Scanning for held locks…

Resource @ 0x896fab88    Shared 1 owning threads
     Threads: 88c78608-01<*>
KD: Scanning for held locks...
15464 total locks, 2 locks currently held

3: kd> !thread 896395f8
THREAD 896395f8  Cid 04c0.0138  Teb: 7ffde000 Win32Thread: bc922d20 RUNNING on processor 1
Not impersonating
DeviceMap                 e3d4c008
Owning Process            8a035020       Image:         MyApp.exe
Wait Start TickCount      36969283       Ticks: 0
Context Switch Count      1926423                 LargeStack
UserTime                  00:00:53.843
KernelTime                00:13:10.703
Win32 Start Address 0×00401478
Start Address 0×77e617f8
Stack Init ba14b000 Current ba14abf8 Base ba14b000 Limit ba146000 Call 0
Priority 11 BasePriority 6 PriorityDecrement 5
ChildEBP RetAddr  Args to Child
ba14ac94 bf8c6505 001544c8 bf995948 000c000a nt!_wcsicmp+0×3a (FPO: [Non-Fpo])
ba14ace0 bf8c6682 00000000 00000000 00000000 win32k!_FindWindowEx+0xfb (FPO: [Non-Fpo])
ba14ad48 8088978c 00000000 00000000 0012f8d4 win32k!NtUserFindWindowEx+0xef (FPO: [Non-Fpo])
ba14ad48 7c8285ec 00000000 00000000 0012f8d4 nt!KiFastCallEntry+0xfc (FPO: [0,0] TrapFrame @ ba14ad64)

3: kd> !process 8a035020
PROCESS 8a035020  SessionId: 9  Cid: 04c0    Peb: 7ffdf000  ParentCid: 10e8
    DirBase: cffaf7a0  ObjectTable: e4ba30a0  HandleCount:  73.
    Image: MyApp.exe
    VadRoot 88bc1bf8 Vads 82 Clone 0 Private 264. Modified 0. Locked 0.
    DeviceMap e3d4c008
    Token                             e5272028
    ElapsedTime                       00:14:19.360
    UserTime                          00:00:53.843
    KernelTime                        00:13:10.703
    QuotaPoolUsage[PagedPool]         40660
    QuotaPoolUsage[NonPagedPool]      3280
    Working Set Sizes (now,min,max)  (1139, 50, 345) (4556KB, 200KB, 1380KB)
    PeakWorkingSetSize                1141
    VirtualSize                       25 Mb
    PeakVirtualSize                   27 Mb
    PageFaultCount                    1186
    MemoryPriority                    BACKGROUND
    BasePriority                      6
    CommitCharge                      315

- Dmitry Vostokov @ DumpAnalysis.org -

Heaps and heap corruption explained

Tuesday, October 2nd, 2007

Excellent free chapter explaining process heap implementation and debugging heap corruption issues from the authors of Advanced Windows Debugging book:

Sample Chapter

- Dmitry Vostokov @ DumpAnalysis.org -

Windows service crash dumps in Vista

Monday, October 1st, 2007

I was playing with Vista Platform SDK samples to create the minimal native Windows service that crashes to test various postmortem debugger configurations, Windows Error Reporting (WER) options and conditions under which crash dumps are available. Initially I put a NULL pointer dereference into the service control handler processing service stop command and although the service crashed under WinDbg I couldn’t get CDB or NTSD configured as a default postmortem debugger to save the crash dump automatically. I tested under x64 Vista and Windows Server 2003 x64 both 32-bit and 64-bit versions of my service.

Here is the source code and stack trace from WinDbg when we attach it to the running service and then try to stop it:

//
// FUNCTION: service_ctrl
//
// PURPOSE: This function is called by the SCM whenever
// ControlService() is called on this service.
//
// PARAMETERS:
// dwCtrlCode - type of control requested
//
// RETURN VALUE:
// none
//
// COMMENTS:
//
VOID WINAPI service_ctrl(DWORD dwCtrlCode)
{
  // Handle the requested control code.
  //
  switch (dwCtrlCode)
  {
  case SERVICE_CONTROL_STOP:
    *(int *)NULL = 0;
    ReportStatusToSCMgr(SERVICE_STOP_PENDING, NO_ERROR, 0);
    ServiceStop();
    return;

  // Update the service status.
  //
  case SERVICE_CONTROL_INTERROGATE:
    break;

  // invalid control code
  //
  default:
    break;

}

ReportStatusToSCMgr(ssStatus.dwCurrentState, NO_ERROR, 0);
}

0:000> r
rax=0000000000000001 rbx=00000000001e36d0 rcx=0000000000000001
rdx=000000000a9ff32c rsi=0000000000000000 rdi=0000000000401aa0
rip=0000000000401ab9 rsp=000000000012fab0 rbp=00000000001e36d0
 r8=0000000000400000  r9=0000000077b3f990 r10=00000000004000d8
r11=00000000004000d8 r12=0000000000000000 r13=000000000a9ff32c
r14=00000000001e36e8 r15=000000000012fc20
iopl=0 nv up ei pl zr na po nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b  efl=00010246
simple!service_ctrl+0x19:
00000000`00401ab9 c704250000000000000000 mov dword ptr [0],0 ds:00000000`00000000=????????

0:000> k
Child-SP          RetAddr           Call Site
00000000`0012fab0 000007fe`fe276cee simple!service_ctrl+0x19
00000000`0012faf0 000007fe`fe2cea5d ADVAPI32!ScDispatcherLoop+0x54c
00000000`0012fbf0 00000000`004019f5 ADVAPI32!StartServiceCtrlDispatcherA+0x8d
00000000`0012fe70 00000000`00408b8c simple!main+0x155
00000000`0012fec0 00000000`0040895e simple!__tmainCRTStartup+0x21c
00000000`0012ff30 00000000`7792cdcd simple!mainCRTStartup+0xe
00000000`0012ff60 00000000`77a7c6e1 kernel32!BaseThreadInitThunk+0xd
00000000`0012ff90 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

If we put while(1); code instead of NULL pointer dereference the process will be interrupted via breakpoint and then terminated. There is no postmortem dump saved too. Therefore it looks like any fault inside the service main thread is not allowed to execute the potentially blocking operation of unhandled exception filter perhaps to avoid blocking the service control manager (SCM) communicating with service dispatcher code.

On Vista if Windows Error Reporting service is running and WER is configured in Control Panel to allow a user to choose reporting settings we get the following familiar dialog but without Debug option to attach a postmortem debugger and save a crash dump:

If we choose the recommended option we get the following dialog showing the path where a minidump file was temporarily stored:

  

We need to leave this dialog open if we want to open the crash dump or copy it to another location otherwise report files will be removed as soon as we dismiss the dialog box (they may be stored temporarily in another location - check Problem Reports and Solutions\View Problem History in Control Panel). If we open the crash dump using WinDbg we get the same stack trace that we got previously during live debugging:

Loading Dump File [C:\ProgramData\Microsoft\Windows\WER\ReportQueue\Report19527353 \WER7346.tmp.mdmp]
User Mini Dump File: Only registers, stack and portions of memory are available

Symbol search path is: srv*c:\mss*http://msdl.microsoft.com/download/symbols
Executable search path is:
Windows Vista Version 6000 MP (2 procs) Free x64
Product: WinNt, suite: SingleUserTS
Debug session time: Fri Sep 28 16:36:38.000 2007 (GMT+1)
System Uptime: 2 days 1:42:22.810
Process Uptime: 0 days 0:00:10.000
.....
This dump file has an exception of interest stored in it.
The stored exception information can be accessed via .ecxr.
(13b0.d54): Access violation - code c0000005 (first/second chance not available)
*** WARNING: Unable to verify checksum for simple.exe
simple!service_ctrl+0x19:
00000000`00401ab9 c704250000000000000000 mov dword ptr [0],0 ds:00000000`00000000=????????
0:000> k
Child-SP          RetAddr           Call Site
00000000`0012fab0 000007fe`fe276cee simple!service_ctrl+0x19
00000000`0012faf0 000007fe`fe2cea5d advapi32!ScDispatcherLoop+0x54c
00000000`0012fbf0 00000000`004019f5 advapi32!StartServiceCtrlDispatcherA+0x8d
00000000`0012fe70 00000000`00408b8c simple!main+0x155
00000000`0012fec0 00000000`0040895e simple!__tmainCRTStartup+0x21c
00000000`0012ff30 00000000`7792cdcd simple!mainCRTStartup+0xe
00000000`0012ff60 00000000`77a7c6e1 kernel32!BaseThreadInitThunk+0xd
00000000`0012ff90 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

Fault in any other service thread, for example, the one that SCM starts per every SERVICE_TABLE_ENTRY in dispatch table results in a default postmortem debugger saving a crash dump on Windows Server 2003 x64 but not on Vista x64 or Vista x86 (32-bit):

void __cdecl main(int argc, char **argv)
{
  SERVICE_TABLE_ENTRY dispatchTable[] =
  {
  { TEXT(SZSERVICENAME), (LPSERVICE_MAIN_FUNCTION)service_main},
  { NULL, NULL}
  };
  ...
  ...
  ...
  if (!StartServiceCtrlDispatcher(dispatchTable))
    AddToMessageLog(TEXT("StartServiceCtrlDispatcher failed."));
}

void WINAPI service_main(DWORD dwArgc, LPTSTR *lpszArgv)
{
  // register our service control handler:
  //
  sshStatusHandle = RegisterServiceCtrlHandler( TEXT(SZSERVICENAME), service_ctrl);

  if (!sshStatusHandle)
    goto cleanup;

  // SERVICE_STATUS members that don't change in example
  //
  ssStatus.dwServiceType = SERVICE_WIN32_OWN_PROCESS;
  ssStatus.dwServiceSpecificExitCode = 0;

  // report the status to the service control manager.
  //
  if (!ReportStatusToSCMgr(
      SERVICE_START_PENDING, // service state
      NO_ERROR, // exit code
      3000)) // wait hint
    goto cleanup;
  *(int *)NULL = 0;
  …
  …
  …
}

Seems the only way to get a crash mindump for analysis is to copy it from the report data like I explained above:

Loading Dump File [C:\ProgramData\Microsoft\Windows\WER\ReportQueue\Report0fa05f9d \WER5F42.tmp.mdmp]
User Mini Dump File: Only registers, stack and portions of memory are available

Symbol search path is: srv*c:\mss*http://msdl.microsoft.com/download/symbols
Executable search path is:
Windows Vista Version 6000 MP (2 procs) Free x64
Product: WinNt, suite: SingleUserTS
Debug session time: Fri Sep 28 17:50:06.000 2007 (GMT+1)
System Uptime: 0 days 0:30:59.495
Process Uptime: 0 days 0:00:04.000
.....
This dump file has an exception of interest stored in it.
The stored exception information can be accessed via .ecxr.
(d6c.fcc): Access violation - code c0000005 (first/second chance not available)
*** WARNING: Unable to verify checksum for simple.exe
simple!service_main+0x60:
00000000`00401aa0 c704250000000000000000 mov dword ptr [0],0 ds:00000000`00000000=????????
0:001> ~*k

   0  Id: d6c.cf4 Suspend: 0 Teb: 000007ff`fffdd000 Unfrozen
Child-SP          RetAddr           Call Site
00000000`0012f978 00000000`777026da ntdll!NtReadFile+0xa
00000000`0012f980 000007fe`feb265aa kernel32!ReadFile+0x8a
00000000`0012fa10 000007fe`feb262e3 advapi32!ScGetPipeInput+0x4a
00000000`0012faf0 000007fe`feb7ea5d advapi32!ScDispatcherLoop+0x9a
00000000`0012fbf0 00000000`004019f5 advapi32!StartServiceCtrlDispatcherA+0x8d
00000000`0012fe70 00000000`00408bac simple!main+0x155
00000000`0012fec0 00000000`0040897e simple!__tmainCRTStartup+0x21c
00000000`0012ff30 00000000`7770cdcd simple!mainCRTStartup+0xe
00000000`0012ff60 00000000`7792c6e1 kernel32!BaseThreadInitThunk+0xd
00000000`0012ff90 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

#  1  Id: d6c.fcc Suspend: 0 Teb: 000007ff`fffdb000 Unfrozen
Child-SP          RetAddr           Call Site
00000000`008eff00 000007fe`feb24bf5 simple!service_main+0x60
00000000`008eff30 00000000`7770cdcd advapi32!ScSvcctrlThreadW+0x25
00000000`008eff60 00000000`7792c6e1 kernel32!BaseThreadInitThunk+0xd
00000000`008eff90 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

Spawning a custom thread with NULL pointer access violation doesn’t result in a crash dump on my Vista x86 and x64 too. Therefore it appears that there are no automatic postmortem crash dumps saved for native Window services in Vista unless there is some setting that I missed. This might create some problems for traditional 3rd party Technical Support procedures especially if Windows Server 2008 (Longhorn) will have the same behavior.   

- Dmitry Vostokov @ DumpAnalysis.org -

Crash Dump Analysis Patterns (Part 28)

Wednesday, September 26th, 2007

Sometimes we have a problem that some functionality is not available or it is unresponsive when we request it. We can suppose that the process implementing that functionality has crashed or hangs. If we know the relationship between processes we can request several user dumps at once or a complete memory dump to analyze the dependency between processes by looking at their stack traces. This is an example of the system level crash dump analysis pattern that I call Coupled Processes.

Process relationship can be implemented via different interprocess communication mechanisms (IPC), for example, Remote Procedure Call (RPC) via LPC (Local Procedure Call) which can be easily identified in stack traces.

My favorite example here is when some application tries to print and hangs. Printing API is exported from WINSPOOL.DLL and it forwards via RPC most requests to Windows Print Spooler service. Therefore it is logical to take two dumps, one from that application and one from spoolsv.exe. Similar example is from Citrix Presentation Server environments related to printer autocreation when there are dependencies between Citrix Printing Service CpSvc.exe and spoolsv.exe. Therefore if new user connections hang and restarting both printing services resolves the issue then we might need to analyze dumps from both services together to confirm this Procedure Call Chain and find the problem 3rd-party printing component or driver.

Back to my favorite example. In the hang application we have the following thread:

  18  Id: 2130.6320 Suspend: 1 Teb: 7ffa8000 Unfrozen
ChildEBP RetAddr
01eae170 7c821c94 ntdll!KiFastSystemCallRet
01eae174 77c72700 ntdll!NtRequestWaitReplyPort+0xc
01eae1c8 77c713ba rpcrt4!LRPC_CCALL::SendReceive+0x230
01eae1d4 77c72c7f rpcrt4!I_RpcSendReceive+0x24
01eae1e8 77ce219b rpcrt4!NdrSendReceive+0x2b
01eae5d0 7307c9ef rpcrt4!NdrClientCall2+0x22e
01eae5e8 73082d8d winspool!RpcAddPrinter+0x1c
01eaea70 0040d81a winspool!AddPrinterW+0x102
01eaef58 0040ee7c App!AddNewPrinter+0x816
...
...
...

Notice winspool and rpcrt4 modules. The application is calling spooler service using RPC to add a new printer and waiting for a reply back. Looking at spooler service dump shows several threads displaying message boxes and waiting for user input: 

  20  Id: 790.5950 Suspend: 1 Teb: 7ffa2000 Unfrozen
ChildEBP RetAddr  Args to Child
03deea70 7739d02f 77392bf3 00000000 00000000 ntdll!KiFastSystemCallRet
03deeaa8 7738f122 03dd0058 00000000 00000001 user32!NtUserWaitMessage+0xc
03deead0 773a1722 77380000 00123690 00000000 user32!InternalDialogBox+0xd0
03deed90 773a1004 03deeeec 03dae378 03dae160 user32!SoftModalMessageBox+0x94b
03deeee0 773b1a28 03deeeec 00000028 00000000 user32!MessageBoxWorker+0x2ba
03deef38 773b19c4 00000000 03defb9c 03def39c user32!MessageBoxTimeoutW+0x7a
03deef58 773b19a0 00000000 03defb9c 03def39c user32!MessageBoxExW+0x1b
03deef74 021f265b 00000000 03defb9c 03def39c user32!MessageBoxW+0×45
WARNING: Stack unwind information not available. Following frames may be wrong.
03deef88 00000000 03dae160 03deffec 03dae16a PrinterDriver!UninstallerInstall+0×2cb

Dumping the 3rd parameter of MessageBoxW using WinDbg du command shows the message:

“Installation of the software for your printer is now complete. Restart your computer to make the new settings active.”

Another example when one process starts another and is waiting for it to finish:

0 Id: 2a34.24d0 Suspend: 1 Teb: 7ffde000 Unfrozen
ChildEBP RetAddr
0007ec8c 7c822124 ntdll!KiFastSystemCallRet
0007ec90 77e6bad8 ntdll!NtWaitForSingleObject+0xc
0007ed00 77e6ba42 kernel32!WaitForSingleObjectEx+0xac
0007ed14 01002f4c kernel32!WaitForSingleObject+0x12
0007f79c 01003137 userinit!ExecApplication+0x2d3
0007f7dc 0100366b userinit!ExecProcesses+0x1bb
0007fe68 010041fd userinit!StartTheShell+0x132
0007ff1c 010056f1 userinit!WinMain+0x263
0007ffc0 77e523e5 userinit!WinMainCRTStartup+0x186
0007fff0 00000000 kernel32!BaseProcessStart+0x23

- Dmitry Vostokov @ DumpAnalysis.org -

Resolving “Symbol file could not be found”

Monday, September 17th, 2007

On one of my debugging workstations I couldn’t analyze kernel and complete memory dumps from Windows 2003 Server R02. I was always getting this message: 

*** ERROR: Symbol file could not be found.  Defaulted to export symbols for ntkrnlmp.exe -

An attempt to reload and overwrite PDB files using .reload /o /f command didn’t resolve the issue but the following WinDbg command helped:

1: kd> !sym noisy
noisy mode - symbol prompts on

Reloading symbol files showed that default symbol path contained corrupt ntkrnlmp.pdb:  

1: kd> .reload
DBGHELP: C:\Program Files\Debugging Tools for Windows\sym\ntkrnlmp.pdb\A91CA63E49A840F4A50509F90ADE10D52\ntkrnlmp.pdb - E_PDB_CORRUPT
DBGHELP: ntkrnlmp.pdb - file not found
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for ntkrnlmp.exe -
DBGHELP: nt - export symbol

Deleting it and reloading symbols again showed problems with the file downloaded from MS symbol server too: 

1: kd> .reload
SYMSRV:  c:\mss\ntkrnlmp.pdb\A91CA63E49A840F4A50509F90ADE10D52\ntkrnlmp.pd_
         The file or directory is corrupted and unreadable.
DBGHELP: ntkrnlmp.pdb - file not found
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for ntkrnlmp.exe -
DBGHELP: nt - export symbols

Removing the folder and reloading symbols resolved the problem: 

1: kd> .reload
DBGHELP: nt - public symbols
         c:\mss\ntkrnlmp.pdb\A91CA63E49A840F4A50509F90ADE10D52\ntkrnlmp.pdb

Now it was time to switch noisy mode off:

1: kd> !sym quiet
quiet mode - symbol prompts on

- Dmitry Vostokov @ DumpAnalysis.org -

Crash Dump Analysis Patterns (Part 27)

Friday, September 14th, 2007

Sometimes a problem can be identified not from a single Stack Trace pattern but from a Stack Trace Collection

These include Coupled ProcessesProcedure Call Chains and Blocked Threads. All of them will be discussed in subsequent parts and in this part I only discuss various methods to list stack traces.

  • Process dumps including various process minidumps:

~*kv command lists all process threads

!findstack module[!symbol] 2 command filters all stack traces to show ones containing module or module!symbol

!uniqstack command

  • Kernel minidumps:

have only one problem thread. kv command or its variant is suffice.

  • Kernel and complete memory dumps:

!process 0 ff command lists all processes and their threads including user space process thread stacks for complete memory dumps. This command is valid for Windows XP and later. For older systems use WinDbg scripts

!stacks 2 [module[!symbol]] command shows kernel mode stack traces and you can filter the output based on module or module!symbol. Filtering is valid only for dumps from Windows XP and later systems    

~[ProcessorN]s;.reload /user;kv command sequence shows stack trace for the running thread on the specified processor.

!sprocess 0n<SessionId> ff lists all processes and their threads for the specified [terminal services] session. 

!for_each_thread “.thread /r /p @#Thread; kvallows execution of stack trace command variants per each thread. The following script takes the advantage of this command to list complete stack traces from an x64 system.

The processor change command is illustrated in this example:

0: kd> ~2s

2: kd> k
ChildEBP RetAddr
eb42bd58 00000000 nt!KiIdleLoop+0x14

2: kd> ~1s;.reload /user;k
Loading User Symbols
...
ChildEBP RetAddr
be4f8c30 eb091f43 i8042prt!I8xProcessCrashDump+0x53
be4f8c8c 8046bfe2 i8042prt!I8042KeyboardInterruptService+0x15d
be4f8c8c 8049470f nt!KiInterruptDispatch+0x32
be4f8d54 80468389 nt!NtSetEvent+0x71
be4f8d54 77f8290a nt!KiSystemService+0xc9
081cfefc 77f88266 ntdll!ZwSetEvent+0xb
081cff0c 77f881b1 ntdll!RtlpUnWaitCriticalSection+0x1b
081cff14 1b00c7d1 ntdll!RtlLeaveCriticalSection+0x1d
081cff4c 1b0034da msjet40!Database::ReadPages+0x81
081cffb4 7c57b3bc msjet40!System::WorkerThread+0x115
081cffec 00000000 KERNEL32!BaseThreadStart+0x52

Example of !findstack command (process dump):

0:000> !findstack kernel32!RaiseException 2
Thread 000, 1 frame(s) match
* 00 0013b3f8 72e8d3ef kernel32!RaiseException+0x53
  01 0013b418 72e9a26b msxml3!Exception::raiseException+0x5f
  02 0013b424 72e8ff00 msxml3!Exception::_throwError+0x22
  03 0013b46c 72e6abaa msxml3!COMSafeControlRoot::getBaseURL+0x3d
  04 0013b4bc 72e6a888 msxml3!Document::loadXML+0x82
  05 0013b510 64b73a9b msxml3!DOMDocumentWrapper::loadXML+0x5a
  06 0013b538 64b74eb6 iepeers!CPersistUserData::initXMLCache+0xa6
  07 0013b560 77d0516e iepeers!CPersistUserData::load+0xfc
  08 0013b57c 77d14abf oleaut32!DispCallFunc+0x16a
...
...
...
  66 0013fec8 0040243d shdocvw!IEWinMain+0x129
  67 0013ff1c 00402744 iexplore!WinMain+0x316
  68 0013ffc0 77e6f23b iexplore!WinMainCRTStartup+0x182
  69 0013fff0 00000000 kernel32!BaseProcessStart+0x23

Example of !stacks command (kernel dump):

2: kd> !stacks 2 nt!PspExitThread
Proc.Thread  .Thread  Ticks   ThreadState Blocker
                            [8a390818 System]

                            [8a1bbbf8 smss.exe]

                            [8a16cbf8 csrss.exe]

                            [89c14bf0 winlogon.exe]

                            [89dda630 services.exe]

                            [89c23af0 lsass.exe]

                            [8a227470 svchost.exe]

                            [89f03bb8 svchost.exe]

                            [89de3820 svchost.exe]

                            [89d09b60 svchost.exe]

                            [89c03530 ccEvtMgr.exe]

                            [89b8f4f0 ccSetMgr.exe]

                            [89dfe8c0 SPBBCSvc.exe]

                            [89c9db18 svchost.exe]

                            [89dfa268 spoolsv.exe]

                            [89dfa6b8 msdtc.exe]

                            [89df38f0 CpSvc.exe]

                            [89d97d88 DefWatch.exe]

                            [89e04020 IBMSPSVC.EXE]

                            [89b54710 IBMSPREM.EXE]

                            [89d9e4b0 IBMSPREM.EXE]

                            [89c2c4e8 svchost.exe]

                            [89d307c0 SavRoam.exe]

                            [89bfcd88 Rtvscan.exe]

                            [89b53b60 uphclean.exe]

                            [89c24020 AgentSVC.exe]

                            [89d75b60 sAginst.exe]

                            [89cf0d88 CdfSvc.exe]

                            [89d87020 cdmsvc.exe]

                            [89dafd88 ctxxmlss.exe]

                            [89d8dd88 encsvc.exe]

                            [89d06d88 ImaSrv.exe]

                            [89d37b60 mfcom.exe]

                            [89c8bb18 SmaService.exe]

                            [89d2ba80 svchost.exe]

                            [89ce8630 XTE.exe]

                            [89b64b60 XTE.exe]

                            [89b7c680 ctxcpusched.exe]

                            [88d94a88 ctxcpuusync.exe]

                            [89ba5418 unsecapp.exe]

                            [89d846e0 wmiprvse.exe]

                            [89cda9d8 ctxwmisvc.exe]

                            [88d6cb78 logon.scr]

                            [88ba0a70 csrss.exe]

                            [88961968 winlogon.exe]

                            [8865f740 rdpclip.exe]

                            [8858db20 wfshell.exe]

                            [88754020 explorer.exe]

                            [88846d88 BacsTray.exe]

                            [886b6180 ccApp.exe]

                            [884bc020 fppdis3a.exe]

                            [885cb350 ctfmon.exe]

                            [888bb918 cscript.exe]

                            [8880b3c8 cscript.exe]

                            [88ad2950 csrss.exe]
 b68.00215c  88930020 0000000 RUNNING    nt!KeBugCheckEx+0x1b
                                        nt!MiCheckSessionPoolAllocations+0xe3
                                        nt!MiDereferenceSessionFinal+0x183
                                        nt!MmCleanProcessAddressSpace+0x6b
                                        nt!PspExitThread+0x5f1
                                        nt!PspTerminateThreadByPointer+0x4b
                                        nt!PspSystemThreadStartup+0x3c
                                        nt!KiThreadStartup+0x16

                            [88629310 winlogon.exe]

                            [88a4d9b0 csrss.exe]

                            [88d9f8b0 winlogon.exe]

                            [88cd5840 wfshell.exe]

                            [8a252440 OUTLOOK.EXE]

                            [8a194bf8 WINWORD.EXE]

                            [88aabd20 ctfmon.exe]

                            [889ef440 EXCEL.EXE]

                            [88bec838 HogiaGUI2.exe]

                            [88692020 csrss.exe]

                            [884dd508 winlogon.exe]

                            [88be1d88 wfshell.exe]

                            [886a7d88 OUTLOOK.EXE]

                            [889baa70 WINWORD.EXE]

                            [8861e3d0 ctfmon.exe]

                            [887bbb68 EXCEL.EXE]

                            [884e4020 csrss.exe]

                            [8889d218 winlogon.exe]

                            [887c8020 wfshell.exe]

Threads Processed: 1101

What if we have a list of processes from a complete memory dump by using !process 0 0 command and we want to interrogate the specific process? In this case we need to switch to that process and reload user space symbol files (.process /r /p address)

There is also a separate command to reload user space symbol files any time (.reload /user).

After switching we can list threads (!process address), dump or search process virtual memory, etc. For example:

1: kd> !process 0 0
**** NT ACTIVE PROCESS DUMP ****
PROCESS 890a3320  SessionId: 0  Cid: 0008    Peb: 00000000  ParentCid: 0000
    DirBase: 00030000  ObjectTable: 890a3e08  TableSize: 405.
    Image: System

PROCESS 889dfd60  SessionId: 0  Cid: 0144    Peb: 7ffdf000  ParentCid: 0008
    DirBase: 0b9e7000  ObjectTable: 889fdb48  TableSize: 212.
    Image: SMSS.EXE

PROCESS 890af020  SessionId: 0  Cid: 0160    Peb: 7ffdf000  ParentCid: 0144
    DirBase: 0ce36000  ObjectTable: 8898e308  TableSize: 747.
    Image: CSRSS.EXE

PROCESS 8893d020  SessionId: 0  Cid: 0178    Peb: 7ffdf000  ParentCid: 0144
    DirBase: 0d33b000  ObjectTable: 890ab4c8  TableSize: 364.
    Image: WINLOGON.EXE

PROCESS 88936020  SessionId: 0  Cid: 0194    Peb: 7ffdf000  ParentCid: 0178
    DirBase: 0d7d5000  ObjectTable: 88980528  TableSize: 872.
    Image: SERVICES.EXE

PROCESS 8897f020  SessionId: 0  Cid: 01a0    Peb: 7ffdf000  ParentCid: 0178
    DirBase: 0d89d000  ObjectTable: 889367c8  TableSize: 623.
    Image: LSASS.EXE

1: kd> .process /r /p 8893d020
Implicit process is now 8893d020
Loading User Symbols
...

1: kd> !process 8893d020
PROCESS 8893d020  SessionId: 0  Cid: 0178    Peb: 7ffdf000  ParentCid: 0144
    DirBase: 0d33b000  ObjectTable: 890ab4c8  TableSize: 364.
    Image: WINLOGON.EXE
    VadRoot 8893a508 Clone 0 Private 1320. Modified 45178. Locked 0.
    DeviceMap 89072448
    Token                             e392f8d0
    ElapsedTime                        9:54:06.0882
    UserTime                          0:00:00.0071
    KernelTime                        0:00:00.0382
    QuotaPoolUsage[PagedPool]         34828
    QuotaPoolUsage[NonPagedPool]      43440
    Working Set Sizes (now,min,max)  (737, 50, 345) (2948KB, 200KB, 1380KB)
    PeakWorkingSetSize                2764
    VirtualSize                       46 Mb
    PeakVirtualSize                   52 Mb
    PageFaultCount                    117462
    MemoryPriority                    FOREGROUND
    BasePriority                      13
    CommitCharge                      1861

        THREAD 8893dda0  Cid 178.15c  Teb: 7ffde000  Win32Thread: a2034908 WAIT: (WrUserRequest) UserMode Non-Alertable
            8893bee0  SynchronizationEvent
        Not impersonating
        Owning Process 8893d020
        Wait Start TickCount    29932455      Elapsed Ticks: 7
        Context Switch Count    28087                   LargeStack
        UserTime                  0:00:00.0023
        KernelTime                0:00:00.0084
        Start Address winlogon!WinMainCRTStartup (0x0101cbb0)
        Stack Init eb1b0000 Current eb1afcc8 Base eb1b0000 Limit eb1ac000 Call 0
        Priority 15 BasePriority 15 PriorityDecrement 0 DecrementCount 0

        ChildEBP RetAddr  Args to Child
        eb1afce0 8042d893 00000000 a2034908 00000001 nt!KiSwapThread+0x1b1
        eb1afd08 a00019c2 8893bee0 0000000d 00000001 nt!KeWaitForSingleObject+0x1a3
        eb1afd44 a0013993 000020ff 00000000 00000001 win32k!xxxSleepThread+0x18a
        eb1afd54 a001399f 0006fdd8 80468389 00000000 win32k!xxxWaitMessage+0xe
        eb1afd5c 80468389 00000000 00000000 00000000 win32k!NtUserWaitMessage+0xb
        eb1afd5c 77e58b53 00000000 00000000 00000000 nt!KiSystemService+0xc9
        0006fdd0 77e33630 00000000 00000000 0000ffff USER32!NtUserWaitMessage+0xb
        0006fe04 77e44327 000100d2 00000000 00000010 USER32!DialogBox2+0x216
        0006fe28 77e38d37 76b90000 76c75c78 00000000 USER32!InternalDialogBox+0xd1
        0006fe48 77e39eba 76b90000 76c75c78 00000000 USER32!DialogBoxIndirectParamAorW+0x34
        0006fe6c 01011749 76b90000 00000578 00000000 USER32!DialogBoxParamW+0x3d
        0006fea8 01018bd3 000755e8 76b90000 00000578 winlogon!TimeoutDialogBoxParam+0x27
        0006fee0 76b93701 000755e8 76b90000 00000578 winlogon!WlxDialogBoxParam+0×7b
        0006ff08 010164c6 0008d0e0 5ffa0000 000755e8 3rdPartyGINA!WlxDisplaySASNotice+0×43
        0006ff20 01014960 000755e8 00000005 00072c9c winlogon!MainLoop+0×96
        0006ff58 0101cd06 00071fc8 00000000 00072c9c winlogon!WinMain+0×37a
        0006fff4 00000000 7ffdf000 000000c8 00000100 winlogon!WinMainCRTStartup+0×156

        THREAD 88980020  Cid 178.188  Teb: 7ffdc000  Win32Thread: 00000000 WAIT: (DelayExecution) UserMode Alertable
            88980108  NotificationTimer
        Not impersonating
        Owning Process 8893d020
        Wait Start TickCount    29930810      Elapsed Ticks: 1652
        Context Switch Count    15638
        UserTime                  0:00:00.0000
        KernelTime                0:00:00.0000
        Start Address KERNEL32!BaseThreadStartThunk (0x7c57b740)
        Win32 Start Address ntdll!RtlpTimerThread (0x77faa02d)
        Stack Init bf6f7000 Current bf6f6cc4 Base bf6f7000 Limit bf6f4000 Call 0
        Priority 13 BasePriority 13 PriorityDecrement 0 DecrementCount 0

        ChildEBP RetAddr  Args to Child
        bf6f6cdc 8042d340 bf6f6d64 00bfffac 00bfffac nt!KiSwapThread+0x1b1
        bf6f6d04 8052aac9 8046c101 00000001 bf6f6d34 nt!KeDelayExecutionThread+0x182
        bf6f6d54 80468389 00000001 00bfffac 00000000 nt!NtDelayExecution+0x7f
        bf6f6d54 77f82831 00000001 00bfffac 00000000 nt!KiSystemService+0xc9
        00bfff9c 77f842c4 00000001 00bfffac 00000000 ntdll!NtDelayExecution+0xb
        00bfffb4 7c57b3bc 0006fe60 00000000 00000000 ntdll!RtlpTimerThread+0x42
        00bfffec 00000000 77faa02d 0006fe60 00000000 KERNEL32!BaseThreadStart+0x52

1: kd> dds 0006fee0
0006fee0  0006ff08
0006fee4  76b93701 3rdPartyGINA!WlxDisplaySASNotice+0x43
0006fee8  000755e8
0006feec  76b90000 3rdParty
0006fef0  00000578
0006fef4  00000000
0006fef8  76b9370b 3rdParty!WlxDisplaySASNotice+0x4d
0006fefc  0008d0e0
0006ff00  00000008
0006ff04  00000080
0006ff08  0006ff20
0006ff0c  010164c6 winlogon!MainLoop+0x96
0006ff10  0008d0e0
0006ff14  5ffa0000
0006ff18  000755e8
0006ff1c  00000000
0006ff20  0006ff58
0006ff24  01014960 winlogon!WinMain+0x37a
0006ff28  000755e8
0006ff2c  00000005
0006ff30  00072c9c
0006ff34  00000001
0006ff38  000001bc
0006ff3c  00000005
0006ff40  00000001
0006ff44  0000000d
0006ff48  00000000
0006ff4c  00000000
0006ff50  00000000
0006ff54  0000ffe4
0006ff58  0006fff4
0006ff5c  0101cd06 winlogon!WinMainCRTStartup+0x156

- Dmitry Vostokov @ DumpAnalysis.org -

Minidump Analysis (Part 3)

Wednesday, September 12th, 2007

Part 2 dealt with stack traces. Unfortunately stack traces reported by WinDbg, especially involving 3rd-party components, are usually incomplete and sometimes not even correct. They can also point to stable drivers when the system failure happened after slowly accumulated corruption caused by some intermediate driver or a combination of drivers.

Sometimes there are other 3rd-party drivers involved before the system crash that are not visible in the output of !analyze -v command and simply removing them, disabling or upgrading software they are part from make the system stable. To see them we can look at the so called raw stack data. Because kernel mode thread stack size is small (12Kb or 0×3000) we can simply dump memory range between ESP-3000 and ESP+3000. We can use RSP register for x64 dumps but the output will be the same.

Let’s look at our minidump from the previous part. The stack trace is small, incomplete and points to DisplayDriver. This is because we don’t have symbol information for DisplayDriver.dll. Could it be the case that DisplayDriver.dll was used incorrectly by another driver or operating system component? What are other components that might have been used prior to BSOD? Raw stack dump shows additional symbols like DisplayDriver_mini, win32k and dxg:

0: kd> dps esp-3000 esp+3000
b4f4f8b4  ????????
b4f4f8b8  ????????
b4f4f8bc  ????????
b4f4f8c0  ????????
...
...
...
b4f51ffc  ????????
b4f52000  00001000
b4f52004  00006000
b4f52008  b4f5204c
b4f5200c  89025978
b4f52010  89139000
b4f52014  00000000
b4f52018  b4f527ec
b4f5201c  b4f52840
b4f52020  bfbf0ca6 DisplayDriver+0x21bca6
b4f52024  00000000
b4f52028  89025978
...
...
...
b4f52100  e24079e0
b4f52104  bfbf0ca6 DisplayDriver+0x21bca6
b4f52108  00000008
...
...
...
b4f52364  b4f52414
b4f52368  804dc0b2 nt!ExecuteHandler+0x24
b4f5236c  b4f527ec
b4f52370  b4f52d40
b4f52374  b4f524e8
b4f52378  b4f52400
b4f5237c  bf9d2132 dxg!_except_handler3
b4f52380  2a2a2a0a
...
...
...
b4f523e8  b4f52408
b4f523ec  8053738a nt!KeBugCheckEx+0x1b
b4f523f0  0000008e
b4f523f4  c0000005
b4f523f8  bfbf0ca6 DisplayDriver+0x21bca6
b4f523fc  b4f52840
b4f52400  00000000
b4f52404  00000000
b4f52408  b4f527d0
b4f5240c  80521fed nt!KiDispatchException+0x3b1
b4f52410  0000008e
b4f52414  c0000005
b4f52418  bfbf0ca6 DisplayDriver+0x21bca6
b4f5241c  b4f52840
b4f52420  00000000
b4f52424  03a3fb4c
b4f52428  03a3fb4c
b4f5242c  b4f52800
b4f52430  00000000
b4f52434  00000000
b4f52438  00000000
b4f5243c  b9deffc6 DisplayDriver_mini+0x4bfc6
b4f52440  897c621c
b4f52444  00000086
b4f52448  0000003c
b4f5244c  b9f3af5a DisplayDriver_mini+0x196f5a
b4f52450  897c6200
b4f52454  00000086
b4f52458  897c6200
b4f5245c  00000000
b4f52460  00000000
b4f52464  00000000
b4f52468  b9f38b4e DisplayDriver_mini+0x194b4e
b4f5246c  00000000
...
...
...
b4f5250c  00002800
b4f52510  b9f3ac10 DisplayDriver_mini+0x196c10
b4f52514  897c6200
b4f52518  00002504
b4f5251c  00000010
b4f52520  897c6200
b4f52524  b9f2d194 DisplayDriver_mini+0x189194
b4f52528  897c6200
b4f5252c  00002504
b4f52530  00000010
b4f52534  897c6200
b4f52538  898cca80
b4f5253c  00000080
b4f52540  89654008
b4f52544  b9f358e2 DisplayDriver_mini+0x1918e2
b4f52548  897c6200
...
...
...
b4f5256c  00000000
b4f52570  b9deff5c DisplayDriver_mini+0x4bf5c
b4f52574  00000000
...
...
...
b4f5259c  e24079e0
b4f525a0  bfbf0ca6 DisplayDriver+0x21bca6
b4f525a4  00000008
b4f525a8  00010246
b4f525ac  b4f528b4
b4f525b0  00000010
b4f525b4  0000003c
b4f525b8  b9f3af5a DisplayDriver_mini+0x196f5a
b4f525bc  897c6200
b4f525c0  00000086
b4f525c4  89b81008
b4f525c8  897c6200
b4f525cc  00000000
b4f525d0  00007c00
b4f525d4  b9deff5c DisplayDriver_mini+0x4bf5c
b4f525d8  b9deff5c DisplayDriver_mini+0x4bf5c
b4f525dc  8988d7d8
b4f525e0  b9deff66 DisplayDriver_mini+0x4bf66
b4f525e4  b9deff5c DisplayDriver_mini+0x4bf5c
b4f525e8  8961c288
b4f525ec  b9deff66 DisplayDriver_mini+0x4bf66
b4f525f0  8961c288
b4f525f4  00000000
b4f525f8  00000046
b4f525fc  00000000
b4f52600  89903000
b4f52604  b9e625a9 DisplayDriver_mini+0xbe5a9
b4f52608  8961c288
b4f5260c  00000046
b4f52610  00000000
b4f52614  b9deff5c DisplayDriver_mini+0x4bf5c
b4f52618  896ac008
...
...
...
b4f52630  898a8000
b4f52634  b9e9f220 DisplayDriver_mini+0xfb220
b4f52638  89941400
b4f5263c  b9e2ffec DisplayDriver_mini+0x8bfec
b4f52640  00000000
b4f52644  00000000
b4f52648  00000050
b4f5264c  b9e790d3 DisplayDriver_mini+0xd50d3
b4f52650  897c6200
...
...
...
b4f5266c  89bf6200
b4f52670  805502fa nt!ExFreePoolWithTag+0x664
b4f52674  00000000
b4f52678  88f322e0
b4f5267c  88c9d708
b4f52680  00000001
b4f52684  898cf918
b4f52688  ffdff538
b4f5268c  804dc766 nt!KiUnlockDispatcherDatabase+0x1c
b4f52690  b4f52901
b4f52694  b4f526ac
b4f52698  00000001
b4f5269c  804eaf06 nt!IopFreeIrp+0xed
b4f526a0  00000000
b4f526a4  00000000
b4f526a8  88c9d708
b4f526ac  b4f52700
b4f526b0  804f2b9f nt!IopCompleteRequest+0x319
b4f526b4  804f2bb5 nt!IopCompleteRequest+0x32f
b4f526b8  88c9d748
b4f526bc  89025978
b4f526c0  890259ac
b4f526c4  897752e8
b4f526c8  89025978
b4f526cc  b4f52910
b4f526d0  b4f527c8
b4f526d4  00000000
b4f526d8  b9e0d300 DisplayDriver_mini+0x69300
b4f526dc  88c9d708
b4f526e0  00000000
b4f526e4  00000086
b4f526e8  b4f526b8
b4f526ec  b9f3ad28 DisplayDriver_mini+0x196d28
b4f526f0  ffffffff
b4f526f4  804e2ed8 nt!_except_handler3
b4f526f8  804f2bb8 nt!GUID_DOCK_INTERFACE+0x424
b4f526fc  ffffffff
b4f52700  804f2bb5 nt!IopCompleteRequest+0x32f
b4f52704  804f2db5 nt!KiDeliverApc+0xb3
b4f52708  88c9d748
b4f5270c  b4f5274c
b4f52710  b4f52728
b4f52714  890259ac
b4f52718  804dce74 nt!KiDeliverApc+0x1e0
b4f5271c  806ffae4 hal!KeReleaseQueuedSpinLock+0x3c
b4f52720  89025978
b4f52724  b4f527f8
b4f52728  00000000
b4f5272c  89025a60
b4f52730  00000001
b4f52734  b4f52d64
b4f52738  88e775c8
b4f5273c  804f2a72 nt!IopCompleteRequest
b4f52740  00000000
b4f52744  00000000
b4f52748  00000000
b4f5274c  00000000
b4f52750  b4f52768
b4f52754  806ffef2 hal!HalpApcInterrupt+0xc6
b4f52758  00000000
b4f5275c  00000000
b4f52760  b4f52768
b4f52764  00000000
b4f52768  b4f527f8
b4f5276c  806ffae4 hal!KeReleaseQueuedSpinLock+0x3c
b4f52770  badb0d00
b4f52774  00000000
b4f52778  00000000
b4f5277c  806ffae4 hal!KeReleaseQueuedSpinLock+0x3c
b4f52780  00000008
b4f52784  00000246
b4f52788  804e5d2c nt!KeInsertQueueApc+0x6d
b4f5278c  88c9d748
...
...
...
b4f527c0  b4f52c10
b4f527c4  804e2ed8 nt!_except_handler3
b4f527c8  804faca0 nt!KiFindFirstSetLeft+0x120
b4f527cc  ffffffff
b4f527d0  b4f52840
b4f527d4  804de403 nt!CommonDispatchException+0x4d
b4f527d8  b4f527ec
...
...
...
b4f527f4  00000000
b4f527f8  bfbf0ca6 DisplayDriver+0x21bca6
b4f527fc  00000002
...
...
...
b4f52828  b4f52840
b4f5282c  804e0944 nt!KiTrap0E+0xd0
b4f52830  00000000
b4f52834  03a3fb4c
b4f52838  00000000
b4f5283c  804de3b4 nt!Kei386EoiHelper+0x18a
b4f52840  e24079e0
b4f52844  bfbf0ca6 DisplayDriver+0x21bca6
b4f52848  badb0d00
...
...
...
b4f52884  00000000
b4f52888  bfdba6c7 DisplayDriver+0x3e56c7
b4f5288c  b4f52c10
...
...
...
b4f528a4  00000000
b4f528a8  bfbf0ca6 DisplayDriver+0x21bca6
b4f528ac  00000008
...
...
...
b4f528d8  000000f3
b4f528dc  bfb6269f DisplayDriver+0x18d69f
b4f528e0  9745d083
b4f528e4  00000001
b4f528e8  e9a18d4c
b4f528ec  ffffffff
b4f528f0  bfb268e7 DisplayDriver+0x1518e7
b4f528f4  000000ab
...
...
...
b4f52960  0000027a
b4f52964  bfb2696c DisplayDriver+0x15196c
b4f52968  00000000
...
...
...
b4f5298c  e2004308
b4f52990  bfab8ce4 DisplayDriver+0xe3ce4
b4f52994  000000ab
...
...
...
b4f52bd0  00000000
b4f52bd4  bf804779 win32k!GreReleaseFastMutex+0x14
b4f52bd8  b4f52be8
b4f52bdc  bf8a04e3 win32k!dhpdevRetrieveNode+0x32
b4f52be0  89b20128
b4f52be4  b4f52c50
b4f52be8  b4f52c20
b4f52bec  bf907d15 win32k!WatchdogDdBlt+0x38
b4f52bf0  b4f52c50
...
...
...
b4f52c10  b4f52d40
b4f52c14  bf9877ae win32k!_except_handler3
b4f52c18  bf995380 win32k!`string'+0x2b4
b4f52c1c  00000000
b4f52c20  b4f52d50
b4f52c24  bf9cdd78 dxg!DxDdBlt+0x374
b4f52c28  b4f52c50
b4f52c2c  b4f52d64
b4f52c30  038dfaf4
b4f52c34  bf907ca3 win32k!NtGdiDdBlt
b4f52c38  00000001
...
...
...
b4f52c90  000000b0
b4f52c94  bf805b42 win32k!AllocateObject+0xaa
b4f52c98  00000001
b4f52c9c  00000006
b4f52ca0  b4f52cb0
b4f52ca4  32040ddf
b4f52ca8  bf805734 win32k!HANDLELOCK::vLockHandle+0x75
b4f52cac  00000ff4
b4f52cb0  00000000
b4f52cb4  bc40ddf0
b4f52cb8  b4f52cd0
b4f52cbc  00000001
b4f52cc0  804da3ee nt!ExAcquireResourceExclusiveLite+0x67
b4f52cc4  00000008
...
...
...
b4f52ce8  80004005
b4f52cec  804dc605 nt!ExReleaseResourceLite+0x8d
b4f52cf0  00000000
...
...
...
b4f52d08  b4f52d18
b4f52d0c  bf8018bf win32k!GreReleaseSemaphore+0xa
b4f52d10  bf803d1e win32k!GreUnlockDisplay+0x24
b4f52d14  00000000
...
...
...
b4f52d40  ffffffff
b4f52d44  bf9d2132 dxg!_except_handler3
b4f52d48  bf9d2928 dxg!GUID_MiscellaneousCallbacks+0x42c
b4f52d4c  ffffffff
b4f52d50  b4f52d64
b4f52d54  804dd99f nt!KiFastCallEntry+0xfc
b4f52d58  02400002
...
...
...
b4f52ddc  00000023
b4f52de0  804ec781 nt!KiThreadStartup+0x16
b4f52de4  f7849b85 NDIS!ndisWorkerThread
b4f52de8  88c9d4d0
b4f52dec  00000000
b4f52df0  0020027f
b4f52df4  011c0000
b4f52df8  bfdb97b7 DisplayDriver+0x3e47b7
b4f52dfc  00000008
...
...
...
b4f52e70  00000000
b4f52e74  f7800000 InCDPass+0x1000
b4f52e78  00004026
...
...
...
b4f52ff8  00000000
b4f52ffc  00000000
b4f53000  ????????
b4f53004  ????????

Some are coincidental like InCDPass and NDIS. Obviously DisplayDriver, DisplayDriver_mini, dxg and win32k are related due to their functions: Display, DirectX, GDI (Graphics Device Interface). 

Now we can check their module information:

0: kd> lmv m DisplayDriver
start    end        module name
bf9d5000 bff42500   DisplayDriver T (no symbols)
    Loaded symbol image file: DisplayDriver.dll
    Image path: DisplayDriver.dll
    Image name: DisplayDriver.dll
    Timestamp:        Fri Jun 29 09:13:08 2007 (4684BF14)
    CheckSum:         00570500
    ImageSize:        0056D500
    Translations:     0000.04b0 0000.04e0 0409.04b0 0409.04e0

0: kd> lmv m DisplayDriver_mini
start    end        module name
b9da4000 ba421f20   DisplayDriver_mini T (no symbols)
    Loaded symbol image file: DisplayDriver_mini.sys
    Image path: DisplayDriver_mini.sys
    Image name: DisplayDriver_mini.sys
    Timestamp:        Fri Jun 29 09:16:41 2007 (4684BFE9)
    CheckSum:         00680F20
    ImageSize:        0067DF20
    Translations:     0000.04b0 0000.04e0 0409.04b0 0409.04e0

0: kd> lmv m dxg
start    end        module name
bf9c3000 bf9d4580   dxg        (pdb symbols)
    Loaded symbol image file: dxg.sys
    Mapped memory image file: c:\websymbols\dxg.sys\41107B9311580\dxg.sys
    Image path: dxg.sys
    Image name: dxg.sys
    Timestamp:        Wed Aug 04 07:00:51 2004 (41107B93)
    CheckSum:         0001D181
    ImageSize:        00011580
    File version:     5.1.2600.2180
    Product version:  5.1.2600.2180
    File flags:       0 (Mask 3F)
    File OS:          40004 NT Win32
    File type:        3.7 Driver
    File date:        00000000.00000000
    Translations:     0409.04b0
    CompanyName:      Microsoft Corporation
    ProductName:      Microsoft® Windows® Operating System
    InternalName:     dxg.sys
    OriginalFilename: dxg.sys
    ProductVersion:   5.1.2600.2180
    FileVersion:      5.1.2600.2180 (xpsp_sp2_rtm.040803-2158)
    FileDescription:  DirectX Graphics Driver
    LegalCopyright:   © Microsoft Corporation. All rights reserved.

0: kd> lmv m win32k
start    end        module name
bf800000 bf9c2180   win32k   # (pdb symbols)
    Loaded symbol image file: win32k.sys
    Mapped memory image file: c:\websymbols\win32k.sys\45F013F61c2180\win32k.sys
    Image path: win32k.sys
    Image name: win32k.sys
    Timestamp:        Thu Mar 08 13:47:34 2007 (45F013F6)
    CheckSum:         001C4886
    ImageSize:        001C2180
    File version:     5.1.2600.3099
    Product version:  5.1.2600.3099
    File flags:       0 (Mask 3F)
    File OS:          40004 NT Win32
    File type:        3.7 Driver
    File date:        00000000.00000000
    Translations:     0406.04b0
    CompanyName:      Microsoft Corporation
    ProductName:      Microsoft® Windows® Operativsystem
    InternalName:     win32k.sys
    OriginalFilename: win32k.sys
    ProductVersion:   5.1.2600.3099
    FileVersion:      5.1.2600.3099 (xpsp_sp2_gdr.070308-0222)
    FileDescription:  Win32-flerbrugerdriver
    LegalCopyright:   © Microsoft Corporation. Alle rettigheder forbeholdes.

- Dmitry Vostokov @ DumpAnalysis.org -

Crash Dump Analysis Patterns (Part 26)

Tuesday, September 11th, 2007

Sometimes we get a process dump from x64 Windows and when we load it into WinDbg we get the output telling us that an exception or a breakpoint comes from wow64.dll. For example:

Loading Dump File [X:\ppid2088.dmp]
User Mini Dump File with Full Memory: Only application data is available

Comment: 'Userdump generated complete user-mode minidump with Exception Monitor function on SERVER01'
Symbol search path is: srv*c:\mss*http://msdl.microsoft.com/download/symbols
Executable search path is:
Windows Server 2003 Version 3790 (Service Pack 2) MP (4 procs) Free x64
Product: Server, suite: TerminalServer
Debug session time: Tue Sep  4 13:36:14.000 2007 (GMT+2)
System Uptime: 6 days 3:32:26.081
Process Uptime: 0 days 0:01:54.000
WARNING: tsappcmp overlaps ws2_32
WARNING: msvcp60 overlaps oleacc
WARNING: tapi32 overlaps rasapi32
WARNING: rtutils overlaps rasman
WARNING: dnsapi overlaps rasapi32
WARNING: wldap32 overlaps dnsapi
WARNING: ntshrui overlaps userenv
WARNING: wtsapi32 overlaps dnsapi
WARNING: winsta overlaps setupapi
WARNING: activeds overlaps rtutils
WARNING: activeds overlaps rasman
WARNING: adsldpc overlaps activeds
WARNING: drprov overlaps apphelp
WARNING: netui1 overlaps netui0
WARNING: davclnt overlaps apphelp
...
This dump file has an exception of interest stored in it.
The stored exception information can be accessed via .ecxr.
(2088.2fe4): Unknown exception - code 000006d9 (first/second chance not available)
wow64!Wow64NotifyDebugger+0×9:
00000000`6b006369 b001            mov     al,1

Analysis shows that some run-time exception was raised but the stack trace shows only WOW64 CPU simulation code in all process threads:

0:000> !analyze -v
*********************************************************
*                                                       *
*                  Exception Analysis                   *
*                                                       *
*********************************************************

FAULTING_IP:
kernel32!RaiseException+53
00000000`7d4e2366 5e              pop     rsi

EXCEPTION_RECORD:  ffffffffffffffff -- (.exr 0xffffffffffffffff)
ExceptionAddress: 000000007d4e2366 (kernel32!RaiseException+0x0000000000000053)
   ExceptionCode: 000006d9
  ExceptionFlags: 00000001
NumberParameters: 0

DEFAULT_BUCKET_ID:  STACK_CORRUPTION

PROCESS_NAME:  App.exe

ERROR_CODE: (NTSTATUS) 0x6d9 - There are no more endpoints available from the endpoint mapper.

NTGLOBALFLAG:  0

APPLICATION_VERIFIER_FLAGS:  0

LAST_CONTROL_TRANSFER:  from 000000006b0064f2 to 000000006b006369

FOLLOWUP_IP:
wow64!Wow64NotifyDebugger+9
00000000`6b006369 b001            mov     al,1

SYMBOL_STACK_INDEX:  0

SYMBOL_NAME:  wow64!Wow64NotifyDebugger+9

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: wow64

IMAGE_NAME:  wow64.dll

DEBUG_FLR_IMAGE_TIMESTAMP:  45d6943d

FAULTING_THREAD:  0000000000002fe4

PRIMARY_PROBLEM_CLASS:  STACK_CORRUPTION

BUGCHECK_STR:  APPLICATION_FAULT_STACK_CORRUPTION

STACK_COMMAND:  ~0s; .ecxr ; dt ntdll!LdrpLastDllInitializer BaseDllName ; dt ntdll!LdrpFailureData ; kb

FAILURE_BUCKET_ID:  X64_APPLICATION_FAULT_STACK_CORRUPTION_wow64!Wow64NotifyDebugger+9

BUCKET_ID:  X64_APPLICATION_FAULT_STACK_CORRUPTION_wow64!Wow64NotifyDebugger+9

Followup: MachineOwner
---------

0:000> ~*k

.  0  Id: 2088.2fe4 Suspend: 1 Teb: 00000000`7efdb000 Unfrozen
Child-SP          RetAddr           Call Site
00000000`0016e190 00000000`6b0064f2 wow64!Wow64NotifyDebugger+0x9
00000000`0016e1c0 00000000`6b006866 wow64!Wow64KiRaiseException+0x172
00000000`0016e530 00000000`78b83c7d wow64!Wow64SystemServiceEx+0xd6
00000000`0016edf0 00000000`6b006a5a wow64cpu!ServiceNoTurbo+0x28
00000000`0016ee80 00000000`6b005e0d wow64!RunCpuSimulation+0xa
00000000`0016eeb0 00000000`77ed8030 wow64!Wow64LdrpInitialize+0x2ed
00000000`0016f3f0 00000000`77ed582f ntdll!LdrpInitializeProcess+0x1538
00000000`0016f6f0 00000000`77ef30a5 ntdll!LdrpInitialize+0x18f
00000000`0016f7d0 00000000`7d4d1510 ntdll!KiUserApcDispatcher+0x15
00000000`0016fcc8 00000000`00000000 kernel32!BaseProcessStartThunk
00000000`0016fcd0 00000000`00000000 0x0
00000000`0016fcd8 00000000`00000000 0x0
00000000`0016fce0 00000000`00000000 0x0
00000000`0016fce8 00000000`00000000 0x0
00000000`0016fcf0 00000000`00000000 0x0
00000000`0016fcf8 00000000`00000000 0x0
00000000`0016fd00 00010007`00000000 0x0
00000000`0016fd08 00000000`00000000 0x10007`00000000
00000000`0016fd10 00000000`00000000 0x0
00000000`0016fd18 00000000`00000000 0x0

   1  Id: 2088.280c Suspend: 1 Teb: 00000000`7efd8000 Unfrozen
Child-SP          RetAddr           Call Site
00000000`0200f0d8 00000000`6b006a5a wow64cpu!WaitForMultipleObjects32+0x3a
00000000`0200f180 00000000`6b005e0d wow64!RunCpuSimulation+0xa
00000000`0200f1b0 00000000`77f109f0 wow64!Wow64LdrpInitialize+0x2ed
00000000`0200f6f0 00000000`77ef30a5 ntdll!LdrpInitialize+0x2aa
00000000`0200f7d0 00000000`7d4d1504 ntdll!KiUserApcDispatcher+0x15
00000000`0200fcc8 00000000`00000000 kernel32!BaseThreadStartThunk
00000000`0200fcd0 00000000`00000000 0x0
00000000`0200fcd8 00000000`00000000 0x0
00000000`0200fce0 00000000`00000000 0x0
00000000`0200fce8 00000000`00000000 0x0
00000000`0200fcf0 00000000`00000000 0x0
00000000`0200fcf8 00000000`00000000 0x0
00000000`0200fd00 0001002f`00000000 0x0
00000000`0200fd08 00000000`00000000 0x1002f`00000000
00000000`0200fd10 00000000`00000000 0x0
00000000`0200fd18 00000000`00000000 0x0
00000000`0200fd20 00000000`00000000 0x0
00000000`0200fd28 00000000`00000000 0x0
00000000`0200fd30 00000000`00000000 0x0
00000000`0200fd38 00000000`00000000 0x0

   2  Id: 2088.1160 Suspend: 1 Teb: 00000000`7efd5000 Unfrozen
Child-SP          RetAddr           Call Site
00000000`0272e7c8 00000000`6b29c464 wow64win!ZwUserGetMessage+0xa
00000000`0272e7d0 00000000`6b006866 wow64win!whNtUserGetMessage+0x34
00000000`0272e830 00000000`78b83c7d wow64!Wow64SystemServiceEx+0xd6
00000000`0272f0f0 00000000`6b006a5a wow64cpu!ServiceNoTurbo+0x28
00000000`0272f180 00000000`6b005e0d wow64!RunCpuSimulation+0xa
00000000`0272f1b0 00000000`77f109f0 wow64!Wow64LdrpInitialize+0x2ed
00000000`0272f6f0 00000000`77ef30a5 ntdll!LdrpInitialize+0x2aa
00000000`0272f7d0 00000000`7d4d1504 ntdll!KiUserApcDispatcher+0x15
00000000`0272fcc8 00000000`00000000 kernel32!BaseThreadStartThunk
00000000`0272fcd0 00000000`00000000 0x0
00000000`0272fcd8 00000000`00000000 0x0
00000000`0272fce0 00000000`00000000 0x0
00000000`0272fce8 00000000`00000000 0x0
00000000`0272fcf0 00000000`00000000 0x0
00000000`0272fcf8 00000000`00000000 0x0
00000000`0272fd00 00010003`00000000 0x0
00000000`0272fd08 00000000`00000000 0x10003`00000000
00000000`0272fd10 00000000`00000000 0x0
00000000`0272fd18 00000000`00000000 0x0
00000000`0272fd20 00000000`00000000 0x0

   3  Id: 2088.2d04 Suspend: 1 Teb: 00000000`7efad000 Unfrozen
Child-SP          RetAddr           Call Site
00000000`0289f108 00000000`78b84191 wow64cpu!CpupSyscallStub+0x9
00000000`0289f110 00000000`6b006a5a wow64cpu!Thunk2ArgNSpNSpReloadState+0x21
00000000`0289f180 00000000`6b005e0d wow64!RunCpuSimulation+0xa
00000000`0289f1b0 00000000`77f109f0 wow64!Wow64LdrpInitialize+0x2ed
00000000`0289f6f0 00000000`77ef30a5 ntdll!LdrpInitialize+0x2aa
00000000`0289f7d0 00000000`7d4d1504 ntdll!KiUserApcDispatcher+0x15
00000000`0289fcc8 00000000`00000000 kernel32!BaseThreadStartThunk
00000000`0289fcd0 00000000`00000000 0x0
00000000`0289fcd8 00000000`00000000 0x0
00000000`0289fce0 00000000`00000000 0x0
00000000`0289fce8 00000000`00000000 0x0
00000000`0289fcf0 00000000`00000000 0x0
00000000`0289fcf8 00000000`00000000 0x0
00000000`0289fd00 0001002f`00000000 0x0
00000000`0289fd08 00000000`00000000 0x1002f`00000000
00000000`0289fd10 00000000`00000000 0x0
00000000`0289fd18 00000000`00000000 0x0
00000000`0289fd20 00000000`00000000 0x0
00000000`0289fd28 00000000`00000000 0x0
00000000`0289fd30 00000000`00000000 0x0

   4  Id: 2088.15c4 Suspend: 1 Teb: 00000000`7efa4000 Unfrozen
Child-SP          RetAddr           Call Site
00000000`02def0a8 00000000`6b006a5a wow64cpu!RemoveIoCompletionFault+0x41
00000000`02def180 00000000`6b005e0d wow64!RunCpuSimulation+0xa
00000000`02def1b0 00000000`77f109f0 wow64!Wow64LdrpInitialize+0x2ed
00000000`02def6f0 00000000`77ef30a5 ntdll!LdrpInitialize+0x2aa
00000000`02def7d0 00000000`7d4d1504 ntdll!KiUserApcDispatcher+0x15
00000000`02defcc8 00000000`00000000 kernel32!BaseThreadStartThunk
00000000`02defcd0 00000000`00000000 0x0
00000000`02defcd8 00000000`00000000 0x0
00000000`02defce0 00000000`00000000 0x0
00000000`02defce8 00000000`00000000 0x0
00000000`02defcf0 00000000`00000000 0x0
00000000`02defcf8 00000000`00000000 0x0
00000000`02defd00 0001002f`00000000 0x0
00000000`02defd08 00000000`00000000 0x1002f`00000000
00000000`02defd10 00000000`00000000 0x0
00000000`02defd18 00000000`00000000 0x0
00000000`02defd20 00000000`00000000 0x0
00000000`02defd28 00000000`00000000 0x0
00000000`02defd30 00000000`00000000 0x0
00000000`02defd38 00000000`00000000 0x0

This is a clear indication that the process was in fact 32-bit but the dump is 64-bit. This situation is depicted in one of my earlier posts last year:

Dumps, Debuggers and Virtualization

and we need a debugger plugin to understand virtualized CPU architecture:

Dumps, Debuggers and Virtualization refined

This crash dump pattern can be called Virtualized Process. In our case we need to load wow64exts.dll WinDbg extension and set the target processor mode to x86 by using .effmach command

0:000> .load wow64exts
0:000> .effmach x86
Effective machine: x86 compatible (x86)

Then analysis gives us more meaningful results:

0:000:x86> !analyze -v
*********************************************************
*                                                       *
*                Exception Analysis                     *
*                                                       *
*********************************************************

FAULTING_IP:
kernel32!RaiseException+53
00000000`7d4e2366 5e              pop     esi

EXCEPTION_RECORD:  ffffffffffffffff -- (.exr 0xffffffffffffffff)
ExceptionAddress: 000000007d4e2366 (kernel32!RaiseException+0x0000000000000053)
   ExceptionCode: 000006d9
  ExceptionFlags: 00000001
NumberParameters: 0

BUGCHECK_STR:  6d9

DEFAULT_BUCKET_ID:  APPLICATION_FAULT

PROCESS_NAME:  App.exe

ERROR_CODE: (NTSTATUS) 0x6d9 - There are no more endpoints available from the endpoint mapper.

NTGLOBALFLAG:  0

APPLICATION_VERIFIER_FLAGS:  0

LAST_CONTROL_TRANSFER:  from 000000007da4a631 to 000000007d4e2366

STACK_TEXT:
0012d98c 7da4a631 000006d9 00000001 00000000 kernel32!RaiseException+0x53
0012d9a4 7da4a5f7 000006d9 0012ddc4 0012dda0 rpcrt4!RpcpRaiseException+0x24
0012d9b4 7dac0140 0012da00 00000018 0024a860 rpcrt4!NdrGetBuffer+0x46
0012dda0 5f2a2fba 5f2777b8 5f2771e2 0012ddc4 rpcrt4!NdrClientCall2+0x197
0012ddbc 5f29c6a6 0024a860 00000370 00000000 hnetcfg!FwOpenDynamicFwPort+0x1d
0012de68 7db4291f 0024a860 00000370 00000002 hnetcfg!IcfOpenDynamicFwPort+0x6a
0012df00 71c043db 00000370 0012df4c 00000010 mswsock!WSPBind+0x2e3
WARNING: Frame IP not in any known module. Following frames may be wrong.
0012df24 76ed91c8 00000370 0012df4c 00000010 ws2_32+0x43db
0012df6c 76ed9128 00000370 00000002 00000000 rasapi32+0x491c8
0012df98 76ed997c 00000002 00000002 00000000 rasapi32+0x49128
0012dfc0 76ed8ac2 00000002 02c225f8 02c22688 rasapi32+0x4997c
0012dfd4 76ed89cd 02c225f8 02c22ce8 02c22f38 rasapi32+0x48ac2
0012dff0 76ed82e5 02c226b2 02c22f38 00000000 rasapi32+0x489cd
0012e010 76ed827f 02c225f8 02c22ce8 02c22f38 rasapi32+0x482e5
0012e044 76ed8bf0 02c225f8 00000000 00000000 rasapi32+0x4827f
0012e0c8 76ed844d 02c225f8 00000000 04000000 rasapi32+0x48bf0
0012e170 76ed74b5 04000000 0000232b 02c21f58 rasapi32+0x4844d
0012e200 76ed544f 02c21f58 00000000 02c21f58 rasapi32+0x474b5
0012e22c 76ed944d 00000001 00000000 02c21f58 rasapi32+0x4544f
0012e24c 76ed93a4 00000000 00000000 00000000 rasapi32+0x4944d
0012e298 76ed505f 02c22c9c 00000001 04000000 rasapi32+0x493a4
0012e2bc 7db442bf 02c22c9c 00000001 04000000 rasapi32+0x4505f
0012e2ec 7db4418b 02c22c9c 00000001 04000000 mswsock!SaBlob_Query+0x2d
0012e330 7db4407c 00000000 020243f0 020243d8 mswsock!Rnr_DoDnsLookup+0xf0
0012e5c8 71c06dc0 02c22c38 00000000 0012e680 mswsock!Dns_NSPLookupServiceNext+0x24b
0012e5e0 71c06da0 02024498 02c22c38 00000000 ws2_32+0x6dc0
0012e5fc 71c06d6a 02024560 00000000 0012e680 ws2_32+0x6da0
0012e628 71c06d08 020243f0 00000000 0012e680 ws2_32+0x6d6a
0012e648 71c08282 020243d8 00000000 0012e680 ws2_32+0x6d08
0012ef00 71c07f68 02024c98 00000002 0012ef74 ws2_32+0x8282
0012ef34 71c08433 02024c98 00000001 0012ef74 ws2_32+0x7f68
0012efa0 71c03236 02024c98 00000000 00000000 ws2_32+0x8433
0012f094 71c03340 02024c98 00000000 0012f148 ws2_32+0x3236
0012f0bc 7dab22fb 0012f0d4 00000000 0012f148 ws2_32+0x3340
0012f11c 7dab3a0e 0012f184 0026fff0 0026836c rpcrt4!IP_ADDRESS_RESOLVER::NextAddress+0x13e
0012f238 7dab3c11 0026836c 00272f38 00000402 rpcrt4!TCPOrHTTP_Open+0xdb
0012f270 7da44c85 0026836c 00271d38 00272f38 rpcrt4!TCP_Open+0x55
0012f2b8 7da44b53 00000000 00271d38 00272f38 rpcrt4!OSF_CCONNECTION::TransOpen+0x5e
0012f31c 7da447d7 0026fff0 000dbba0 00000000 rpcrt4!OSF_CCONNECTION::OpenConnectionAndBind+0xbe
0012f360 7da44720 00000000 0012f414 00000000 rpcrt4!OSF_CCALL::BindToServer+0xfa
0012f378 7da3a9df 00000000 00000000 00000000 rpcrt4!OSF_BINDING_HANDLE::InitCCallWithAssociation+0x63
0012f3f4 7da3a8dd 0012f414 0012f480 00270028 rpcrt4!OSF_BINDING_HANDLE::AllocateCCall+0x49d
0012f428 7da37a1c 0012f480 0012f4ac 00000001 rpcrt4!OSF_BINDING_HANDLE::NegotiateTransferSyntax+0x2e
0012f440 7da3642c 0012f480 00000000 0012f460 rpcrt4!I_RpcGetBufferWithObject+0x5b
0012f450 7da37bff 0012f480 0012f86c 0012f84c rpcrt4!I_RpcGetBuffer+0xf
0012f460 7dac0140 0012f4ac 0000006c 0026fff0 rpcrt4!NdrGetBuffer+0x2e
0012f84c 766f41f1 766f24e8 766f423a 0012f86c rpcrt4!NdrClientCall2+0x197
0012f864 766f40b8 0026fff0 0012f904 0012f8e4 ntdsapi!_IDL_DRSBind+0x1c
0012f930 7d8ecaa2 002788bc 00278908 00000000 ntdsapi!DsBindWithSpnExW+0x223
0012f9b0 7d8ed028 00000000 02264f90 00000000 secur32!SecpTranslateName+0x1f3
0012f9d0 00434aa0 02264f90 00000000 00000002 secur32!TranslateNameW+0x2d
0012fab4 00419a7f 022a85e4 0012fc94 afec94eb App+0x34aa0
0012fb0c 0041a61b afec9443 ffffffff 00463fb8 App+0x19a7f
0012fbc0 0045a293 ffffffff 0043682f 004188f3 App+0x1a61b
0012fbc8 0043682f 004188f3 afec93eb ffffffff App+0x5a293
0012fbcc 004188f3 afec93eb ffffffff 00463fb0 App+0x3682f
0043682f 00000000 7459c085 7000c707 c3004645 App+0x188f3

STACK_COMMAND:  kb

FOLLOWUP_IP:
hnetcfg!FwOpenDynamicFwPort+1d
00000000`5f2a2fba 83c40c          add     esp,0Ch

SYMBOL_STACK_INDEX:  4

SYMBOL_NAME:  hnetcfg!FwOpenDynamicFwPort+1d

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: hnetcfg

IMAGE_NAME:  hnetcfg.dll

DEBUG_FLR_IMAGE_TIMESTAMP:  45d6cc2a

FAULTING_THREAD:  0000000000002fe4

FAILURE_BUCKET_ID:  X64_6d9_hnetcfg!FwOpenDynamicFwPort+1d

BUCKET_ID:  X64_6d9_hnetcfg!FwOpenDynamicFwPort+1d

Followup: MachineOwner
---------

0:000:x86> ~*k

.  0  Id: 2088.2fe4 Suspend: 1 Teb: 00000000`7efdb000 Unfrozen
ChildEBP          RetAddr
0012d98c 7da4a631 kernel32!RaiseException+0x53
0012d9a4 7da4a5f7 rpcrt4!RpcpRaiseException+0x24
0012d9b4 7dac0140 rpcrt4!NdrGetBuffer+0x46
0012dda0 5f2a2fba rpcrt4!NdrClientCall2+0x197
0012ddbc 5f29c6a6 hnetcfg!FwOpenDynamicFwPort+0x1d
0012de68 7db4291f hnetcfg!IcfOpenDynamicFwPort+0x6a
0012df00 71c043db mswsock!WSPBind+0x2e3
WARNING: Frame IP not in any known module. Following frames may be wrong.
0012df24 76ed91c8 ws2_32+0x43db
0012df6c 76ed9128 rasapi32+0x491c8
0012df98 76ed997c rasapi32+0x49128
0012dfc0 76ed8ac2 rasapi32+0x4997c
0012dfd4 76ed89cd rasapi32+0x48ac2
0012dff0 76ed82e5 rasapi32+0x489cd
0012e010 76ed827f rasapi32+0x482e5
0012e044 76ed8bf0 rasapi32+0x4827f
0012e0c8 76ed844d rasapi32+0x48bf0
0012e170 76ed74b5 rasapi32+0x4844d
0012e200 76ed544f rasapi32+0x474b5
0012e22c 76ed944d rasapi32+0x4544f
0012e24c 76ed93a4 rasapi32+0x4944d

   1  Id: 2088.280c Suspend: 1 Teb: 00000000`7efd8000 Unfrozen
ChildEBP          RetAddr
01fcfea4 7d63f501 ntdll_7d600000!NtWaitForMultipleObjects+0x15
01fcff48 7d63f988 ntdll_7d600000!EtwpWaitForMultipleObjectsEx+0xf7
01fcffb8 7d4dfe21 ntdll_7d600000!EtwpEventPump+0x27f
01fcffec 00000000 kernel32!BaseThreadStart+0x34

   2  Id: 2088.1160 Suspend: 1 Teb: 00000000`7efd5000 Unfrozen
ChildEBP          RetAddr
026eff50 0042f13b user32!NtUserGetMessage+0x15
WARNING: Stack unwind information not available. Following frames may be wrong.
026effb8 7d4dfe21 App+0x2f13b
026effec 00000000 kernel32!BaseThreadStart+0x34

   3  Id: 2088.2d04 Suspend: 1 Teb: 00000000`7efad000 Unfrozen
ChildEBP          RetAddr
0285ffa0 7d634d69 ntdll_7d600000!ZwDelayExecution+0x15
0285ffb8 7d4dfe21 ntdll_7d600000!RtlpTimerThread+0x47
0285ffec 00000000 kernel32!BaseThreadStart+0x34

   4  Id: 2088.15c4 Suspend: 1 Teb: 00000000`7efa4000 Unfrozen
ChildEBP          RetAddr
02daff80 7db4b6c6 ntdll_7d600000!NtRemoveIoCompletion+0x15
02daffb8 7d4dfe21 mswsock!SockAsyncThread+0x69
02daffec 00000000 kernel32!BaseThreadStart+0x34

- Dmitry Vostokov @ DumpAnalysis.org -

Crash Dump Analysis Patterns (Part 25)

Monday, September 10th, 2007

The most important pattern that is used for problem identification and resolution is Stack Trace. Consider the following fragment of !analyze -v output from w3wp.exe crash dump:

STACK_TEXT:
WARNING: Frame IP not in any known module. Following frames may be wrong.
1824f90c 5a39f97e 01057b48 01057bd0 5a3215b4 0x0
1824fa50 5a32cf7c 01057b48 00000000 79e651c0 w3core!ISAPI_REQUEST::SendResponseHeaders+0x5d
1824fa78 5a3218ad 01057bd0 79e651c0 79e64d9c w3isapi!SSFSendResponseHeader+0xe0
1824fae8 79e76127 01057bd0 00000003 79e651c0 w3isapi!ServerSupportFunction+0x351
1824fb0c 79e763a3 80000411 00000000 00000000 aspnet_isapi!HttpCompletion::ReportHttpError+0x3a
1824fd50 79e761c3 34df6cf8 79e8e42f 79e8e442 aspnet_isapi!HttpCompletion::ProcessRequestInManagedCode+0x1d1
1824fd5c 79e8e442 34df6cf8 00000000 00000000 aspnet_isapi!HttpCompletion::ProcessCompletion+0x24
1824fd70 791d6211 34df6cf8 18e60110 793ee0d8 aspnet_isapi!CorThreadPoolWorkitemCallback+0x13
1824fd84 791d616a 18e60110 00000000 791d60fa mscorsvr!ThreadpoolMgr::ExecuteWorkRequest+0x19
1824fda4 791fe95c 00000000 8083d5c7 00000000 mscorsvr!ThreadpoolMgr::WorkerThreadStart+0x129
1824ffb8 77e64829 17bb9c18 00000000 00000000 mscorsvr!ThreadpoolMgr::intermediateThreadProc+0x44
1824ffec 00000000 791fe91b 17bb9c18 00000000 kernel32!BaseThreadStart+0x34

Ignoring the first 5 numeric columns gives us the following trace:

0x0
w3core!ISAPI_REQUEST::SendResponseHeaders+0x5d
w3isapi!SSFSendResponseHeader+0xe0
w3isapi!ServerSupportFunction+0x351
aspnet_isapi!HttpCompletion::ReportHttpError+0x3a
aspnet_isapi!HttpCompletion::ProcessRequestInManagedCode+0x1d1
aspnet_isapi!HttpCompletion::ProcessCompletion+0x24
aspnet_isapi!CorThreadPoolWorkitemCallback+0x13
mscorsvr!ThreadpoolMgr::ExecuteWorkRequest+0x19
mscorsvr!ThreadpoolMgr::WorkerThreadStart+0x129
mscorsvr!ThreadpoolMgr::intermediateThreadProc+0x44
kernel32!BaseThreadStart+0x34

or in general we have something like this:

moduleA!functionX+offsetN
moduleB!functionY+offsetM
...
...
...

Sometimes function names are not available or offsets are very big like 0×2380. If this is the case then we probably don’t have symbol files for moduleA and moduleB:

moduleA+offsetN
moduleB+offsetM
...
...
...

Usually there is some kind of a database of previous issues we can use to match moduleA!functionX+offsetN against. If there is no such match we can try functionX+offsetN, moduleA!functionX or just functionX. If there is no such match again we can try the next signature, moduleB!functionY+offsetM, and moduleB!functionY, etc. Usually, the further down the trace the less useful the signature is for problem resolution. For example, mscorsvr!ThreadpoolMgr::WorkerThreadStart+0x129 will probably match many issues because this signature is common for many ASP.NET applications.

If there is no match in internal databases we can try Google. For our example, Google search for SendResponseHeaders+0x5d gives the following search results:

Browsing search results reveals the following discussion:

http://groups.google.com/group/microsoft.public.inetserver.iis/ browse_frm/thread/34bc2be635b26531?tvc=1 

which can be found directly by searching Google groups:

 

Another example from BSOD complete memory dump. Analysis command has the following output (stripped for clarity):

MODE_EXCEPTION_NOT_HANDLED (1e)
This is a very common bugcheck. Usually the exception address pinpoints the driver/function that caused the problem. Always note this address as well as the link date of the driver/image that contains this address.
Arguments:
Arg1: c0000005, The exception code that was not handled
Arg2: bff90ca3, The address that the exception occurred at
Arg3: 00000000, Parameter 0 of the exception
Arg4: 00000000, Parameter 1 of the exception

TRAP_FRAME: bdf80834 -- (.trap ffffffffbdf80834)
ErrCode = 00000000
eax=00000000 ebx=bdf80c34 ecx=89031870 edx=88096928 esi=88096928 edi=8905e7f0
eip=bff90ca3 esp=bdf808a8 ebp=bdf80a44 iopl=0 nv up ei ng nz na po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010282
tsmlvsa+0xfca3:
bff90ca3 8b08 mov ecx,dword ptr [eax] ds:0023:00000000=????????
Resetting default scope

STACK_TEXT:
bdf807c4 80467a15 bdf807e0 00000000 bdf80834 nt!KiDispatchException+0x30e
bdf8082c 804679c6 00000000 bdf80860 804d9f69 nt!CommonDispatchException+0x4d
bdf80838 804d9f69 00000000 00000005 e56c6946 nt!KiUnexpectedInterruptTail+0x207
00000000 00000000 00000000 00000000 00000000 nt!ObpAllocateObject+0xe1

Because the crash point tsmlvsa+0xfca3 is not on the stack trace we use .trap command:

1: kd> .trap ffffffffbdf80834
ErrCode = 00000000
eax=00000000 ebx=bdf80c34 ecx=89031870 edx=88096928 esi=88096928 edi=8905e7f0
eip=bff90ca3 esp=bdf808a8 ebp=bdf80a44 iopl=0 nv up ei ng nz na po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010282
tsmlvsa+0xfca3:
bff90ca3 8b08 mov ecx,dword ptr [eax] ds:0023:00000000=????????

1: kd> k
*** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr
WARNING: Stack unwind information not available. Following frames may be wrong.
00000000 bdf80afc tsmlvsa+0xfca3
89080c00 00000040 nt!ObpLookupObjectName+0x504
00000000 00000001 nt!ObOpenObjectByName+0xc5
c0100080 0012b8d8 nt!IopCreateFile+0x407
c0100080 0012b8d8 nt!IoCreateFile+0x36
c0100080 0012b8d8 nt!NtCreateFile+0x2e
c0100080 0012b8d8 nt!KiSystemService+0xc9
c0100080 0012b8d8 ntdll!NtCreateFile+0xb
c0000000 00000000 KERNEL32!CreateFileW+0x343

1: kd> lmv m tsmlvsa
bff81000 bff987c0 tsmlvsa (no symbols)
Loaded symbol image file: tsmlvsa.sys
Image path: tsmlvsa.sys
Image name: tsmlvsa.sys
Timestamp: Thu Mar 18 06:18:51 2004 (40593F4B)
CheckSum: 0002D102
ImageSize: 000177C0
Translations: 0000.04b0 0000.04e0 0409.04b0 0409.04e0

Google search for tsmlvsa+0xfca3 fails but if we search just for tsmlvsa we get the first link towards problem resolution:

http://www-1.ibm.com/support/docview.wss?uid=swg1IC40964

- Dmitry Vostokov @ DumpAnalysis.org -

Minidump Analysis (Part 2)

Thursday, September 6th, 2007

In the first part I outlined WinDbg commands we can use to get various information from system minidumps. In this part I will show how to find a 3rd-party driver that might have been responsible for system crash and verify that WinDbg reports that driver correctly.

For example, one of the log files shows the following !analyze -v results:

command> !analyze -v
*************************************************************
*                                                           *
*                     Bugcheck Analysis                     *
*                                                           *
*************************************************************

BAD_POOL_CALLER (c2)
The current thread is making a bad pool request. Typically this is at a bad IRQL level or double freeing the same allocation, etc.
Arguments:
Arg1: 00000047, Attempt to free a non-allocated nonpaged pool address
Arg2: 85083000, Starting address
Arg3: 00005083, physical page frame
Arg4: 000bfff9, highest physical page frame

Debugging Details:
------------------
BUGCHECK_STR:  0xc2_47

CUSTOMER_CRASH_COUNT:  2

DEFAULT_BUCKET_ID:  DRIVER_FAULT_SERVER_MINIDUMP

PROCESS_NAME:  3rdPartyAntivirus.exe

CURRENT_IRQL:  0

LAST_CONTROL_TRANSFER:  from 8054d2eb to 805435b9

STACK_TEXT:
b325db68 8054d2eb 000000c2 00000047 85083000 nt!KeBugCheckEx+0x19
b325db94 805689c2 85083000 00080000 85083000 nt!MmGetSizeOfBigPoolAllocation+0x1cb
b325dbe4 f77c8098 00000000 00000000 00000004 nt!ExFreePoolWithTag+0x1d0
WARNING: Stack unwind information not available. Following frames may be wrong.
b325dc24 f77c8234 8599d4e0 8599d4e0 88cecf38 3rdPartyAVDrv+0×1098
b325dc3c 804f04f3 871d2cf0 00004808 8743b6a0 3rdPartyAVDrv+0×1234
b325dc4c 80585208 8599d550 872c8028 8599d4e0 nt!IofCallDriver+0×3f
b325dc60 80585fe6 871d2cf0 8599d4f0 872c8028 nt!IopSynchronousServiceTail+0×6f
b325dd00 80586028 00000468 00000614 00000000 nt!IopXxxControlFile+0×607
b325dd34 804dfd24 00000468 00000614 00000000 nt!NtDeviceIoControlFile+0×28
b325dd34 7ffe0304 00000468 00000614 00000000 nt!KiSystemService+0xd0
0087b6c0 00000000 00000000 00000000 00000000 SharedUserData!SystemCallStub+0×4
STACK_COMMAND:  kb

FOLLOWUP_IP:
3rdPartyAVDrv+1098
f77c8098 ??              ???

SYMBOL_STACK_INDEX:  3

SYMBOL_NAME:  3rdPartyAVDrv+1098

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: 3rdPartyAVDrv

IMAGE_NAME:  3rdPartyAVDrv.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  410752c5

FAILURE_BUCKET_ID:  0xc2_47_3rdPartyAVDrv+1098

BUCKET_ID:  0xc2_47_3rdPartyAVDrv+1098

Followup: MachineOwner
---------

MODULE_NAME and IMAGE_NAME report 3rd-party antivirus driver 3rdPartyAVDrv.sys responsible for BSOD. However the top lines from STACK_TEXT report nt module, Windows NT kernel and system. We can get information about all loaded drivers from the output of lmv command:

command> lmv
start    end        module name
804de000 80744000   nt       # (pdb symbols)          c:\...\ntkrnlmp.pdb\...\ntkrnlmp.pdb
    Loaded symbol image file: ntkrnlmp.exe
    Mapped memory image file: c:\...\ntkrnlmp.exe\...\ntkrnlmp.exe
    Image path: ntkrnlmp.exe
    Image name: ntkrnlmp.exe
    Timestamp:        Tue Mar 25 08:39:34 2003 (3E8015C6)
    CheckSum:         00254553
    ImageSize:        00266000
    File version:     5.2.3790.0
    Product version:  5.2.3790.0
    File flags:       0 (Mask 3F)
    File OS:          40004 NT Win32
    File type:        1.0 App
    File date:        00000000.00000000
    Translations:     0415.04b0
    CompanyName:      Microsoft Corporation
    ProductName:      Microsoft® Windows® Operating System
    InternalName:     ntkrnlmp.exe
    OriginalFilename: ntkrnlmp.exe
    ProductVersion:   5.2.3790.0
    FileVersion:      5.2.3790.0 (srv03_rtm.030324-2048)
    FileDescription:  NT Kernel & System
    LegalCopyright:   © Microsoft Corporation. All rights reserved.
...
...
...
f77c7000 f77cf000   3rdPartyAVDrv 3rdPartyAVDrv.sys
    Loaded symbol image file: 3rdPartyAVDrv.sys
    Symbol file: 3rdPartyAVDrv.sys
    Image path: 3rdPartyAVDrv.sys
    Timestamp:        Wed Jul 28 07:15:21 2004 (410752C5)
    CheckSum:         00011518
    ImageSize:        00008000
    Translations:     0000.04b0 0000.04e0 0409.04b0 0409.04e0
...
...
...

Why WinDbg skips Microsoft module and points to the 3rd-party one? Because, when WinDbg encounters a non-Microsoft driver, it always shows it as the possible cause. Then it looks at triage.ini file located in triage folder under Debugging Tools for Windows installation folder where certain modules and functions are listed with appropriate actions for WinDbg, for example:

nt!KeBugCheck*=ignore
nt!ExAllocatePool=Pool_corruption

Let’s add the following entry at the end of that INI file:

3rdPartyAVDrv!*=ignore

Now, if we load the dump in WinDbg, set the symbols and run the analysis, we get different results:

3: kd> .symfix
No downstream store given, using C:\Program Files\Debugging Tools for Windows\sym
3: kd> !analyze -v
...
...
...
FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: nt

DEBUG_FLR_IMAGE_TIMESTAMP:  3e8015c6

SYMBOL_NAME:  nt!MmGetSizeOfBigPoolAllocation+1cb

IMAGE_NAME:  memory_corruption

FAILURE_BUCKET_ID:  0xc2_47_nt!MmGetSizeOfBigPoolAllocation+1cb

BUCKET_ID:  0xc2_47_nt!MmGetSizeOfBigPoolAllocation+1cb

Followup: MachineOwner
---------

Because nt!MmGetSizeOfBigPoolAllocation is not listed in triage.ini WinDbg reports nt module and memory corruption. The latter cause is probably inferred from either BAD_POOL_CALLER bugcheck name or Mm function prefix.

Let’s add more lines to tiage.ini:

3rdPartyAVDrv!*=ignore
nt!MmGetSizeOfBigPoolAllocation=ignore
nt!ExFreePoolWithTag=Dynamic memory corruption detected when freeing memory 

Now the analysis reports our custom follow up message:

3: kd> !analyze -v
...
...
...
FOLLOWUP_IP:
nt!ExFreePoolWithTag+1d0
805689c2 e9c8f0ffff      jmp     nt!ExFreePoolWithTag+0x1d0 (80567a8f)

SYMBOL_STACK_INDEX:  2

FOLLOWUP_NAME:  Dynamic memory corruption detected when freeing memory

MODULE_NAME: nt

IMAGE_NAME:  ntkrnlmp.exe

DEBUG_FLR_IMAGE_TIMESTAMP:  3e8015c6

SYMBOL_NAME:  nt!ExFreePoolWithTag+1d0

FAILURE_BUCKET_ID:  0xc2_47_nt!ExFreePoolWithTag+1d0

BUCKET_ID:  0xc2_47_nt!ExFreePoolWithTag+1d0

Followup: Dynamic memory corruption detected when freeing memory
---------

Let’s look at STACK_TEXT data:

STACK_TEXT:
b325db68 8054d2eb 000000c2 00000047 85083000 nt!KeBugCheckEx+0x19
b325db94 805689c2 85083000 00080000 85083000 nt!MmGetSizeOfBigPoolAllocation+0x1cb
b325dbe4 f77c8098 00000000 00000000 00000004 nt!ExFreePoolWithTag+0×1d0
WARNING: Stack unwind information not available. Following frames may be wrong.
b325dc24 f77c8234 8599d4e0 8599d4e0 88cecf38 3rdPartyAVDrv+0×1098
b325dc3c 804f04f3 871d2cf0 00004808 8743b6a0 3rdPartyAVDrv+0×1234
b325dc4c 80585208 8599d550 872c8028 8599d4e0 nt!IofCallDriver+0×3f
b325dc60 80585fe6 871d2cf0 8599d4f0 872c8028 nt!IopSynchronousServiceTail+0×6f
b325dd00 80586028 00000468 00000614 00000000 nt!IopXxxControlFile+0×607
b325dd34 804dfd24 00000468 00000614 00000000 nt!NtDeviceIoControlFile+0×28
b325dd34 7ffe0304 00000468 00000614 00000000 nt!KiSystemService+0xd0
0087b6c0 00000000 00000000 00000000 00000000 SharedUserData!SystemCallStub+0×4

We see that WinDbg is not sure that it correctly identified 3rdPartyAVDrv module. We can check this manually by subtracting 0×1098 from f77c8098 (shown in blue above). The latter address is the so called return address saved when nt!ExFreePoolWithTag was called and can located in the second column. We can see that it falls within 3rdPartyAVDrv address range and therefore the stack looks correct:

3: kd> ? f77c8098-0x1098
Evaluate expression: -142839808 = f77c7000

3: kd> lmv m 3rdPartyAVDrv
f77c7000 f77cf000   3rdPartyAVDrv 3rdPartyAVDrv.sys
    Loaded symbol image file: 3rdPartyAVDrv.sys
    Symbol file: 3rdPartyAVDrv.sys
    Image path: 3rdPartyAVDrv.sys
    Browse all global symbols  functions  data
    Timestamp:        Wed Jul 28 07:15:21 2004 (410752C5)
    CheckSum:         00011518
    ImageSize:        00008000
    Translations:     0000.04b0 0000.04e0 0409.04b0 0409.04e0

The driver is dated Jul, 2004 and therefore we can try either to disable it or contract the vendor for any updates. The “Timestamp” refers to the time when the driver was built at the vendor software factory and not the time when it was installed. This is a way to identify the version of the software if no more information except the time is present in the output of lmv command.

Also note the PROCESS_NAME field in the analysis output. It is a process that was active at the time of the system failure. The application that the process belongs to might be from the same vendor as the suspicious driver but usually they are independent, for example, if there is a bug in a display driver it might manifest when any application is running.

Sometimes stack trace (STACK_TEXT) starts with a warning like in this example:

STACK_TEXT:
WARNING: Stack unwind information not available. Following frames may be wrong.
b4f528b0 b4f52904 e24079e0 000000ab 000003cb DisplayDriver+0x21bca6
b4f528b4 e24079e0 000000ab 000003cb 43f0027f 0xb4f52904
b4f52904 00000000 000003cb 0000027a 00000135 0xe24079e0

In such cases we can look at a crash point (FAULTING_IP) and notice the module there:

FAULTING_IP:
DisplayDriver+21bca6
bfbf0ca6 8b3e            mov     edi,dword ptr [esi]

0: kd> lmv m DisplayDriver
start    end        module name
bf9d5000 bff42500   DisplayDriver T (no symbols)
    Loaded symbol image file: DisplayDriver.dll
    Image path: DisplayDriver.dll
    Image name: DisplayDriver.dll
    Timestamp:        Fri Jun 29 09:13:08 2007 (4684BF14)
    CheckSum:         00570500
    ImageSize:        0056D500
    Translations:     0000.04b0 0000.04e0 0409.04b0 0409.04e0

- Dmitry Vostokov @ DumpAnalysis.org -

Crash Dump Analysis Patterns (Part 24)

Thursday, August 30th, 2007

Raw stack dumps can be useful for finding any suspicious modules that might have caused the problem. For example, it is common for some programs to install hooks to monitor GUI changes, intercept window messages to provide value added services on top of the existing applications. These hooks are implemented as DLLs. Another use would be to examine raw stack data for printer drivers that caused problems before. The fact that these modules had been loaded doesn’t mean that they were used. If we find references to their code it would mean that they might have been used.

However when looking at raw stack dump with symbol information we should be aware of Coincidental Symbolic Information pattern. Here is the first example. Loading the crash dump and displaying the problem thread stack shows the following reference:

...
...
...
00b1ed00  0063006f
00b1ed04  006d0075
00b1ed08  006e0065
00b1ed0c  00200074
00b1ed10  006f004c
00b1ed14  00640061
00b1ed18  00720065
00b1ed1c  005b0020
00b1ed20  00500055
00b1ed24  003a0044
00b1ed28  00430050 Application!Array::operator=+0x2f035
00b1ed2c  0035004c
00b1ed30  005d0063
00b1ed34  00630000
00b1ed38  0000005d
...
...
...

Applying symbols gives us more meaningful name:

...
...
...
00b1ed00  0063006f
00b1ed04  006d0075
00b1ed08  006e0065
00b1ed0c  00200074
00b1ed10  006f004c
00b1ed14  00640061
00b1ed18  00720065
00b1ed1c  005b0020
00b1ed20  00500055
00b1ed24  003a0044
00b1ed28  00430050 Application!Print::DocumentLoad+0x5f
00b1ed2c  0035004c
00b1ed30  005d0063
00b1ed34  00630000
...
...
...

However this is the pure coincidence. The data pattern 00NN00NN clearly belongs to a Unicode string:

0:020> du 00b1ed00
00b1ed00  "ocument Loader [UPD:PCL5c]"

It just happens that 00430050 value can be interpreted as an address that falls into Application module address range and its code section:

0:020> lm
start    end        module name
00400000 0044d000   Application

In the second example, the crash dump is from some 3rd-party application called AppSql for which we don’t have PDB files. Also we know that myhook.dll is installed as a system wide hook and had some problems in the past. It is loaded into any address space but is not necessarily used. We want to see if there are traces of it on the problem thread stack. Dumping stack contents shows us the only one reference:

...
...
...
00118cb0  37302f38
00118cb4  00000000
00118cb8  10008e00 myhook!notify_me+0×22c
00118cbc  01400000
00118cc0  00118abc
00118cc4  06a129f0
00118cc8  00118d04
00118ccc  02bc57d0
00118cd0  04ba5d74
00118cd4  00118d30
00118cd8  0000001c
00118cdc  00000010
00118ce0  075922bc
00118ce4  04a732e0
00118ce8  075922bc
00118cec  04a732e0
00118cf0  0066a831 AppSql+0×26a831
00118cf4  04a732d0
00118cf8  02c43190
00118cfc  00000001
00118d00  0000001c
00118d04  00118d14
00118d08  0049e180 AppSql+0×9e180
00118d0c  02c43190
00118d10  0000001c
00118d14  00118d34


0:020> lm
start    end        module name
00400000 00ba8000   AppSql
...
...
...
10000000 100e0000   myhook

The address 10008e00 looks very “round” and it might be the set of bit flags and also, if we disassemble the code at this address backwards, we don’t see the usual call instruction that saved that address on the stack:

0:000> ub 10008e00
myhook!notify_me+0x211
10008de5 81c180000000    add     ecx,80h
10008deb 899578ffffff    mov     dword ptr [ebp-88h],edx
10008df1 89458c          mov     dword ptr [ebp-74h],eax
10008df4 894d98          mov     dword ptr [ebp-68h],ecx
10008df7 6a01            push    1
10008df9 8d45ec          lea     eax,[ebp-14h]
10008dfc 50              push    eax
10008dfd ff75e0          push    dword ptr [ebp-20h]

In contrast, the other two addresses are return addresses saved on the stack:

0:000> ub 0066a831
AppSql+0x26a81e:
0066a81e 8bfb            mov     edi,ebx
0066a820 f3a5            rep movs dword ptr es:[edi],dword ptr [esi]
0066a822 8bca            mov     ecx,edx
0066a824 83e103          and     ecx,3
0066a827 f3a4            rep movs byte ptr es:[edi],byte ptr [esi]
0066a829 8b00            mov     eax,dword ptr [eax]
0066a82b 50              push    eax
0066a82c e8affeffff      call    AppSql+0×26a6e0 (0066a6e0)

0:000> ub 0049e180
AppSql+0x9e16f:
0049e16f cc              int     3
0049e170 55              push    ebp
0049e171 8bec            mov     ebp,esp
0049e173 8b4510          mov     eax,dword ptr [ebp+10h]
0049e176 8b4d0c          mov     ecx,dword ptr [ebp+0Ch]
0049e179 50              push    eax
0049e17a 51              push    ecx
0049e17b e840c61c00      call    AppSql+0×26a7c0 (0066a7c0)

Therefore the appearance of myhook!notify_me+0×22c could be a coincidence unless it was a pointer to a function. However, if it was the function pointer address then it wouldn’t have pointed to the middle of the function call sequence that pushes arguments:

0:000> ub 10008e00
myhook!notify_me+0x211
10008de5 81c180000000    add     ecx,80h
10008deb 899578ffffff    mov     dword ptr [ebp-88h],edx
10008df1 89458c          mov     dword ptr [ebp-74h],eax
10008df4 894d98          mov     dword ptr [ebp-68h],ecx
10008df7 6a01            push    1
10008df9 8d45ec          lea     eax,[ebp-14h]
10008dfc 50              push    eax
10008dfd ff75e0          push    dword ptr [ebp-20h]
0:000> u 10008e00
myhook!notify_me+0×22c
10008e00 e82ff1ffff      call    myhook!copy_data (10007f34)
10008e05 8b8578ffffff    mov     eax,dword ptr [ebp-88h]
10008e0b 3945ac          cmp     dword ptr [ebp-54h],eax
10008e0e 731f            jae     myhook!notify_me+0×25b (10008e2f)
10008e10 8b4598          mov     eax,dword ptr [ebp-68h]
10008e13 0fbf00          movsx   eax,word ptr [eax]
10008e16 8945a8          mov     dword ptr [ebp-58h],eax
10008e19 8b45e0          mov     eax,dword ptr [ebp-20h]

Also, because we have source code and private symbols, we know that if it was a function pointer then it would have been myhook!notify_me address and not notify_me+0×22c address.

All this evidence supports the hypothesis that myhook occurrence on the problem stack is just the coincidence and should be ignored.

- Dmitry Vostokov @ DumpAnalysis.org -

Minidump Analysis (Part 1)

Wednesday, August 29th, 2007

Small Memory Dumps, also referred as minidumps because they are stored in %SystemRoot% \ Minidump folder, contain only bugcheck information, kernel mode stack data and the list of loaded drivers. They can be used to transmit system crash information to a vendor or a 3rd-party for an automated crash dump analysis. Another use is to keep system crash history. In this part I discuss the scripting approach to extract information from all minidumps stored on a particular system. The script processes all minidump files and creates text log files containing the following information:

  1. Crash dump name and type 

  2. OS information, crash time and system uptime 

  3. Processor context (r) and verbose stack trace (kv) prior to applying !analyze -v. This is useful sometimes when WinDbg reconstructs a different stack trace after changing a processor context to the execution context at the time of a trap, exception or fault. 

  4. The output of !analyze -v command

  5. Processor context (r) and verbose stack trace (kv) after !analyze -v command.

  6. Code disassembly for the current execution pointer (EIP or x64 RIP). This includes forward (u) and backward (ub) disassembly, and we also try to disassemble the whole function (uf) which should succeed if we have symbol information

  7. Raw stack dump with symbol information (dps)

  8. The same raw stack data but interpreted as pointers to Unicode zero-terminated strings (dpu). Some pointers on the stack might point to local string buffers located on the same stack. This can be a slow operation and WinDbg might temporarily hang.

  9. The same raw stack data but interpreted as pointers to ASCII zero-terminated strings (dpa). This can be a slow operation and WinDbg might temporarily hang.

  10. Verbose information about loaded drivers (lmv)

  11. CPU, machine ID, machine-specific registers, and verbose SMBIOS information like motherboard and devices (!sysinfo)

Here is WinDbg script listing:

$$
$$ MiniDmp2Txt: Dump information from minidump into log
$$
.logopen /d /u
.echo "command> ||"
||
.echo "command> vertarget"
vertarget
.echo "command> r (before analysis)"
r
.echo "command> kv (before analysis)"
kv 100
.echo "command> !analyze -v"
!analyze -v
.echo "command> r"
r
.echo "command> kv"
kv 100
.echo "command> ub eip"
ub eip
.echo "command> u eip"
u eip
.echo "command> uf eip"
uf eip
.echo "command> dps esp-3000 esp+3000"
dps esp-3000 esp+3000
.echo "command> dpu esp-3000 esp+3000"
dpu esp-3000 esp+3000
.echo "command> dpa esp-3000 esp+3000"
dpa esp-3000 esp+3000
.echo "command> lmv"
lmv
.echo "command> !sysinfo cpuinfo"
!sysinfo cpuinfo
.echo "command> !sysinfo cpuspeed"
!sysinfo cpuspeed
.echo "command> !sysinfo cpumicrocode"
!sysinfo cpumicrocode
.echo "command> !sysinfo gbl"
!sysinfo gbl
.echo "command> !sysinfo machineid"
!sysinfo machineid
.echo "command> !sysinfo registers"
!sysinfo registers
.echo "command> !sysinfo smbios -v"
!sysinfo smbios -v
.logclose
$$
$$ MiniDmp2Txt: End of File
$$

To run WinDbg automatically against each minidump file (.dmp) use the following VB script (customize symbol search path (-y) to point to your own folders):

'
' MiniDumps2Txt.vbs
'
Set fso = CreateObject("Scripting.FileSystemObject")
Set Folder = fso.GetFolder(".")
Set Files = Folder.Files
Set WshShell = CreateObject("WScript.Shell")
For Each File In Files
  If Mid(File.Name,Len(File.Name)-3,4) = ".dmp" Then
    Set oExec = WshShell.Exec("C:\Program Files\Debugging Tools for Windows\WinDbg.exe -y ""srv*c:\ms*http://msdl.microsoft.com/download/symbols"" -z " + File.Name + " -c ""$$><c:\scripts\MiniDmp2Txt.txt;q"" -Q -QS -QY -QSY")
    Do While oExec.Status = 0
      WScript.Sleep 1000
    Loop
  End If
Next
'
' MiniDumps2Txt.vbs: End of File
'

We can also use kd.exe instead of WinDbg but its window will be hidden if we use the same VB script.

Log file interpretation is the subject of the next minidump analysis part.

- Dmitry Vostokov @ DumpAnalysis.org -

Raw Stack Dump of all threads

Tuesday, August 28th, 2007

Sometimes we need to dump the whole thread stack data to find traces of hooks, printer drivers or just string fragments. This is usually done by finding the appropriate TEB and dumping the data between StackLimit and StackBase addresses, for example:

0:000> ~
.  0  Id: 106c.4e4 Suspend: 1 Teb: 7ffde000 Unfrozen
   1  Id: 106c.4e0 Suspend: 1 Teb: 7ffdc000 Unfrozen
   2  Id: 106c.1158 Suspend: 1 Teb: 7ffdb000 Unfrozen
   3  Id: 106c.c3c Suspend: 1 Teb: 7ffd9000 Unfrozen
   4  Id: 106c.1174 Suspend: 1 Teb: 7ffd8000 Unfrozen
   5  Id: 106c.1168 Suspend: 1 Teb: 7ffd4000 Unfrozen
   6  Id: 106c.1568 Suspend: 1 Teb: 7ffaf000 Unfrozen
   7  Id: 106c.1574 Suspend: 1 Teb: 7ffad000 Unfrozen
   8  Id: 106c.964 Suspend: 1 Teb: 7ffac000 Unfrozen
   9  Id: 106c.1164 Suspend: 1 Teb: 7ffab000 Unfrozen
  10  Id: 106c.d84 Suspend: 1 Teb: 7ffaa000 Unfrozen
  11  Id: 106c.bf4 Suspend: 1 Teb: 7ffa9000 Unfrozen
  12  Id: 106c.eac Suspend: 1 Teb: 7ffa8000 Unfrozen
  13  Id: 106c.614 Suspend: 1 Teb: 7ffd5000 Unfrozen
  14  Id: 106c.cd8 Suspend: 1 Teb: 7ffa7000 Unfrozen
  15  Id: 106c.1248 Suspend: 1 Teb: 7ffa6000 Unfrozen
  16  Id: 106c.12d4 Suspend: 1 Teb: 7ffa4000 Unfrozen
  17  Id: 106c.390 Suspend: 1 Teb: 7ffa3000 Unfrozen
  18  Id: 106c.764 Suspend: 1 Teb: 7ffa1000 Unfrozen
  19  Id: 106c.f48 Suspend: 1 Teb: 7ff5f000 Unfrozen
  20  Id: 106c.14a8 Suspend: 1 Teb: 7ff53000 Unfrozen
  21  Id: 106c.464 Suspend: 1 Teb: 7ff4d000 Unfrozen
  22  Id: 106c.1250 Suspend: 1 Teb: 7ffa5000 Unfrozen
  23  Id: 106c.fac Suspend: 1 Teb: 7ff5c000 Unfrozen
  24  Id: 106c.1740 Suspend: 1 Teb: 7ffd7000 Unfrozen
  25  Id: 106c.ae4 Suspend: 1 Teb: 7ffd6000 Unfrozen
  26  Id: 106c.a4c Suspend: 1 Teb: 7ffdd000 Unfrozen
  27  Id: 106c.1710 Suspend: 1 Teb: 7ffda000 Unfrozen
  28  Id: 106c.1430 Suspend: 1 Teb: 7ffa2000 Unfrozen
  29  Id: 106c.1404 Suspend: 1 Teb: 7ff4e000 Unfrozen
  30  Id: 106c.9a8 Suspend: 1 Teb: 7ff4c000 Unfrozen
  31  Id: 106c.434 Suspend: 1 Teb: 7ff4b000 Unfrozen
  32  Id: 106c.c8c Suspend: 1 Teb: 7ff4a000 Unfrozen
  33  Id: 106c.4f0 Suspend: 1 Teb: 7ff49000 Unfrozen
  34  Id: 106c.be8 Suspend: 1 Teb: 7ffae000 Unfrozen
  35  Id: 106c.14e0 Suspend: 1 Teb: 7ff5d000 Unfrozen
  36  Id: 106c.fe0 Suspend: 1 Teb: 7ff5b000 Unfrozen
  37  Id: 106c.1470 Suspend: 1 Teb: 7ff57000 Unfrozen
  38  Id: 106c.16c4 Suspend: 1 Teb: 7ff5e000 Unfrozen

0:000> !teb 7ffad000
TEB at 7ffad000
    ExceptionList:        0181ff0c
    StackBase:            01820000
    StackLimit:           0181c000

    SubSystemTib:         00000000
    FiberData:            00001e00
    ArbitraryUserPointer: 00000000
    Self:                 7ffad000
    EnvironmentPointer:   00000000
    ClientId:             0000106c . 00001574
    RpcHandle:            00000000
    Tls Storage:          00000000
    PEB Address:          7ffdf000
    LastErrorValue:       0
    LastStatusValue:      c000000d
    Count Owned Locks:    0
    HardErrorMode:        0

0:000> dps 0181c000 01820000
0181c000  00000000
0181c004  00000000
0181c008  00000000
0181c00c  00000000
0181c010  00000000
0181c014  00000000
0181c018  00000000
0181c01c  00000000
0181c020  00000000
0181c024  00000000
...
...
...
0181ffb8  0181ffec
0181ffbc  77e6608b kernel32!BaseThreadStart+0x34
0181ffc0  00f31eb0
0181ffc4  00000000
0181ffc8  00000000
0181ffcc  00f31eb0
0181ffd0  8a38f7a8
0181ffd4  0181ffc4
0181ffd8  88a474b8
0181ffdc  ffffffff
0181ffe0  77e6b7d0 kernel32!_except_handler3
0181ffe4  77e66098 kernel32!`string'+0x98
0181ffe8  00000000
0181ffec  00000000
0181fff0  00000000
0181fff4  7923a709
0181fff8  00f31eb0
0181fffc  00000000
01820000  ????????

However, if our process has many threads, like in the example above,  and we want to dump stack data from all of them, we need to automate this process. After several attempts I created the following simple script which can be copy-pasted into WinDbg command window or saved in a text file to be loaded and executed later via WinDbg $$>< command. The script takes the advantage of the following command

~e (Thread-Specific Command)

The ~e command executes one or more commands for a specific thread or for all threads in the target process.

(from WinDbg help)

Here is the script:

~*e r? $t1 = ((ntdll!_NT_TIB *)@$teb)->StackLimit; r? $t2 = ((ntdll!_NT_TIB *)@$teb)->StackBase; !teb; dps @$t1 @$t2

Raw stack data from different stacks is separated by !teb output for clarity, for example:

0:000> .logopen rawdata.log
0:000> ~*e r? $t1 = ((ntdll!_NT_TIB *)@$teb)->StackLimit; r? $t2 = ((ntdll!_NT_TIB *)@$teb)->StackBase; !teb; dps @$t1 @$t2
TEB at 7ffde000
    ExceptionList:        0007fd38
    StackBase:            00080000
    StackLimit:           0007c000
    SubSystemTib:         00000000
    FiberData:            00001e00
    ArbitraryUserPointer: 00000000
    Self:                 7ffde000
    EnvironmentPointer:   00000000
    ClientId:             0000106c . 000004e4
    RpcHandle:            00000000
    Tls Storage:          00000000
    PEB Address:          7ffdf000
    LastErrorValue:       0
    LastStatusValue:      c0000034
    Count Owned Locks:    0
    HardErrorMode:        0
0007c000  00000000
0007c004  00000000
0007c008  00000000
0007c00c  00000000
0007c010  00000000
0007c014  00000000
0007c018  00000000
0007c01c  00000000
0007c020  00000000
0007c024  00000000
...
...
...
...
...
...
...
0977ffb4  00000000
0977ffb8  0977ffec
0977ffbc  77e6608b kernel32!BaseThreadStart+0x34
0977ffc0  025c3728
0977ffc4  00000000
0977ffc8  00000000
0977ffcc  025c3728
0977ffd0  a50c4963
0977ffd4  0977ffc4
0977ffd8  000a5285
0977ffdc  ffffffff
0977ffe0  77e6b7d0 kernel32!_except_handler3
0977ffe4  77e66098 kernel32!`string'+0x98
0977ffe8  00000000
0977ffec  00000000
0977fff0  00000000
0977fff4  77bcb4bc msvcrt!_endthreadex+0x2f
0977fff8  025c3728
0977fffc  00000000
09780000  ????????
TEB at 7ffae000
    ExceptionList:        0071ff64
    StackBase:            00720000
    StackLimit:           0071c000
    SubSystemTib:         00000000
    FiberData:            00001e00
    ArbitraryUserPointer: 00000000
    Self:                 7ffae000
    EnvironmentPointer:   00000000
    ClientId:             0000106c . 00000be8
    RpcHandle:            00000000
    Tls Storage:          00000000
    PEB Address:          7ffdf000
    LastErrorValue:       0
    LastStatusValue:      c000000d
    Count Owned Locks:    0
    HardErrorMode:        0
0071c000  00000000
0071c004  00000000
0071c008  00000000
0071c00c  00000000
0071c010  00000000
0071c014  00000000
0071c018  00000000
0071c01c  00000000
0071c020  00000000
0071c024  00000000
0071c028  00000000
0071c02c  00000000
0071c030  00000000
0071c034  00000000
0071c038  00000000
0071c03c  00000000
0071c040  00000000
0071c044  00000000
0071c048  00000000
0071c04c  00000000
0071c050  00000000
0071c054  00000000
...
...
...
...
...
...
...
0:000> .logclose

Instead of (or in addition to) dps command used in the script we can use dpu or dpa commands to dump all strings that are pointed to by stack data or create an even more complex script that does triple dereference.    

- Dmitry Vostokov @ DumpAnalysis.org -

Book: Windows® Crash Dump Analysis

Sunday, August 26th, 2007

After blogging for more than a year, accumulating initial amount of material and being persuaded by my colleagues at Citrix I finally decided to embrace a new challenge and write a book with a simple working title: Windows® Crash Dump Analysis.

The reader will master crash and hang memory dump analysis for process, kernel and complete memory dumps. The book will provide answers to many questions including those in the following list:

Memory Dump Analysis Interview Questions

and will also include guidelines for implementing SMART (Smart Memory Analysis in Real Time) process in a technical support or software maintenance environment.

As this is my first book I’m going to write the draft version online. More details and the link to Table of Contents will be announced in September/October.

I’ll continue blogging about crash dump analysis at the same time.

- Dmitry Vostokov @ DumpAnalysis.org -

Moving to kernel space (updated references)

Sunday, August 26th, 2007

If you are developing and debugging user space applications (and/or doing crash dump analysis in user space) and you want to understand Windows kernel dumps and device drivers better (and probably start writing your own kernel tools) here is the reading list I found the most effective over the last 4 years:

0. Read and re-read Windows Internals book in parallel while reading all other books. I read all editions by the way. It will show you the big picture and some useful WinDbg commands and techniques but you need to read device driver books to fill the gaps and be confident in kernel space:

Buy from Amazon

1. Start with “The Windows 2000 Device Driver Book: A Guide for Programmers (2nd Edition)”. This short book will show you the basics and you can start writing your drivers and kernel tools immediately.

Buy from Amazon

2. Next read “Windows NT Device Driver Development” book to consolidate your knowledge. This book has been reprinted by OSR:

Buy from Amazon

3. Don’t stop here. Read “Developing Windows NT Device Drivers:
 A Programmer’s Handbook”. This is very good book explaining everything in great detail and good pictures. You will finally understand various buffering methods.

Buy from Amazon

4. Continue with WDM drivers and modern presentation: “Programming the Microsoft Windows Driver Model, Second Edition”. Must read even if your drivers are not WDM.

Buy from Amazon

5. Finally read “Developing Drivers with the Windows Driver Foundation” book as this is the future and it also covers ETW (event tracing for Windows), WinDbg extensions, PREfast and static driver verifier.

Buy from Amazon

Additional reading (not including DDK Help which you will use anyway) can be done in parallel after finishing “Windows NT Device Driver Development” book:

1. OSR NT Insider articles. I have their full printed collection 1996 - 2006

http://www.osronline.com/

2. “Windows NT File System Internals” reprinted by OSR:

Buy from Amazon

3. “Rootkits: Subverting the Windows Kernel” book will show you Windows kernel from hacker perspective. In addition you will find overview of kernel areas not covered in other books.

Buy from Amazon

Of course, you must know C language and its idioms really well. Really know it down to assembly language level! I’ll publish another reading list soon. Stay tuned.

- Dmitry Vostokov @ DumpAnalysis.org -

Memory Dump Analysis Interview Questions

Sunday, August 26th, 2007

The following interview questions might be useful to assess the skill level in crash dump analysis on Windows platforms. These could be useful for debugging interviews as well.

  1. What is FPO?

  2. How many exceptions can be found in a crash dump?

  3. You see the message from WinDbg:
    WARNING: Stack unwind information not available. Following frames may be wrong.
    What would you do?

  4. How would you find spinlock implementation if you have a kernel dump?

  5. What is OMAP?

  6. What is full page heap?

  7. Company name is missing from module information. How would you try to find it?

  8. What is IDT?

  9. How does a postmortem debugger work?

  10. You’ve got a mini dump of your application. How would you disassemble the code?

  11. Memory consumption is growing for an application. How would you discover the leaking component?

  12. What is IRQL?

  13. When do you use TEB?

  14. You’ve got 200 process dumps from a server. You need to find a deadlock. How would you do it?

  15. You’ve got a complete memory dump from a server. You need to find a deadlock. How would you do it?

  16. What is GC heap?

  17. Your customer is reluctant to send a dump due to security policies. What is your next step?

  18. What is a first chance exception?

I’ve created a permanent page and will add more questions there in the future:

Memory Dump Analysis Interview Questions

- Dmitry Vostokov @ DumpAnalysis.org -

Memory Dump Analysis Jobs

Sunday, August 26th, 2007

The Jobs section was created on Crash Dump Analysis Portal to assist companies in finding engineers skilled in crash/core dump analysis. Please read guidelines at:

http://www.dumpanalysis.org/index.php?q=jobs

- Dmitry Vostokov @ DumpAnalysis.org -

Crash Dump Analysis Patterns (Part 23b)

Saturday, August 25th, 2007

In contrast to Double Free pattern in a user mode process heap double free in a kernel mode pool results in immediate bugcheck in order to identify the driver causing the problem (BAD_POOL_CALLER bugcheck with Arg1 == 7):

2: kd> !analyze -v
...
...
...
BAD_POOL_CALLER (c2)
The current thread is making a bad pool request. Typically this is at a bad IRQL level or double freeing the same allocation, etc.
Arguments:
Arg1: 00000007, Attempt to free pool which was already freed
Arg2: 0000121a, (reserved)
Arg3: 02140001, Memory contents of the pool block
Arg4: 89ba74f0, Address of the block of pool being deallocated

If we look at the block being deallocated we would see that it was marked as “Free” block: 

2: kd> !pool 89ba74f0
Pool page 89ba74f0 region is Nonpaged pool
 89ba7000 size:  270 previous size:    0  (Allocated)  Thre (Protected)
 89ba7270 size:    8 previous size:  270  (Free)       ....
 89ba7278 size:   18 previous size:    8  (Allocated)  ReEv
 89ba7290 size:   80 previous size:   18  (Allocated)  Mdl
 89ba7310 size:   80 previous size:   80  (Allocated)  Mdl
 89ba7390 size:   30 previous size:   80  (Allocated)  Vad
 89ba73c0 size:   98 previous size:   30  (Allocated)  File (Protected)
 89ba7458 size:    8 previous size:   98  (Free)       Wait
 89ba7460 size:   28 previous size:    8  (Allocated)  FSfm
 89ba74a0 size:   40 previous size:   18  (Allocated)  Ntfr
 89ba74e0 size:    8 previous size:   40  (Free)       File
*89ba74e8 size:   a0 previous size:    8  (Free )     *ABCD
  Owning component : Unknown (update pooltag.txt)
 89ba7588 size:   38 previous size:   a0  (Allocated)  Sema (Protected)
 89ba75c0 size:   38 previous size:   38  (Allocated)  Sema (Protected)
 89ba75f8 size:   10 previous size:   38  (Free)       Nbtl
 89ba7608 size:   98 previous size:   10  (Allocated)  File (Protected)
 89ba76a0 size:   28 previous size:   98  (Allocated)  Ntfn
 89ba76c8 size:   40 previous size:   28  (Allocated)  Ntfr
 89ba7708 size:   28 previous size:   40  (Allocated)  NtFs
 89ba7730 size:   40 previous size:   28  (Allocated)  Ntfr
 89ba7770 size:   40 previous size:   40  (Allocated)  Ntfr
 89ba7a10 size:  270 previous size:  260  (Allocated)  Thre (Protected)
 89ba7c80 size:   20 previous size:  270  (Allocated)  VadS
 89ba7ca0 size:   40 previous size:   20  (Allocated)  Ntfr
 89ba7ce0 size:   b0 previous size:   40  (Allocated)  Ussy
 89ba7d90 size:  270 previous size:   b0  (Free)       Thre

The pool tag is a 4 byte character sequence used to associate drivers with pool blocks and is useful to identify a driver allocated or freed a block. In our case the pool tag is ABCD and it is associated with the driver which previously freed the block. All known pool tags corresponding to kernel components can be found in pooltag.txt located in triage subfolder where you installed WinDbg. However our ABCD tag is not listed there. We can try to find the driver corresponding to ABCD tag using findstr CMD command:

C:\Windows\System32\drivers>findstr /m /l hABCD *.sys

The results of the search will help us to identify the driver which freed the block first. The driver which double freed the same block can be found from the call stack and it might be the same driver or a different driver:

2: kd> k
ChildEBP RetAddr
f78be910 8089c8f4 nt!KeBugCheckEx+0x1b
f78be978 8089c622 nt!ExFreePoolWithTag+0x477
f78be988 f503968b nt!ExFreePool+0xf
WARNING: Stack unwind information not available. Following frames may be wrong.
f78be990 f5024a6e driver+0×1768b
f78be9a0 f50249e7 driver+0×2a6e
f78be9a4 84b430e0 driver+0×29e7

Because we don’t have symbol files for driver.sys WinDbg warns us that it was unable to identify the correct stack trace and driver.sys might not have called ExFreePool or ExFreePoolWithTag. To verify that driver.sys called ExFreePool indeed we disassemble backwards the return address of ExFreePool: 

2: kd> ub f503968b
driver+0×1767b:
f503967b 90              nop
f503967c 90              nop
f503967d 90              nop
f503967e 90              nop
f503967f 90              nop
f5039680 8b442404        mov     eax,dword ptr [esp+4]
f5039684 50              push    eax
f5039685 ff15202302f5    call    dword ptr [driver+0×320 (f5022320)]

Finally we can get some info from the driver: 

2: kd> lmv m driver
start    end        module name
f5022000 f503e400   driver   (no symbols)
    Loaded symbol image file: driver.sys
    Image path: \SystemRoot\System32\drivers\driver.sys
    Image name: driver.sys
    Timestamp:  Tue Aug 12 11:32:16 2007

If the company name developed the driver is absent we can try techniques outlined in Unknown Component pattern.

If we have symbols it is very easy to identify the code as can be seen from this 64-bit dump:

BAD_POOL_CALLER (c2)
The current thread is making a bad pool request. Typically this is at a bad IRQL level or double freeing the same allocation, etc.
Arguments:
Arg1: 0000000000000007, Attempt to free pool which was already freed
Arg2: 000000000000121a, (reserved)
Arg3: 0000000000000080, Memory contents of the pool block
Arg4: fffffade6d54e270, Address of the block of pool being deallocated

0: kd> kL
fffffade`45517b08 fffff800`011ad905 nt!KeBugCheckEx
fffffade`45517b10 fffffade`5f5991ac nt!ExFreePoolWithTag+0x401
fffffade`45517bd0 fffffade`5f59a0b0 driver64!ProcessDataItem+0×198
fffffade`45517c70 fffffade`5f5885a6 driver64!OnDataArrival+0×2b4
fffffade`45517cd0 fffff800`01299cae driver64!ReaderThread+0×15a
fffffade`45517d70 fffff800`0102bbe6 nt!PspSystemThreadStartup+0×3e
fffffade`45517dd0 00000000`00000000 nt!KiStartSystemThread+0×16

- Dmitry Vostokov @ DumpAnalysis.org -

Crash Dump Analysis Patterns (Part 23a)

Sunday, August 19th, 2007

Double-free bugs lead to Dynamic Memory Corruption. The reason why Double Free deserves its own pattern name is the fact that either debug runtime libraries or even OS itself detect such bugs and save crash dumps immediately.

For some heap implementations double free doesn’t lead to an immediate heap corruption and subsequent crash. For example, if you allocate 3 blocks in a row and then free the middle one twice there will be no crash as the second free call is able to detect that the block was already freed and does nothing. The following program loops forever and never crashes:

#include "stdafx.h"
#include <windows.h>

int _tmain(int argc, _TCHAR* argv[])
{
  while (true)
  {
    puts("Allocate: p1");
    void *p1 = malloc(100);
    puts("Allocate: p2");
    void *p2 = malloc(100);
    puts("Allocate: p3");
    void *p3 = malloc(100);

    puts("Free: p2");
    free(p2);
    puts(”Double-Free: p2″);
    free(p2);

    puts(”Free: p1″);
    free(p1);
    puts(”Free: p3″);
    free(p3);

    Sleep(100);
  }

  return 0;
}

The output of the program: 

...
...
...
Allocate: p1
Allocate: p2
Allocate: p3
Free: p2
Double-Free: p2
Free: p1
Free: p3
Allocate: p1
Allocate: p2
Allocate: p3
Free: p2
Double-Free: p2
Free: p1
Free: p3
Allocate: p1
Allocate: p2
Allocate: p3
Free: p2
Double-Free: p2
...
...
...

However if a free call triggered heap coalescence (adjacent free blocks form the bigger free block) then we have a heap corruption crash on the next double-free call because the coalescence triggered by the previous free call erased free block information:

#include "stdafx.h"
#include <windows.h>

int _tmain(int argc, _TCHAR* argv[])
{
  while (true)
  {
    puts("Allocate: p1");
    void *p1 = malloc(100);
    puts("Allocate: p2");
    void *p2 = malloc(100);
    puts("Allocate: p3");
    void *p3 = malloc(100);

    puts("Free: p3");
    free(p3);
    puts("Free: p1");
    free(p1);
    puts("Free: p2");
    free(p2);
    puts(”Double-Free: p2″);
    free(p2);

    Sleep(100);
  }

  return 0;
}

The output of the program:

Allocate: p1
Allocate: p2
Allocate: p3
Free: p3
Free: p1
Free: p2
Double-Free: p2
Crash!

If we open a crash dump we would see the following stack trace:

0:000> r
eax=00922130 ebx=00920000 ecx=10101010 edx=10101010 esi=00922128 edi=00921fc8
eip=76ee1ad5 esp=0012fd6c ebp=0012fd94 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010246
ntdll!RtlpCoalesceFreeBlocks+0x6ef:
76ee1ad5 8b4904          mov     ecx,dword ptr [ecx+4] ds:0023:10101014=????????

0:000> kL
ChildEBP RetAddr
0012fd94 76ee1d37 ntdll!RtlpCoalesceFreeBlocks+0x6ef
0012fe8c 76ee1c21 ntdll!RtlpFreeHeap+0x1e2
0012fea8 758d7a7e ntdll!RtlFreeHeap+0x14e
0012febc 6cff4c39 kernel32!HeapFree+0x14
0012ff08 0040107b msvcr80!free+0xcd
0012ff5c 004011f1 DoubleFree!wmain+0x7b
0012ffa0 758d3833 DoubleFree!__tmainCRTStartup+0x10f
0012ffac 76eba9bd kernel32!BaseThreadInitThunk+0xe
0012ffec 00000000 ntdll!_RtlUserThreadStart+0x23

This is illustrated on the following picture where free calls result in heap coalescence and the subsequent double-free call corrupts the heap:

The problem here is that heap coalescence can be triggered some time after the double free so we need some solution to diagnose double-free bugs earlier, ideally at the first double-free call. For example, the following program crashes during the normal free operation long after the first double-free happened:

#include "stdafx.h"
#include <windows.h>

int _tmain(int argc, _TCHAR* argv[])
{
  while (true)
  {
    puts("Allocate: p1");
    void *p1 = malloc(100);
    puts("Allocate: p2");
    void *p2 = malloc(100);
    puts("Allocate: p3");
    void *p3 = malloc(100);

    puts("Free: p1");
    free(p1);
    puts("Free: p2");
    free(p2);
    puts(”Double-Free: p2″);
    free(p2);

    puts(”Double-Free: p3″);
    free(p3);

    Sleep(100);
  }

  return 0;
}

The output of the program:

Allocate: p1
Allocate: p2
Allocate: p3
Free: p1
Free: p2
Double-Free: p2
Free: p3
Allocate: p1
Allocate: p2
Allocate: p3
Free: p1
Free: p2
Double-Free: p2
Free: p3
Allocate: p1
Allocate: p2
Allocate: p3
Free: p1
Free: p2
Double-Free: p2
Free: p3
Allocate: p1
Allocate: p2
Allocate: p3
Free: p1
Free: p2
Double-Free: p2
Free: p3
Crash!

If we enable full page heap using gflags.exe from Debugging Tools for Windows the program crashes immediately on the double free call:

Allocate: p1
Allocate: p2
Allocate: p3
Free: p1
Free: p2
Double-Free: p2
Crash!

The crash dump shows the following stack trace:

0:000> kL
ChildEBP RetAddr
0012f810 71aa4ced ntdll!DbgBreakPoint+0x1
0012f834 71aa9fc2 verifier!VerifierStopMessage+0x1fd
0012f890 71aaa4da verifier!AVrfpDphReportCorruptedBlock+0x102
0012f8a4 71ab2c98 verifier!AVrfpDphCheckNormalHeapBlock+0x18a
0012f8b8 71ab2a0e verifier!_EH4_CallFilterFunc+0x12
0012f8e0 76ee1039 verifier!_except_handler4+0x8e
0012f904 76ee100b ntdll!ExecuteHandler2+0x26
0012f9ac 76ee0e97 ntdll!ExecuteHandler+0x24
0012f9ac 71aaa3ad ntdll!KiUserExceptionDispatcher+0xf
0012fcf0 71aaa920 verifier!AVrfpDphCheckNormalHeapBlock+0x5d
0012fd0c 71aa879b verifier!AVrfpDphNormalHeapFree+0x20
0012fd60 76f31c8f verifier!AVrfDebugPageHeapFree+0x1cb
0012fda8 76efd9fa ntdll!RtlDebugFreeHeap+0x2f
0012fe9c 76ee1c21 ntdll!RtlpFreeHeap+0x5f
0012feb8 758d7a7e ntdll!RtlFreeHeap+0x14e
0012fecc 6cff4c39 kernel32!HeapFree+0x14
0012ff18 0040105f msvcr80!free+0xcd
0012ff5c 004011f1 DoubleFree!wmain+0x5f
0012ffa0 758d3833 DoubleFree!__tmainCRTStartup+0x10f
0012ffac 76eba9bd kernel32!BaseThreadInitThunk+0xe

0:000> !gflag
Current NtGlobalFlag contents: 0x02000000
    hpa - Place heap allocations at ends of pages

If we enable heap free checking instead of page heap we get our crash on the first double free call immediately too:

Allocate: p1
Allocate: p2
Allocate: p3
Free: p1
Free: p2
Double-Free: p2
Crash!

The crash dump shows the following stack trace:

0:000> r
eax=feeefeee ebx=001b2040 ecx=001b0000 edx=001b2040 esi=d4476047 edi=001b2038
eip=76ee2086 esp=0012fe68 ebp=0012fe9c iopl=0 nv up ei ng nz na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010286
ntdll!RtlpLowFragHeapFree+0x31:
76ee2086 8b4604          mov     eax,dword ptr [esi+4] ds:0023:d447604b=????????

0:000> kL
ChildEBP RetAddr
0012fe9c 76ee18c3 ntdll!RtlpLowFragHeapFree+0x31
0012feb0 758d7a7e ntdll!RtlFreeHeap+0x101
0012fec4 6cff4c39 kernel32!HeapFree+0x14
0012ff10 0040106d msvcr80!free+0xcd
0012ff5c 004011f1 DoubleFree!wmain+0x6d
0012ffa0 758d3833 DoubleFree!__tmainCRTStartup+0x10f
0012ffac 76eba9bd kernel32!BaseThreadInitThunk+0xe
0012ffec 00000000 ntdll!_RtlUserThreadStart+0x23

0:000> !gflag
Current NtGlobalFlag contents: 0x00000020
    hfc - Enable heap free checking

- Dmitry Vostokov @ DumpAnalysis.org -

Crash Dump Analysis Patterns (Part 20b)

Sunday, August 19th, 2007

Sometimes the process size constantly grows but there is no difference in the process heap size. In such cases we need to check whether the process uses Microsoft .NET runtime (CLR). If one of the loaded modules is mscorwks.dll or mscorsvr.dll then it is most likely. Then we should check CLR heap statistics.

In .NET world dynamically allocated objects are garbage collected (GC) and therefore simple allocate-and-forget memory leaks are not possible. To simulate that I created the following C# program:

using System;

namespace CLRHeapLeak
{
    class Leak
    {
        private byte[] m_data;

       

        public Leak()
        {
            m_data = new byte[1024];
        }
    }

    class Program
    {
        static void Main(string[] args)
        {
            Leak leak = new Leak();

            while (true)
            {
                leak = new Leak();
                System.Threading.Thread.Sleep(100);
            }
        }
    }
}

If we run it the process size will never grow. GC thread will collect and free unreferenced Leak classes. This can be seen from inspecting memory dumps taken with userdump.exe after the start, 2, 6 and 12 minutes. The GC heap never grows higher than 1Mb and the number of CLRHeapLeak.Leak and System.Byte[] objects always fluctuates between 100 and 500. For example, on 12th minute we have the following statistics:

0:000> .loadby sos mscorwks

0:000> !eeheap -gc
Number of GC Heaps: 1
generation 0 starts at 0x0147160c
generation 1 starts at 0x0147100c
generation 2 starts at 0x01471000
ephemeral segment allocation context: (0x014dc53c, 0x014dd618)
 segment    begin allocated     size
004aedb8 790d7ae4  790f7064 0x0001f580(128384)
01470000 01471000  014dd618 0x0006c618(443928)
Large object heap starts at 0x02471000
 segment    begin allocated     size
02470000 02471000  02473250 0x00002250(8784)
Total Size   0x8dde8(581096)
------------------------------
GC Heap Size   0×8dde8(581096)

0:000> !dumpheap -stat
total 2901 objects
Statistics:
Count    TotalSize Class Name
    1           12 System.Security.Permissions.SecurityPermission
    1           24 System.OperatingSystem
    1           24 System.Version
    1           24 System.Reflection.Assembly
    1           28 System.SharedStatics
    1           36 System.Int64[]
    1           40 System.AppDomainSetup
    3           60 System.RuntimeType
    5           60 System.Object
    2           72 System.Security.PermissionSet
    1           72 System.ExecutionEngineException
    1           72 System.StackOverflowException
    1           72 System.OutOfMemoryException
    1          100 System.AppDomain
    7          100      Free
    2          144 System.Threading.ThreadAbortException
    4          328 System.Char[]
  418         5016 CLRHeapLeak.Leak
    5         8816 System.Object[]
 2026       128632 System.String
  418       433048 System.Byte[]
Total 2901 objects

However, if we make Leak objects always referenced by introducing the following changes into the program:

using System;

namespace CLRHeapLeak
{
    class Leak
    {
        private byte[] m_data;
        private Leak m_prevLeak;

        public Leak()
        {
            m_data = new byte[1024];
        }

        public Leak(Leak prevLeak)
        {
            m_prevLeak = prevLeak;
            m_data = new byte[1024];
        }
    }

    class Program
    {
        static void Main(string[] args)
        {
            Leak leak = new Leak();

            while (true)
            {
                leak = new Leak(leak);
                System.Threading.Thread.Sleep(100);
            }
        }
    }
}

then if we run the program we would see in Task Manager that it grows over time. Taking consecutive memory dumps after the start, 10 and 16 minutes, shows that Win32 heap segments have always the same size:

0:000> !heap 0 0
Index   Address  Name      Debugging options enabled
  1:   00530000
    Segment at 00530000 to 00630000 (0003d000 bytes committed)
  2:   00010000
    Segment at 00010000 to 00020000 (00003000 bytes committed)
  3:   00520000
    Segment at 00520000 to 00530000 (00003000 bytes committed)
  4:   00b10000
    Segment at 00b10000 to 00b50000 (00001000 bytes committed)
  5:   001a0000
    Segment at 001a0000 to 001b0000 (00003000 bytes committed)
  6:   00170000
    Segment at 00170000 to 00180000 (00008000 bytes committed)
  7:   013b0000
    Segment at 013b0000 to 013c0000 (00003000 bytes committed)

but GC heap and the number of Leak and System.Byte[] objects in it were growing significantly: 

Process Uptime: 0 days 0:00:04.000

0:000> !eeheap -gc
Number of GC Heaps: 1
generation 0 starts at 0x013c1018
generation 1 starts at 0x013c100c
generation 2 starts at 0x013c1000
ephemeral segment allocation context: (0x013cd804, 0x013cdff4)
 segment    begin allocated     size
0055ee08 790d7ae4  790f7064 0x0001f580(128384)
013c0000 013c1000  013cdff4 0x0000cff4(53236)
Large object heap starts at 0x023c1000
 segment    begin allocated     size
023c0000 023c1000  023c3250 0x00002250(8784)
Total Size   0x2e7c4(190404)
------------------------------
GC Heap Size   0×2e7c4(190404)

0:000> !dumpheap -stat
total 2176 objects
Statistics:
Count    TotalSize Class Name
...
...
...
   46          736 CLRHeapLeak.Leak
    5         8816 System.Object[]
   46        47656 System.Byte[]
 2035       129604 System.String
Total 2176 objects

Process Uptime: 0 days 0:09:56.000

0:000> !eeheap -gc
Number of GC Heaps: 1
generation 0 starts at 0x018cddbc
generation 1 starts at 0x01541ec4
generation 2 starts at 0x013c1000
ephemeral segment allocation context: (0x0192d668, 0x0192ddc8)
 segment    begin allocated     size
0055ee08 790d7ae4  790f7064 0x0001f580(128384)
013c0000 013c1000  0192ddc8 0x0056cdc8(5688776)
Large object heap starts at 0x023c1000
 segment    begin allocated     size
023c0000 023c1000  023c3240 0x00002240(8768)
Total Size  0x58e588(5825928)
------------------------------
GC Heap Size  0×58e588(5825928)

0:000> !dumpheap -stat
total 12887 objects
Statistics:
Count    TotalSize Class Name
...
...
...
    5         8816 System.Object[]
 5403        86448 CLRHeapLeak.Leak
 2026       128632 System.String
 5403      5597508 System.Byte[]
Total 12887 objects

Process Uptime: 0 days 0:16:33.000

0:000> !eeheap -gc
Number of GC Heaps: 1
generation 0 starts at 0x01c59cb4
generation 1 starts at 0x0194fd20
generation 2 starts at 0x013c1000
ephemeral segment allocation context: (0x01cd3050, 0x01cd3cc0)
 segment    begin allocated     size
0055ee08 790d7ae4  790f7064 0x0001f580(128384)
013c0000 013c1000  01cd3cc0 0x00912cc0(9514176)
Large object heap starts at 0x023c1000
 segment    begin allocated     size
023c0000 023c1000  023c3240 0x00002240(8768)
Total Size  0x934480(9651328)
------------------------------
GC Heap Size  0×934480(9651328)

0:000> !dumpheap -stat
total 20164 objects
Statistics:
Count    TotalSize Class Name
    5         8816 System.Object[]
 2026       128632 System.String
 9038       144608 CLRHeapLeak.Leak
 9038      9363368 System.Byte[]
Total 20164 objects

This is not the traditional memory leak because we have the reference chain. However, uncontrolled memory growth can be considered as a memory leak too, caused by poor application design, bad input validation or error handling, etc.

There are situations when customers think there is a memory leak but it is not. One of them is unusually big size of a process when running it on a multi-processor server. If dllhost.exe hosting typical .NET assembly DLL occupies less than 100Mb on a local workstation starts consuming more than 300Mb on a 4 processor server than it can be the case that the server version of CLR uses per processor GC heaps:

0:000> .loadby sos mscorsvr

0:000> !EEHeap -gc
generation 0 starts at 0×05c80154
generation 1 starts at 0×05c7720c
generation 2 starts at 0×102d0030

generation 0 starts at 0×179a0444
generation 1 starts at 0×1799b7a4
generation 2 starts at 0×142d0030

generation 0 starts at 0×0999ac88
generation 1 starts at 0×09990cc4
generation 2 starts at 0×182d0030

generation 0 starts at 0×242eccb0
generation 1 starts at 0×242d0030
generation 2 starts at 0×1c2d0030




GC Heap Size  0×109702ec(278332140)

or if this is CLR 1.x the old extension will tell you the same too: 

0:000> !.\clr10\sos.eeheap -gc
Loaded Son of Strike data table version 5 from "C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\mscorsvr.dll"
Number of GC Heaps: 4
------------------------------
Heap 0 (0x000f9af0)
generation 0 starts at 0x05c80154
generation 1 starts at 0x05c7720c
generation 2 starts at 0x102d0030
...
...
...
Heap Size  0x515ed60(85,323,104)
------------------------------
Heap 1 (0x000fa070)
generation 0 starts at 0x179a0444
generation 1 starts at 0x1799b7a4
generation 2 starts at 0x142d0030
...
...
...
Heap Size  0x37c7bf0(58,489,840)
------------------------------
Heap 2 (0x000fab80)
generation 0 starts at 0x0999ac88
generation 1 starts at 0x09990cc4
generation 2 starts at 0x182d0030
...
...
...
Heap Size  0x485de34(75,882,036)
------------------------------
Heap 3 (0x000fb448)
generation 0 starts at 0x242eccb0
generation 1 starts at 0x242d0030
generation 2 starts at 0x1c2d0030
...
...
...
Heap Size  0x41ea570(69,117,296)
------------------------------
Reserved segments:
------------------------------
GC Heap Size  0x1136ecf4(288,812,276)

The more processors you have the more heaps are contributing to the overall VM size. Although the process occupies almost 400Mb if it doesn’t grow constantly over time beyond that value then it is normal.

- Dmitry Vostokov @ DumpAnalysis.org -