Archive for the ‘Debugging’ Category

Forthcoming Windows Debugging Notebook: Essential User Space WinDbg Commands

Tuesday, April 5th, 2011

Finally, after the numerous delays, the first Windows Debugging Notebook is almost ready for publication by the end of this month with a new cover and a foreword written by Mario Hewardt, the author of Advanced Windows Debugging and Advanced .NET Debugging books:

It is the first notebook because we have decided to split it into 3 different memory space volumes and one additional overview volume for concepts and tools:

- Windows Debugging Notebook: Essential User Space WinDbg Commands (ISBN: 978-1906717001 and 978-0955832857)
- Windows Debugging Notebook: Essential Kernel Space WinDbg Commands (ISBN: 978-1908043146)
- Windows Debugging Notebook: Essential Managed Space WinDbg .NET Commands (ISBN: 978-1908043153)
- Windows Debugging Notebook: Essential Concepts and Tools (ISBN: 978-1908043160)

On a bookshelf you would be able to distinguish between the first 3 volumes by a 3 color spine stripe with an appropriate check on it: red for kernel, blue for user, and green for managed space.

The Table of Contents has been changed too.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Crash Dump Analysis Patterns (Part 39b)

Tuesday, April 5th, 2011

This is a kernel space counterpart to Custom Exception Handler pattern in user space. In the following stack trace below we see that DriverA code intercepted an access violation exception resulted from dereferencing a NULL pointer and generated a custom bugcheck:

kd> !analyze -v

[...]

EXCEPTION_RECORD: fffff8801c757158 -- (.exr 0xfffff8801c757158)
ExceptionAddress: fffff88003977de1 (DriverA!foo+0x0000000000000381)
ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000000
NumberParameters: 2
  Parameter[0]: 0000000000000000
  Parameter[1]: 0000000000000070
Attempt to read from address 0000000000000070

TRAP_FRAME: fffff8801c757200 -- (.trap 0xfffff8801c757200)
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=0000000000000000 rbx=0000000000000000 rcx=fffff8a00da3f3c0
rdx=0000000000000000 rsi=0000000000000000 rdi=0000000000000000
rip=fffff88003977de1 rsp=fffff8801c757390 rbp=fffffa8009a853f0
 r8=0000000000000000 r9=0000000000000000 r10=006800740020006e
r11=fffff8a00da3f3c6 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0 nv up ei pl zr na po nc
DriverA!foo+0×381:
fffff880`03977de1 0fb74070 movzx eax,word ptr [rax+70h] ds:0703:0070=????
Resetting default scope

[...]

kd> kL 100
Child-SP          RetAddr           Call Site
fffff880`1c7560f8 fffff880`039498f7 nt!KeBugCheckEx
fffff880`1c756100 fffff880`039352a0 DriverA!MyBugCheckEx+0×93
fffff880`1c756140 fffff800`016f1d1c DriverA!MyExceptionFilter+0×1d0

fffff880`1c756210 fffff800`016e940d nt!_C_specific_handler+0×8c
fffff880`1c756280 fffff800`016f0a90 nt!RtlpExecuteHandlerForException+0xd

fffff880`1c7562b0 fffff800`016fd9ef nt!RtlDispatchException+0×410
fffff880`1c756990 fffff800`016c2d82 nt!KiDispatchException+0×16f
fffff880`1c757020 fffff800`016c18fa nt!KiExceptionDispatch+0xc2
fffff880`1c757200 fffff880`03977de1 nt!KiPageFault+0×23a

fffff880`1c757390 fffff880`03977754 DriverA!foo+0×381
fffff880`1c757430 fffff880`0396f006 DriverA!bar+0×74
[…]
fffff880`1c7579b0 fffff800`019a6e0a DriverA!QueryInformation+0×30b
fffff880`1c757a70 fffff800`016c2993 nt!NtQueryInformationFile+0×535
fffff880`1c757bb0 00000000`76e5fe6a nt!KiSystemServiceCopyEnd+0×13
00000000`0a08dfe8 00000000`00000000 0×76e5fe6a

kd> !exchain
24 stack frames, scanning for handlers...
Frame 0×05: nt!RtlpExecuteHandlerForException+0xd (fffff800`016e940d)
  ehandler nt!RtlpExceptionHandler (fffff800`016e93d0)

Frame 0×07: nt!KiDispatchException+0×16f (fffff800`016fd9ef)
  ehandler nt!_GSHandlerCheck_SEH (fffff800`0169aec0)
Frame 0×0b: DriverA!bar+0×74 (fffff880`03977754)
  ehandler DriverA!__GSHandlerCheck (fffff880`039a12fc)

[…]
Frame 0×14: DriverA!QueryInformation+0×30b (fffff880`039303ab)
  ehandler DriverA!_C_specific_handler (fffff880`039a1864)
Frame 0×15: nt!NtQueryInformationFile+0×535 (fffff800`019a6e0a)
  ehandler nt!_C_specific_handler (fffff800`016f1c90)
Frame 0×16: nt!KiSystemServiceCopyEnd+0×13 (fffff800`016c2993)
  ehandler nt!KiSystemServiceHandler (fffff800`016c2580)
Frame 0×17: error getting module for 0000000076e5fe6a

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Crash Dump Analysis Patterns (Part 134a)

Monday, March 28th, 2011

Data Correlation is a general pattern where values found in different parts of a memory dump correlate between each other according to some rules, for example, in some proportion. Here we show a variant for function parameters.

A process user memory dump showed a C++ exception:

0:000> kL
  *** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr 
0012e950 78158e89 kernel32!RaiseException+0x53
0012e988 7830770c msvcr80!_CxxThrowException+0×46
0012e99c 783095bc mfc80u!AfxThrowMemoryException+0×19
0012e9b4 02afa8ca mfc80u!operator new+0×27

0012e9c8 02b0992f ModuleA!std::_Allocate<…>+0×1a
0012e9e0 02b09e7c ModuleA!std::vector<double,std::allocator<double> >::vector<double,std::allocator<double> >+0×3f
[…]

We suspected an out-of-memory condition and looked for function parameters:

0:000> kv 5
ChildEBP RetAddr  Args to Child             
0012e950 78158e89 e06d7363 00000001 00000003 kernel32!RaiseException+0x53
0012e988 7830770c 0012e998 783b0110 783c8d68 msvcr80!_CxxThrowException+0x46
0012e99c 783095bc 0000a7c0 0012ea40 000014f8 mfc80u!AfxThrowMemoryException+0x19
0012e9b4 02afa8ca 0000a7c0 089321b0 089321f0 mfc80u!operator new+0×27 (FPO: [Uses EBP] [1,0,0])
0012e9c8 02b0992f 000014f8 00000000 00000008 ModuleA!std::_Allocate<…>+0×1a (FPO: [2,3,0])

Because of FPO optimization we originally thought that stack arguments would be invalid. However, bearing in mind the function prototype and semantics of operator new and std::vector double element type we immediately see the correlation between 0xa7c0 and 0×14f8 which are proportional to sizeof(double) == 8:

0:000> ? 0000a7c0/000014f8
Evaluate expression: 8 = 00000000`00000008

We therefore conclude without looking at disassembly that memory allocation size was 42944 bytes:

0:000> .formats 0000a7c0
Evaluate expression:
  Hex:     00000000`0000a7c0
  Decimal: 42944
  Octal:   0000000000000000123700
  Binary:  00000000 00000000 00000000 00000000 00000000 00000000 10100111 11000000
  Chars:   ……..
  Time:    Thu Jan 01 11:55:44 1970
  Float:   low 6.01774e-041 high 0
  Double:  2.12172e-319

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Presentation from Webinar: Introduction to Pattern-Driven Software Problem Solving

Saturday, March 26th, 2011

The presentation materials from the webinar (25th of March, 2011) are available for download:

http://www.dumpanalysis.com/PDSPSI-materials

Thanks to everyone who registered and attended!

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

DBI: Debugging Bureau of Investigations

Thursday, March 24th, 2011

Memory Dump Analysis Services opens Debugging Bureau to complement crash and hang memory dump analysis audit:

http://www.dumpanalysis.com/debugging-bureau

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Icons for Memory Dump Analysis Patterns (Part 94)

Wednesday, March 23rd, 2011

Today we introduce an icon for Stack Overflow (software implementation) pattern:

B/W

Color

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Advanced Software Debugging Reference

Wednesday, March 23rd, 2011

While working on Pattern-Driven Software Problem Solving Webinars I needed a reference page that lists all Memory Dump Analysis Anthology volumes:

Advanced Software Debugging Reference

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Crash Dump Analysis Patterns (Part 133)

Sunday, March 20th, 2011

Sometimes we see Truncated Stack Trace with missing stack frames. For example, in one incident, after enabling user mode stack trace database for a memory leaking application we got these entries from the growing heap segment (other segments had non-truncated saved stack traces):

0bdc1350: 40010 . 40010 [101] - busy (3fff8) Internal

    7702fbd2: ntdll!RtlAllocateHeap+0x0000021d
    77005eef: ntdll!RtlpAllocateUserBlock+0x000000a2
    77026a65: ntdll!RtlpLowFragHeapAllocFromContext+0x00000785
    7702661f: ntdll!RtlAllocateHeap+0x0000017c

0be01360: 40010 . 40010 [101] - busy (3fff8) Internal

    7702fbd2: ntdll!RtlAllocateHeap+0x0000021d
    77005eef: ntdll!RtlpAllocateUserBlock+0x000000a2
    77026a65: ntdll!RtlpLowFragHeapAllocFromContext+0x00000785
    7702661f: ntdll!RtlAllocateHeap+0x0000017c

0be41370: 40010 . 40010 [101] - busy (3fff8) Internal

    7702fbd2: ntdll!RtlAllocateHeap+0x0000021d
    77005eef: ntdll!RtlpAllocateUserBlock+0x000000a2
    77026a65: ntdll!RtlpLowFragHeapAllocFromContext+0x00000785
    7702661f: ntdll!RtlAllocateHeap+0x0000017c

Truncated traces are different from incorrect stack traces because their surviving part is correct. How can we find the rest of such stack traces? Here we can suggest to look at other heap segments and see allocations of the same size. If a truncated trace comes from a stack trace collection we can compare it with a non-truncated thread stack from another process instance having the same thread position. Other suggestions are welcome too, especially, for the mysterious memory leak shown above.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Analysis Compass

Friday, March 18th, 2011

The forthcoming Software Behavior Analysis Patterns and Checklists: Corporate Reference Poster Edition for Windows Crash and Hang Memory Dump and Software Trace Analysis (ISBN: 978-1908043030) will feature Analysis Compass based on the reversal of an extended version of pattern-to-command table published in the printed version of Debugged! MZ/PE March, 2009 issue. Here’s a sneak preview screenshot of this feature:

The notation F”Function” means searching for frames having the given Function name in stack traces from Stack Trace Collection. S”String” means searching for String in the output of WinDbg commands, for example, !process 0 ff.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Bugtation No.140

Thursday, March 17th, 2011

“If you want to be thought a” debugger “, always” talk about bugs.

Logan Pearsall Smith

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Bugtation No.139

Monday, March 14th, 2011

“… he who has not been in” support “does not know what” debugging “is.”

Attributed to Leo Tolstoy, by Adam Ulam, The Bolsheviks

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Crash Dump Analysis Patterns (Part 132)

Saturday, March 12th, 2011

Sometimes simple comparison of crash signatures is not enough to find similar support incidents. We then traverse stack trace frames to find Least Common Frame matching similar stack traces in a database. For example, consider this signature:

0:026> r
eax=011349ec ebx=01136738 ecx=79f943e1 edx=00000000 esi=011349ec edi=0888f3b8
eip=00dfbef8 esp=0888f348 ebp=0888f3c8 iopl=0  nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000  efl=00000246
00dfbef8 3902  cmp     dword ptr [edx],eax  ds:0023:00000000=????????

0:026> k 100
ChildEBP RetAddr 
WARNING: Frame IP not in any known module. Following frames may be wrong.
0888f3c8 792a842f 0xdfbef8
0888f3e4 792a839b mscorlib_ni+0x1e842f
0888f3fc 79e71b4c mscorlib_ni+0x1e839b
0888f40c 79e821b9 mscorwks!CallDescrWorker+0×33
0888f48c 79e8281f mscorwks!CallDescrWorkerWithHandler+0xa3
0888f4ac 79e82860 mscorwks!DispatchCallBody+0×1e
0888f510 79e828d1 mscorwks!DispatchCallDebuggerWrapper+0×3d
0888f544 79ec50f5 mscorwks!DispatchCallNoEH+0×51
0888f5a0 79e9848f mscorwks!AddTimerCallback_Worker+0×66
0888f5b4 79e9842b mscorwks!Thread::DoADCallBack+0×32a
0888f648 79e98351 mscorwks!Thread::ShouldChangeAbortToUnload+0xe3
0888f684 79e984dd mscorwks!Thread::ShouldChangeAbortToUnload+0×30a
0888f6ac 79ec4a84 mscorwks!Thread::ShouldChangeAbortToUnload+0×33e
0888f6c4 79ec5075 mscorwks!ManagedThreadBase::ThreadPool+0×13
0888f70c 79ec50a4 mscorwks!AddTimerCallbackEx+0×83
0888f720 79ec514a mscorwks!AddTimerCallback+0×10
0888f75c 79ec4e0c mscorwks!ThreadpoolMgr::AsyncTimerCallbackCompletion+0×64
0888f7a8 79ec471e mscorwks!UnManagedPerAppDomainTPCount::DispatchWorkItem+0×9a
0888f7bc 79ec4892 mscorwks!ThreadpoolMgr::ExecuteWorkRequest+0xaf
0888f814 79f75715 mscorwks!ThreadpoolMgr::WorkerThreadStart+0×20b
0888ffb4 7c80b729 mscorwks!Thread::intermediateThreadProc+0×49
0888ffec 00000000 kernel32!BaseThreadStart+0×37

Most likely we won’t find any similar stack trace when searching for 0xdfbef8. The search for mscorlib_ni+0×1e842f brings several results but they are not crashes but hangs with the frame being in the middle of call stack. The same is for mscorlib_ni+0×1e839b. So we finally try searching for CallDescrWorker+0×33 but with a limit to results having the same application module name. And indeed we find the similar software incident with the same stack trace after our least common frame:

0:004> r
eax=00000024 ebx=03e6f738 ecx=738129d8 edx=00495ef0 esi=01a87c4c edi=019c5f1c
eip=00a92037 esp=03e6f6cc ebp=03e6f6e8 iopl=0 nv up ei pl nz na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b  efl=00010206
00a92037 ??              ???

0:004> k 100
  *** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr 
WARNING: Frame IP not in any known module. Following frames may be wrong.
03e6f6c8 737d6bb5 0xa92037
03e6f6e8 737a509f mscorlib_ni+0x216bb5
03e6f6f8 737a834c mscorlib_ni+0x1e509f
03e6f70c 74171b6c mscorlib_ni+0x1e834c
03e6f71c 74182209 mscorwks!CallDescrWorker+0×33
03e6f79c 7418286f mscorwks!CallDescrWorkerWithHandler+0xa3
03e6f7bc 741828b0 mscorwks!DispatchCallBody+0×1e
03e6f820 74182921 mscorwks!DispatchCallDebuggerWrapper+0×3d
03e6f854 742ced79 mscorwks!DispatchCallNoEH+0×51
03e6f8b0 7419846f mscorwks!AddTimerCallback_Worker+0×66
03e6f8c4 7419840b mscorwks!Thread::DoADCallBack+0×32a
03e6f958 74198331 mscorwks!Thread::ShouldChangeAbortToUnload+0xe3
03e6f994 741984bd mscorwks!Thread::ShouldChangeAbortToUnload+0×30a
03e6f9bc 742ce708 mscorwks!Thread::ShouldChangeAbortToUnload+0×33e
03e6f9d4 742cecf9 mscorwks!ManagedThreadBase::ThreadPool+0×13
03e6fa1c 742ced28 mscorwks!AddTimerCallbackEx+0×83
03e6fa30 742cedce mscorwks!AddTimerCallback+0×10
03e6fa6c 742cea90 mscorwks!ThreadpoolMgr::AsyncTimerCallbackCompletion+0×64
03e6fab8 742ce3a2 mscorwks!UnManagedPerAppDomainTPCount::DispatchWorkItem+0×9a
03e6facc 742ce516 mscorwks!ThreadpoolMgr::ExecuteWorkRequest+0xaf
03e6fb64 74441ec9 mscorwks!ThreadpoolMgr::WorkerThreadStart+0×20b
03e6fc84 76813677 mscorwks!Thread::intermediateThreadProc+0×49
03e6fc90 77219d72 kernel32!BaseThreadInitThunk+0xe
03e6fcd0 77219d45 ntdll!__RtlUserThreadStart+0×70
03e6fce8 00000000 ntdll!_RtlUserThreadStart+0×1b

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Software Trace Analysis Checklist

Thursday, March 10th, 2011

Because the number of software trace patterns is growing I’m starting another checklist in addition to memory dump analysis checklist. The goal is to help experienced engineers not to miss any important information. The checklist doesn’t prescribe any specific steps, just lists all possible points to double check when looking at a software trace. Of course, it is not complete at the moment and any suggestions are welcome. This post will be modified on the ongoing basis.

General:

• Check overall trace time delta
• Check no trace metafile message density
• Check whether a trace is a multi-part or a circular
• Check for basic facts and the story (software narrative)
• Check for any exceptions, non-false positive errors and periodic errors
• Check for significant events
• Check for discontinuities in the time domain
• Check for message current and acceleration in the frequency domain

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

10 Common Mistakes in Memory Analysis (Part 10)

Thursday, March 10th, 2011

The last common mistake in this set is when an engineer doesn’t compare stack traces or other debugger output to normal reference stack traces. For example, a 3rd party service was hanging on a W2K8 server and a user memory dump was saved for offline analysis. The following thread was identified as blocked waiting for a response:

STACK_TEXT: 
0041f1bc 75a30816 ntdll_77e50000!ZwWaitForSingleObject+0x15
0041f228 76671184 KERNELBASE!WaitForSingleObjectEx+0x98
0041f240 76671138 kernel32!WaitForSingleObjectExImplementation+0x75
0041f254 75b17be6 kernel32!WaitForSingleObject+0x12
0041f2f8 75b18040 sechost!ScSendResponseReceiveControls+0xea
0041f3ac 75b18662 sechost!ScDispatcherLoop+0xc2
0041f3ec 01271bb4 sechost!StartServiceCtrlDispatcherW+0xb0
0041fa7c 01271dd6 ServiceA!WinMain+0×254
0041fb0c 76673677 ServiceA!__tmainCRTStartup+0×160
0041fb18 77e89d42 kernel32!BaseThreadInitThunk+0xe
0041fb58 77e89d15 ntdll_77e50000!__RtlUserThreadStart+0×70
0041fb70 00000000 ntdll_77e50000!_RtlUserThreadStart+0×1b

Unfortunately, this thread wasn’t recognized as a normal main service thread. Typical Internet search for ScSendResponseReceiveControls function points to a sample analysis log where we can find such thread stacks in the variety of other standard services:

THREAD fffffa8005362060  Cid 0a1c.0b68  Teb: 000007fffffde000 Win32Thread: 0000000000000000 WAIT: (UserRequest) UserMode Non-Alertable
    fffffa8004dc0a60  SynchronizationEvent
Not impersonating
DeviceMap                 fffff8a000008c10
Owning Process            fffffa800540e060       Image:         svchost.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      37051          Ticks: 3239 (0:00:00:50.528)
Context Switch Count      13            
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Win32 Start Address svchost!wmainCRTStartup (0x00000000ffa9246c)
Stack Init fffff88005abbdb0 Current fffff88005abb900
Base fffff88005abc000 Limit fffff88005ab6000 Call 0
Priority 9 BasePriority 8 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           Call Site
fffff880`05abb940 fffff800`01a93992 nt!KiSwapContext+0x7a
fffff880`05abba80 fffff800`01a95cff nt!KiCommitThreadWait+0x1d2
fffff880`05abbb10 fffff800`01d871d2 nt!KeWaitForSingleObject+0x19f
fffff880`05abbbb0 fffff800`01a8b993 nt!NtWaitForSingleObject+0xb2
fffff880`05abbc20 00000000`7781fefa nt!KiSystemServiceCopyEnd+0x13 (TrapFrame @ fffff880`05abbc20)
00000000`001ffab8 000007fe`fda910ac ntdll!NtWaitForSingleObject+0xa
00000000`001ffac0 000007fe`ff4eaffb KERNELBASE!WaitForSingleObjectEx+0x79
00000000`001ffb60 000007fe`ff4e9d61 sechost!ScSendResponseReceiveControls+0×13b
00000000`001ffc50 000007fe`ff4e9c16 sechost!ScDispatcherLoop+0×121
00000000`001ffd60 00000000`ffa91d3a sechost!StartServiceCtrlDispatcherW+0×14e
00000000`001ffdb0 00000000`ffa9257a svchost!wmain+0×110
00000000`001ffde0 00000000`776cf56d svchost!ScCreateWellKnownSids+0×2fd
00000000`001ffe20 00000000`77803281 kernel32!BaseThreadInitThunk+0xd
00000000`001ffe50 00000000`00000000 ntdll!RtlUserThreadStart+0×1d

Also, studying OS architecture and deliberate practice in memory dump analysis helps in recognition of problem and normal structural and behavioral patterns.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Forthcoming Webinar: Introduction to Pattern-Driven Software Problem Solving

Monday, March 7th, 2011

Introduction to Pattern-Driven Software Problem Solving Logo

The first Webinar to start an in-depth discussion of pattern-driven software troubleshooting, debugging and maintenance:

Date: 25th of March 2011
Time: 18:30 (GMT) 14:30 (EST) 11:30 (PST)
Duration: 60 minutes

Space is limited.
Reserve your Webinar seat now at:
https://www3.gotomeeting.com/register/448268158

Topics include:

  • A Short History of DumpAnalysis.org
  • Memory Dump Analysis Patterns
  • Troubleshooting and Debugging Tools (Debugware) Patterns
  • Software Trace Analysis Patterns
  • From Software Defects to Software Behavior
  • Workaround Patterns
  • Structural Memory Patterns
  • Memory Analysis Domain Pattern Hierarchy
  • New Directions

Prerequisites: experience in software troubleshooting and/or debugging.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Crash Dump Analysis Patterns (Part 131)

Monday, March 7th, 2011

In certain software behavior scenarios such as a memory leak when we see top modules calling OS API functions we might suspect them having defects. However, this might not be the case and these modules were used from Directing Module  keeping references or handles preventing top modules from freeing memory or releasing resources.

For example, a memory dump from a process had 2 growing heap segments and one of them had this recurrent stack trace saved in a user mode stack trace database:

38D2CE78: 02ba8 . 02ba8 [07] - busy (2b90), tail fill
Stack trace (38101) at 83e390:
7d6568be: ntdll!RtlAllocateHeapSlowly+0×00000041
7d62b846: ntdll!RtlAllocateHeap+0×00000E9F

337d0572: ModuleA!XHeapAlloc+0×00000115
[…]
338809e2: ModuleA!Execute+0×000002CD

488b3fc1: ModuleB!Execute+0×000000D3
679b8c64: ModuleC!ExecuteByHandle+0×00000074
[…]
67d241cb: ModuleD!Query+0×0000016B
67ba2ed4: ModuleE!Browse+0×000000E4
[…]
667122c6: ModuleF!Check+0×00000126
65e73826: ModuleG!Enum+0×00000406

[…]

Initially we suspected ModuleA but found a different recurrent stack trace corresponding to another growing segment:

40C81688: 000c8 . 00058 [07] - busy (40), tail fill
Stack trace (38136) at 83f6a4:
7d6568be: ntdll!RtlAllocateHeapSlowly+0×00000041
7d62b846: ntdll!RtlAllocateHeap+0×00000E9F
7c3416b3: msvcr71!_heap_alloc+0×000000E0
7c3416db: msvcr71!_nh_malloc+0×00000010

67745875: ModuleX!BufAllocate+0×00000015
6775085e: ModuleY!QueryAttribute+0×0000008E
[…]
677502b5: ModuleY!Query+0×00000015
67ba2f19: ModuleE!Browser+0×00000129
[…]
667122c6: ModuleF!Check+0×00000126
65e73826: ModuleG!Enum+0×00000406
[…]

From the common stack trace fragment (highlighted in blue) we transferred our investigation to ModuleE and indeed the similar software incident (as the latter trace) was found in our troubleshooting database.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Stack Overflow Patterns

Wednesday, March 2nd, 2011

A page to reference all different kinds of stack overflow is necessary, so I created this post:

I’ll update it as soon as I add more similar patterns.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Crash Dump Analysis Patterns (Part 16c)

Tuesday, March 1st, 2011

Stack Overflow pattern variants in user and kernel mode are ISA (Instruction Set Architecture) and processor architecture oriented. Another pattern variant is software stack implementations where push and pop operations check  stack ADT preconditions and throw a software exception (overflow or underflow) or call an assertion mechanism to display an error message. For the latter example, we look at a bugcheck for the specific stack implementation on Windows: IRP stack locations array. For a graphical reminder on how driver-to-driver communication is implemented by an IRP stack corresponding to a driver stack please refer to UML diagram no. 3 in the old post about using UML for describing device driver design. The following WinDbg command output is from a kernel memory dump:

0: kd> !analyze -v
[...]
NO_MORE_IRP_STACK_LOCATIONS (35)
A higher level driver has attempted to call a lower level driver through the IoCallDriver() interface, but there are no more stack locations in the packet, hence, the lower level driver would not be able to access its parameters, as there are no parameters for it. This is a disasterous situation, since the higher level driver "thinks" it has filled in the parameters for the lower level driver (something it MUST do before it calls it), but since there is no stack location for the latter driver, the former has written off of the end of the packet.  This means that some other memory has probably been trashed at this point.
Arguments:
Arg1: fffffa800500c9e0, Address of the IRP
Arg2: 0000000000000000
Arg3: 0000000000000000
Arg4: 0000000000000000
[…]

0: kd> kL 100
Child-SP          RetAddr           Call Site
fffff880`01fe2338 fffff800`016d7732 nt!KeBugCheckEx
fffff880`01fe2340 fffff800`01754f27 nt!KiBugCheck3+0x12
fffff880`01fe2370 fffff880`0177e271 nt! ?? ::FNODOBFM::`string’+0×3f31b
fffff880`01fe23a0 fffff880`0177c138 DriverA!CallProvider+0×161
[…]
fffff880`01fe2cb0 fffff800`0197a7c6 nt!ExpWorkerThread+0×111
fffff880`01fe2d40 fffff800`016b5c26 nt!PspSystemThreadStartup+0×5a
fffff880`01fe2d80 00000000`00000000 nt!KxStartSystemThread+0×16

0: kd> !irp fffffa800500c9e0
Irp is active with 1 stacks 0 is current (= 0xfffffa8006c2e960)
 No Mdl: No System Buffer: Thread 00000000:  Irp stack trace. 
     cmd  flg cl Device   File     Completion-Context
 [  4, 0]   0 e0 fffffa8004045c50 fffffa8006c2e960 fffff88005a04460-fffffa8005b9c370 Success Error Cancel
        \DriverA DriverB!CompleteRoutine
   Args: 00000008 00000000 00000000 00000000

0: kd> ub fffff880`0177e271
DriverA!CallProvider+0×13e:
fffff880`0177e24e mov     qword ptr [r11-10h],rax
fffff880`0177e252 mov     qword ptr [r11-8],r12
fffff880`0177e256 mov     byte ptr [r11-45h],0E0h
fffff880`0177e25b mov     rcx,qword ptr [rdi+40h]
fffff880`0177e25f call    qword ptr [DriverA!_imp_IoGetAttachedDevice (fffff880`017790b0)]
fffff880`0177e265 mov     rdx,rbp
fffff880`0177e268 mov     rcx,rax
fffff880`0177e26b call    qword ptr [DriverA!_imp_IofCallDriver (fffff880`01779068)]

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Bugtation No.138

Thursday, February 24th, 2011

Learning from Philip Marlowe, a detective:

“I like you,” she said suddenly. “You believe in miracles.”

Raymond Chandler, The Big Sleep

Do you believe in miracles from a driver modifying an arbitrary user space? Or in a miracle of suddenly disappearing software incidents?

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Icons for Memory Dump Analysis Patterns (Part 93)

Thursday, February 24th, 2011

Today we introduce an icon for Hardware Error pattern:

B/W

Color

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -