Crash Dump Analysis AntiPatterns (Part 13)

Sometimes we follow the course of established troubleshooting or debugging methods without questioning them or not considering possible exceptions. For example, the usual advice to heap corruption signs in process memory dumps is to ask to enable full page heap. However page heap helps to catch buffer overwrites but not underwrites and heap can also be damaged by other means like double free, passing an invalid address or direct corruption of control structures via a dangling pointer. Therefore in some cases when we enable full page heap we get the same stack trace:

Normal:

0:030> kL 100
ChildEBP RetAddr 
175af77c 7d620ec5 ntdll_7d600000!RtlpCoalesceFreeBlocks+0×231
175af864 7c34218a ntdll_7d600000!RtlFreeHeap+0×38e

175af8ac 67d6153d msvcr71!free+0xc3
175af950 67781d69 Component!OnData+0×504
175af96c 7da4d03d Component!DispatchRequest+0×99

175afc28 7da4d177 rpcrt4!DispatchToStubInCNoAvrf+0×38
175afc7c 7da4d812 rpcrt4!RPC_INTERFACE::DispatchToStubWorker+0×11f
175afca0 7da59a1c rpcrt4!RPC_INTERFACE::DispatchToStub+0xa3
175afcfc 7da4da21 rpcrt4!LRPC_SCALL::DealWithRequestMessage+0×421
175afd20 7da3db14 rpcrt4!LRPC_ADDRESS::DealWithLRPCRequest+0×127
175aff84 7da45eac rpcrt4!LRPC_ADDRESS::ReceiveLotsaCalls+0×430
175aff8c 7da45dd0 rpcrt4!RecvLotsaCallsWrapper+0xd
175affac 7da45e94 rpcrt4!BaseCachedThreadRoutine+0×9d
175affb8 7d4dfe37 rpcrt4!ThreadStartRoutine+0×1b
175affec 00000000 kernel32!BaseThreadStart+0×34

With full page heap enabled:

0:030> kL
ChildEBP RetAddr 
6092f568 7d6822fa ntdll!RtlpDphIsNormalHeapBlock+0×1f
6092f598 7d68256c ntdll!RtlpDphNormalHeapFree+0×21
6092f5f0 7d685443 ntdll!RtlpDebugPageHeapFree+0×146
6092f658 7d65714a ntdll!RtlDebugFreeHeap+0×2c
6092f730 7d62c5c0 ntdll!RtlFreeHeapSlowly+0×37
6092f814 7c34218a ntdll!RtlFreeHeap+0×11a

6092f85c 67d6153d msvcr71!free+0xc3
6092f900 67781d69 Component!OnData+0×504
6092f91c 7da4d03d Component!DispatchRequest+0×99

6092fc10 7da7ec5e rpcrt4!DispatchToStubInCNoAvrf+0×38
6092fc28 7da4d177 rpcrt4!DispatchToStubInCAvrf+0×14
6092fc7c 7da4d812 rpcrt4!RPC_INTERFACE::DispatchToStubWorker+0×11f
6092fca0 7da59a1c rpcrt4!RPC_INTERFACE::DispatchToStub+0xa3
6092fcfc 7da4da21 rpcrt4!LRPC_SCALL::DealWithRequestMessage+0×421
6092fd20 7da3db14 rpcrt4!LRPC_ADDRESS::DealWithLRPCRequest+0×127
6092ff84 7da45eac rpcrt4!LRPC_ADDRESS::ReceiveLotsaCalls+0×430
6092ff8c 7da45dd0 rpcrt4!RecvLotsaCallsWrapper+0xd
6092ffac 7da45e94 rpcrt4!BaseCachedThreadRoutine+0×9d
6092ffb8 7d4dfe37 rpcrt4!ThreadStartRoutine+0×1b
6092ffec 00000000 kernel32!BaseThreadStart+0×34

We see that we get the same Component stack trace before calling free (shown in blue above) so we better search in a database of stack trace signatures before sending Habitual Reply to enable page heap. If there is a match in database then the course of future actions can be completely different: an immediate available hotfix or we find a similar problem that is already under investigation. The sooner we provide a relief to our customer the better.

- Dmitry Vostokov @ DumpAnalysis.org -

5 Responses to “Crash Dump Analysis AntiPatterns (Part 13)”

  1. Dmitry Vostokov Says:

    Another instance that I observed recently is looking at 32-bit stacks of a WOW64 process ignoring 64-bit stacks.

  2. Neelabh Mam Says:

    but verifier does expose a “backward” check box under “heap” tree item’s properties. Does that not introduce a guard page at the front as well for underwrites ?

  3. Dmitry Vostokov Says:

    I have to check there as I rarely used app verifier only gflags.exe.

  4. Neelabh Mam Says:

    verifier , as far as I can tell, is effectively a UI wrapper on top of gflags command line. Settings from either of the two eventually go to HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\

    Now as per:
    http://msdn.microsoft.com/en-us/library/windows/hardware/ff549566(v=vs.85).aspx backward implies

    /backwards
    Places the zone of reserved virtual memory at the beginning of an allocation, rather than at the end. As a result, the debugger traps overruns at the beginning of the buffer, instead of those at the end of the buffer. Valid only with the /full parameter.

    much like when we run, say a for loop backwards…

    This must have been an addition to gflags/AV post this blog post was originally published …

  5. Dmitry Vostokov Says:

    Indeed, so I will update this in the second edition of Volume 1 to be published this year with credits. Thank you!

Leave a Reply

You must be logged in to post a comment.