Minidump Analysis (Part 2)

In the first part I outlined WinDbg commands we can use to get various information from system minidumps. In this part I will show how to find a 3rd-party driver that might have been responsible for system crash and verify that WinDbg reports that driver correctly.

For example, one of the log files shows the following !analyze -v results:

command> !analyze -v
*************************************************************
*                                                           *
*                     Bugcheck Analysis                     *
*                                                           *
*************************************************************

BAD_POOL_CALLER (c2)
The current thread is making a bad pool request. Typically this is at a bad IRQL level or double freeing the same allocation, etc.
Arguments:
Arg1: 00000047, Attempt to free a non-allocated nonpaged pool address
Arg2: 85083000, Starting address
Arg3: 00005083, physical page frame
Arg4: 000bfff9, highest physical page frame

Debugging Details:
------------------
BUGCHECK_STR:  0xc2_47

CUSTOMER_CRASH_COUNT:  2

DEFAULT_BUCKET_ID:  DRIVER_FAULT_SERVER_MINIDUMP

PROCESS_NAME:  3rdPartyAntivirus.exe

CURRENT_IRQL:  0

LAST_CONTROL_TRANSFER:  from 8054d2eb to 805435b9

STACK_TEXT:
b325db68 8054d2eb 000000c2 00000047 85083000 nt!KeBugCheckEx+0x19
b325db94 805689c2 85083000 00080000 85083000 nt!MmGetSizeOfBigPoolAllocation+0x1cb
b325dbe4 f77c8098 00000000 00000000 00000004 nt!ExFreePoolWithTag+0x1d0
WARNING: Stack unwind information not available. Following frames may be wrong.
b325dc24 f77c8234 8599d4e0 8599d4e0 88cecf38 3rdPartyAVDrv+0×1098
b325dc3c 804f04f3 871d2cf0 00004808 8743b6a0 3rdPartyAVDrv+0×1234
b325dc4c 80585208 8599d550 872c8028 8599d4e0 nt!IofCallDriver+0×3f
b325dc60 80585fe6 871d2cf0 8599d4f0 872c8028 nt!IopSynchronousServiceTail+0×6f
b325dd00 80586028 00000468 00000614 00000000 nt!IopXxxControlFile+0×607
b325dd34 804dfd24 00000468 00000614 00000000 nt!NtDeviceIoControlFile+0×28
b325dd34 7ffe0304 00000468 00000614 00000000 nt!KiSystemService+0xd0
0087b6c0 00000000 00000000 00000000 00000000 SharedUserData!SystemCallStub+0×4
STACK_COMMAND:  kb

FOLLOWUP_IP:
3rdPartyAVDrv+1098
f77c8098 ??              ???

SYMBOL_STACK_INDEX:  3

SYMBOL_NAME:  3rdPartyAVDrv+1098

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: 3rdPartyAVDrv

IMAGE_NAME:  3rdPartyAVDrv.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  410752c5

FAILURE_BUCKET_ID:  0xc2_47_3rdPartyAVDrv+1098

BUCKET_ID:  0xc2_47_3rdPartyAVDrv+1098

Followup: MachineOwner
---------

MODULE_NAME and IMAGE_NAME report 3rd-party antivirus driver 3rdPartyAVDrv.sys responsible for BSOD. However the top lines from STACK_TEXT report nt module, Windows NT kernel and system. We can get information about all loaded drivers from the output of lmv command:

command> lmv
start    end        module name
804de000 80744000   nt       # (pdb symbols)          c:\...\ntkrnlmp.pdb\...\ntkrnlmp.pdb
    Loaded symbol image file: ntkrnlmp.exe
    Mapped memory image file: c:\...\ntkrnlmp.exe\...\ntkrnlmp.exe
    Image path: ntkrnlmp.exe
    Image name: ntkrnlmp.exe
    Timestamp:        Tue Mar 25 08:39:34 2003 (3E8015C6)
    CheckSum:         00254553
    ImageSize:        00266000
    File version:     5.2.3790.0
    Product version:  5.2.3790.0
    File flags:       0 (Mask 3F)
    File OS:          40004 NT Win32
    File type:        1.0 App
    File date:        00000000.00000000
    Translations:     0415.04b0
    CompanyName:      Microsoft Corporation
    ProductName:      Microsoft® Windows® Operating System
    InternalName:     ntkrnlmp.exe
    OriginalFilename: ntkrnlmp.exe
    ProductVersion:   5.2.3790.0
    FileVersion:      5.2.3790.0 (srv03_rtm.030324-2048)
    FileDescription:  NT Kernel & System
    LegalCopyright:   © Microsoft Corporation. All rights reserved.
...
...
...
f77c7000 f77cf000   3rdPartyAVDrv 3rdPartyAVDrv.sys
    Loaded symbol image file: 3rdPartyAVDrv.sys
    Symbol file: 3rdPartyAVDrv.sys
    Image path: 3rdPartyAVDrv.sys
    Timestamp:        Wed Jul 28 07:15:21 2004 (410752C5)
    CheckSum:         00011518
    ImageSize:        00008000
    Translations:     0000.04b0 0000.04e0 0409.04b0 0409.04e0
...
...
...

Why WinDbg skips Microsoft module and points to the 3rd-party one? Because, when WinDbg encounters a non-Microsoft driver, it always shows it as the possible cause. Then it looks at triage.ini file located in triage folder under Debugging Tools for Windows installation folder where certain modules and functions are listed with appropriate actions for WinDbg, for example:

nt!KeBugCheck*=ignore
nt!ExAllocatePool=Pool_corruption

Let’s add the following entry at the end of that INI file:

3rdPartyAVDrv!*=ignore

Now, if we load the dump in WinDbg, set the symbols and run the analysis, we get different results:

3: kd> .symfix
No downstream store given, using C:\Program Files\Debugging Tools for Windows\sym
3: kd> !analyze -v
...
...
...
FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: nt

DEBUG_FLR_IMAGE_TIMESTAMP:  3e8015c6

SYMBOL_NAME:  nt!MmGetSizeOfBigPoolAllocation+1cb

IMAGE_NAME:  memory_corruption

FAILURE_BUCKET_ID:  0xc2_47_nt!MmGetSizeOfBigPoolAllocation+1cb

BUCKET_ID:  0xc2_47_nt!MmGetSizeOfBigPoolAllocation+1cb

Followup: MachineOwner
---------

Because nt!MmGetSizeOfBigPoolAllocation is not listed in triage.ini WinDbg reports nt module and memory corruption. The latter cause is probably inferred from either BAD_POOL_CALLER bugcheck name or Mm function prefix.

Let’s add more lines to tiage.ini:

3rdPartyAVDrv!*=ignore
nt!MmGetSizeOfBigPoolAllocation=ignore
nt!ExFreePoolWithTag=Dynamic memory corruption detected when freeing memory 

Now the analysis reports our custom follow up message:

3: kd> !analyze -v
...
...
...
FOLLOWUP_IP:
nt!ExFreePoolWithTag+1d0
805689c2 e9c8f0ffff      jmp     nt!ExFreePoolWithTag+0x1d0 (80567a8f)

SYMBOL_STACK_INDEX:  2

FOLLOWUP_NAME:  Dynamic memory corruption detected when freeing memory

MODULE_NAME: nt

IMAGE_NAME:  ntkrnlmp.exe

DEBUG_FLR_IMAGE_TIMESTAMP:  3e8015c6

SYMBOL_NAME:  nt!ExFreePoolWithTag+1d0

FAILURE_BUCKET_ID:  0xc2_47_nt!ExFreePoolWithTag+1d0

BUCKET_ID:  0xc2_47_nt!ExFreePoolWithTag+1d0

Followup: Dynamic memory corruption detected when freeing memory
---------

Let’s look at STACK_TEXT data:

STACK_TEXT:
b325db68 8054d2eb 000000c2 00000047 85083000 nt!KeBugCheckEx+0x19
b325db94 805689c2 85083000 00080000 85083000 nt!MmGetSizeOfBigPoolAllocation+0x1cb
b325dbe4 f77c8098 00000000 00000000 00000004 nt!ExFreePoolWithTag+0×1d0
WARNING: Stack unwind information not available. Following frames may be wrong.
b325dc24 f77c8234 8599d4e0 8599d4e0 88cecf38 3rdPartyAVDrv+0×1098
b325dc3c 804f04f3 871d2cf0 00004808 8743b6a0 3rdPartyAVDrv+0×1234
b325dc4c 80585208 8599d550 872c8028 8599d4e0 nt!IofCallDriver+0×3f
b325dc60 80585fe6 871d2cf0 8599d4f0 872c8028 nt!IopSynchronousServiceTail+0×6f
b325dd00 80586028 00000468 00000614 00000000 nt!IopXxxControlFile+0×607
b325dd34 804dfd24 00000468 00000614 00000000 nt!NtDeviceIoControlFile+0×28
b325dd34 7ffe0304 00000468 00000614 00000000 nt!KiSystemService+0xd0
0087b6c0 00000000 00000000 00000000 00000000 SharedUserData!SystemCallStub+0×4

We see that WinDbg is not sure that it correctly identified 3rdPartyAVDrv module. We can check this manually by subtracting 0×1098 from f77c8098 (shown in blue above). The latter address is the so called return address saved when nt!ExFreePoolWithTag was called and can located in the second column. We can see that it falls within 3rdPartyAVDrv address range and therefore the stack looks correct:

3: kd> ? f77c8098-0x1098
Evaluate expression: -142839808 = f77c7000

3: kd> lmv m 3rdPartyAVDrv
f77c7000 f77cf000   3rdPartyAVDrv 3rdPartyAVDrv.sys
    Loaded symbol image file: 3rdPartyAVDrv.sys
    Symbol file: 3rdPartyAVDrv.sys
    Image path: 3rdPartyAVDrv.sys
    Browse all global symbols  functions  data
    Timestamp:        Wed Jul 28 07:15:21 2004 (410752C5)
    CheckSum:         00011518
    ImageSize:        00008000
    Translations:     0000.04b0 0000.04e0 0409.04b0 0409.04e0

The driver is dated Jul, 2004 and therefore we can try either to disable it or contract the vendor for any updates. The “Timestamp” refers to the time when the driver was built at the vendor software factory and not the time when it was installed. This is a way to identify the version of the software if no more information except the time is present in the output of lmv command.

Also note the PROCESS_NAME field in the analysis output. It is a process that was active at the time of the system failure. The application that the process belongs to might be from the same vendor as the suspicious driver but usually they are independent, for example, if there is a bug in a display driver it might manifest when any application is running.

Sometimes stack trace (STACK_TEXT) starts with a warning like in this example:

STACK_TEXT:
WARNING: Stack unwind information not available. Following frames may be wrong.
b4f528b0 b4f52904 e24079e0 000000ab 000003cb DisplayDriver+0x21bca6
b4f528b4 e24079e0 000000ab 000003cb 43f0027f 0xb4f52904
b4f52904 00000000 000003cb 0000027a 00000135 0xe24079e0

In such cases we can look at a crash point (FAULTING_IP) and notice the module there:

FAULTING_IP:
DisplayDriver+21bca6
bfbf0ca6 8b3e            mov     edi,dword ptr [esi]

0: kd> lmv m DisplayDriver
start    end        module name
bf9d5000 bff42500   DisplayDriver T (no symbols)
    Loaded symbol image file: DisplayDriver.dll
    Image path: DisplayDriver.dll
    Image name: DisplayDriver.dll
    Timestamp:        Fri Jun 29 09:13:08 2007 (4684BF14)
    CheckSum:         00570500
    ImageSize:        0056D500
    Translations:     0000.04b0 0000.04e0 0409.04b0 0409.04e0

- Dmitry Vostokov @ DumpAnalysis.org -

7 Responses to “Minidump Analysis (Part 2)”

  1. Dmitry Vostokov Says:

    Part 3:

    http://www.dumpanalysis.org/blog/index.php/2007/09/12/minidump-analysis-part-3/

  2. Andrei Belogortseff Says:

    Thank you for very useful posts, I’ve just discovered your blog and have not read everything in detail yet (but intend to!)

    I have one question that I cannot find an answer to anywhere:
    what value of DEBUG_FLR_IMAGE_TIMESTAMP is displayed in the minidump? Is it related to the time stamp of the sys file, or to the time stamp in its PE header, or anything else? Can it be used to distinguish between different revisions of the same driver?

    Thank you in advance!

  3. Dmitry Vostokov Says:

    Seems it is taken from PE header:

    DEBUG_FLR_IMAGE_TIMESTAMP: 45e53f9d

    1: kd> !dh -f nt

    File Type: EXECUTABLE IMAGE
    FILE HEADER VALUES
    14C machine (i386)
    19 number of sections
    45E53F9D time date stamp Wed Feb 28 08:38:53 2007

  4. Dmitry Vostokov Says:

    Also timestamp can be used to differentiate files because it is set by a linker.

  5. Andrei Belogortseff Says:

    Thank you very much, it’s very helpful!

  6. Crash Dump Analysis » Blog Archive » Null data pointer, pass through functions and platformorphic fault: pattern cooperation Says:

    […] !analyze -v pointed to the first non-MS module DriverA (the identification process is explained here) located on the following stack trace (that also shows exception processing in file system kernel […]

  7. Wisdom-Fu: E-mail alert when you find a memory dump « Wag the Real Says:

    […] Minidump Analysis (Part 2) […]

Leave a Reply

You must be logged in to post a comment.