Minidump Analysis (Part 2)
In the first part I outlined WinDbg commands we can use to get various information from system minidumps. In this part I will show how to find a 3rd-party driver that might have been responsible for system crash and verify that WinDbg reports that driver correctly.
For example, one of the log files shows the following !analyze -v results:
command> !analyze -v
*************************************************************
* *
* Bugcheck Analysis *
* *
*************************************************************
BAD_POOL_CALLER (c2)
The current thread is making a bad pool request. Typically this is at a bad IRQL level or double freeing the same allocation, etc.
Arguments:
Arg1: 00000047, Attempt to free a non-allocated nonpaged pool address
Arg2: 85083000, Starting address
Arg3: 00005083, physical page frame
Arg4: 000bfff9, highest physical page frame
Debugging Details:
------------------
BUGCHECK_STR: 0xc2_47
CUSTOMER_CRASH_COUNT: 2
DEFAULT_BUCKET_ID: DRIVER_FAULT_SERVER_MINIDUMP
PROCESS_NAME: 3rdPartyAntivirus.exe
CURRENT_IRQL: 0
LAST_CONTROL_TRANSFER: from 8054d2eb to 805435b9
STACK_TEXT:
b325db68 8054d2eb 000000c2 00000047 85083000 nt!KeBugCheckEx+0x19
b325db94 805689c2 85083000 00080000 85083000 nt!MmGetSizeOfBigPoolAllocation+0x1cb
b325dbe4 f77c8098 00000000 00000000 00000004 nt!ExFreePoolWithTag+0x1d0
WARNING: Stack unwind information not available. Following frames may be wrong.
b325dc24 f77c8234 8599d4e0 8599d4e0 88cecf38 3rdPartyAVDrv+0×1098
b325dc3c 804f04f3 871d2cf0 00004808 8743b6a0 3rdPartyAVDrv+0×1234
b325dc4c 80585208 8599d550 872c8028 8599d4e0 nt!IofCallDriver+0×3f
b325dc60 80585fe6 871d2cf0 8599d4f0 872c8028 nt!IopSynchronousServiceTail+0×6f
b325dd00 80586028 00000468 00000614 00000000 nt!IopXxxControlFile+0×607
b325dd34 804dfd24 00000468 00000614 00000000 nt!NtDeviceIoControlFile+0×28
b325dd34 7ffe0304 00000468 00000614 00000000 nt!KiSystemService+0xd0
0087b6c0 00000000 00000000 00000000 00000000 SharedUserData!SystemCallStub+0×4
STACK_COMMAND: kb
FOLLOWUP_IP:
3rdPartyAVDrv+1098
f77c8098 ?? ???
SYMBOL_STACK_INDEX: 3
SYMBOL_NAME: 3rdPartyAVDrv+1098
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: 3rdPartyAVDrv
IMAGE_NAME: 3rdPartyAVDrv.sys
DEBUG_FLR_IMAGE_TIMESTAMP: 410752c5
FAILURE_BUCKET_ID: 0xc2_47_3rdPartyAVDrv+1098
BUCKET_ID: 0xc2_47_3rdPartyAVDrv+1098
Followup: MachineOwner
---------
MODULE_NAME and IMAGE_NAME report 3rd-party antivirus driver 3rdPartyAVDrv.sys responsible for BSOD. However the top lines from STACK_TEXT report nt module, Windows NT kernel and system. We can get information about all loaded drivers from the output of lmv command:
command> lmv
start end module name
804de000 80744000 nt # (pdb symbols) c:\...\ntkrnlmp.pdb\...\ntkrnlmp.pdb
Loaded symbol image file: ntkrnlmp.exe
Mapped memory image file: c:\...\ntkrnlmp.exe\...\ntkrnlmp.exe
Image path: ntkrnlmp.exe
Image name: ntkrnlmp.exe
Timestamp: Tue Mar 25 08:39:34 2003 (3E8015C6)
CheckSum: 00254553
ImageSize: 00266000
File version: 5.2.3790.0
Product version: 5.2.3790.0
File flags: 0 (Mask 3F)
File OS: 40004 NT Win32
File type: 1.0 App
File date: 00000000.00000000
Translations: 0415.04b0
CompanyName: Microsoft Corporation
ProductName: Microsoft® Windows® Operating System
InternalName: ntkrnlmp.exe
OriginalFilename: ntkrnlmp.exe
ProductVersion: 5.2.3790.0
FileVersion: 5.2.3790.0 (srv03_rtm.030324-2048)
FileDescription: NT Kernel & System
LegalCopyright: © Microsoft Corporation. All rights reserved.
...
...
...
f77c7000 f77cf000 3rdPartyAVDrv 3rdPartyAVDrv.sys
Loaded symbol image file: 3rdPartyAVDrv.sys
Symbol file: 3rdPartyAVDrv.sys
Image path: 3rdPartyAVDrv.sys
Timestamp: Wed Jul 28 07:15:21 2004 (410752C5)
CheckSum: 00011518
ImageSize: 00008000
Translations: 0000.04b0 0000.04e0 0409.04b0 0409.04e0
...
...
...
Why WinDbg skips Microsoft module and points to the 3rd-party one? Because, when WinDbg encounters a non-Microsoft driver, it always shows it as the possible cause. Then it looks at triage.ini file located in triage folder under Debugging Tools for Windows installation folder where certain modules and functions are listed with appropriate actions for WinDbg, for example:
nt!KeBugCheck*=ignore
nt!ExAllocatePool=Pool_corruption
Let’s add the following entry at the end of that INI file:
3rdPartyAVDrv!*=ignore
Now, if we load the dump in WinDbg, set the symbols and run the analysis, we get different results:
3: kd> .symfix
No downstream store given, using C:\Program Files\Debugging Tools for Windows\sym
3: kd> !analyze -v
...
...
...
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: nt
DEBUG_FLR_IMAGE_TIMESTAMP: 3e8015c6
SYMBOL_NAME: nt!MmGetSizeOfBigPoolAllocation+1cb
IMAGE_NAME: memory_corruption
FAILURE_BUCKET_ID: 0xc2_47_nt!MmGetSizeOfBigPoolAllocation+1cb
BUCKET_ID: 0xc2_47_nt!MmGetSizeOfBigPoolAllocation+1cb
Followup: MachineOwner
---------
Because nt!MmGetSizeOfBigPoolAllocation is not listed in triage.ini WinDbg reports nt module and memory corruption. The latter cause is probably inferred from either BAD_POOL_CALLER bugcheck name or Mm function prefix.
Let’s add more lines to tiage.ini:
3rdPartyAVDrv!*=ignore
nt!MmGetSizeOfBigPoolAllocation=ignore
nt!ExFreePoolWithTag=Dynamic memory corruption detected when freeing memory
Now the analysis reports our custom follow up message:
3: kd> !analyze -v
...
...
...
FOLLOWUP_IP:
nt!ExFreePoolWithTag+1d0
805689c2 e9c8f0ffff jmp nt!ExFreePoolWithTag+0x1d0 (80567a8f)
SYMBOL_STACK_INDEX: 2
FOLLOWUP_NAME: Dynamic memory corruption detected when freeing memory
MODULE_NAME: nt
IMAGE_NAME: ntkrnlmp.exe
DEBUG_FLR_IMAGE_TIMESTAMP: 3e8015c6
SYMBOL_NAME: nt!ExFreePoolWithTag+1d0
FAILURE_BUCKET_ID: 0xc2_47_nt!ExFreePoolWithTag+1d0
BUCKET_ID: 0xc2_47_nt!ExFreePoolWithTag+1d0
Followup: Dynamic memory corruption detected when freeing memory
---------
Let’s look at STACK_TEXT data:
STACK_TEXT:
b325db68 8054d2eb 000000c2 00000047 85083000 nt!KeBugCheckEx+0x19
b325db94 805689c2 85083000 00080000 85083000 nt!MmGetSizeOfBigPoolAllocation+0x1cb
b325dbe4 f77c8098 00000000 00000000 00000004 nt!ExFreePoolWithTag+0×1d0
WARNING: Stack unwind information not available. Following frames may be wrong.
b325dc24 f77c8234 8599d4e0 8599d4e0 88cecf38 3rdPartyAVDrv+0×1098
b325dc3c 804f04f3 871d2cf0 00004808 8743b6a0 3rdPartyAVDrv+0×1234
b325dc4c 80585208 8599d550 872c8028 8599d4e0 nt!IofCallDriver+0×3f
b325dc60 80585fe6 871d2cf0 8599d4f0 872c8028 nt!IopSynchronousServiceTail+0×6f
b325dd00 80586028 00000468 00000614 00000000 nt!IopXxxControlFile+0×607
b325dd34 804dfd24 00000468 00000614 00000000 nt!NtDeviceIoControlFile+0×28
b325dd34 7ffe0304 00000468 00000614 00000000 nt!KiSystemService+0xd0
0087b6c0 00000000 00000000 00000000 00000000 SharedUserData!SystemCallStub+0×4
We see that WinDbg is not sure that it correctly identified 3rdPartyAVDrv module. We can check this manually by subtracting 0×1098 from f77c8098 (shown in blue above). The latter address is the so called return address saved when nt!ExFreePoolWithTag was called and can located in the second column. We can see that it falls within 3rdPartyAVDrv address range and therefore the stack looks correct:
3: kd> ? f77c8098-0x1098
Evaluate expression: -142839808 = f77c7000
3: kd> lmv m 3rdPartyAVDrv
f77c7000 f77cf000 3rdPartyAVDrv 3rdPartyAVDrv.sys
Loaded symbol image file: 3rdPartyAVDrv.sys
Symbol file: 3rdPartyAVDrv.sys
Image path: 3rdPartyAVDrv.sys
Browse all global symbols functions data
Timestamp: Wed Jul 28 07:15:21 2004 (410752C5)
CheckSum: 00011518
ImageSize: 00008000
Translations: 0000.04b0 0000.04e0 0409.04b0 0409.04e0
The driver is dated Jul, 2004 and therefore we can try either to disable it or contract the vendor for any updates. The “Timestamp” refers to the time when the driver was built at the vendor software factory and not the time when it was installed. This is a way to identify the version of the software if no more information except the time is present in the output of lmv command.
Also note the PROCESS_NAME field in the analysis output. It is a process that was active at the time of the system failure. The application that the process belongs to might be from the same vendor as the suspicious driver but usually they are independent, for example, if there is a bug in a display driver it might manifest when any application is running.
Sometimes stack trace (STACK_TEXT) starts with a warning like in this example:
STACK_TEXT:
WARNING: Stack unwind information not available. Following frames may be wrong.
b4f528b0 b4f52904 e24079e0 000000ab 000003cb DisplayDriver+0x21bca6
b4f528b4 e24079e0 000000ab 000003cb 43f0027f 0xb4f52904
b4f52904 00000000 000003cb 0000027a 00000135 0xe24079e0
In such cases we can look at a crash point (FAULTING_IP) and notice the module there:
FAULTING_IP:
DisplayDriver+21bca6
bfbf0ca6 8b3e mov edi,dword ptr [esi]
0: kd> lmv m DisplayDriver
start end module name
bf9d5000 bff42500 DisplayDriver T (no symbols)
Loaded symbol image file: DisplayDriver.dll
Image path: DisplayDriver.dll
Image name: DisplayDriver.dll
Timestamp: Fri Jun 29 09:13:08 2007 (4684BF14)
CheckSum: 00570500
ImageSize: 0056D500
Translations: 0000.04b0 0000.04e0 0409.04b0 0409.04e0
- Dmitry Vostokov @ DumpAnalysis.org -
September 12th, 2007 at 3:31 pm
Part 3:
http://www.dumpanalysis.org/blog/index.php/2007/09/12/minidump-analysis-part-3/
December 7th, 2007 at 12:36 am
Thank you for very useful posts, I’ve just discovered your blog and have not read everything in detail yet (but intend to!)
I have one question that I cannot find an answer to anywhere:
what value of DEBUG_FLR_IMAGE_TIMESTAMP is displayed in the minidump? Is it related to the time stamp of the sys file, or to the time stamp in its PE header, or anything else? Can it be used to distinguish between different revisions of the same driver?
Thank you in advance!
December 7th, 2007 at 12:43 pm
Seems it is taken from PE header:
DEBUG_FLR_IMAGE_TIMESTAMP: 45e53f9d
1: kd> !dh -f nt
File Type: EXECUTABLE IMAGE
FILE HEADER VALUES
14C machine (i386)
19 number of sections
45E53F9D time date stamp Wed Feb 28 08:38:53 2007
December 7th, 2007 at 12:49 pm
Also timestamp can be used to differentiate files because it is set by a linker.
December 7th, 2007 at 4:27 pm
Thank you very much, it’s very helpful!
June 20th, 2009 at 12:30 am
[…] !analyze -v pointed to the first non-MS module DriverA (the identification process is explained here) located on the following stack trace (that also shows exception processing in file system kernel […]
March 24th, 2010 at 1:42 am
[…] Minidump Analysis (Part 2) […]