Crash Dump Analysis Patterns (Part 87)

Sometimes we get rare or hardware-related bugchecks and these might have been influenced by hardware virtualization (that is also a software and could have its own defects). Therefore it is beneficial to recognize when a system is running under a VM (Virtualized System pattern):

Memory Dumps from Xen-virtualized Windows

Memory dumps from VMware images (Virtual PC diagnostics in post comments)

Memory Dumps from Hyper-Virtualized Windows

For example, we get the following bugcheck and stack trace for the first processor:

0: kd> !analyze -v
[...]
CLOCK_WATCHDOG_TIMEOUT (101)
An expected clock interrupt was not received on a secondary processor in an
MP system within the allocated interval. This indicates that the specified
processor is hung and not processing interrupts.
Arguments:
Arg1: 00000060, Clock interrupt time out interval in nominal clock ticks.
Arg2: 00000000, 0.
Arg3: 805d8120, The PRCB address of the hung processor.
Arg4: 00000001, 0.

CURRENT_IRQL:  1c

STACK_TEXT: 
8d283694 816df8a5 nt!KeBugCheckEx+0x1e
8d2836c8 816e163d nt!KeUpdateRunTime+0xd5
8d2836c8 84617008 nt!KeUpdateSystemTime+0xed
WARNING: Frame IP not in any known module. Following frames may be wrong.
8d283748 816f46a1 0×84617008
8d283758 816fa6aa nt!HvlpInitiateHypercall+0×21
8d283784 8166aca5 nt!HvlNotifyLongSpinWait+0×2b
8d2837a0 816cce7e nt!KeFlushSingleTb+0xc4
8d283808 81681db4 nt!MmAccessFault+0xc1d
8d283808 816dd033 nt!KiTrap0E+0xdc

8d28389c 8168ed58 nt!memcpy+0×33
8d283954 816712bf nt!MmCopyToCachedPage+0×1193
8d2839ec 81663053 nt!CcMapAndCopy+0×210
8d283a74 8c688218 nt!CcFastCopyWrite+0×283
8d283b98 8c40badc Ntfs!NtfsCopyWriteA+0×23e
8d283bcc 8c40bcab fltmgr!FltpPerformFastIoCall+0×22e
8d283bf8 8c41dc30 fltmgr!FltpPassThroughFastIo+0×7d
8d283c3c 818471cd fltmgr!FltpFastIoWrite+0×146
8d283d38 8167ec7a nt!NtWriteFile+0×34c
8d283d38 77115e74 nt!KiFastCallEntry+0×12a
01cfee80 00000000 0×77115e74

The thread was servicing a page fault. Notice the gap between KeUpdateSystemTime and HvlpInitiateHypercall. This is normal and consistent code if we look closer:

0: kd> .asm no_code_bytes
Assembly options: no_code_bytes

0: kd> uf HvlpInitiateHypercall
nt!HvlpInitiateHypercall:
816f4680 push    edi
816f4681 push    esi
816f4682 push    ebx
816f4683 mov     eax,dword ptr [esp+10h]
816f4687 mov     edx,dword ptr [esp+14h]
816f468b mov     ecx,dword ptr [esp+18h]
816f468f mov     ebx,dword ptr [esp+1Ch]
816f4693 mov     esi,dword ptr [esp+20h]
816f4697 mov     edi,dword ptr [esp+24h]
816f469b call    dword ptr [nt!HvlpHypercallCodeVa (8176bb8c)]
816f46a1 pop     ebx
816f46a2 pop     esi
816f46a3 pop     edi
816f46a4 ret     18h

0: kd> dp 8176bb8c l1
8176bb8c  84617000

0: kd> uf 84617000
84617000 or      eax,80000000h
84617005 vmcall
84617008ret

We have the address of RET instruction (84617008) on the stack trace:

0: kd> kv
ChildEBP RetAddr  Args to Child             
8d283694 816df8a5 00000101 00000060 00000000 nt!KeBugCheckEx+0x1e
8d2836c8 816e163d 84e1521b 000000d1 8d283784 nt!KeUpdateRunTime+0xd5
8d2836c8 8461700884e1521b 000000d1 8d283784 nt!KeUpdateSystemTime+0xed (FPO: [0,2] TrapFrame @ 8d2836d8)
WARNING: Frame IP not in any known module. Following frames may be wrong.
8d283748 816f46a1 84e6c900 22728000 8172e28c 0×84617008
8d283758 816fa6aa 00010008 00000000 22728000 nt!HvlpInitiateHypercall+0×21 (FPO: [6,3,0])
8d283784 8166aca5 22728000 00000000 00000000 nt!HvlNotifyLongSpinWait+0×2b
[…]

The second processor is busy too:

0: kd> !running

System Processors 3 (affinity mask)
  Idle Processors 0

Prcbs  Current   Next   
  0    8172c920  84e6c900            ................
  1    805d8120  85138030  85a50d78  …………….

0: kd> !thread 85138030
THREAD 85138030  Cid 0564.11c8  Teb: 7ff9f000 Win32Thread: 00000000 RUNNING on processor 1
IRP List:
    85ab5d00: (0006,01fc) Flags: 00000884  Mdl: 00000000
    85445ab8: (0006,0094) Flags: 00060000  Mdl: 00000000
Not impersonating
DeviceMap                 98a7d558
Owning Process            84f0d938       Image:         Application.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      695643         Ticks: 224 (0:00:00:03.500)
Context Switch Count      20            
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Win32 Start Address 0x705e2679
Stack Init a1d13000 Current a1d10a70 Base a1d13000 Limit a1d10000 Call 0
Priority 9 BasePriority 8 PriorityDecrement 0 IoPriority 2 PagePriority 5
ChildEBP RetAddr  Args to Child             
00000000 00000000 00000000 00000000 00000000 0×0

Because we have (Limit, Current, Base) triple for our thread we check its execution residue on kernel raw stack. We find traces of a hypercall too:

0: kd> dds a1d10000 a1d13000
[...]
a1d1215c  816fa6da nt!HvlEndSystemInterrupt+0x20
a1d12160  40000070
a1d12164  00000000
a1d12168  81608838 hal!HalEndSystemInterrupt+0x7a
a1d1216c  805d8000
a1d12170  a1d12180
a1d12174  81618cc9 hal!HalpIpiHandler+0x189
a1d12178  84f4841b
a1d1217c  000000e1
a1d12180  a1d1222c
a1d12184  84617008
a1d12188  badb0d00
a1d1218c  00000000
a1d12190  81665699 nt!RtlWalkFrameChain+0x58
a1d12194  816659c4 nt!RtlWalkFrameChain+0x377
a1d12198  11c4649e
a1d1219c  00000002
a1d121a0  00000003
a1d121a4  85478444
a1d121a8  0000001f
a1d121ac  00000000
a1d121b0  00000000
a1d121b4  816f46a1 nt!HvlpInitiateHypercall+0×21
a1d121b8  8172c800 nt!KiInitialPCR
a1d121bc  85138030
a1d121c0  85a72ac0
a1d121c4  8171b437 nt!HvlSwitchVirtualAddressSpace+0×28
a1d121c8  00010001
a1d121cc  00000000
a1d121d0  85a72ac0
a1d121d4  85138030
a1d121d8  8172c802 nt!KiInitialPCR+0×2
a1d121dc  00000000
a1d121e0  85138030
a1d121e4  816fa6da nt!HvlEndSystemInterrupt+0×20
a1d121e8  40000070
a1d121ec  00000000
a1d121f0  81608838 hal!HalEndSystemInterrupt+0×7a
a1d121f4  816f46a1 nt!HvlpInitiateHypercall+0×21
a1d121f8  805d8000
a1d121fc  85138030
a1d12200  805dc1e0
a1d12204  8171b437 nt!HvlSwitchVirtualAddressSpace+0×28
a1d12208  00010001
a1d1220c  00000000
a1d12210  00000000
a1d12214  81608468 hal!HalpDispatchSoftwareInterrupt+0×5e
a1d12218  00000000
a1d1221c  00000000
a1d12220  00000206
a1d12224  a1d12240
a1d12228  81608668 hal!HalpCheckForSoftwareInterrupt+0×64
a1d1222c  00000002
a1d12230  00000000
a1d12234  816086a8 hal!KfLowerIrql
a1d12238  00000000
a1d1223c  00000002
a1d12240  a1d12250
a1d12244  8160870c hal!KfLowerIrql+0×64
a1d12248  00000000
a1d1224c  00000000
a1d12250  a1d12294
a1d12254  816e035a nt!KiSwapThread+0×477
a1d12258  85138030
a1d1225c  851380b8
a1d12260  805d8120
a1d12264  0014d1f8
[…]

Looking at raw stack further we can even see that it was processing a page fault too and manually reconstruct its stack trace:

[...]
a1d1074c  85aef510
a1d10750  a1d10768
a1d10754  81678976 nt!IofCallDriver+0×63
a1d10758  84c87d50
a1d1075c  85aef510
a1d10760  00000000
a1d10764  84c87d50
a1d10768  a1d10784
a1d1076c  8166d74e nt!IoPageRead+0×172
a1d10770  85138030
a1d10774  84a1352c
a1d10778  84a134f8
a1d1077c  84a13538
a1d10780  84c87d50
a1d10784  a1d10840
a1d10788  816abf07 nt!MiDispatchFault+0xd14
a1d1078c  00000043
a1d10790  85138030
a1d10794  84a13538
a1d10798  84a1350c
a1d1079c  84a1352c
a1d107a0  8174c800 nt!MmSystemCacheWs
a1d107a4  00000000
a1d107a8  85138030
a1d107ac  a5397bf8
a1d107b0  85b01c48
a1d107b4  00000000
a1d107b8  00000000
a1d107bc  a5397bf8
a1d107c0  84a1358c
a1d107c4  a1d10864
a1d107c8  00000000
a1d107cc  8463a590
a1d107d0  84a134f8
a1d107d4  c0518000
a1d107d8  00000000
a1d107dc  00000000
a1d107e0  00000028
a1d107e4  a1d107f4
a1d107e8  00000000
a1d107ec  00000038
a1d107f0  859f5640
a1d107f4  a4bfa390
a1d107f8  00000000
a1d107fc  00000000
a1d10800  00000000
a1d10804  a1d10938
a1d10808  818652bd nt!MmCreateSection+0×98f
a1d1080c  00000000
a1d10810  846652e8
a1d10814  00000000
a1d10818  00000000
a1d1081c  00000000
a1d10820  00000028
a1d10824  00000000
a1d10828  00000080
a1d1082c  0000000a
a1d10830  85ae1c98
a1d10834  85ae1c20
a1d10838  00000000
a1d1083c  00000000
a1d10840  a1d108b8
a1d10844  816cd325 nt!MmAccessFault+0×10c6
a1d10848  a3000000
a1d1084c  a5397bf8
a1d10850  00000000
a1d10854  8174c800 nt!MmSystemCacheWs
a1d10858  00000000
a1d1085c  00000000
a1d10860  a5397bf8
a1d10864  00000000
[…]

0: kd> k L=a1d10750 a1d10750 a1d10750
ChildEBP RetAddr 
WARNING: Frame IP not in any known module. Following frames may be wrong.
a1d10750 81678976 0xa1d10750
a1d10768 8166d74e nt!IofCallDriver+0x63
a1d10784 816abf07 nt!IoPageRead+0x172
a1d10840 816cd325 nt!MiDispatchFault+0xd14
a1d108b8 816f0957 nt!MmAccessFault+0x10c6
a1d10924 8181c952 nt!MmCheckCachedPageState+0x801
a1d109b0 8c60f850 nt!CcCopyRead+0x435
a1d109dc 8c613c52 Ntfs!NtfsCachedRead+0x13b
a1d10abc 8c612b6f Ntfs!NtfsCommonRead+0x105a
a1d10b2c 81678976 Ntfs!NtfsFsdRead+0x273
a1d10b44 8c40cba7 nt!IofCallDriver+0x63
a1d10b68 8c40d7c7 fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x251
a1d10ba0 8c40dbe7 fltmgr!FltPerformSynchronousIo+0xb9
a1d10c10 9ca680e5 fltmgr!FltReadFile+0x2ed
[...]

A bit of reverse engineering offtopic here: I think that the absence of a trap means that cache processing NTFS code reuses page fault handling code.

- Dmitry Vostokov @ DumpAnalysis.org -

2 Responses to “Crash Dump Analysis Patterns (Part 87)”

  1. Crash Dump Analysis » Blog Archive » WOW64 process, NULL data pointer, stack overflow, main thread, incorrect stack trace, nested exceptions, hidden exception, manual dump, multiple exceptions and virtualized system: pattern cooperation Says:

    […] at drivers shows that the system was virtualized under a […]

  2. Crash Dump Analysis » Blog Archive » IRP distribution anomaly, inconsistent dump, execution residue, hardware activity, coincidental symbolic information, not my version, virtualized system: pattern cooperation Says:

    […] We find the system virtualized: […]

Leave a Reply