WOW64 process, NULL data pointer, stack overflow, main thread, incorrect stack trace, nested exceptions, hidden exception, manual dump, multiple exceptions and virtualized system: pattern cooperation
Sunday, July 12th, 2009One 32-bit WOW64 process was crashing when accessing a direct NULL data pointer with the following stack trace:
0:000> r
rax=00000000750e9b40 rbx=000000000017f29c rcx=0000000000041710
rdx=0000000000041240 rsi=0000000000041710 rdi=0000000000041240
rip=00000000750e9b59 rsp=00000000000411f0 rbp=000000000017f2a8
r8=0000000000000001 r9=00000000750ffd40 r10=0000000000000000
r11=000000000017f29c r12=000000007efdb000 r13=000000000007fd20
r14=000000000007ee70 r15=00000000751e3380
iopl=0 nv up ei pl nz na po nc
cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010206
wow64!Wow64PrepareForException+0×19:
00000000`750e9b59 654c8b1c2530000000 mov r11,qword ptr gs:[30h] gs:00000000`00000030=????????????????
0:000> kc
Call Site
wow64!Wow64PrepareForException
ntdll!KiUserExceptionDispatcher
wow64!Wow64PrepareForException
ntdll!KiUserExceptionDispatcher
wow64!Wow64PrepareForException
ntdll!KiUserExceptionDispatcher
wow64!Wow64PrepareForException
ntdll!KiUserExceptionDispatcher
wow64!Wow64PrepareForException
ntdll!KiUserExceptionDispatcher
wow64!Wow64PrepareForException
ntdll!KiUserExceptionDispatcher
wow64!Wow64PrepareForException
ntdll!KiUserExceptionDispatcher
wow64!Wow64PrepareForException
ntdll!KiUserExceptionDispatcher
wow64!Wow64PrepareForException
ntdll!KiUserExceptionDispatcher
wow64!Wow64PrepareForException
ntdll!KiUserExceptionDispatcher
It looks like a stack overflow. Usually it manifests via a PUSH instruction or a data access violation when ESP/RSP < TEB.StackLimit. However here RSP is still inside the stack range:
0:000> !teb
[...]
Wow64 TEB at 000000007efdb000
ExceptionList: 000000007efdd000
StackBase: 000000000007fd20
StackLimit: 0000000000041000
SubSystemTib: 0000000000000000
FiberData: 0000000000001e00
ArbitraryUserPointer: 0000000000000000
Self: 000000007efdb000
EnvironmentPointer: 0000000000000000
ClientId: 0000000000001684 . 000000000000168c
RpcHandle: 0000000000000000
Tls Storage: 0000000000000000
PEB Address: 000000007efdf000
LastErrorValue: 0
LastStatusValue: c0000034
Count Owned Locks: 0
HardErrorMode: 0
0:000> dq rsp-10 rsp+10
00000000`000411e0 00000000`00000000 00000000`00000000
00000000`000411f0 00000000`00000000 00000000`00000000
00000000`00041200 00000000`00000000
But still RSP is very close to the stack limit and the next function call surely would overflow because of the large frame delta:
0:000> kcf
Memory Call Site
wow64!Wow64PrepareForException
50 ntdll!KiUserExceptionDispatcher
5a0 wow64!Wow64PrepareForException
50 ntdll!KiUserExceptionDispatcher
5a0 wow64!Wow64PrepareForException
50 ntdll!KiUserExceptionDispatcher
5a0 wow64!Wow64PrepareForException
50 ntdll!KiUserExceptionDispatcher
5a0 wow64!Wow64PrepareForException
50 ntdll!KiUserExceptionDispatcher
5a0 wow64!Wow64PrepareForException
50 ntdll!KiUserExceptionDispatcher
5a0 wow64!Wow64PrepareForException
50 ntdll!KiUserExceptionDispatcher
5a0 wow64!Wow64PrepareForException
50 ntdll!KiUserExceptionDispatcher
5a0 wow64!Wow64PrepareForException
50 ntdll!KiUserExceptionDispatcher
5a0 wow64!Wow64PrepareForException
50 ntdll!KiUserExceptionDispatcher
So we consider this as the stack overflow indeed. We specify the larger number of frames to reach the stack base:
0:000> kc 1000
Call Site
wow64!Wow64PrepareForException
ntdll!KiUserExceptionDispatcher
wow64!Wow64PrepareForException
ntdll!KiUserExceptionDispatcher
wow64!Wow64PrepareForException
ntdll!KiUserExceptionDispatcher
[...]
ntdll!KiUserExceptionDispatcher
wow64!Wow64PrepareForException
ntdll!KiUserExceptionDispatcher
wow64!Wow64PrepareForException
ntdll!KiUserExceptionDispatcher
wow64!Wow64PrepareForException
ntdll!KiUserExceptionDispatcher
wow64!Wow64SystemServiceEx
wow64cpu!ServiceNoTurbo
wow64!RunCpuSimulation
wow64!Wow64LdrpInitialize
ntdll!LdrpInitializeProcess
ntdll! ?? ::FNODOBFM::`string’
ntdll!LdrInitializeThunk
We consider this as a nested unmanaged exception and try to see where it originated. First we double check that we don’t have any exceptions in 32-bit code. And indeed threads look clean:
0:000:x86> ~*kc
. 0 Id: 1684.168c Suspend: 0 Teb: 7efdb000 Unfrozen
user32!NtUserGetProp
user32!GetPropA
WARNING: Stack unwind information not available. Following frames may be wrong.
Application!foo
user32!InternalCallWinProc
user32!UserCallWinProcCheckWow
user32!SendMessageWorker
user32!SendMessageA
Application
1 Id: 1684.1688 Suspend: 1 Teb: 7efd8000 Unfrozen
ntdll_77320000!ZwRemoveIoCompletion
kernel32!GetQueuedCompletionStatus
rpcrt4!COMMON_ProcessCalls
rpcrt4!LOADABLE_TRANSPORT::ProcessIOEvents
rpcrt4!ProcessIOEventsWrapper
rpcrt4!BaseCachedThreadRoutine
rpcrt4!ThreadStartRoutine
kernel32!BaseThreadInitThunk
ntdll_77320000!__RtlUserThreadStart
ntdll_77320000!_RtlUserThreadStart
2 Id: 1684.1678 Suspend: 1 Teb: 7efd5000 Unfrozen
ntdll_77320000!NtDelayExecution
kernel32!SleepEx
kernel32!Sleep
ole32!CROIDTable::WorkerThreadLoop
ole32!CRpcThread::WorkerLoop
ole32!CRpcThreadCache::RpcWorkerThreadEntry
kernel32!BaseThreadInitThunk
ntdll_77320000!__RtlUserThreadStart
ntdll_77320000!_RtlUserThreadStart
3 Id: 1684.164c Suspend: 1 Teb: 7efad000 Unfrozen
ntdll_77320000!NtWaitForMultipleObjects
kernel32!WaitForMultipleObjectsEx
kernel32!WaitForMultipleObjects
msiltcfg!WorkerThread
kernel32!BaseThreadInitThunk
ntdll_77320000!__RtlUserThreadStart
ntdll_77320000!_RtlUserThreadStart
4 Id: 1684.166c Suspend: 1 Teb: 7efaa000 Unfrozen
ntdll_77320000!ZwWaitForSingleObject
kernel32!WaitForSingleObjectEx
kernel32!WaitForSingleObject
winspool!MonitorRPCServerProcess
kernel32!BaseThreadInitThunk
ntdll_77320000!__RtlUserThreadStart
ntdll_77320000!_RtlUserThreadStart
The first thread looks a bit suspicious, we have never seen NtUserGetProp on stack traces, there is a possibility of an exception in main GUI thread and also the stack trace itself looks incorrect, suddenly sending a Windows message without any kind of a message loop:
0:000:x86> k
ChildEBP RetAddr
0017f294 76e45be7 user32!NtUserGetProp+0×15
0017f2a8 025d4ba0 user32!GetPropA+0×3d
WARNING: Stack unwind information not available. Following frames may be wrong.
0017f2e0 76e38807 Application!foo+0×230
0017f30c 76e38962 user32!InternalCallWinProc+0×23
0017f384 76e3c4b6 user32!UserCallWinProcCheckWow+0×109
0017f3c8 76e3eae2 user32!SendMessageWorker+0×55b
0017f3ec 02a755c6 user32!SendMessageA+0×7f
00000000 00000000 Application+0×255c6
NtUserGetProp EIP address looks valid and points to the code after the return from a system call:
0:000:x86> u eip
user32!NtUserGetProp+0x15:
76e3b64f c20800 ret 8
76e3b652 90 nop
76e3b653 90 nop
76e3b654 90 nop
76e3b655 90 nop
76e3b656 90 nop
user32!GetPropW:
76e3b657 8bff mov edi,edi
76e3b659 55 push ebp
0:000:x86> ub eip
user32!DefWindowProcW+0x96:
76e3b636 90 nop
76e3b637 90 nop
76e3b638 90 nop
76e3b639 90 nop
user32!NtUserGetProp:
76e3b63a b80e100000 mov eax,100Eh
76e3b63f 8d542404 lea edx,[esp+4]
76e3b643 b900000000 mov ecx,0
76e3b648 64ff15c0000000 call dword ptr fs:[0C0h]
Its 32-bit raw stack looks like normal main GUI thread of a VB6 application:
0:000:x86> !wow64exts.info
[...]
32 bit, StackBase : 0×180000
StackLimit : 0×17c000
Deallocation: 0×80000
64 bit, StackBase : 0x7fd20
StackLimit : 0x41000
Deallocation: 0x40000
[...]
0:000:x86> dds 0x17c000 0x180000
0017c000 00000000
0017c004 00000000
0017c008 00000000
0017c00c 00000000
0017c010 00000000
[...]
0017fdc8 00000000
0017fdcc 0017fe30
0017fdd0 76e38aad user32!DispatchMessageWorker+0×380
0017fdd4 00000000
0017fdd8 726ff5d1 msvbvm60!IID_IVbaHost+0×30ff1
0017fddc 0002088c
0017fde0 00000113
0017fde4 0002088c
0017fde8 00000000
0017fdec 00b319a8
0017fdf0 00000001
0017fdf4 9f7e168b
0017fdf8 76e395c0 user32!PeekMessageA
0017fdfc 02c0209c
0017fe00 00000000
0017fe04 76e395c0 user32!PeekMessageA
0017fe08 02c0216c
0017fe0c 0002088c
0017fe10 00000000
0017fe14 726aa76a msvbvm60!_vbaStrToAnsi+0×3ab
0017fe18 0017fdf4
0017fe1c ffffffff
0017fe20 0017ff70
0017fe24 76e985e7 user32!_except_handler4
0017fe28 e98a626b
0017fe2c fffffffe
0017fe30 0017fe40
0017fe34 76e39100 user32!DispatchMessageA+0xf
[…]
0017ff58 ffffffff
0017ff5c ffffffff
0017ff60 ffffffff
0017ff64 00000000
0017ff68 0017ff10
0017ff6c 00000000
0017ff70 0017ffc4
0017ff74 7278bafd msvbvm60!CreateIExprSrvObj+0×9f2
0017ff78 726b97d0 msvbvm60!BASIC_CLASS_Release+0xadbd
0017ff7c 00000000
0017ff80 0017ff94
0017ff84 00401396
0017ff88 00401514
0017ff8c 768de3f3 kernel32!BaseThreadInitThunk+0xe
0017ff90 7efde000
0017ff94 0017ffd4
0017ff98 7739cfed ntdll_77320000!__RtlUserThreadStart+0×23
0017ff9c 7efde000
0017ffa0 63553bf6
0017ffa4 00000000
0017ffa8 00000000
0017ffac 7efde000
0017ffb0 00000000
0017ffb4 00000000
0017ffb8 00000000
0017ffbc 0017ffa0
0017ffc0 00000000
0017ffc4 0017ffe4
0017ffc8 773d2926 ntdll_77320000!_except_handler4
0017ffcc 14761e1a
0017ffd0 00000000
0017ffd4 0017ffec
0017ffd8 7739d1ff ntdll_77320000!_RtlUserThreadStart+0×1b
0017ffdc 0040138c
0017ffe0 7efde000
0017ffe4 ffffffff
0017ffe8 773bd377 ntdll_77320000!FinalExceptionHandler
0017ffec 00000000
0017fff0 00000000
0017fff4 0040138c
0017fff8 7efde000
0017fffc 00000000
00180000 78746341
There are no any signs of exception codes and processing residue there and we come back to out 64-bit layer:
0:000:x86> .effmach amd64
Effective machine: x64 (AMD64)
We dump 64-bit stack to see the moment when the first exception happened:
0:000> dqs 0x41000 0x7fd20
[...]
00000000`0007df30 00000000`00000000
00000000`0007df38 00000000`00000000
00000000`0007df40 00000000`00000000
00000000`0007df48 00000000`00000000
00000000`0007df50 00000000`00000000
00000000`0007df58 00000000`00000000
00000000`0007df60 00000000`00000000
00000000`0007df68 00000000`00000000
00000000`0007df70 00000000`00000000
00000000`0007df78 00000000`00000000
00000000`0007df80 00000000`750e9b59 wow64!Wow64PrepareForException+0x19
00000000`0007df88 00000000`00000000
00000000`0007df90 00000000`00000000
00000000`0007df98 00000000`0007dfb0
00000000`0007dfa0 00000000`00000001
00000000`0007dfa8 00000000`0007e510
00000000`0007dfb0 00000000`611b422e
00000000`0007dfb8 fffffa60`04eec978
00000000`0007dfc0 fffffa60`04ee6520
00000000`0007dfc8 fffffa60`04ee6520
00000000`0007dfd0 fffffa60`04eec978
00000000`0007dfd8 00000000`00000030
00000000`0007dfe0 00000000`0017f29c
00000000`0007dfe8 00000000`0000c04c
00000000`0007dff0 00000000`00000000
00000000`0007dff8 00000000`771c59e6 ntdll!KiUserExceptionDispatcher+0×1c
00000000`0007e000 01c9f89c`fe787c8d
00000000`0007e008 fffffa60`03732e4e
00000000`0007e010 fffffa80`0497c9a0
00000000`0007e018 00000000`00000000
00000000`0007e020 00000000`00000001
00000000`0007e028 01c9f89c`fe787c00
00000000`0007e030 00001fa0`0010001f
00000000`0007e038 0053002b`002b0033
00000000`0007e040 00010283`002b002b
00000000`0007e048 00000000`00000000
00000000`0007e050 00000000`00000000
00000000`0007e058 00000000`00000000
00000000`0007e060 00000000`00000000
00000000`0007e068 00000000`00000000
00000000`0007e070 00000000`00000000
00000000`0007e078 00000000`00000006
00000000`0007e080 00000000`0000100e
00000000`0007e088 00000000`0000000e
00000000`0007e090 00000000`0017f29c
00000000`0007e098 00000000`0007e5a0
00000000`0007e0a0 00000000`0017f2a8
00000000`0007e0a8 00000000`0000c04c
00000000`0007e0b0 00000000`00000000
00000000`0007e0b8 00000000`00000001
00000000`0007e0c0 00000000`750ffd40 wow64!ServiceTables
00000000`0007e0c8 00000000`00000000
00000000`0007e0d0 00000000`0017f29c
00000000`0007e0d8 00000000`7efdb000
00000000`0007e0e0 00000000`0007fd20
00000000`0007e0e8 00000000`0007ee70
00000000`0007e0f0 00000000`751e3380 wow64cpu!CpupSaveLegacyFloatingPointState+0×60
00000000`0007e0f8 00000000`750ca923 wow64!Wow64SystemServiceEx+0×57
00000000`0007e100 015c0000`4000137f
00000000`0007e108 00000023`02add97f
00000000`0007e110 0000002b`0017f408
00000000`0007e118 0000ffff`00001f80
00000000`0007e120 00000000`00000000
00000000`0007e128 00000000`00000000
All addresses we used to try for .exr and .cxr commands in hidden exception pattern are beyond user space and we therefore conclude that somehow such structures or pointers to them became corrupt and their access triggered the same exception processing code over and over again finally leading to the stack overflow. This was an x64 system and taking a complete memory dump was impractical so at the same time when a WER error message appeared a kernel memory dump was generated (a pair of dumps from fiber bundle) and it shows the final stages of exception processing that are in agreement with the user dump we analyzed.
Initially we thought that the system experienced a bugcheck after the application crash:
1: kd> !analyze -v
DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
An attempt was made to access a pageable (or completely invalid) address at an interrupt request level (IRQL) that is too high. This is usually caused by drivers using improper addresses. If kernel debugger is available get stack backtrace.
Arguments:
Arg1: fffff880058b1010, memory referenced
Arg2: 0000000000000002, IRQL
Arg3: 0000000000000000, value 0 = read operation, 1 = write operation
Arg4: fffffa60053da17a, address which referenced memory
But then we noticed that bugcheck was forced intentionally using NotMyFault sysinternals tool:
1: kd> kc
Call Site
nt!KeBugCheckEx
nt!KiBugCheckDispatch
nt!KiPageFault
myfault
myfault
nt!IopXxxControlFile
nt!NtDeviceIoControlFile
nt!KiSystemServiceCopyEnd
0×0
In the kernel dump we see our application and the next process is WER:
1: kd> !vm
[...]
PROCESS fffffa8001886710
SessionId: 1 Cid: 13e4 Peb: 7efdf000 ParentCid: 0ab0
DirBase: 329c8000 ObjectTable: fffff880059424d0 HandleCount: 190.
Image: Application.exe
PROCESS fffffa800188a040
SessionId: 1 Cid: 0b44 Peb: 7efdf000 ParentCid: 0934
DirBase: 2b144000 ObjectTable: fffff88004c68a30 HandleCount: 166.
Image: WerFault.exe
We see a page fault on a kernel stack trace of the main process thread:
1: kd> !thread fffffa8001cc7bb0 1f
THREAD fffffa8001cc7bb0 Cid 13e4.01dc Teb: 000000007efdb000 Win32Thread: fffff900c210ed50 WAIT: (WrLpcReply) UserMode Non-Alertable
fffffa8001cc7f40 Semaphore Limit 0x1
Waiting for reply to ALPC Message fffff880075c3970 : queued at port fffffa8001dae290 : owned by process fffffa8001d8e040
Not impersonating
DeviceMap fffff88004c30a00
Owning Process fffffa8001886710 Image: Application.exe
Attached Process N/A Image: N/A
Wait Start TickCount 83669865 Ticks: 320 (0:00:00:04.992)
Context Switch Count 1829 LargeStack
UserTime 00:00:00.187
KernelTime 00:00:00.280
Win32 Start Address 0x000000000040138c
Stack Init fffffa6004cc3db0 Current fffffa6004cc2ae0
Base fffffa6004cc4000 Limit fffffa6004cb9000 Call 0
Priority 10 BasePriority 8 PriorityDecrement 2 IoPriority 2 PagePriority 5
Child-SP RetAddr Call Site
fffffa60`04cc2b20 fffff800`0165cf8a nt!KiSwapContext+0×7f
fffffa60`04cc2c60 fffff800`0165e38a nt!KiSwapThread+0×2fa
fffffa60`04cc2cd0 fffff800`0168ecab nt!KeWaitForSingleObject+0×2da
fffffa60`04cc2d60 fffff800`018e1f64 nt!AlpcpSignalAndWait+0×7b
fffffa60`04cc2da0 fffff800`018e7216 nt!AlpcpReceiveSynchronousReply+0×44
fffffa60`04cc2e00 fffff800`018dc27f nt!AlpcpProcessSynchronousRequest+0×24f
fffffa60`04cc2f20 fffff800`01656e33 nt!NtAlpcSendWaitReceivePort+0×19f
fffffa60`04cc2fd0 fffff800`01657340 nt!KiSystemServiceCopyEnd+0×13 (TrapFrame @ fffffa60`04cc3040)
fffffa60`04cc31d8 fffff800`019b6e30 nt!KiServiceLinkage
fffffa60`04cc31e0 fffff800`01898d6d nt!DbgkpSendErrorMessage+0×230
fffffa60`04cc3300 fffff800`016bf487 nt!DbgkForwardException+0×12d
fffffa60`04cc3440 fffff800`016571e9 nt! ?? ::FNODOBFM::`string’+0×29550
fffffa60`04cc3a40 fffff800`01655fe5 nt!KiExceptionDispatch+0xa9
fffffa60`04cc3c20 00000000`75319b59 nt!KiPageFault+0×1e5 (TrapFrame @ fffffa60`04cc3c20)
00000000`000411f0 00000000`00000000 0×75319b59
Analysis of the raw stack confirms the final access violation address:
1: kd> dqs fffffa6004cb9000 fffffa6004cc4000
[…]
fffffa60`04cc3a00 00000000`00000000
fffffa60`04cc3a08 00000000`00000000
fffffa60`04cc3a10 00000000`00000000
fffffa60`04cc3a18 00000000`00000000
fffffa60`04cc3a20 00000000`0007fd20
fffffa60`04cc3a28 00000000`7efdb000
fffffa60`04cc3a30 00000000`00041240
fffffa60`04cc3a38 fffff800`016571e9 nt!KiExceptionDispatch+0xa9
fffffa60`04cc3a40 fffffa60`04cc3b78
fffffa60`04cc3a48 00000000`0017f290
fffffa60`04cc3a50 fffffa60`04cc3c20
fffffa60`04cc3a58 00000000`00041710
fffffa60`04cc3a60 00000000`00000001
fffffa60`04cc3a68 00000000`80000001
fffffa60`04cc3a70 0000002d`0000002d
fffffa60`04cc3a78 0000002d`0000002d
fffffa60`04cc3a80 0000002d`0000002d
fffffa60`04cc3a88 0000002d`0000002d
fffffa60`04cc3a90 00000000`00000000
fffffa60`04cc3a98 00000000`00000000
fffffa60`04cc3aa0 00000000`00000000
fffffa60`04cc3aa8 00000000`00000000
fffffa60`04cc3ab0 00000000`00000000
fffffa60`04cc3ab8 00000000`00000000
fffffa60`04cc3ac0 00000000`00000000
fffffa60`04cc3ac8 00000000`00000000
[…]
1: kd> .exr fffffa60`04cc3b78
ExceptionAddress: 0000000075319b59
ExceptionCode: c0000005 (Access violation)
ExceptionFlags: 00000000
NumberParameters: 2
Parameter[0]: 0000000000000000
Parameter[1]: 0000000000000030
Attempt to read from address 0000000000000030
Looking at drivers shows that the system was virtualized under a VM:
1: kd> lm
[...]
fffffa60`037bc000 fffffa60`037e0000 3rdpartyVM (deferred)
[...]
Therefore we suggested to test the application without VM to rule out vitrualization influence.
- Dmitry Vostokov @ DumpAnalysis.org -

