Archive for May, 2007

Crash Dump Analysis Patterns (Part 15)

Thursday, May 24th, 2007

Sometimes when we look at the list of loaded modules in a process address space we see an instance of the pattern that I call Module Variety. It means, literally, that there are so many different loaded modules that we start thinking that their coexistence created the problem. We can also call this pattern Component Variety or DLL Variety but I prefer the former because WinDbg refers to loaded executables, dlls, drivers, ActiveX controls, etc. as modules.

Modules can be roughly classified into 4 broad categories:

  • Application modules - components that were developed specifically for this application, one of them is the main application module

  • 3rd-party modules - you can easily identify them if the company name is the same in the output of lmv WinDbg command

  • Common system modules - Windows dlls supplied by OS implementing native OS calls, Windows API and also C/C++ runtime functions, for example, ntdll.dll, kernel32.dll, user32.dll, gdi32.dll, advapi32.dll, msvcrt.dll, etc.

  • Specific system modules - optional Windows dlls supplied by Microsoft that are specific to the application functionality and implementation, like MFC dlls, .NET runtime or tapi32.dll

Although lmv is verbose for quick check of component timestamps you can use lmt WinDbg command. Here is an example of the great module variety:

Loading Dump File [application.dmp] ... ... ... Windows Server 2003 Version 3790 (Service Pack 1) ... ... ... 0:001> lmt start end module name 00400000 030ba000 app_main Mon Dec 04 21:22:42 2006 04120000 04193000 Dformd Mon Jan 31 02:27:58 2000 041a0000 04382000 sqllib2 Mon May 29 22:50:11 2006 04490000 044d3000 udNet Mon May 29 23:22:43 2006 04e30000 04f10000 abchook Wed Aug 01 20:47:17 2006 05e10000 05e15000 token_manager Fri Mar 12 11:54:17 1999 06030000 06044000 ODBCINT Thu Mar 24 22:59:58 2005 06150000 0618d000 sgl5NET Mon May 29 23:25:22 2006 06190000 0622f000 OPENGL32 Mon Nov 06 21:30:52 2006 06230000 06240000 pwrpc32 Thu Oct 22 16:22:40 1998 06240000 07411000 app_dll_1 Tue Aug 08 12:14:39 2006 07420000 07633000 app_dll_2 Mon Dec 04 22:11:59 2006 07640000 07652000 zlib Fri Aug 30 08:12:24 2002 07660000 07f23000 app_dll_3 Wed Oct 19 11:43:34 2005 0dec0000 0dedc000 app_dll_4 Mon Dec 04 22:11:36 2006 10000000 110be000 des Tue Jul 18 20:42:02 2006 129c0000 12f1b000 xpsp2res Fri Mar 25 00:26:47 2005 1b000000 1b170000 msjet40 Tue Jul 06 19:16:05 2004 1b2c0000 1b2cd000 msjter40 Thu May 09 19:09:53 2002 1b2d0000 1b2ea000 msjint40 Thu May 09 19:09:53 2002 1b570000 1b5c5000 msjetoledb40 Thu Nov 13 23:40:06 2003 1b5d0000 1b665000 mswstr10 Thu May 09 19:09:56 2002 1e000000 1e0f0000 python23 Fri Jan 30 13:03:24 2004 4b070000 4b0c1000 MSCTF Fri Mar 25 02:10:36 2005 4b610000 4b64d000 ODBC32 Fri Mar 25 02:09:33 2005 4b9e0000 4ba59000 OLEDB32 Fri Mar 25 02:09:56 2005 4c310000 4c31d000 OLEDB32R Fri Mar 25 02:09:57 2005 4c3b0000 4c3de000 MSCTFIME Fri Mar 25 02:10:37 2005 5f400000 5f4f2000 mfc42 Wed Oct 27 22:35:22 1999 62130000 6213d000 mfc42loc Wed Mar 26 03:35:58 2003 62460000 6246e000 msadrh15 Fri Mar 25 02:10:29 2005 63050000 63059000 lpk Fri Mar 25 02:09:21 2005 63270000 632c7000 hnetcfg Fri Mar 25 02:09:11 2005 65340000 653d2000 OLEAUT32 Wed Sep 01 00:15:11 1999 68000000 6802f000 rsaenh Fri Mar 25 00:30:55 2005 68a50000 68a70000 glu32 Fri Mar 25 02:09:03 2005 71990000 71998000 wshtcpip Wed Mar 26 03:34:24 2003 719d0000 71a11000 mswsock Fri Mar 25 02:12:06 2005 71a60000 71a6b000 wsock32 Wed Mar 26 03:34:24 2003 71a80000 71a91000 mpr Wed Mar 26 03:34:24 2003 71aa0000 71aa8000 ws2help Fri Mar 25 02:10:19 2005 71ab0000 71ac7000 ws2_32 Fri Mar 25 02:10:18 2005 71ad0000 71ae2000 tsappcmp Fri Mar 25 02:09:56 2005 71af0000 71b48000 netapi32 Fri Aug 11 11:00:07 2006 72ec0000 72ee7000 winspool Fri Mar 25 02:09:48 2005 73290000 73295000 riched32 Wed Mar 26 03:34:14 2003 73ee0000 73ee5000 icmp Wed Mar 26 03:34:09 2003 74920000 7493a000 msdart Fri Mar 25 02:10:48 2005 74b10000 74b80000 riched20 Fri Mar 25 02:09:36 2005 75220000 75281000 usp10 Fri Mar 25 02:09:51 2005 75810000 758d0000 userenv Fri Mar 25 02:09:50 2005 75d00000 75d27000 apphelp Fri Mar 25 02:09:21 2005 76120000 7613d000 imm32 Fri Mar 25 02:09:37 2005 76140000 76188000 comdlg32 Fri Mar 25 02:10:11 2005 76810000 76949000 comsvcs Fri Aug 26 23:19:45 2005 76a60000 76a6b000 psapi Fri Mar 25 02:09:57 2005 76c00000 76c1a000 iphlpapi Fri May 19 04:21:07 2006 76de0000 76e0f000 dnsapi Wed Jul 12 20:02:12 2006 76e20000 76e4e000 wldap32 Fri Mar 25 02:09:59 2005 76e60000 76e73000 secur32 Fri Mar 25 02:10:01 2005 76e80000 76e87000 winrnr Fri Mar 25 02:09:45 2005 76e90000 76e98000 rasadhlp Wed Jul 12 20:02:15 2006 76f20000 77087000 comres Wed Mar 26 03:33:48 2003 77330000 773c7000 comctl32 Mon Aug 28 09:26:02 2006 77470000 775a4000 ole32 Thu Jul 21 04:25:12 2005 77640000 776c3000 clbcatq Thu Jul 21 04:25:13 2005 77b30000 77b38000 version Fri Mar 25 02:09:50 2005 77b40000 77b9a000 msvcrt Fri Mar 25 02:11:59 2005 77ba0000 77be8000 gdi32 Tue Mar 07 03:55:05 2006 77bf0000 77c8f000 rpcrt4 Fri Mar 25 02:09:42 2005 77ca0000 77da3000 comctl32_77ca0000 Mon Aug 28 09:25:59 2006 77db0000 77dc1000 winsta Fri Mar 25 02:09:51 2005 77de0000 77e71000 user32 Fri Mar 25 02:09:49 2005 77e80000 77ed2000 shlwapi Wed Sep 20 01:33:12 2006 77ee0000 77ef1000 regapi Fri Mar 25 02:09:51 2005 77f20000 77fcb000 advapi32 Fri Mar 25 02:09:06 2005 780a0000 780b2000 MSVCIRT Wed Jun 17 19:45:46 1998 780c0000 78121000 MSVCP60 Wed Jun 17 19:52:10 1998 79040000 79085000 fusion Fri Feb 18 20:57:41 2005 79170000 79198000 mscoree Fri Feb 18 20:57:48 2005 791b0000 79417000 mscorwks Fri Feb 18 20:59:56 2005 79510000 79523000 mscorsn Fri Feb 18 20:30:38 2005 79780000 7998c000 mscorlib Fri Feb 18 20:48:36 2005 79990000 79cce000 mscorlib_79990000 Thu Nov 02 04:53:27 2006 7c340000 7c396000 msvcr71 Fri Feb 21 12:42:20 2003 7c800000 7c93e000 kernel32 Tue Jul 25 13:37:16 2006 7c940000 7ca19000 ntdll Fri Mar 25 02:09:53 2005 7ca20000 7d20a000 shell32 Thu Jul 13 13:58:56 2006

Note: you can use lmtD command to take the advantage of WinDbg hypertext commands. In that case you can quickly click on a module name to view its detailed information.

We see that some components are very old, 1998-1999, and some are from 2006. We also see 3rd-party libraries: OpenGL, Visual Fortran RTL, Python language runtime. Common system modules include two versions of C/C++ runtime library, 6.0 and 7.0. Specific system modules include MFC and .NET, MSJET, ODBC and OLE DB support. There is a sign of DLL Hell here too. OLE Automation DLL in system32 folder seems to be very old and doesn’t correspond to Windows 2003 SP1 which should have file version 5.2.3790.1830:

0:001> lmv m OLEAUT32 start end module name 65340000 653d2000 OLEAUT32 (deferred) Image path: C:\WINDOWS\system32\OLEAUT32.DLL Image name: OLEAUT32.DLL Timestamp: Wed Sep 01 00:15:11 1999 (37CC61FF) CheckSum: 0009475A ImageSize: 00092000 File version: 2.40.4277.1 Product version: 2.40.4277.1 File flags: 2 (Mask 3F) Pre-release File OS: 40004 NT Win32 File type: 2.0 Dll File date: 00000000.00000000 Translations: 0409.04e4 CompanyName: Microsoft Corporation ProductName: Microsoft OLE 2.40 for Windows NT(TM) and Windows 95(TM) Operating Systems InternalName: OLEAUT32.DLL ProductVersion: 2.40.4277 FileVersion: 2.40.4277 FileDescription: Microsoft OLE 2.40 for Windows NT(TM) and Windows 95(TM) Operating Systems LegalCopyright: Copyright © Microsoft Corp. 1993-1998. LegalTrademarks: Microsoft® is a registered trademark of Microsoft Corporation. Windows NT(TM) and Windows 95(TM) are trademarks of Microsoft Corporation. Comments: Microsoft OLE 2.40 for Windows NT(TM) and Windows 95(TM) Operating Systems

- Dmitry Vostokov @ DumpAnalysis.org -

ASLR: Address Space Layout Randomization

Tuesday, May 22nd, 2007

From Writing Secure Code for Windows Vista book written by Michael Howard and David LeBlanc I learnt that Vista has the new ASLR feature:

  • Load address randomization (/dynamicbase linker option)
  • Stack address randomization (/dynamicbase linker option)
  • Heap randomization

The first randomization changes addresses across Vista reboots. The second randomization happens every time you launch an application linked with /dynamicbase option. The third randomization happens every time you launch an application linked with or without /dynamicbase option as we will see below.

The book shows some source code that prints addresses of various modules, functions and local parameters to show the feature. I decided to check that using more direct route by attaching WinDbg to calc, notepad and pre-Vista application TestDefaultDebugger. Obviously native Vista applications use ASLR.

Comparison between two calc.exe processes inspected separately before and after reboot shows that the main module and system dlls have different load addresses:

0:000> lm start end module name 009f0000 00a1e000 calc 74710000 748a4000 comctl32 75b10000 75bba000 msvcrt … … … 76f00000 76fbf000 ADVAPI32 770d0000 771a8000 kernel32 771b0000 7724e000 USER32 77250000 7736e000 ntdll

0:000> lm start end module name 00470000 0049e000 calc … … … 743e0000 74574000 comctl32 … 75730000 757da000 msvcrt 757e0000 7589f000 ADVAPI32 … 75e20000 75ebe000 USER32 … 76cf0000 76dc8000 kernel32 76dd0000 76eee000 ntdll …

Main module address has different 3rd byte across reboots. I don’t believe that 0×00 is allowed because then we would have 0×00000000 load address. Therefore we have 255 unique load addresses chosen randomly.

Stack addresses are different:

0:000> k
ChildEBP RetAddr
000ffc8c 771d199a ntdll!KiFastSystemCallRet
000ffc90 771d19cd USER32!NtUserGetMessage+0xc
000ffcac 009f24e8 USER32!GetMessageW+0×33
000ffd08 00a02588 calc!WinMain+0×278
000ffd98 77113833 calc!_initterm_e+0×1a1
000ffda4 7728a9bd kernel32!BaseThreadInitThunk+0xe
000ffde4 00000000 ntdll!_RtlUserThreadStart+0×23

0:000> k
ChildEBP RetAddr
0007fbe4 75e4199a ntdll!KiFastSystemCallRet
0007fbe8 75e419cd USER32!NtUserGetMessage+0xc
0007fc04 004724e8 USER32!GetMessageW+0×33
0007fc60 00482588 calc!WinMain+0×278
0007fcf0 76d33833 calc!_initterm_e+0×1a1
0007fcfc 76e0a9bd kernel32!BaseThreadInitThunk+0xe
0007fd3c 00000000 ntdll!_RtlUserThreadStart+0×23

Because module base addresses are different return addresses on call stacks are different too.

Heap base addresses are different:

0:000> !heap
Index Address
1: 00120000
2: 00010000
3: 00760000
4: 00990000
5: 00700000
6: 00670000
7: 01320000

0:000> !heap
Index Address
1: 001b0000
2: 00010000
3: 00a00000
4: 009c0000
5: 00400000
6: 00900000
7: 01260000

PEB and environment addresses are different:

notepad.exe (PID 1248):

0:000> !peb
PEB at 7ffd4000
...
Environment: 000507e8

notepad.exe (PID 1370):

0:000> !peb
PEB at 7ffd9000
...
Environment: 003a07e8

If we look inside TEB we would see that pointers to exception handler list are different and stack bases are different too:

notepad.exe (PID 1248):

TEB at 7ffdf000 ExceptionList: 0023ff34   StackBase: 00240000   StackLimit: 0022f000   SubSystemTib: 00000000   FiberData: 00001e00   ArbitraryUserPointer: 00000000   Self: 7ffdf000   EnvironmentPointer: 00000000   ClientId: 00001248 . 000004e0   RpcHandle: 00000000   Tls Storage: 7ffdf02c   PEB Address: 7ffd4000   LastErrorValue: 0   LastStatusValue: c0000034   Count Owned Locks: 0   HardErrorMode: 0

notepad.exe (PID 1370):

0:000> !teb TEB at 7ffdf000 ExceptionList: 001ffa00   StackBase: 00200000   StackLimit: 001ef000   SubSystemTib: 00000000   FiberData: 00001e00   ArbitraryUserPointer: 00000000   Self: 7ffdf000   EnvironmentPointer: 00000000   ClientId: 00001370 . 00001454   RpcHandle: 00000000   Tls Storage: 7ffdf02c   PEB Address: 7ffd9000   LastErrorValue: 5   LastStatusValue: c0000008   Count Owned Locks: 0   HardErrorMode: 0

However if we look at old applications that weren’t linked with /dynamicbase option we would see that the main module and old dll base addresses are the same:

0:000> lm
start end module name
00400000 00435000 TestDefaultDebugger
20000000 2000d000 LvHook

To summarize different alternatives I created the following table where

“New” column - processes linked with /dynamicbase option, no reboot
“New/Reboot” column - processes linked with /dynamicbase option, reboot
“Old” column - old processes, no reboot
“Old/Reboot” column - old processes, reboot

Randomization   | New/Reboot | New | Old/Reboot | Old
------------------------------------------------------
Module          |      +     |  -  |      -     |  -
------------------------------------------------------
System DLLs     |      +     |  -  |      +     |  -
------------------------------------------------------
Stack           |      +     |  +  |      -     |  -
------------------------------------------------------
Heap            |      +     |  +  |      +     |  +
------------------------------------------------------
PEB             |      +     |  +  |      +     |  +
------------------------------------------------------
Environment     |      +     |  +  |      +     |  +
------------------------------------------------------
ExceptionList   |      +     |  +  |      -     |  -

From PEB and process heap base addresses we can see that environment addresses are always correlated with the heap:

0:000> !heap Index Address Name Debugging options enabled   1: 005f0000

0:000> !peb PEB at 7ffd7000 ... ... ...   ProcessHeap: 005f0000   ProcessParameters: 005f1540   Environment: 005f07e8

I think the reason why Microsoft didn’t enable ASLR by default is to prevent Changed Environment pattern from appearing.

- Dmitry Vostokov -

Where did the crash dump come from?

Tuesday, May 22nd, 2007

This is the basic check and very useful if your customer complains that the fix you sent yesterday doesn’t work. Check the computer name from the dump. It could be the case that your fix wasn’t applied to all computers. Here is a short summary for different dump types:

1. Complete/kernel memory dumps: dS srv!srvcomputername

1: kd> dS srv!srvcomputername
e17c9078 "COMPUTER-NAME"

2. User dumps: !peb and the subsequent search inside the environment variables

0:000> !peb
PEB at 7ffde000
...
...
...
Environment: 0x10000
...
0:000> s-su 0x10000 0x20000
...
...
000123b2 "COMPUTERNAME=COMPUTER-NAME"
...
...

dS command shown above interpret the address as a pointer to UNICODE_STRING structure widely used inside the Windows kernel space

1: kd> dt _UNICODE_STRING
+0x000 Length : Uint2B
+0x002 MaximumLength : Uint2B
+0x004 Buffer : Ptr32 Uint2B

DDK definition:

typedef struct _UNICODE_STRING {
USHORT Length;
USHORT MaximumLength;
PWSTR Buffer;
} UNICODE_STRING *PUNICODE_STRING;

Let’s dd the name:

1: kd> dd srv!srvcomputername l2
f5e8d1a0 0022001a e17c9078

Such combination of short integers following by an address is usually an indication that you have a UNICODE_STRING structure:

1: kd> du e17c9078
e17c9078 "COMPUTER-NAME   "

We can double-check it with dt command:

1: kd> dt _UNICODE_STRING f5e8d1a0
"COMPUTER-NAME"
+0x000 Length : 0x1a
+0x002 MaximumLength : 0x22
+0x004 Buffer : 0xe17c9078 "COMPUTER-NAME"

- Dmitry Vostokov -

Custom postmortem debuggers on Vista

Sunday, May 20th, 2007

Motivated by the previous post I decided to try better alternatives because on new Vista installation you don’t have either drwtsn32.exe or NTSD.

Any application that can attach to a process based on its PID and save its memory state in a dump will do. The first obvious candidate is userdump.exe which actually can setup itself in the registry properly. Here is the detailed instruction. If you already have the latest version of userdump.exe you can skip the first two steps:

1. Download the latest User Mode Process Dumper from Microsoft. At the time of this writing it has version 8.1

2. Run the downloaded executable file and it will prompt to unzip. By default the current version unzips to c:\kktools\userdump8.1. Do not run setup afterwards because it is not needed for our purposes.

3. Create kktools folder in system32 folder

4. Create the folder where userdump will save your dumps; I use c:\UserDumps in my example

5. Copy dbghelp.dll and userdump.exe from x86 or x64 folder depending on the version of Windows you use to system32\kktools folder you created in step 3.

6. Run the elevated command prompt and enter the following command:

C:\Windows\System32\kktools>userdump -I -d c:\UserDumps
User Mode Process Dumper (Version 8.1.2929.5)
Copyright (c) Microsoft Corp. All rights reserved.
Userdump set up Aedebug registry key.

7. Check the following registry key:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\AeDebug
Debugger=C:\Windows\system32\kktools\userdump -E %ld %ld -D c:\UserDumps\
Auto=0

You can set Auto to 1 if you want to see the following dialog every time you have a crash:

8. Test the new settings by using TestDefaultDebugger

9. When you have a crash userdump.exe will show a window on top of your screen while saving the dump file:

Of course, you can setup userdump.exe as the postmortem debugger on other Windows platforms. The problem with userdump.exe is that it overwrites the previous process dump because it uses the module name for the dump file name, for example, TestDefaultDebugger.dmp, so you need to rename or save the dump if you have multiple crashes for the same application.

Other programs can be setup instead of userdump.exe. One of them is WinDbg. Here is the article I wrote about WinDbg so I won’t repeat its content here, except the registry key I tested on Vista:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\AeDebug
Debugger="C:\Program Files\Debugging Tools for Windows\windbg.exe" -p %ld -e %ld -g -c '.dump /o /ma /u c:\UserDumps\new.dmp; q' -Q -QS -QY -QSY

Finally you can use command line CDB user mode debugger from Debugging Tools for Windows. Here is the registry key:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\AeDebug
Debugger="C:\Program Files\Debugging Tools for Windows\cdb.exe" -p %ld -e %ld -g -c ".dump /o /ma /u c:\UserDumps\new.dmp; q"

When you have a crash cdb.exe will be launched and the following console window will appear:

The advantage of using CDB or WinDbg is that you can omit q from the -c command line option and leave your debugger window open for further process inspection.

- Dmitry Vostokov -

Resurrecting Dr. Watson on Vista

Saturday, May 19th, 2007

Feeling nostalgic about pre-Vista times I recalled that one month before upgrading my Windows XP to Vista I saved the copy of Dr. Watson (drwtsn32.exe). Of course, during upgrade, drwtsn32.exe was removed from system32 folder. Now I copied it back and set it as the default postmortem debugger from the elevated command prompt:

When I looked at the registry I found the correctly set key values:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\AeDebug
Debugger=drwtsn32 -p %ld -e %ld -g
Auto=1

Auto=1 means do not show the error message box, just go ahead and dump the process. Actually with Auto=0 Dr. Watson doesn’t work on my Vista.

Also I configured Dr. Watson to store the log and full user dump in c:\DrWatson folder by running drwtsn32.exe from the same elevated command prompt:

Next I launched TestDefaultDebugger and hit the big crash button. Access violation happened and I saw familiar “Program Error” message box:

The log was created and the user dump was saved in the specified folder. All subsequent crashes were appended to the log and user.dmp was updated. When I opened the dump in WinDbg I got the following output:

Loading Dump File [C:DrWatsonuser.dmp]
User Mini Dump File with Full Memory: Only application data is available
Comment: ‘Dr. Watson generated MiniDump’
Symbol search path is: SRV*c:\websymbols*http://msdl.microsoft.com/download/symbols
Executable search path is:
Windows Vista Version 6000 UP Free x86 compatible
Product: WinNt, suite: SingleUserTS
Debug session time: Sat May 19 20:52:23.000 2007 (GMT+1)
System Uptime: 5 days 20:00:04.062
Process Uptime: 0 days 0:00:03.000
This dump file has an exception of interest stored in it.
The stored exception information can be accessed via .ecxr.
(1f70.1e0c): Access violation - code c0000005 (first/second chance not available)
eax=00000000 ebx=00000001 ecx=0012fe70 edx=00000000 esi=00425ae8 edi=0012fe70
eip=004014f0 esp=0012f8a8 ebp=0012f8b4 iopl=0 nv up ei ng nz ac pe cy
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010297
TestDefaultDebugger!CTestDefaultDebuggerDlg::OnBnClickedButton1:
004014f0 c7050000000000000000 mov dword ptr ds:[0],0 ds:0023:00000000=???????

Therefore I believe that if I saved ntsd.exe before upgrading to Vista I would have been able to set it as a default postmortem debugger too.

- Dmitry Vostokov -

Inside Vista Error Reporting (Part 1)

Saturday, May 19th, 2007

This is a follow up to the post about postmortem debuggers on Windows XP/W2K3. Now we look inside the same mechanism on Vista. After launching TestDefaultDebugger and pushing its crash button we get the following Windows error reporting dialog:

If we attach WinDbg to our TestDefaultDebugger process we would no longer see our default unhandled exception filter waiting for the error reporting process:

Windows XP

0:000> k
ChildEBP RetAddr
0012d318 7c90e9ab ntdll!KiFastSystemCallRet
0012d31c 7c8094e2 ntdll!ZwWaitForMultipleObjects+0xc
0012d3b8 7c80a075 kernel32!WaitForMultipleObjectsEx+0×12c
0012d3d4 6945763c kernel32!WaitForMultipleObjects+0×18
0012dd68 694582b1 faultrep!StartDWException+0×5df

0012eddc 7c8633e9 faultrep!ReportFault+0×533
0012f47c 00411eaa kernel32!UnhandledExceptionFilter+0×587
0012f8a4 00403263 TestDefaultDebugger!CTestDefaultDebuggerDlg::OnBnClickedButton1
0012f8b4 00403470 TestDefaultDebugger!_AfxDispatchCmdMsg+0×43



0012fff0 00000000 kernel32!BaseProcessStart+0×23

Windows Vista

0:001> ~*kL 100
0 Id: 120c.148c Suspend: 1 Teb: 7ffdf000 Unfrozen
ChildEBP RetAddr
0012f8a4 00403263 TestDefaultDebugger!CTestDefaultDebuggerDlg::OnBnClickedButton1
0012f8b4 00403470 TestDefaultDebugger!_AfxDispatchCmdMsg+0×43
0012f8e4 00402a27 TestDefaultDebugger!CCmdTarget::OnCmdMsg+0×118
0012f908 00408e69 TestDefaultDebugger!CDialog::OnCmdMsg+0×1b
0012f958 004098d9 TestDefaultDebugger!CWnd::OnCommand+0×90
0012f9f4 00406258 TestDefaultDebugger!CWnd::OnWndMsg+0×36
0012fa14 0040836d TestDefaultDebugger!CWnd::WindowProc+0×22
0012fa7c 004083f4 TestDefaultDebugger!AfxCallWndProc+0×9a
0012fa9c 77b71a10 TestDefaultDebugger!AfxWndProc+0×34
0012fac8 77b71ae8 USER32!InternalCallWinProc+0×23
0012fb40 77b7286a USER32!UserCallWinProcCheckWow+0×14b
0012fb80 77b72bba USER32!SendMessageWorker+0×4b7
0012fba0 7504e5cc USER32!SendMessageW+0×7c
0012fbc0 7504e583 COMCTL32!Button_NotifyParent+0×3d
0012fbdc 7504e680 COMCTL32!Button_ReleaseCapture+0×112
0012fc34 77b71a10 COMCTL32!Button_WndProc+0xa4b
0012fc60 77b71ae8 USER32!InternalCallWinProc+0×23
0012fcd8 77b72a47 USER32!UserCallWinProcCheckWow+0×14b
0012fd3c 77b72a98 USER32!DispatchMessageWorker+0×322
0012fd4c 77b6120c USER32!DispatchMessageW+0xf
0012fd70 0040568b USER32!IsDialogMessageW+0×586
0012fd80 004065d8 TestDefaultDebugger!CWnd::IsDialogMessageW+0×2e
0012fd88 00402a07 TestDefaultDebugger!CWnd::PreTranslateInput+0×29
0012fd98 00408041 TestDefaultDebugger!CDialog::PreTranslateMessage+0×96
0012fda8 00403ae3 TestDefaultDebugger!CWnd::WalkPreTranslateTree+0×1f
0012fdbc 00403c1e TestDefaultDebugger!AfxInternalPreTranslateMessage+0×3b
0012fdc4 00403b29 TestDefaultDebugger!CWinThread::PreTranslateMessage+0×9
0012fdcc 00403c68 TestDefaultDebugger!AfxPreTranslateMessage+0×15
0012fddc 00407920 TestDefaultDebugger!AfxInternalPumpMessage+0×2b
0012fe00 004030a1 TestDefaultDebugger!CWnd::RunModalLoop+0xca
0012fe4c 0040110d TestDefaultDebugger!CDialog::DoModal+0×12c
0012fef8 004206fb TestDefaultDebugger!CTestDefaultDebuggerApp::InitInstance+0xdd
0012ff08 0040e852 TestDefaultDebugger!AfxWinMain+0×47
0012ffa0 77603833 TestDefaultDebugger!__tmainCRTStartup+0×176
0012ffac 779ea9bd kernel32!BaseThreadInitThunk+0xe
0012ffec 00000000 ntdll!_RtlUserThreadStart+0×23
# 1 Id: 120c.17e4 Suspend: 1 Teb: 7ffde000 Unfrozen
ChildEBP RetAddr
011cff70 77a3f0a9 ntdll!DbgBreakPoint
011cffa0 77603833 ntdll!DbgUiRemoteBreakin+0×3c
011cffac 779ea9bd kernel32!BaseThreadInitThunk+0xe
011cffec 00000000 ntdll!_RtlUserThreadStart+0×23

Let’s look at the faulting thread’s raw stack data:

0:001> ~0 s
eax=00000000 ebx=00000001 ecx=0012fe70 edx=00000000 esi=00425ae8 edi=0012fe70
eip=004014f0 esp=0012f8a8 ebp=0012f8b4 iopl=0 nv up ei ng nz ac pe cy
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010297
TestDefaultDebugger!CTestDefaultDebuggerDlg::OnBnClickedButton1:
004014f0 mov dword ptr ds:[0],0 ds:0023:00000000=????????
0:000> !teb
TEB at 7ffdf000
ExceptionList: 0012f9e8
StackBase: 00130000
StackLimit: 0012d000
SubSystemTib: 00000000
FiberData: 00001e00
ArbitraryUserPointer: 00000000
Self: 7ffdf000
EnvironmentPointer: 00000000
ClientId: 0000120c . 0000148c
RpcHandle: 00000000
Tls Storage: 7ffdf02c
PEB Address: 7ffda000
LastErrorValue: 0
LastStatusValue: c000008a
Count Owned Locks: 0
HardErrorMode: 0
0:000>dds 0012d000 00130000



0012f368 0012f3c0
0012f36c 7760fb01 kernel32!GetApplicationRecoveryCallback+0×33
0012f370 ffffffff
0012f374 0012f380
0012f378 00000001
0012f37c 00000000
0012f380 00000000
0012f384 00000000
0012f388 00000000
0012f38c 00000000
0012f390 00000000
0012f394 00000000
0012f398 00000000
0012f39c 00000000
0012f3a0 00000000
0012f3a4 00000000
0012f3a8 00000000
0012f3ac 00000000
0012f3b0 00000000
0012f3b4 00000000
0012f3b8 00000000
0012f3bc 00000000
0012f3c0 0012f410
0012f3c4 7767aa88 kernel32!WerpReportExceptionInProcessContext+0×82
0012f3c8 ffffffff
0012f3cc 0012f3ec
0012f3d0 00000000
0012f3d4 00000000
0012f3d8 7767aab7 kernel32!WerpReportExceptionInProcessContext+0xa7
0012f3dc 001257b9
0012f3e0 00000001
0012f3e4 00000000
0012f3e8 0012f4c8
0012f3ec 00000000
0012f3f0 00000000
0012f3f4 00000000
0012f3f8 0012f3dc
0012f3fc ffffffff
0012f400 0012f488
0012f404 775d5ac9 kernel32!_except_handler4
0012f408 77670969 kernel32!Internal_NotifyUILanguageChange+0×4a6
0012f40c fffffffe
0012f410 7767aab7 kernel32!WerpReportExceptionInProcessContext+0xa7
0012f414 77655b41 kernel32!UnhandledExceptionFilter+0×1b2
0012f418 77655cbd kernel32!UnhandledExceptionFilter+0×32e
0012f41c 00125731
0012f420 00000000
0012f424 0012f4c8
0012f428 00000000
0012f42c 00000000
0012f430 00000000
0012f434 00000000
0012f438 00000000
0012f43c 00000800
0012f440 00000000
0012f444 00000000
0012f448 00000000
0012f44c 00000000
0012f450 00000000
0012f454 00000005
0012f458 994ac7c4
0012f45c 00000011
0012f460 00000000
0012f464 0012f5c0
0012f468 775d5ac9 kernel32!_except_handler4
0012f46c 00000001
0012f470 00000000
0012f474 77655cbd kernel32!UnhandledExceptionFilter+0×32e
0012f478 00000000
0012f47c 00000000
0012f480 0012f41c
0012f484 00000024
0012f488 0012f4f4
0012f48c 775d5ac9 kernel32!_except_handler4
0012f490 7765ff59 kernel32!PEWriteResource<_IMAGE_NT_HEADERS>+0×50a
0012f494 fffffffe
0012f498 77655cbd kernel32!UnhandledExceptionFilter+0×32e
0012f49c 77a29f8e ntdll!_RtlUserThreadStart+0×6f
0012f4a0 00000000
0012f4a4 779b8dd4 ntdll!_EH4_CallFilterFunc+0×12
0012f4a8 00000000
0012f4ac 0012ffec
0012f4b0 779ff108 ntdll! ?? ::FNODOBFM::`string’+0xb6e
0012f4b4 0012f4dc
0012f4b8 779b40e4 ntdll!_except_handler4+0xcc
0012f4bc 00000000
0012f4c0 00000000
0012f4c4 00000000
0012f4c8 0012f5c0
0012f4cc 0012f5dc
0012f4d0 779ff118 ntdll! ?? ::FNODOBFM::`string’+0xb7e
0012f4d4 00000001
0012f4d8 0112f5c0
0012f4dc 0012f500
0012f4e0 77a11039 ntdll!ExecuteHandler2+0×26
0012f4e4 fffffffe
0012f4e8 0012ffdc
0012f4ec 0012f5dc
0012f4f0 0012f59c
0012f4f4 0012f9e8
0012f4f8 77a1104d ntdll!ExecuteHandler2+0×3a
0012f4fc 0012ffdc
0012f500 0012f5a8
0012f504 77a1100b ntdll!ExecuteHandler+0×24
0012f508 0012f5c0
0012f50c 0012ffdc
0012f510 0012fe70
0012f514 0012f59c
0012f518 779b8bf2 ntdll!_except_handler4
0012f51c 00000000
0012f520 0012f5c0
0012f524 0012f538
0012f528 779b94e3 ntdll!RtlCallVectoredContinueHandlers+0×15
0012f52c 0012f5c0
0012f530 0012f5dc
0012f534 77a754c0 ntdll!RtlpCallbackEntryList
0012f538 0012f5a8
0012f53c 779b94c1 ntdll!RtlDispatchException+0×11f
0012f540 0012f5c0
0012f544 0012f5dc
0012f548 00425ae8 TestDefaultDebugger!CTestDefaultDebuggerApp::`vftable’+0×154
0012f54c 00000000
0012f550 00000502
0012f554 00000000
0012f558 00a460e0
0012f55c 00000000
0012f560 00000000
0012f564 00000070
0012f568 ffffffff
0012f56c ffffffff
0012f570 77b60dba USER32!UserCallDlgProcCheckWow+0×5f
0012f574 77b60e63 USER32!UserCallDlgProcCheckWow+0×16e
0012f578 0000006c
0012f57c 00000000
0012f580 00000000
0012f584 00000000
0012f588 00000000
0012f58c 0000004e
0012f590 00000000
0012f594 0012f634
0012f598 77bb76cc USER32!_except_handler4
0012f59c 0012f634
0012f5a0 00130000
0012f5a4 00000000
0012f5a8 0012f8b4
0012f5ac 77a10060 ntdll!NtRaiseException+0xc
0012f5b0 77a10eb2 ntdll!KiUserExceptionDispatcher+0×2a
0012f5b4 0012f5c0


It shows the presence of kernel32!UnhandledExceptionFilter calls. Let’s open TestDefaultDebugger.exe in WinDbg, put breakpoint on UnhandledExceptionFilter and trace the execution. We have to change the return value of IsDebugPortPresent to simulate the normal fault handling logic when no active debugger is attached:

0:000> bp kernel32!UnhandledExceptionFilter
0:000> g
(fb0.1190): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=00000000 ebx=00000001 ecx=0012fe70 edx=00000000 esi=00425ae8 edi=0012fe70
eip=004014f0 esp=0012f8a8 ebp=0012f8b4 iopl=0 nv up ei ng nz ac pe cy
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010297
TestDefaultDebugger!CTestDefaultDebuggerDlg::OnBnClickedButton1:
004014f0 mov dword ptr ds:[0],0 ds:0023:00000000=????????
0:000> g
Breakpoint 0 hit
eax=0042ae58 ebx=00000000 ecx=0042ae58 edx=0042ae58 esi=003b07d8 edi=c0000005
eip=77655984 esp=0012f478 ebp=0012f494 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
kernel32!UnhandledExceptionFilter:
77655984 push 5Ch
0:000> g $$ skip first chance exception
Breakpoint 0 hit
eax=77655984 ebx=00000000 ecx=0012f404 edx=77a10f34 esi=0012f4c8 edi=00000000
eip=77655984 esp=0012f49c ebp=0012ffec iopl=0 nv up ei pl nz na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000206
kernel32!UnhandledExceptionFilter:
77655984 push 5Ch
0:000> p
eax=77655984 ebx=00000000 ecx=0012f404 edx=77a10f34 esi=0012f4c8 edi=00000000
eip=77655986 esp=0012f498 ebp=0012ffec iopl=0 nv up ei pl nz na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000206
kernel32!UnhandledExceptionFilter+0×2:
77655986 push offset kernel32!strcat_s+0×128d (77655cf0)



0:000> p
eax=00000000 ebx=0012f4c8 ecx=776558e5 edx=77a10f34 esi=00000000 edi=00000000
eip=77655a33 esp=0012f41c ebp=0012f498 iopl=0 nv up ei pl nz ac po cy
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000213
kernel32!UnhandledExceptionFilter+0xa5:
77655a33 call kernel32!IsDebugPortPresent (7765594c)
0:000> p
eax=00000001 ebx=0012f4c8 ecx=0012f3f4 edx=77a10f34 esi=00000000 edi=00000000
eip=77655a38 esp=0012f41c ebp=0012f498 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202
kernel32!UnhandledExceptionFilter+0xaa:
77655a38 test eax,eax
0:000> r eax=0
0:000> p
eax=00000000 ebx=0012f4c8 ecx=0012f3f4 edx=77a10f34 esi=00000000 edi=00000000
eip=77655a3a esp=0012f41c ebp=0012f498 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
kernel32!UnhandledExceptionFilter+0xac:
77655a3a jne kernel32!UnhandledExceptionFilter+0×22 (776559a6) [br=0]

Next, we continue to step over using p command until we see WerpReportExceptionInProcessContext function and step into it:

0:000> p
eax=c0000022 ebx=0012f4c8 ecx=0012f400 edx=77a10f34 esi=00000000 edi=00000001
eip=77655b3c esp=0012f418 ebp=0012f498 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202
kernel32!UnhandledExceptionFilter+0×1ad:
77655b3c call kernel32!WerpReportExceptionInProcessContext (7767aa06)
0:000> t
eax=c0000022 ebx=0012f4c8 ecx=0012f400 edx=77a10f34 esi=00000000 edi=00000001
eip=7767aa06 esp=0012f414 ebp=0012f498 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202
kernel32!WerpReportExceptionInProcessContext:
7767aa06 push 14h

At this point if we look at the stack trace we would see:

0:000> kL 100
ChildEBP RetAddr
0012f410 77655b41 kernel32!WerpReportExceptionInProcessContext
0012f498 77a29f8e kernel32!UnhandledExceptionFilter+0×1b2
0012f4a0 779b8dd4 ntdll!_RtlUserThreadStart+0×6f
0012f4b4 779b40f0 ntdll!_EH4_CallFilterFunc+0×12
0012f4dc 77a11039 ntdll!_except_handler4+0×8e
0012f500 77a1100b ntdll!ExecuteHandler2+0×26
0012f5a8 77a10e97 ntdll!ExecuteHandler+0×24
0012f5a8 004014f0 ntdll!KiUserExceptionDispatcher+0xf
0012f8a4 00403263 TestDefaultDebugger!CTestDefaultDebuggerDlg::OnBnClickedButton1
0012f8b4 00403470 TestDefaultDebugger!_AfxDispatchCmdMsg+0×43


After that we step over again and find that the code flow returns from all exception handlers until KiUserExceptionDispatcher raises exception again via ZwRaiseException call.

So it looks that the default unhandled exception filter in Vista only reports the exception and doesn’t launch the error reporting process that displays the error box, WerFault.exe.

If we click on Debug button on the error reporting dialog to launch the postmortem debugger (I have Visual Studio Just-In-Time Debugger configured in AeDebug\Debugger registry key) and look at its parent process by using Process Explorer for example, we would see it is WerFault.exe which in turn has svchost.exe as its parent.

Now we quit WinDbg and launch TestDefaultDebugger again, push its big crash button and when the error reporting dialog appears we attach another instance of WinDbg to svchost.exe process hosting Windows Error Reporting Service (wersvc.dll).
We see the following threads:

0:000> ~*k
. 0 Id: f8c.f90 Suspend: 1 Teb: 7ffdf000 Unfrozen
ChildEBP RetAddr
0008f5b4 77a10080 ntdll!KiFastSystemCallRet
0008f5b8 7760853f ntdll!ZwReadFile+0xc
0008f630 7709ffe2 kernel32!ReadFile+0×20e
0008f65c 7709fdfb ADVAPI32!ScGetPipeInput+0×2a
0008f6c4 7709bdd2 ADVAPI32!ScDispatcherLoop+0×6c
0008f93c 004a241d ADVAPI32!StartServiceCtrlDispatcherW+0xce
0008f944 004a2401 svchost!SvcHostMain+0×12
0008f948 004a2183 svchost!wmain+0×5
0008f98c 77603833 svchost!_initterm_e+0×163
0008f998 779ea9bd kernel32!BaseThreadInitThunk+0xe
0008f9d8 00000000 ntdll!_RtlUserThreadStart+0×23
1 Id: f8c.fa4 Suspend: 1 Teb: 7ffdd000 Unfrozen
ChildEBP RetAddr
0086f6d0 77a10690 ntdll!KiFastSystemCallRet
0086f6d4 779cb65b ntdll!ZwWaitForMultipleObjects+0xc
0086f870 77603833 ntdll!TppWaiterpThread+0×294
0086f87c 779ea9bd kernel32!BaseThreadInitThunk+0xe
0086f8bc 00000000 ntdll!_RtlUserThreadStart+0×23
2 Id: f8c.fa8 Suspend: 1 Teb: 7ffdc000 Unfrozen
ChildEBP RetAddr
0031f81c 77a0f2c0 ntdll!KiFastSystemCallRet
0031f820 71cb1545 ntdll!NtAlpcSendWaitReceivePort+0xc
0031fd3c 71cb63c4 wersvc!CWerService::LpcServerThread+0×9c
0031fd44 77603833 wersvc!CWerService::StaticLpcServerThread+0xd
0031fd50 779ea9bd kernel32!BaseThreadInitThunk+0xe
0031fd90 00000000 ntdll!_RtlUserThreadStart+0×23
3 Id: f8c.2cc Suspend: 1 Teb: 7ffde000 Unfrozen
ChildEBP RetAddr
00f8f768 77a106a0 ntdll!KiFastSystemCallRet
00f8f76c 776077d4 ntdll!NtWaitForSingleObject+0xc
00f8f7dc 77607742 kernel32!WaitForSingleObjectEx+0xbe
00f8f7f0 71cb6f4b kernel32!WaitForSingleObject+0×12
00f8f858 71cb6803 wersvc!CWerService::ReportCrashKernelMsg+0×256
00f8fb7c 71cb6770 wersvc!CWerService::DispatchPortRequestWorkItem+0×70a
00f8fb90 779c1fbb wersvc!CWerService::StaticDispatchPortRequestWorkItem+0×17
00f8fbb4 77a1a2b8 ntdll!TppSimplepExecuteCallback+0×10c
00f8fcdc 77603833 ntdll!TppWorkerThread+0×522
00f8fce8 779ea9bd kernel32!BaseThreadInitThunk+0xe
00f8fd28 00000000 ntdll!_RtlUserThreadStart+0×23
4 Id: f8c.1b38 Suspend: 1 Teb: 7ffdb000 Unfrozen
ChildEBP RetAddr
00d3fe08 77a10850 ntdll!KiFastSystemCallRet
00d3fe0c 77a1a1b4 ntdll!NtWaitForWorkViaWorkerFactory+0xc
00d3ff34 77603833 ntdll!TppWorkerThread+0×1f6
00d3ff40 779ea9bd kernel32!BaseThreadInitThunk+0xe
00d3ff80 00000000 ntdll!_RtlUserThreadStart+0×23

First, it looks like some LPC notification mechanism is present here (CWerService::LpcServerThread).
Next, if we look at CWerService::ReportCrashKernelMsg code we would see it calls CWerService::ReportCrash which in turn loads faultrep.dll

0:000> .asm no_code_bytes
Assembly options: no_code_bytes
0:000> uf wersvc!CWerService::ReportCrashKernelMsg



wersvc!CWerService::ReportCrashKernelMsg+0×226:
71cb6f13 lea  eax,[ebp-20h]
71cb6f16 push eax
71cb6f17 push dword ptr [ebp-34h]
71cb6f1a push dword ptr [ebp-2Ch]
71cb6f1d call dword ptr [wersvc!_imp__GetCurrentProcessId (71cb1120)]
71cb6f23 push eax
71cb6f24 mov  ecx,dword ptr [ebp-38h]
71cb6f27 call wersvc!CWerService::ReportCrash (71cb7008)
71cb6f2c mov  dword ptr [ebp-1Ch],eax
71cb6f2f cmp  eax,ebx
71cb6f31 jl   wersvc!CWerService::ReportCrashKernelMsg+0×279 (71cb6a10)



0:000> uf wersvc!CWerService::ReportCrash



wersvc!CWerService::ReportCrash+0×3d:
71cb7045 mov  dword ptr [ebp-4],edi
71cb7048 push offset wersvc!`string’ (71cb711c)
71cb704d call dword ptr [wersvc!_imp__LoadLibraryW (71cb1144)]

71cb7053 mov  dword ptr [ebp-2Ch],eax
71cb7056 cmp  eax,edi
71cb7058 je   wersvc!CWerService::ReportCrash+0×52 (71cb9b47)

wersvc!CWerService::ReportCrash+0×88:
71cb705e push offset wersvc!`string’ (71cb7100)
71cb7063 push eax
71cb7064 call dword ptr [wersvc!_imp__GetProcAddress (71cb1140)]

71cb706a mov  ebx,eax
71cb706c cmp  ebx,edi
71cb706e je   wersvc!CWerService::ReportCrash+0×9a (71cb9b7d)



0:000> du 71cb711c
71cb711c “faultrep.dll”
0:000> da 71cb7100
71cb7100 “WerpInitiateCrashReporting”

If we attach a new instance of WinDbg to WerFault.exe and inspect its threads we would see:

0:003> ~*k
0 Id: 1bfc.16c4 Suspend: 1 Teb: 7ffdf000 Unfrozen
ChildEBP RetAddr
0015de60 77a10690 ntdll!KiFastSystemCallRet
0015de64 77607e09 ntdll!ZwWaitForMultipleObjects+0xc
0015df00 77b6c4b7 kernel32!WaitForMultipleObjectsEx+0×11d
0015df54 77b68b83 USER32!RealMsgWaitForMultipleObjectsEx+0×13c
0015df70 6d46d90d USER32!MsgWaitForMultipleObjects+0×1f
0015dfc0 6d4acd77 wer!UtilMsgWaitForMultipleObjects+0×8a
0015dff4 6d4a7694 wer!CInitialConsentUI::Show+0×133
0015e040 6d4a9a69 wer!CEventUI::GetInitialDialogSelection+0xc6
0015e104 6d46df18 wer!CEventUI::Start+0×32
0015e39c 6d46b743 wer!CWatson::ReportProblem+0×438
0015e3ac 6d46b708 wer!WatsonReportSend+0×1e
0015e3c8 6d46b682 wer!CDWInstance::WatsonReportStub+0×17
0015e3ec 6d472a7f wer!CDWInstance::SubmitReport+0×21e
0015e410 730b6d0d wer!WerReportSubmit+0×5d
0015f33c 730b73c1 faultrep!CCrashWatson::GenerateCrashReport+0×5c4
0015f5d4 730b4de1 faultrep!CCrashWatson::ReportCrash+0×374
0015fad4 009bd895 faultrep!WerpInitiateCrashReporting+0×304
0015fb0c 009b60cd WerFault!UserCrashMain+0×14e
0015fb30 009b644a WerFault!wmain+0xbf
0015fb74 77603833 WerFault!_initterm_e+0×163
1 Id: 1bfc.894 Suspend: 1 Teb: 7ffde000 Unfrozen
ChildEBP RetAddr
024afbf8 77a10690 ntdll!KiFastSystemCallRet
024afbfc 77607e09 ntdll!ZwWaitForMultipleObjects+0xc
024afc98 77b6c4b7 kernel32!WaitForMultipleObjectsEx+0×11d
024afcec 74fa161a USER32!RealMsgWaitForMultipleObjectsEx+0×13c
024afd0c 74fa2cb6 DUser!CoreSC::Wait+0×59
024afd34 74fa2c55 DUser!CoreSC::WaitMessage+0×54
024afd44 77b615c0 DUser!MphWaitMessageEx+0×22
024afd60 77a10e6e USER32!__ClientWaitMessageExMPH+0×1e
024afd7c 77b6b5bc ntdll!KiUserCallbackDispatcher+0×2e
024afd80 77b61598 USER32!NtUserWaitMessage+0xc
024afdb4 77b61460 USER32!DialogBox2+0×202
024afddc 77b614a2 USER32!InternalDialogBox+0xd0
024afdfc 77b61505 USER32!DialogBoxIndirectParamAorW+0×37
024afe1c 75036c51 USER32!DialogBoxIndirectParamW+0×1b
024afe40 75036beb comctl32!SHFusionDialogBoxIndirectParam+0×2d
024afe74 6d4a65a4 comctl32!CTaskDialog::Show+0×100
024afebc 6d4acb72 wer!IsolationAwareTaskDialogIndirect+0×64
024aff4c 6d4acc39 wer!CInitialConsentUI::InitialDlgThreadRoutine+0×369
024aff54 77603833 wer!CInitialConsentUI::Static_InitialDlgThreadRoutine+0xd
024aff60 779ea9bd kernel32!BaseThreadInitThunk+0xe
2 Id: 1bfc.1a04 Suspend: 1 Teb: 7ffdc000 Unfrozen
ChildEBP RetAddr
012bf998 77a10690 ntdll!KiFastSystemCallRet
012bf99c 77607e09 ntdll!ZwWaitForMultipleObjects+0xc
012bfa38 77b6c4b7 kernel32!WaitForMultipleObjectsEx+0×11d
012bfa8c 74fa161a USER32!RealMsgWaitForMultipleObjectsEx+0×13c
012bfaac 74fa1642 DUser!CoreSC::Wait+0×59
012bfae0 74fac442 DUser!CoreSC::xwProcessNL+0xaa
012bfb00 74fac3a2 DUser!GetMessageExA+0×44
012bfb54 779262b6 DUser!ResourceManager::SharedThreadProc+0xb6
012bfb8c 779263de msvcrt!_endthreadex+0×44
012bfb94 77603833 msvcrt!_endthreadex+0xce
012bfba0 779ea9bd kernel32!BaseThreadInitThunk+0xe
012bfbe0 00000000 ntdll!_RtlUserThreadStart+0×23
# 3 Id: 1bfc.14a4 Suspend: 1 Teb: 7ffdb000 Unfrozen
ChildEBP RetAddr
02a1fc40 77a3f0a9 ntdll!DbgBreakPoint
02a1fc70 77603833 ntdll!DbgUiRemoteBreakin+0×3c
02a1fc7c 779ea9bd kernel32!BaseThreadInitThunk+0xe
02a1fcbc 00000000 ntdll!_RtlUserThreadStart+0×23

Next, we put a breakpoint on CreateProcess, push Debug button on the error reporting dialog and upon the breakpoint hit inspect CreateProcess parameters:

0:003> .asm no_code_bytes
Assembly options: no_code_bytes
0:003> bp kernel32!CreateProcessW
0:003> g
Breakpoint 0 hit
eax=00000000 ebx=00000000 ecx=7ffdf000 edx=0015db30 esi=00000001 edi=00000000
eip=775c1d27 esp=0015dfe0 ebp=0015e408 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202
kernel32!CreateProcessW:
775c1d27 mov edi,edi
0:000> ddu esp+8 l1
0015dfe8 008b0000 “”C:\WINDOWS\system32\vsjitdebugger.exe” -p 8064 -e 312″

ESP points to return address, ESP+4 points to the first CreateProcess parameter and ESP+8 points to the second parameter. The thread stack now involves faultrep.dll:

0:000> k
ChildEBP RetAddr
0020dde0 730bb2b5 kernel32!CreateProcessW
0020e20c 730b6dae faultrep!WerpLaunchAeDebug+0×384
0020f140 730b73c1 faultrep!CCrashWatson::GenerateCrashReport+0×665
0020f3d8 730b4de1 faultrep!CCrashWatson::ReportCrash+0×374
0020f8d8 009bd895 faultrep!WerpInitiateCrashReporting+0×304
0020f910 009b60cd WerFault!UserCrashMain+0×14e
0020f934 009b644a WerFault!wmain+0xbf
0020f978 77603833 WerFault!_initterm_e+0×163
0020f984 779ea9bd kernel32!BaseThreadInitThunk+0xe
0020f9c4 00000000 ntdll!_RtlUserThreadStart+0×23

Therefore it looks like calls to faultrep.dll module to report faults and launch the posmortem debugger were moved from UnhandledExceptionFilter to WerFault.exe in Vista.

Finally, let’s go back to our UnhandledExceptionFilter. If we disassemble it we would see that it can call kernel32!WerpLaunchAeDebug too:

0:000> .asm no_code_bytes
Assembly options: no_code_bytes
0:000> uf kernel32!UnhandledExceptionFilter



kernel32!UnhandledExceptionFilter+0×2d0:
77655c5f push dword ptr [ebp-28h]
77655c62 push dword ptr [ebp-1Ch]
77655c65 push dword ptr [ebx+4]
77655c68 push dword ptr [ebx]
77655c6a push 0FFFFFFFEh
77655c6c call kernel32!GetCurrentProcess (775e9145)
77655c71 push eax
77655c72 call kernel32!WerpLaunchAeDebug (7767baaf)
77655c77 test eax,eax
77655c79 jge  kernel32!UnhandledExceptionFilter+0×2f3 (77655c82)



kernel32!UnhandledExceptionFilter+0×303:
77655c92 mov  eax,dword ptr [ebx]
77655c94 push dword ptr [eax]
77655c96 push 0FFFFFFFFh
77655c98 call dword ptr [kernel32!_imp__NtTerminateProcess (775c14bc)]

If we look at WerpLaunchAeDebug code we would see that it calls CreateProcess too and the code is the same as in faultrep.dll. This could mean that faultrep.dll imports that function from kernel32.dll. So some postmortem debugger launching code is still present in the default unhandled exception filter perhaps for compatibility or in case WER doesn’t work or disabled.

High-level description of the differences between Windows XP and Vista application crash support is present in the recent Mark Russinovich’s article:

Inside the Windows Vista Kernel: Part 3 (Enhanced Crash Support)

- Dmitry Vostokov -

Process and Thread Startup in Vista

Saturday, May 19th, 2007

If you looked at process dumps from Vista or did live debugging you might have noticed that there are no longer kernel32 functions BaseProcessStart on the main thread stack and BaseThreadStart on subsequent thread stacks. In Vista we have ntdll!_RtlUserThreadStart which calls kernel32!BaseThreadInitThunk for both main and secondary threads:

0:002> ~*k
0 Id: 13e8.1348 Suspend: 1 Teb: 7ffdf000 Unfrozen
ChildEBP RetAddr
0009f8d8 77b7199a ntdll!KiFastSystemCallRet
0009f8dc 77b719cd USER32!NtUserGetMessage+0xc
0009f8f8 006b24e8 USER32!GetMessageW+0x33
0009f954 006c2588 calc!WinMain+0x278
0009f9e4 77603833 calc!_initterm_e+0x1a1
0009f9f0 779ea9bd kernel32!BaseThreadInitThunk+0xe
0009fa30 00000000 ntdll!_RtlUserThreadStart+0×23

1 Id: 13e8.534 Suspend: 1 Teb: 7ffde000 Unfrozen
ChildEBP RetAddr
0236f9d8 77a106a0 ntdll!KiFastSystemCallRet
0236f9dc 776077d4 ntdll!NtWaitForSingleObject+0xc
0236fa4c 77607742 kernel32!WaitForSingleObjectEx+0xbe
0236fa60 006b4958 kernel32!WaitForSingleObject+0×12
0236fa78 77603833 calc!WatchDogThread+0×21
0236fa84 779ea9bd kernel32!BaseThreadInitThunk+0xe
0236fac4 00000000 ntdll!_RtlUserThreadStart+0×23

# 2 Id: 13e8.1188 Suspend: 1 Teb: 7ffdd000 Unfrozen
ChildEBP RetAddr
0078fec8 77a3f0a9 ntdll!DbgBreakPoint
0078fef8 77603833 ntdll!DbgUiRemoteBreakin+0×3c
0078ff04 779ea9bd kernel32!BaseThreadInitThunk+0xe
0078ff44 00000000 ntdll!_RtlUserThreadStart+0×23

0:000> .asm no_code_bytes
Assembly options: no_code_bytes
0:000> uf ntdll!_RtlUserThreadStart
...
...
...
ntdll!_RtlUserThreadStart:
779ea996 push 14h
779ea998 push offset ntdll! ?? ::FNODOBFM::`string'+0xb6e (779ff108)
779ea99d call ntdll!_SEH_prolog4 (779f47d8)
779ea9a2 and  dword ptr [ebp-4],0
779ea9a6 mov  eax,dword ptr [ntdll!Kernel32ThreadInitThunkFunction (77a752a0)]
779ea9ab push dword ptr [ebp+0Ch]
779ea9ae test eax,eax
779ea9b0 je   ntdll!_RtlUserThreadStart+0x32 (779c6326)
...
...
...
0:000> dds ntdll!Kernel32ThreadInitThunkFunction l1
77a752a0 77603821 kernel32!BaseThreadInitThunk

- Dmitry Vostokov -

Process crash - getting the dump manually

Wednesday, May 16th, 2007

Sometimes customers have process crashes with exception dialogs but no dumps are saved due to some reason, for example, Dr. Watson limitation, NTSD doesn’t save dumps on Windows 2000, etc. One solution is to dump the process manually while it displays an error message. Customers and support engineers can use Microsoft userdump.exe for this purpose. Then looking at the dump we would see the exception because it is processed by an exception handler that either shows the error dialog or creates Windows Error Reporting process. Non-interactive services usually call NtRaiseHardError to let csrss.exe display a message. The following stack trace is from IE dump saved when WER error dialog box was shown:

0:000> k
ChildEBP RetAddr
0012973c 7c59a072 NTDLL!ZwWaitForSingleObject+0xb
00129764 7c57b3e9 KERNEL32!WaitForSingleObjectEx+0x71
00129774 00401b2f KERNEL32!WaitForSingleObject+0xf
0012a238 7918cd0e IEXPLORE!DwExceptionFilter+0×284
0012a244 03a3f0c3 mscoree!__CxxUnhandledExceptionFilter+0×46
0012a250 7c59bf8d msvcr71!__CxxUnhandledExceptionFilter+0×46
0012a984 715206e0 KERNEL32!UnhandledExceptionFilter+0×140
0012ee74 71520957 BROWSEUI!BrowserProtectedThreadProc+0×64
0012fef0 71762a0a BROWSEUI!SHOpenFolderWindow+0×1ec
0012ff10 00401ecd SHDOCVW!IEWinMain+0×108
0012ff60 00401f7d IEXPLORE!WinMainT+0×2dc
0012ffc0 7c5989a5 IEXPLORE!ModuleEntry+0×97
0012fff0 00000000 KERNEL32!BaseProcessStart+0×3d

If we disassemble DwExceptionFilter we would see CreateProcess call:

0:000> ub IEXPLORE!DwExceptionFilter+0x284
IEXPLORE!DwExceptionFilter+0x263:
00401b0e call dword ptr [IEXPLORE!_imp__CreateProcessA (00401050)]
00401b14 test eax,eax
00401b16 je   IEXPLORE!DwExceptionFilter+0x2f6 (00401ba1)
00401b1c mov  dword ptr [ebp+7Ch],edi
00401b1f mov  edi,dword ptr [IEXPLORE!_imp__WaitForSingleObject (0040104c)]
00401b25 push 4E20h
00401b2a push dword ptr [ebp+68h]
00401b2d call edi

I already described WER processes in the previous post about post-mortem debugging so I won’t cover it here.

If we run !analyze -v command we are lucky because WinDbg will find the exception for us:

...
...
...
CONTEXT: 0012aa94 -- (.cxr 12aa94)
eax=00000000 ebx=00000000 ecx=00000000 edx=7283e058 esi=0271a60c edi=00000000
eip=35c5f973 esp=0012ad60 ebp=0012ad7c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=0038 gs=0000 efl=00010246
componentA!InternalFoo+0x21:
35c5f973 8b01 mov eax,dword ptr [ecx] ds:0023:00000000=????????
...
...
...
STACK_TEXT:
0012ad7c 35c6042f 0012ae10 00000000 35c53390 componentA!InternalFoo+0x21
0012c350 779d7d5d 00000000 001ad114 00000000 componentA!InternalBar+0x157
0012c36c 77a2310e 02b23d5c 00000020 00000004 oleaut32!DispCallFunc+0x15d
0012c3fc 35cc8b60 024d2d94 02b23d5c 00000001 oleaut32!CTypeInfo2::Invoke+0x244
...
...
...

If you see several threads with UnhandledExceptionFilter - Multiple Exceptions pattern - you can set the exception context individually based on the first parameter of UnhandledExceptionFilter which is a pointer to _EXCEPTION_POINTERS structure and then use .cxr command:

0:000> ~*kv
...
...
...
. 0 Id: 1568.68c Suspend: 1 Teb: 7ffde000 Unfrozen
ChildEBP RetAddr Args to Child
...
...
...
0012a984 715206e0 0012a9ac 7800bdb5 0012a9b4 KERNEL32!UnhandledExceptionFilter+0×140 (FPO: [Non-Fpo])



0:000> dt _EXCEPTION_POINTERS 0012a9ac
+0×000 ep_xrecord : 0×12aa78
+0×004 ep_context : 0×12aa94
0:000> .cxr 0012aa94
eax=00000000 ebx=00000000 ecx=00000000 edx=7283e058 esi=0271a60c edi=00000000
eip=35c5f973 esp=0012ad60 ebp=0012ad7c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=0038 gs=0000 efl=00010246
componentA!InternalFoo+0×21:
35c5f973 8b01 mov eax,dword ptr [ecx] ds:0023:00000000=????????

Another stack fragment comes from some Windows service and it shows the thread calling NtRaiseHardError:

0:000> ~*k
...
...
...
13 Id: 3624.16cc Suspend: 1 Teb: 7ffad000 Unfrozen
ChildEBP RetAddr
0148ed40 7c821b74 ntdll!KiFastSystemCallRet
0148ed44 77e99af9 ntdll!NtRaiseHardError+0xc
0148f3dc 77e84259 kernel32!UnhandledExceptionFilter+0×54b

0148f40c 7c82eeb2 kernel32!_except_handler3+0×61
0148f430 7c82ee84 ntdll!ExecuteHandler2+0×26
0148f4d8 7c82ecc6 ntdll!ExecuteHandler+0×24
0148f4d8 7c81e215 ntdll!KiUserExceptionDispatcher+0xe
0148f7e0 76133437 ntdll!RtlLengthSecurityDescriptor+0×2a
0148f80c 7613f33d serviceA!GetObjectSize+0×1c3
0148f8d0 77c70f3b serviceA!RpcGetObjectSize+0×1b
0148f8f8 77ce23f7 rpcrt4!Invoke+0×30
0148fcf8 77ce26ed rpcrt4!NdrStubCall2+0×299
0148fd14 77c709be rpcrt4!NdrServerCall2+0×19
0148fd48 77c7093f rpcrt4!DispatchToStubInCNoAvrf+0×38
0148fd9c 77c70865 rpcrt4!RPC_INTERFACE::DispatchToStubWorker+0×117
0148fdc0 77c734b1 rpcrt4!RPC_INTERFACE::DispatchToStub+0xa3


- Dmitry Vostokov -

Interrupts and exceptions explained (Part 3)

Tuesday, May 15th, 2007

In Part 1 discussed interrupt processing that happens when an x86 processor executes in privileged protected mode (ring 0). It pushes interrupt frame shown in the following pseudo-code:

push EFLAGS
push CS
push EIP
push ErrorCode
EIP := IDT[VectorNumber].ExtendedOffset<<16 +
   IDT[VectorNumber].Offset

Please note that this is an interrupt frame that is created by CPU and not a trap frame created by a software interrupt handler to save CPU state (_KTRAP_FRAME).

If an x86 processor executes in user mode (ring 3) and an interrupt happens then the stack switch occurs before the processor saves user mode stack pointer SS:ESP and pushes the rest of the interrupt frame. Pushing both SS:RSP always happens on x64 processor regardless of the current execution mode, kernel or user. Therefore the following x86 pseudo-code shows how interrupt frame is pushed on the current stack (to be precise, on the kernel space stack if the interrupt happened in user mode):

push SS
push ESP
push EFLAGS
push CS
push EIP
push ErrorCode
EIP := IDT[VectorNumber].ExtendedOffset<<16 +
   IDT[VectorNumber].Offset

Usually CS is 0×1b and SS is 0×23 for x86 Windows flat memory model so we can easily identify this pattern on raw stack data.

Why should we care about an interrupt frame? This is because in complete full memory dumps we can see exceptions that happened in user space and were being processed at the time the dump was saved.

Let’s look at some example:

PROCESS 89a94800 SessionId: 1 Cid: 1050 Peb: 7ffd7000 ParentCid: 08a4
DirBase: 390f5000 ObjectTable: e36ee0b8 HandleCount: 168.
Image: processA.exe
VadRoot 8981d0a0 Vads 309 Clone 0 Private 222555. Modified 10838. Locked 0.
DeviceMap e37957e0
Token e395b8f8
ElapsedTime 07:44:38.505
UserTime 00:54:52.906
KernelTime 00:00:58.109
QuotaPoolUsage[PagedPool] 550152
QuotaPoolUsage[NonPagedPool] 14200
Working Set Sizes (now,min,max) (213200, 50, 345) (852800KB, 200KB, 1380KB)
PeakWorkingSetSize 227093
VirtualSize 1032 Mb
PeakVirtualSize 1032 Mb
PageFaultCount 232357
MemoryPriority BACKGROUND
BasePriority 8
CommitCharge 233170
DebugPort 899b6a40

We see that the process has a DebugPort and the presence of it usually shows that some exception happened. Therefore if you dump all processes by entering !process 0 1 command you can search for any unhandled exceptions.

Indeed if we switch to this process (you can also use !process 89a94800 ff command for dumps coming from XP and higher systems) we see KiDispatchException on one of the processA’s threads:

0: kd> .process 89a94800
0: kd> .reload
0: kd> !process 89a94800
...
...
...
THREAD 89a93020 Cid 1050.1054 Teb: 7ffdf000 Win32Thread: bc1da760 WAIT: (Unknown) KernelMode Non-Alertable
SuspendCount 1
f44dc3a8 SynchronizationEvent
Not impersonating
DeviceMap e37957e0
Owning Process 89a94800 Image: processA.exe
Wait Start TickCount 4244146 Ticks: 1232980 (0:05:21:05.312)
Context Switch Count 1139234 LargeStack
UserTime 00:54:51.0531
KernelTime 00:00:53.0937
Win32 Start Address processA!WinMainCRTStartup (0x00c534c8)
Start Address kernel32!BaseProcessStartThunk (0x77e617f8)
Stack Init f44dcbd0 Current f44dc2ec Base f44dd000 Limit f44d7000 Call f44dcbd8
Priority 12 BasePriority 8 PriorityDecrement 2
ChildEBP RetAddr
f44dc304 8083d5b1 nt!KiSwapContext+0x26
f44dc330 8083df9e nt!KiSwapThread+0x2e5
f44dc378 809c3cff nt!KeWaitForSingleObject+0x346
f44dc458 809c4f09 nt!DbgkpQueueMessage+0x178
f44dc47c 80977ad9 nt!DbgkpSendApiMessage+0x45
f44dc508 8081a94f nt!DbgkForwardException+0x90
f44dc8c4 808346b4 nt!KiDispatchException+0×1ea
f44dc92c 80834650 nt!CommonDispatchException+0×4a
f44dc9b8 80a801ae nt!Kei386EoiHelper+0×16e
0012f968 0046915d hal!HalpDispatchSoftwareInterrupt+0×5e
0012f998 0047cb72 processA!CalculateClientSizeFromPoint+0×5f
0012f9bc 0047cc1d processA!CalculateFromPoint+0×30
0012fa64 0047de83 processA!DrawUsingMemDC+0×1b9
0012fac0 0099fb43 processA!OnDraw+0×13
0012fb5c 7c17332d processA!OnPaint+0×56
0012fbe8 7c16e0b0 MFC71!CWnd::OnWndMsg+0×340
0012fc08 00c6253a MFC71!CWnd::WindowProc+0×22
0012fc24 0096cf9d processA!WindowProc+0×38
0012fcb8 7c16e1b8 MFC71!AfxCallWndProc+0×91
0012fcd8 7c16e1f6 MFC71!AfxWndProc+0×46
0012fd04 7739b6e3 MFC71!AfxWndProcBase+0×39
0012fd30 7739b874 USER32!InternalCallWinProc+0×28
0012fda8 7739c8b8 USER32!UserCallWinProcCheckWow+0×151
0012fe04 7739c9c6 USER32!DispatchClientMessage+0xd9
0012fe2c 7c828536 USER32!__fnDWORD+0×24
0012fe2c 80832dee ntdll!KiUserCallbackDispatcher+0×2e
f44dcbf0 8092d605 nt!KiCallUserMode+0×4
f44dcc48 bf8a26d3 nt!KeUserModeCallback+0×8f
f44dcccc bf89e985 win32k!SfnDWORD+0xb4
f44dcd0c bf89eb27 win32k!xxxDispatchMessage+0×223
f44dcd58 80833bdf win32k!NtUserDispatchMessage+0×4c
f44dcd58 7c8285ec nt!KiFastCallEntry+0xfc
0012fe2c 7c828536 ntdll!KiFastSystemCallRet
0012fe58 7739c57b ntdll!KiUserCallbackDispatcher+0×2e
0012fea8 773a16e5 USER32!NtUserDispatchMessage+0xc
0012feb8 7c169076 USER32!DispatchMessageA+0xf
0012fec8 7c16913e MFC71!AfxInternalPumpMessage+0×3e
0012fee4 0041cb0b MFC71!CWinThread::Run+0×54
0012ff08 7c172fc5 processA!CMain::Run+0×3b
0012ff18 00c5364d MFC71!AfxWinMain+0×68
0012ffc0 77e6f23b processA!WinMainCRTStartup+0×185
0012fff0 00000000 kernel32!BaseProcessStart+0×23

You might think that exception happened in CalculateClientSizeFromPoint function. However there is no nt!KiTrapXXX call and hal!HalpDispatchSoftwareInterrupt has user space return address and this looks suspicious. So we need to look at raw stack data and find our interrupt frame. We look for KiDispatchException, then for KiTrap substring and finally for 0000001b. If 0000001b and 00000023 are separated by 2 double words then we have found out interrupt frame:

0: kd> .thread 89a93020
Implicit thread is now 89a93020
0: kd> dds esp esp+1000
...
...
...
f44dc2f8 f44dc330
f44dc2fc 89a93098
f44dc300 ffdff120
f44dc304 89a93020
f44dc308 8083d5b1 nt!KiSwapThread+0x2e5
f44dc30c 89a93020
f44dc310 89a930c8
f44dc314 00000000
...
...
...
f44dc4e8 f44dcc38
f44dc4ec 8083a8cc nt!_except_handler3
f44dc4f0 80870868 nt!`string'+0xa4
f44dc4f4 ffffffff
f44dc4f8 80998bfd nt!Ki386CheckDivideByZeroTrap+0x273
f44dc4fc 8083484f nt!KiTrap00+0x88
f44dc500 00000001
f44dc504 0000bb40
f44dc508 f44dc8c4
f44dc50c 8081a94f nt!KiDispatchException+0×1ea
f44dc510 f44dc8e0
f44dc514 00000001
f44dc518 00000000
f44dc51c 00469583 processA!LPtoDP+0×19
f44dc520 16b748f0
f44dc524 00469583 processA!LPtoDP+0×19
f44dc528 00000000
f44dc52c 00000000



f44dc8c0 ffffffff
f44dc8c4 f44dc934
f44dc8c8 808346b4 nt!CommonDispatchException+0×4a
f44dc8cc f44dc8e0
f44dc8d0 00000000
f44dc8d4 f44dc934
f44dc8d8 00000001
f44dc8dc 00000001
f44dc8e0 c0000094
f44dc8e4 00000000
f44dc8e8 00000000
f44dc8ec 00469583 processA!LPtoDP+0×19
f44dc8f0 00000000
f44dc8f4 808a3988 nt!KiAbiosPresent+0×4
f44dc8f8 ffffffff
f44dc8fc 0000a6f2
f44dc900 00469585 processA!LPtoDP+0×1b
f44dc904 00000004
f44dc908 00000000
f44dc90c f9000001
f44dc910 f44dc8dc
f44dc914 ffffffff
f44dc918 f44dcc38
f44dc91c 8083a8cc nt!_except_handler3
f44dc920 80870868 nt!`string’+0xa4
f44dc924 ffffffff
f44dc928 80998bfd nt!Ki386CheckDivideByZeroTrap+0×273
f44dc92c 8083484f nt!KiTrap00+0×88
f44dc930 80834650 nt!Kei386EoiHelper+0×16e
f44dc934 0012f968
f44dc938 00469583 processA!LPtoDP+0×19
f44dc93c badb0d00
f44dc940 00000000
f44dc944 ffffffff
f44dc948 00007fff
f44dc94c 00000000
f44dc950 fffff800
f44dc954 ffffffff
f44dc958 00007fff
f44dc95c 00000000
f44dc960 00000000
f44dc964 80a80000 hal!HalpInitIrqlAuditFlag+0×4e
f44dc968 00000023
f44dc96c 00000023
f44dc970 00000000
f44dc974 00000000
f44dc978 00005334
f44dc97c 00000001
f44dc980 f44dcc38
f44dc984 0000003b
f44dc988 16b748f0
f44dc98c 16b748f0
f44dc990 0012f9fc
f44dc994 0012f968
f44dc998 00000000 ; ErrorCode
f44dc99c 00469583 processA!LPtoDP+0×19 ; EIP
f44dc9a0 0000001b ; CS
f44dc9a4 00010246 ; EFLAGS
f44dc9a8 0012f934 ; ESP
f44dc9ac 00000023 ; SS
f44dc9b0 8982e7e0
f44dc9b4 00000000

Why did we skip the first KiTrap00? Because KiDispatchException is called after KiTrap00 so we should see it before KiTrap00 on raw stack. To see all these calls we can disassemble return addresses:

0: kd> .asm no_code_bytes
Assembly options: no_code_bytes
0: kd> ub nt!KiTrap00+0x88
nt!KiTrap00+0x74:
8083483b test byte ptr [ebp+6Ch],1
8083483f je   nt!KiTrap00+0x81 (80834848)
80834841 cmp  word ptr [ebp+6Ch],1Bh
80834846 jne  nt!KiTrap00+0x9e (80834865)
80834848 sti
80834849 push ebp
8083484a call nt!Ki386CheckDivideByZeroTrap (8099897d)
8083484f mov  ebx,dword ptr [ebp+68h]

nt!KiTrap00+0×88 is not equal to nt!KiTrap00+0×74 so we have OMAP code optimization case here and we have to disassemble raw addresses as seen on the raw stack fragment repeated here:

...
...
...
f44dc8c8 808346b4 nt!CommonDispatchException+0x4a
...
...
...
f44dc924 ffffffff
f44dc928 80998bfd nt!Ki386CheckDivideByZeroTrap+0x273
f44dc92c 8083484f nt!KiTrap00+0×88
f44dc930 80834650 nt!Kei386EoiHelper+0×16e
f44dc934 0012f968


0: kd> u 8083484f
nt!KiTrap00+0×88:
8083484f mov  ebx,dword ptr [ebp+68h]
80834852 jmp  nt!Kei386EoiHelper+0×167 (80834649)
80834857 sti
80834858 mov  ebx,dword ptr [ebp+68h]
8083485b mov  eax,0C0000094h
80834860 jmp  nt!Kei386EoiHelper+0×167 (80834649)
80834865 mov  ebx,dword ptr fs:[124h]
8083486c mov  ebx,dword ptr [ebx+38h]
0: kd> u 80834649
nt!Kei386EoiHelper+0×167:
80834649 xor  ecx,ecx
8083464b call nt!CommonDispatchException (8083466a)
80834650 xor  edx,edx ; nt!Kei386EoiHelper+0×16e
80834652 mov  ecx,1
80834657 call nt!CommonDispatchException (8083466a)
8083465c xor  edx,edx
8083465e mov  ecx,2
80834663 call nt!CommonDispatchException (8083466a)
0: kd> ub 808346b4
nt!CommonDispatchException+0×38:
808346a2 mov  eax,dword ptr [ebp+6Ch]
808346a5 and  eax,1
808346a8 push 1
808346aa push eax
808346ab push ebp
808346ac push 0
808346ae push ecx
808346af call nt!KiDispatchException (80852a53)

So we see that KiTrap00 calls CommonDispatchException which calls KiDispatchException. If we look at our found interrupt frame we see that EIP of the exception was 00469583 and ESP was 0012f934:

...
...
...
f44dc998 00000000 ; ErrorCode
f44dc99c 00469583 processA!LPtoDP+0×19 ; EIP
f44dc9a0 0000001b ; CS
f44dc9a4 00010246 ; EFLAGS
f44dc9a8 0012f934 ; ESP
f44dc9ac 00000023 ; SS


Now we try to reconstruct stack trace by putting the values of ESP and EIP:

0: kd> k L=0012f934 0012f934 00469583 ; EBP ESP EIP format
ChildEBP RetAddr
0012f930 00469a16 processA!LPtoDP+0x19
0012f934 00000000 processA!GetColumnWidth+0x45

Stack trace doesn’t look good, there is neither BaseProcessStart nor BaseThreadStart, perhaps because we specified ESP value twice instead of EBP and ESP. Let’s hope to find EBP value by dumping the memory around ESP:

0: kd> dds 0012f934-10 0012f934+100
0012f924 00000000
0012f928 0012f934 ; the same as ESP
0012f92c 0012f968 ; looks good to us
0012f930 00469572 processA!LPtoDP+0×8
0012f934 00469a16 processA!GetColumnWidth+0×45
0012f938 00005334



0012f964 00005334
0012f968 0012f998
0012f96c 0046915d processA!CalculateClientSizeFromPoint+0×5f
0012f970 00000000
0012f974 0012f9fc
0012f978 16b748f0
0012f97c 0012fa48
0012f980 00000000
0012f984 00000000
0012f988 000003a0
0012f98c 00000237
0012f990 00000014
0012f994 00000000
0012f998 0012f9bc
0012f99c 0047cb72 processA!CalculateFromPoint+0×30
0012f9a0 0012f9fc
0012f9a4 0012f9b4
0012f9a8 0012fa48


So finally we get our stack trace:

0: kd> k L=0012f968 0012f934 00469583 100
ChildEBP RetAddr
0012f930 00469a16 processA!LPtoDP+0x19
0012f968 0046915d processA!GetColumnWidth+0x45
0012f998 0047cb72 processA!CalculateClientSizeFromPoint+0x5f
0012f9bc 0047cc1d processA!CalculateFromPoint+0x30
0012fa64 0047de83 processA!DrawUsingMemDC+0x1b9
0012fac0 0099fb43 processA!OnDraw+0x13
0012fb5c 7c17332d processA!OnPaint+0x56
0012fbe8 7c16e0b0 MFC71!CWnd::OnWndMsg+0x340
0012fc08 00c6253a MFC71!CWnd::WindowProc+0x22
0012fc24 0096cf9d processA!WindowProc+0x38
0012fcb8 7c16e1b8 MFC71!AfxCallWndProc+0x91
0012fcd8 7c16e1f6 MFC71!AfxWndProc+0x46
0012fd04 7739b6e3 MFC71!AfxWndProcBase+0x39
0012fd30 7739b874 USER32!InternalCallWinProc+0x28
0012fda8 7739c8b8 USER32!UserCallWinProcCheckWow+0x151
0012fe04 7739c9c6 USER32!DispatchClientMessage+0xd9
0012fe2c 7c828536 USER32!__fnDWORD+0x24
0012fe2c 80832dee ntdll!KiUserCallbackDispatcher+0x2e
f44dcbf0 8092d605 nt!KiCallUserMode+0x4
f44dcc48 bf8a26d3 nt!KeUserModeCallback+0x8f
f44dcccc bf89e985 win32k!SfnDWORD+0xb4
f44dcd0c bf89eb27 win32k!xxxDispatchMessage+0x223
f44dcd58 80833bdf win32k!NtUserDispatchMessage+0x4c
f44dcd58 7c8285ec nt!KiFastCallEntry+0xfc
0012fe2c 7c828536 ntdll!KiFastSystemCallRet
0012fe58 7739c57b ntdll!KiUserCallbackDispatcher+0x2e
0012fea8 773a16e5 USER32!NtUserDispatchMessage+0xc
0012feb8 7c169076 USER32!DispatchMessageA+0xf
0012fec8 7c16913e MFC71!AfxInternalPumpMessage+0x3e
0012fee4 0041cb0b MFC71!CWinThread::Run+0x54
0012ff08 7c172fc5 processA!CMain::Run+0x3b
0012ff18 00c5364d MFC71!AfxWinMain+0x68
0012ffc0 77e6f23b processA!WinMainCRTStartup+0x185
0012fff0 00000000 kernel32!BaseProcessStart+0x23

- Dmitry Vostokov -

Crash Dump Analysis Patterns (Part 14)

Friday, May 11th, 2007

The next pattern is Spiking Thread. If you have a process dump with many threads from a customer it is sometimes difficult to see which thread there was spiking CPU, that’s why it is always good to have some screenshots or notes from QSlice or Process Explorer showing spiking thread ID and process ID. The latter ID is to make sure that the process dump was from the correct process. New process dumpers and tools from Microsoft (userdump.exe, for example) save thread time information so you can open the dump and see the time spent in kernel and user mode for any thread by entering !runaway command. However if that command shows many threads with similar CPU consumption it will not highlight the particular thread that was spiking at the time the crash dump was saved so screenshots are still useful in some cases.

What to do if you don’t have spiking thread ID? Look at all threads and find those that are not waiting. Almost all threads are waiting most of the time. So the chances to dump the normal process and see some active threads are very low. If the thread is waiting the top function on its stack usually is (for XP/W2K3/Vista):

ntdll!KiFastSystemCallRet

and below it you could see some blocking calls waiting for some synchronization object, Sleep API call, IO completion or for LPC reply:

0:085> ~*kv
...
...
...
64 Id: 1b0.120c Suspend: -1 Teb: 7ff69000 Unfrozen
ChildEBP RetAddr Args to Child
02defe18 7c90e399 ntdll!KiFastSystemCallRet
02defe1c 77e76703 ntdll!NtReplyWaitReceivePortEx+0xc
02deff80 77e76c22 rpcrt4!LRPC_ADDRESS::ReceiveLotsaCalls+0xf4
02deff88 77e76a3b rpcrt4!RecvLotsaCallsWrapper+0xd
02deffa8 77e76c0a rpcrt4!BaseCachedThreadRoutine+0×79
02deffb4 7c80b683 rpcrt4!ThreadStartRoutine+0×1a
02deffec 00000000 kernel32!BaseThreadStart+0×37

65 Id: 1b0.740 Suspend: -1 Teb: 7ff67000 Unfrozen
ChildEBP RetAddr Args to Child
02edff44 7c90d85c ntdll!KiFastSystemCallRet
02edff48 7c8023ed ntdll!NtDelayExecution+0xc
02edffa0 57cde2dd kernel32!SleepEx+0×61

02edffb4 7c80b683 component!foo+0×35
02edffec 00000000 kernel32!BaseThreadStart+0×37

66 Id: 1b0.131c Suspend: -1 Teb: 7ff66000 Unfrozen
ChildEBP RetAddr Args to Child
02f4ff38 7c90e9c0 ntdll!KiFastSystemCallRet
02f4ff3c 7c8025cb ntdll!ZwWaitForSingleObject+0xc
02f4ffa0 72001f65 kernel32!WaitForSingleObjectEx+0xa8

02f4ffb4 7c80b683 component!WorkerThread+0×15
02f4ffec 00000000 kernel32!BaseThreadStart+0×37

67 Id: 1b0.1320 Suspend: -1 Teb: 7ff65000 Unfrozen
ChildEBP RetAddr Args to Child
02f8fe1c 7c90e9ab ntdll!KiFastSystemCallRet
02f8fe20 7c8094e2 ntdll!ZwWaitForMultipleObjects+0xc
02f8febc 7e4195f9 kernel32!WaitForMultipleObjectsEx+0×12c
02f8ff18 7e4196a8 user32!RealMsgWaitForMultipleObjectsEx+0×13e
02f8ff34 720019f6 user32!MsgWaitForMultipleObjects+0×1f

02f8ffa0 72001a29 component!bar+0xd9
02f8ffb4 7c80b683 component!MonitorWorkerThread+0×11
02f8ffec 00000000 kernel32!BaseThreadStart+0×37

68 Id: 1b0.1340 Suspend: -1 Teb: 7ff63000 Unfrozen
ChildEBP RetAddr Args to Child
0301ff1c 7c90e31b ntdll!KiFastSystemCallRet
0301ff20 7c80a746 ntdll!ZwRemoveIoCompletion+0xc
0301ff4c 57d46e65 kernel32!GetQueuedCompletionStatus+0×29

0301ffb4 7c80b683 component!AsyncEventsThread+0×91
0301ffec 00000000 kernel32!BaseThreadStart+0×37



# 85 Id: 1b0.17b4 Suspend: -1 Teb: 7ffd4000 Unfrozen
ChildEBP RetAddr Args to Child
00daffc8 7c9507a8 ntdll!DbgBreakPoint
00dafff4 00000000 ntdll!DbgUiRemoteBreakin+0×2d

Therefore if you have a different thread like this one below the chances that it was spiking are big:

58 Id: 1b0.9f4 Suspend: -1 Teb: 7ff75000 Unfrozen
ChildEBP RetAddr Args to Child
0280f64c 500af723 componentB!DoSomething+32
0280f85c 500b5391 componentB!CheckSomething+231
0280f884 500b7a3f componentB!ProcessWorkIteme+9f
0301ffec 00000000 kernel32!BaseThreadStart+0x37

There is no KiFastSystemCallRet on top and if we look at the currently executing instruction it indeed does some copy operation:

0:085> ~58r
eax=00000000 ebx=0280fdd4 ecx=0000005f edx=00000000 esi=03d30444 edi=0280f6dc
eip=500a4024 esp=0280f644 ebp=0280f64c iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010202
componentB!DoSomething+32:
500a4024 f3a5 rep movs dword ptr es:[edi],dword ptr [esi] es:0023:0280f6dc=00000409 ds:0023:03d30444=00000409

In a kernel or a complete memory dump you can see spikes by checking KernelTime and UserTime:

0: kd> !thread 88b66768
THREAD 88b66768 Cid 01fc.1550 Teb: 7ffad000 Win32Thread: bc18f240 RUNNING on processor 1
IRP List:
89716008: (0006,0094) Flags: 00000a00 Mdl: 00000000
Impersonation token: e423a030 (Level Impersonation)
DeviceMap e3712480
Owning Process 8a0a56a0 Image: SomeSvc.exe
Wait Start TickCount 1782229 Ticks: 0
Context Switch Count 877610 LargeStack
UserTime 00:00:01.0078
KernelTime 02:23:21.0718

- Dmitry Vostokov @ DumpAnalysis.org -

Crash Dump Analysis Patterns (Part 13a)

Wednesday, May 9th, 2007

Insufficient Memory pattern can be seen in many complete and kernel memory dumps. This condition could cause the system to crash, become slow, hang or refuse to provide the expected functionality, for example, refuse new terminal server connections. There are many types of memory resources and we can classify them initially into the following categories:

  • Committed memory
  • Virtual memory
    • Kernel space
      • Paged pool
      • Non-paged pool
      • Session pool
      • PTE limits
      • Desktop heap
      • GDI limits
    • User space
      • Virtual regions
      • Process heap

We will talk about all of them in separate parts. What I outline in this part is committed memory exhaustion. Committed memory is an allocated memory backed up by some physical memory or by a reserved space in the page file. Reserving the space needs to be done in case OS wants to swap out that memory’s data to disk if it is not currently used and there is no physical memory available for other processes. If that data is needed again OS brings it back to physical memory. If there is no space in the page file then physical memory is filled up. If committed memory is exhausted most likely the system will hang or result in a bugcheck soon so checking memory statistics shall always be done when you get a kernel or a complete memory dump. Even access violation bugchecks could result from insufficient memory when some memory allocation operation failed but a kernel mode component didn’t check the return value for NULL. Here is an example:

BugCheck 8E, {c0000005, 809203af, aa647c0c, 0}

0: kd> !analyze -v
...
...
...
TRAP_FRAME: aa647c0c -- (.trap ffffffffaa647c0c)
...
...
...

0: kd> .trap ffffffffaa647c0c
ErrCode = 00000000
eax=00000000 ebx=bc1f3cfc ecx=89589250 edx=000018c1 esi=bc1f3ce0 edi=aa647d14
eip=809203af esp=aa647c80 ebp=aa647c80 iopl=0 nv up ei pl zr na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010246
nt!SeTokenType+0x8:
809203af 8b8080000000 mov eax,dword ptr [eax+80h] ds:0023:00000080=????????

0: kd> k
ChildEBP RetAddr
aa647c80 bf8173c5 nt!SeTokenType+0x8
aa647cdc bf81713b win32k!GreGetSpoolMessage+0xb0
aa647d4c 80834d3f win32k!NtGdiGetSpoolMessage+0x96
aa647d4c 7c82ed54 nt!KiFastCallEntry+0xfc

If we enter !vm command to display memory statistics we would see that all committed memory is filled up:

0: kd> !vm
*** Virtual Memory Usage ***
 Physical Memory:      999294 (   3997176 Kb)
 Page File: \??\C:\pagefile.sys
   Current:   4193280 Kb  Free Space:    533744 Kb
   Minimum:   4193280 Kb  Maximum:      4193280 Kb
 Available Pages:       18698 (     74792 Kb)
 ResAvail Pages:       865019 (   3460076 Kb)
 Locked IO Pages:         290 (      1160 Kb)
 Free System PTEs:     155265 (    621060 Kb)
 Free NP PTEs:          32766 (    131064 Kb)
 Free Special NP:           0 (         0 Kb)
 Modified Pages:          113 (       452 Kb)
 Modified PF Pages:        61 (       244 Kb)
 NonPagedPool Usage:    12380 (     49520 Kb)
 NonPagedPool Max:      64799 (    259196 Kb)
 PagedPool 0 Usage:     40291 (    161164 Kb)
 PagedPool 1 Usage:      2463 (      9852 Kb)
 PagedPool 2 Usage:      2455 (      9820 Kb)
 PagedPool 3 Usage:      2453 (      9812 Kb)
 PagedPool 4 Usage:      2488 (      9952 Kb)
 PagedPool Usage:       50150 (    200600 Kb)
 PagedPool Maximum:     67584 (    270336 Kb)

 ********** 18 pool allocations have failed **********

 Shared Commit:         87304 (    349216 Kb)
 Special Pool:              0 (         0 Kb)
 Shared Process:        56241 (    224964 Kb)
 PagedPool Commit:      50198 (    200792 Kb)
 Driver Commit:          1892 (      7568 Kb)
 Committed pages:     2006945 (   8027780 Kb)
 Commit limit:        2008205 (   8032820 Kb)

 ********** 1216024 commit requests have failed  **********

There might have been a memory leak or too many terminal sessions with fat applications to fit in physical memory and the page file. Actually for that particular case there were both.

- Dmitry Vostokov @ DumpAnalysis.org -

Using SSSL principle to design support tools

Sunday, May 6th, 2007

Start, Stop and Send Log (SSSL) principle was applied to design of many tools created in Citrix for troubleshooting customer issues. My favorite examples are WindowHistory, MessageHistory, and ScreenHistory

In a typical scenario, a customer has a GUI issue, downloads a tool, runs it, clicks on a “Start” button, tries to reproduce the issue, clicks on a “Stop” button and then sends a log either saved in a file or copied from the tool’s dialog. Simple action, no need for customers to become familiar with colored and complex GUI tools especially when an issue is hot and there is no time to play. Upon the arrival of the log file a skilled engineer can analyze it and provide further recommendations. Of course, SSSL tools must record all information that would be necessary to look at during analysis.

SSSL principle promotes agile iterative and incremental development of tools. Because it is difficult to know in advance about all information necessary to record, the first version usually implements logging of essential data only and what is missing in other available tools. The GUI is very simple and this shortens the development time. Later, after support incidents and analysis of their logs, it becomes apparent that more information needs to be recorded and then the second version is born, etc.

SSSL principle also promotes design and implementation reuse. Because GUI and certain operations are the same you can reuse common source code.

Finally after some time when more and more SSSL tools appear you can refactor them into a unified SSSL framework where individual tools become pluggable  components. This is what is coming this summer: GUI History Monitor that will combine all separate SSSL “History” tools together.

- Dmitry Vostokov -

Interrupts and exceptions explained (Part 2)

Saturday, May 5th, 2007

As I promised in Part 1, I describe changes in 64-bit Windows. The size of IDTR is 10 bytes where 8 bytes hold 64-bit address of IDT. The size of IDT entry is 16 bytes and it holds the address of an interrupt procedure corresponding to an interrupt vector. However interrupt procedure names are different in x64 Windows, they do not follow the same pattern like KiTrapXX. 

The following UML class diagram describes the relationship and also shows what registers are saved. In native x64 mode SS and RSP are saved regardless of kernel or user mode.

Let’s dump all architecture-defined interrupt procedure names. This is a good exercise because we will use scripting. !pcr extension reports wrong IDT base so we use dt command:

kd> !pcr 0
KPCR for Processor 0 at fffff80001176000:
    Major 1 Minor 1
 NtTib.ExceptionList: fffff80000124000
     NtTib.StackBase: fffff80000125070
    NtTib.StackLimit: 0000000000000000
  NtTib.SubSystemTib: fffff80001176000
       NtTib.Version: 0000000001176180
   NtTib.UserPointer: fffff800011767f0
       NtTib.SelfTib: 000000007ef95000
             SelfPcr: 0000000000000000
                Prcb: fffff80001176180
                Irql: 0000000000000000
                 IRR: 0000000000000000
                 IDR: 0000000000000000
       InterruptMode: 0000000000000000
                 IDT: 0000000000000000
                 GDT: 0000000000000000
                 TSS: 0000000000000000
       CurrentThread: fffffadfe669f890
          NextThread: 0000000000000000
          IdleThread: fffff8000117a300
           DpcQueue:
kd> dt _KPCR fffff80001176000
nt!_KPCR
   +0×000 NtTib            : _NT_TIB
   +0×000 GdtBase          : 0xfffff800`00124000 _KGDTENTRY64
   +0×008 TssBase          : 0xfffff800`00125070 _KTSS64
   +0×010 PerfGlobalGroupMask : (null)
   +0×018 Self             : 0xfffff800`01176000 _KPCR
   +0×020 CurrentPrcb      : 0xfffff800`01176180 _KPRCB
   +0×028 LockArray        : 0xfffff800`011767f0 _KSPIN_LOCK_QUEUE
   +0×030 Used_Self        : 0×00000000`7ef95000
   +0×038 IdtBase          : 0xfffff800`00124070 _KIDTENTRY64
   +0×040 Unused           : [2] 0
   +0×050 Irql             : 0 ”
   +0×051 SecondLevelCacheAssociativity : 0×10 ”
   +0×052 ObsoleteNumber   : 0 ”
   +0×053 Fill0            : 0 ”
   +0×054 Unused0          : [3] 0
   +0×060 MajorVersion     : 1
   +0×062 MinorVersion     : 1
   +0×064 StallScaleFactor : 0×892
   +0×068 Unused1          : [3] (null)
   +0×080 KernelReserved   : [15] 0
   +0×0bc SecondLevelCacheSize : 0×100000
   +0×0c0 HalReserved      : [16] 0×82c5c880
   +0×100 Unused2          : 0
   +0×108 KdVersionBlock   : 0xfffff800`01174ca0
   +0×110 Unused3          : (null)
   +0×118 PcrAlign1        : [24] 0
   +0×180 Prcb             : _KPRCB

Next we dump the first entry of IDT array and glue together OffsetHigh, OffsetMiddle and OffsetLow fields to form the interrupt procedure address corresponding to the interrupt vector 0, divide by zero exception:

kd> dt _KIDTENTRY64 0xfffff800`00124070
nt!_KIDTENTRY64
   +0x000 OffsetLow        : 0xf240
   +0x002 Selector         : 0x10
   +0x004 IstIndex         : 0y000
   +0x004 Reserved0        : 0y00000 (0)
   +0x004 Type             : 0y01110 (0xe)
   +0x004 Dpl              : 0y00
   +0x004 Present          : 0y1
   +0×006 OffsetMiddle     : 0×103
   +0×008 OffsetHigh       : 0xfffff800
   +0×00c Reserved1        : 0
   +0×000 Alignment        : 0×1038e00`0010f240

kd> u 0xfffff8000103f240
nt!KiDivideErrorFault:
fffff800`0103f240 4883ec08        sub     rsp,8
fffff800`0103f244 55              push    rbp
fffff800`0103f245 4881ec58010000  sub     rsp,158h
fffff800`0103f24c 488dac2480000000 lea     rbp,[rsp+80h]
fffff800`0103f254 c645ab01        mov     byte ptr [rbp-55h],1
fffff800`0103f258 488945b0        mov     qword ptr [rbp-50h],rax
fffff800`0103f25c 48894db8        mov     qword ptr [rbp-48h],rcx
fffff800`0103f260 488955c0        mov     qword ptr [rbp-40h],rdx
kd> ln 0xfffff8000103f240
(fffff800`0103f240)   nt!KiDivideErrorFault   |
(fffff800`0103f300)   nt!KiDebugTrapOrFault
Exact matches:
    nt!KiDivideErrorFault = <no type information>

We see that the name of the procedure is KiDivideErrorFault and not KiTrap00. We can dump the second IDT entry manually by adding a 0×10 offset but in order to automate this I wrote the following WinDbg script to dump the first 20 vectors and get their interrupt procedure names:

r? $t0=(_KIDTENTRY64 *)0xfffff800`00124070; .for (r $t1=0; @$t1 <= 13; r? $t0=(_KIDTENTRY64 *)@$t0+1) { .printf “Interrupt vector %d (0x%x):\n”, @$t1, @$t1; ln @@c++(@$t0->OffsetHigh*0×100000000 + @$t0->OffsetMiddle*0×10000 + @$t0->OffsetLow); r $t1=$t1+1 }

Here is the same script but formatted:

r? $t0=(_KIDTENTRY64 *)0xfffff800`00124070;
.for (r $t1=0; @$t1 <= 13; r? $t0=(_KIDTENTRY64 *)@$t0+1)
{
    .printf "Interrupt vector %d (0x%x):\n", @$t1, @$t1;
    ln @@c++(@$t0->OffsetHigh*0x100000000 +
        @$t0->OffsetMiddle*0x10000 + @$t0->OffsetLow);
    r $t1=$t1+1
}

The output on my system is:

Interrupt vector 0 (0x0):
(fffff800`0103f240) nt!KiDivideErrorFault | (fffff800`0103f300) nt!KiDebugTrapOrFault
Exact matches:
  nt!KiDivideErrorFault = <no type information>
Interrupt vector 1 (0×1):
(fffff800`0103f300) nt!KiDebugTrapOrFault | (fffff800`0103f440) nt!KiNmiInterrupt
Exact matches:
  nt!KiDebugTrapOrFault = <no type information>
Interrupt vector 2 (0×2):
(fffff800`0103f440) nt!KiNmiInterrupt | (fffff800`0103f680) nt!KxNmiInterrupt
Exact matches:
  nt!KiNmiInterrupt = <no type information>
Interrupt vector 3 (0×3):
(fffff800`0103f780) nt!KiBreakpointTrap | (fffff800`0103f840) nt!KiOverflowTrap
Exact matches:
  nt!KiBreakpointTrap = <no type information>
Interrupt vector 4 (0×4):
(fffff800`0103f840) nt!KiOverflowTrap | (fffff800`0103f900) nt!KiBoundFault
Exact matches:
  nt!KiOverflowTrap = <no type information>
Interrupt vector 5 (0×5):
(fffff800`0103f900) nt!KiBoundFault | (fffff800`0103f9c0) nt!KiInvalidOpcodeFault
Exact matches:
  nt!KiBoundFault = <no type information>
Interrupt vector 6 (0×6):
(fffff800`0103f9c0) nt!KiInvalidOpcodeFault | (fffff800`0103fb80) nt!KiNpxNotAvailableFault
Exact matches:
  nt!KiInvalidOpcodeFault = <no type information>
Interrupt vector 7 (0×7):
(fffff800`0103fb80) nt!KiNpxNotAvailableFault | (fffff800`0103fc40) nt!KiDoubleFaultAbort
Exact matches:
  nt!KiNpxNotAvailableFault = <no type information>
Interrupt vector 8 (0×8):
(fffff800`0103fc40) nt!KiDoubleFaultAbort | (fffff800`0103fd00) nt!KiNpxSegmentOverrunAbort
Exact matches:
  nt!KiDoubleFaultAbort = <no type information>
Interrupt vector 9 (0×9):
(fffff800`0103fd00) nt!KiNpxSegmentOverrunAbort | (fffff800`0103fdc0) nt!KiInvalidTssFault
Exact matches:
  nt!KiNpxSegmentOverrunAbort = <no type information>
Interrupt vector 10 (0xa):
(fffff800`0103fdc0) nt!KiInvalidTssFault | (fffff800`0103fe80) nt!KiSegmentNotPresentFault
Exact matches:
  nt!KiInvalidTssFault = <no type information>
Interrupt vector 11 (0xb):
(fffff800`0103fe80) nt!KiSegmentNotPresentFault | (fffff800`0103ff80) nt!KiStackFault
Exact matches:
  nt!KiSegmentNotPresentFault = <no type information>
Interrupt vector 12 (0xc):
(fffff800`0103ff80) nt!KiStackFault | (fffff800`01040080) nt!KiGeneralProtectionFault
Exact matches:
  nt!KiStackFault = <no type information>
Interrupt vector 13 (0xd):
(fffff800`01040080) nt!KiGeneralProtectionFault | (fffff800`01040180) nt!KiPageFault
Exact matches:
  nt!KiGeneralProtectionFault = <no type information>
Interrupt vector 14 (0xe):
(fffff800`01040180) nt!KiPageFault | (fffff800`010404c0) nt!KiFloatingErrorFault
Exact matches:
  nt!KiPageFault = <no type information>
Interrupt vector 15 (0xf):
(fffff800`01179090) nt!KxUnexpectedInterrupt0+0xf0 | (fffff800`0117a0c0) nt!KiNode0
Interrupt vector 16 (0×10):
(fffff800`010404c0) nt!KiFloatingErrorFault | (fffff800`01040600) nt!KiAlignmentFault
Exact matches:
  nt!KiFloatingErrorFault = <no type information>
Interrupt vector 17 (0×11):
(fffff800`01040600) nt!KiAlignmentFault | (fffff800`010406c0) nt!KiMcheckAbort
Exact matches:
  nt!KiAlignmentFault = <no type information>
Interrupt vector 18 (0×12):
(fffff800`010406c0) nt!KiMcheckAbort | (fffff800`01040900) nt!KxMcheckAbort
Exact matches:
  nt!KiMcheckAbort = <no type information>
Interrupt vector 19 (0×13):
(fffff800`01040a00) nt!KiXmmException | (fffff800`01040b80) nt!KiRaiseAssertion
Exact matches:
  nt!KiXmmException = <no type information>

Let’s look at some dump.

BugCheck 1E, {ffffffffc0000005, fffffade5ba2d643, 0, 28}

This is KMODE_EXCEPTION_NOT_HANDLED and obviously it could have been invalid memory access. And indeed the stack WinDbg shows after opening a dump and entering !analyze -v command is:

RSP
fffffade`4e88fe68 nt!KeBugCheckEx
fffffade`4e88fe70 nt!KiDispatchException+0x128
fffffade`4e8905f0 nt!KiPageFault+0x1e1
fffffade`4e890780 driver!foo+0x9b

and the exception context is:

2: kd> r
Last set context:
rax=fffffade4e8907f4 rbx=fffffade6de0c2e0 rcx=fffffa8014412000
rdx=fffffade71e7e2ac rsi=0000000000000000 rdi=fffffadffff03000
rip=fffffade5ba2d643 rsp=fffffade4e890780 rbp=fffffade71e7ffff
 r8=00000000000005b0 r9=fffffade4e890a88 r10=fffffadffd077898
r11=fffffade71e7e260 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0 nv up ei pl zr na po nc
cs=0010 ss=0018 ds=0950 es=4e89 fs=fade gs=ffff efl=00010246
driver!foo+0×9b:
fffffade`5ba2d643 8b4e28 mov ecx,dword ptr [rsi+28h] ds:0950:0028=????????

We see that KiPageFault was called and from the dumped IDT we see it corresponds to the interrupt vector 14 (0xE) which is called on any memory reference that is not present in physical memory.

Now I’m going to dump the raw stack around fffffade`4e890780 to see our the values the processor saved before calling KiPageFault:

2: kd> dps fffffade`4e890780-50 fffffade`4e890780+50
fffffade`4e890730  fffffade`6de0c2e0
fffffade`4e890738  fffffadf`fff03000
fffffade`4e890740  00000000`00000000
fffffade`4e890748  fffffade`71e7ffff
fffffade`4e890750  00000000`00000000 ; ErrorCode
fffffade`4e890758  fffffade`5ba2d643 driver!foo+0×9b ; RIP
fffffade`4e890760  00000000`00000010 ; CS
fffffade`4e890768  00000000`00010246 ; RFLAGS
fffffade`4e890770  fffffade`4e890780 ; RSP
fffffade`4e890778  00000000`00000018 ; SS

fffffade`4e890780  00000000`00000000 ; RSP before interrupt
fffffade`4e890788  00000000`00000000
fffffade`4e890790  00000000`00000000
fffffade`4e890798  00000000`00000000
fffffade`4e8907a0  00000000`00000000
fffffade`4e8907a8  00000000`00000000
fffffade`4e8907b0  00000000`00000000
fffffade`4e8907b8  00000000`00000000
fffffade`4e8907c0  00000000`00000000
fffffade`4e8907c8  00000000`00000000
fffffade`4e8907d0  00000000`00000000

You see the values are exactly the same as WinDbg shows in the saved context above. Actually if you look at Page-Fault Error Code bits in Intel Architecture Software Developer’s Manual Volume 3A, you would see that for this case, all zeroes, we have:

  • the page was not present in memory
  • the fault was caused by the read access
  • the processor was executing in kernel mode
  • no reserved bits in page directory were set to 1 when 0s were expected
  • it was not caused by instruction fetch

- Dmitry Vostokov -

Parameterized WinDbg Scripts

Saturday, May 5th, 2007

I was looking for a way to pass arguments to scripts and the new WinDbg 6.7.5 documentation contains the description of $$>a< command:

$$>a< Filename arg1 arg2 arg3 … argn

argn

Specifies an argument that the debugger passes to the script. These parameters can be quoted strings or space-delimited strings. All arguments are optional.

This promises a lot of possibilities to write cool scripts and use structured programming. Couldn’t quickly find in the help who gets the passed parameter, a pseudu-register or a fixed-name alias. Certainly some experimentation is needed here.

- Dmitry Vostokov -

Crash Dumps for Dummies (Part 5)

Saturday, May 5th, 2007

In this part, I try to explain symbol files. They are usually called PDB files because they have .PDB extension although the older ones can have .DBG extension. PDB files are needed to read dump files properly. Without PDB files the dump file data is just a collection of numbers, the contents of memory, without any meaning. PDB files help tools like WinDbg to interpret the data and present it in a human-readable format. Roughly speaking, PDB files contain associations between numbers and their meanings expressed in short text strings:

Because these associations are changed when you have a fix or a service pack on a computer and you have a dump from it you need newer PDB files that correspond to updated components such as DLLs or drivers. 

Long time ago you had to download symbol files manually from Microsoft or get them from CDs. Now Microsoft has its dedicated internet symbol server and WinDbg can download PDB files automatically. However you need to specify Microsoft symbol server location in File\Symbol File Path… dialog and check Reload. The location is usually:

SRV*c:\websymbols*http://msdl.microsoft.com/download/symbols

If you don’t remember the location when you run WinDbg for the first time or on a new computer you can enter .symfix command to set Microsoft symbol server path automatically and specify the location where to download symbol files. You can check your current symbol search path by using .sympath command and don’t forget to reload symbols by entering .reload command:

0:000> .symfix
No downstream store given, using C:\Program Files\Debugging Tools for Windows\sym
0:000> .sympath
Symbol search path is: SRV**http://msdl.microsoft.com/download/symbols
0:000> .symfix c:\websymbols
0:000> .sympath
Symbol search path is: SRV*c:\websymbols*http://msdl.microsoft.com/download/symbols
0:000> .reload

- Dmitry Vostokov @ DumpAnalysis.org -

Reading Korean

Saturday, May 5th, 2007

I’m very pleased that Heejune Kim and Taehwa Lee translated and posted the first crash dump analysis pattern article on www.driveronline.org

Heejune also has a blog insidekernel.net, mainly about WDF.

I’ve just looked at my blog idea notes and found that I have plans to write about 15 more crash dump analysis patterns. I will try to write most of them in forthcoming months and later reorganize and classify them in retrospection. 

As you might have guessed already I have books about Korean language too. First I’d like to mention 2 volumes written by Ross King, Jae-Hoon Yeon and Insun Lee:

Elementary Korean

Continuing Korean 

Korean writing system, Hankul, is usually described as the most scientific writing system in the world and as the world’s best alphabet.

If you are interested in Writing Systems in general (capitalized to avoid confusion with writing systems in C++) I would recommend two books that I browse sometimes:

Writing Systems: A Linguistic Approach

Writing Systems: A Linguistic Introduction

Both books use IPA (International Phonetic Alphabet) to illustrate mapping of writing to sounds so if you have never used IPA and you want to know about it, vowels, consonants and sounds of languages, basics of speech recognition and synthesis I would greatly recommend this amazing book with CD (I read the first edition but seems there is the second one):

Vowels and Consonants

- Dmitry Vostokov -

PDBFinder (public version 3.0.1)

Friday, May 4th, 2007

Finally I’ve managed to create a slimmer version 3.0.1 with many resource consumption improvements:

  • Built database occupies 3.7 times less disk space
  • 2 times faster load
  • 5 times less memory consumption

Additional improvements:

  • “FIXFinder” feature: shows folders where you can find newer modules
  • Search can be restricted to folder names containing sub-string (for example, “release” or “full” symbols only)
  • Ready to copy and paste folder names (for WinDbg Symbol File Path dialog) – exe/dll/sys subfolders are stripped off
  • Additional minor interface improvements (larger screen, ellipsis in paths during build, better keyboard focus handling)

Screenshot:

The tool has been tested with more than 1,000,000 PDB files.

PDBFinder Deluxe Pro 3.0.1 and its documentation can be downloaded from Citrix support web site (requires free registration).

The motivation behind the tool is explained in my previous post: Cons of Symbol Server.

- Dmitry Vostokov -

Moving to UML 2.x

Friday, May 4th, 2007

As a part of improving my UML diagrams and moving towards UML 2.x I’m evaluating free community editors and found the one that seems to be good: Visual Paradigm for UML 6.0 Community Edition.
Here are some diagram screenshots:

http://www.visual-paradigm.com/product/vpuml/vpumlscreenshots.jsp

If you don’t know you can be certified in UML 2:

http://www.omg.org/uml-certification/index.htm

In 2003 I was one of beta-testers of UML Fundamentals exam and passed it. I guess from my certificate number (B100031) I was the 31st person who passed the exam, the highest bit offset in a double word. Now I’m aiming to be certified in the next Intermediate level. This exam is a pure UML language exam. It is not about applying UML in your domain. The only one book for this certification is this and which I found very good for exam preparation:

UML 2 Certification Guide: Fundamental & Intermediate Exams

Buy from Amazon

- Dmitry Vostokov -

Cons of Symbol Server

Thursday, May 3rd, 2007

Symbol servers are great. However I found that in crash dump analysis the absence of automatically loaded symbols sometimes helps to identify a problem or at least gives some directions for further research. It also helps to see which hot fixes or service packs for your product were installed on a problem computer. The scenario I use sometimes when I analyze crash dumps from product A is the following:

  1. Set up WinDbg to point to Microsoft Symbol Server
  2. Load a crash dump and enter various commands based on the issue. Some OS or product A components become visible and their symbols are unresolved.
  3. From unresolved OS symbols I’m aware of the latest fixes or privates from MS
  4. From unresolved symbols of the product A and PDBFinder I determine the base product level and this already gives me some directions.
  5. I add the base product A symbols to symbol file path and continue my analysis.
  6. If unresolved symbols of the product A continue to come up I use PDBFinder again to find corresponding symbols and add them to symbol file path. By doing that I’m aware of the product A hot fix and/or service pack level.
  7. Also from the latest version of PDBFinder (3.0.1) I know whether there are any updates to the component in question.

Of course, all this works only if you store all PDB files from all your fixes and service packs in some location(s) with easily identified names, for example, PRODUCTA\VER20\SP31\FIX01. Adding symbols manually helps to be focused on components, gives attention to some threads where they appear. You might think it is a waste of time but it only takes very small percentage of time especially if you look at the dump for a couple of hours.

What is PDBFinder? This is a program I developed to be able to find right symbol files (especially for minidumps). It scans all locations for PDB or DBG files and adds them to a text database. Next time you run PDBFinder it loads that database and you can find PDB or DBG file location by specifying module name and its date. You can also do a fuzzy search by specifying some date interval. If you run it with -update command line option it will build the database automatically, useful for scheduling weekly updates.  

The public version of PDBFinder Deluxe 2.2.1 can be downloaded from Citrix support web site. The new version 3.0.1 on the way with major improvements and will be announced tomorrow.

- Dmitry Vostokov -

The new version of WinDbg has been released

Tuesday, May 1st, 2007

Version 6.7.5 of Debugging Tools for Windows was released last week:

I haven’t found so far significant additions for crash dump analysis. What I noticed today is that when I open a user dump and enter
!analyze -v command the new field appears in the output:

NTGLOBALFLAG:  400

I ran gflags.exe and enabled page heap for notepad.exe. Then I launched notepad.exe, attached WinDbg, entered the command and NTGLOBALFLAG field reflected the change:

NTGLOBALFLAG:  2000000

So I don’t need to type !gflag command anymore. 

- Dmitry Vostokov -