Archive for the ‘Crash Dump Analysis’ Category

Fault context, wild code and hardware error: pattern cooperation

Tuesday, March 16th, 2010

I recently got a crying request from a reader of my blog to analyze the source of frequent bugchecks on a newly bought computer running Windows 7. I got 8 kernel minidumps with 5 different bugchecks. However, inspection of the default analysis revealed common Fault Context pattern of high resource consumption flight simulator processes in 6 minidumps. Most fault IPs were showing signs of Wild Code pattern and that most probably implicated Hardware Error (Looks like WinDbg suggests that MISALIGNED_IP implicates hardware). Here is the listing of relevant output fragments with attempts to disassemble code around IP (Instruction Pointer) to see if code make any sense (magenta color means the valid that should have been instead of misaligned code highlighted in red):

Windows 7 Kernel Version 7600 MP (4 procs) Free x86 compatible

Debug session time: Fri Jan  8 20:31:15.121 2010 (GMT+0)
System Uptime: 0 days 2:54:44.916

1: kd> !analyze -v

DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)

PROCESS_NAME:  FlightSimulatorA.exe

CURRENT_IRQL:  2

TRAP_FRAME:  807e6ea4 -- (.trap 0xffffffff807e6ea4)
ErrCode = 00000002
eax=872082a7 ebx=80028d5f ecx=b3348635 edx=87208638 esi=80280001 edi=000082a7
eip=8d613485 esp=807e6f18 ebp=6f248635 iopl=0  nv up ei ng nz na po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010282
USBPORT!USBPORT_Xdpc_End+0xa6:
8d613485 897904          mov     dword ptr [ecx+4],edi ds:0023:b3348639=????????
Resetting default scope

STACK_TEXT: 
807e6ea4 8d613485 badb0d00 87208638 82a7b334 nt!KiTrap0E+0x2cf
807e6f24 8d613d18 00000000 86358720 86358002 USBPORT!USBPORT_Xdpc_End+0xa6
807e6f48 82aa33b5 8635872c 86358002 00000000 USBPORT!USBPORT_Xdpc_Worker+0x173
807e6fa4 82aa3218 807c6120 87e7e950 00000000 nt!KiExecuteAllDpcs+0xf9
807e6ff4 82aa29dc 9f7e1ce4 00000000 00000000 nt!KiRetireDpcList+0xd5
807e6ff8 9f7e1ce4 00000000 00000000 00000000 nt!KiDispatchInterrupt+0x2c
WARNING: Frame IP not in any known module. Following frames may be wrong.
82aa29dc 00000000 0000001a 00d6850f bb830000 0x9f7e1ce4

Debug session time: Fri Jan  8 20:42:16.395 2010 (GMT+0)
System Uptime: 0 days 0:10:22.815

2: kd> !analyze -v

DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)

CURRENT_IRQL:  2

TRAP_FRAME:  8d91cbc4 -- (.trap 0xffffffff8d91cbc4)
ErrCode = 00000002
eax=00000000 ebx=8d901a00 ecx=86570108 edx=86570108 esi=8d905884 edi=86573920
eip=911e5f5d esp=8d91cc38 ebp=8d91cc78 iopl=0 nv up ei pl nz na po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010202
HDAudBus!HdaController::NotificationDpc+0×14d:
911e5f5d ff              ???

Resetting default scope

IMAGE_NAME:  hardware

2: kd> u HDAudBus!HdaController::NotificationDpc+14d
HDAudBus!HdaController::NotificationDpc+0×14d:
911e5f5d ff              ???
911e5f5e ff              ???
911e5f5f ff6a00          jmp     fword ptr [edx]

911e5f62 6a00            push    0
911e5f64 6a00            push    0
911e5f66 68ff000000      push    0FFh
911e5f6b 6a03            push    3
911e5f6d 6a04            push    4

2: kd> uf HDAudBus!HdaController::NotificationDpc
[...]
HDAudBus!HdaController::NotificationDpc+0x135:
911e5f45 8b45d8          mov     eax,dword ptr [ebp-28h]
911e5f48 c6405400        mov     byte ptr [eax+54h],0
911e5f4c 8b4dd8          mov     ecx,dword ptr [ebp-28h]
911e5f4f 83c148          add     ecx,48h
911e5f52 8a55e7          mov     dl,byte ptr [ebp-19h]
911e5f55 ff1510a01e91    call    dword ptr [HDAudBus!_imp_KfReleaseSpinLock (911ea010)]

HDAudBus!HdaController::NotificationDpc+0x14b:
911e5f5b e909ffffff      jmp     HDAudBus!HdaController::NotificationDpc+0x59 (911e5e69)

HDAudBus!HdaController::NotificationDpc+0x150:
911e5f60 6a00            push    0
911e5f62 6a00            push    0
911e5f64 6a00            push    0
911e5f66 68ff000000      push    0FFh
911e5f6b 6a03            push    3
911e5f6d 6a04            push    4
911e5f6f 6a08            push    8
911e5f71 6a02            push    2
911e5f73 e818180000      call    HDAudBus!HDABusWmiLogETW (911e7790)
911e5f78 8b4df0          mov     ecx,dword ptr [ebp-10h]
911e5f7b 64890d00000000  mov     dword ptr fs:[0],ecx
911e5f82 59              pop     ecx
911e5f83 5f              pop     edi
911e5f84 5e              pop     esi
911e5f85 5b              pop     ebx
911e5f86 8be5            mov     esp,ebp
911e5f88 5d              pop     ebp
911e5f89 c21000          ret     10h

Debug session time: Fri Jan  8 21:32:04.096 2010 (GMT+0)
System Uptime: 0 days 0:49:10.517

1: kd> !analyze -v

KERNEL_MODE_EXCEPTION_NOT_HANDLED_M (1000008e)

Arg1: c000001d, The exception code that was not handled

EXCEPTION_CODE: (NTSTATUS) 0xc000001d - {EXCEPTION}  Illegal Instruction  An attempt was made to execute an illegal instruction.

TRAP_FRAME:  a99e3644 -- (.trap 0xffffffffa99e3644)
ErrCode = 00000000
eax=000000fe ebx=8556a2b0 ecx=754764cd edx=00000001 esi=858ad008 edi=858ad048
eip=82ada4c2 esp=a99e36b8 ebp=a99e3704 iopl=0 nv up ei ng nz na po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010282
nt!IopCompleteRequest+0×3ac:
82ada4c2 02cd            add     cl,ch

PROCESS_NAME:  FlightSimulatorA.exe

CURRENT_IRQL:  1

MISALIGNED_IP:
nt!IopCompleteRequest+3ac
82ada4c2 02cd            add     cl,ch

IMAGE_NAME:  hardware

1: kd> uf nt!IopCompleteRequest+3ac
nt!IopCompleteRequest+0×3a9:
82ada4bf 82680002        sub     byte ptr [eax],2
82ada4c3 cd82            int     82h

82ada4c5 50              push    eax
82ada4c6 ff75e0          push    dword ptr [ebp-20h]
82ada4c9 57              push    edi
82ada4ca e881830100      call    nt!KeInitializeApc (82af2850)
82ada4cf 6a02            push    2
82ada4d1 6a00            push    0
82ada4d3 ff7628          push    dword ptr [esi+28h]
82ada4d6 57              push    edi
82ada4d7 e8d2830100      call    nt!KeInsertQueueApc (82af28ae)
82ada4dc 33ff            xor     edi,edi
82ada4de eb5f            jmp     nt!IopCompleteRequest+0×429 (82ada53f)

1: kd> ub nt!IopCompleteRequest+3ac
                                  ^ Unable to find valid previous instruction for 'ub nt!IopCompleteRequest+3ac'

Debug session time: Sat Jan  9 07:45:24.155 2010 (GMT+0)
System Uptime: 0 days 2:09:39.576

0: kd> !analyze -v

UNEXPECTED_KERNEL_MODE_TRAP (7f)

Arg1: 0000000d, EXCEPTION_GP_FAULT

PROCESS_NAME:  FlightSimulatorA.exe

CURRENT_IRQL:  6

STACK_TEXT: 
a24b3bd8 90f9e956 badb0d00 00000000 ddf1ba50 nt!KiSystemFatalException+0xf
a24b3cc4 90f93f2b 00000001 00000004 00000004 HDAudBus!HDABusWmiLogETW+0x1c6
a24b3d08 82a817ad 864a6280 86541000 a24b3d34 HDAudBus!HdaController::Isr+0x2b
a24b3d08 20c40d61 864a6280 86541000 a24b3d34 nt!KiInterruptDispatch+0x6d
WARNING: Frame IP not in any known module. Following frames may be wrong.
1343f8ea 00000000 00000000 00000000 00000000 0x20c40d61

Debug session time: Sat Jan  9 08:52:03.454 2010 (GMT+0)
System Uptime: 0 days 1:05:54.249

0: kd> !analyze -v

IRQL_NOT_LESS_OR_EQUAL (a)

CURRENT_IRQL:  2

PROCESS_NAME:  FlightSimulatorA.exe

TRAP_FRAME:  8078adf0 -- (.trap 0xffffffff8078adf0)
ErrCode = 00000002
eax=8632e2a6 ebx=00000000 ecx=880fb200 edx=00000118 esi=00000007 edi=8632e27c
eip=82a0c967 esp=8078ae64 ebp=c1e2baa0 iopl=0 nv up ei ng nz na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010286
hal!HalBuildScatterGatherList+0xf3:
82a0c967 8901            mov     dword ptr [ecx],eax  ds:0023:880fb200=????????
Resetting default scope

STACK_TEXT: 
8078adf0 82a0c967 badb0d00 00000118 82b5f466 nt!KiTrap0E+0x2cf
8078ae78 82a0cc16 880fb218 86379028 8632e260 hal!HalBuildScatterGatherList+0xf3
8078aea8 909b3e70 8651c6b0 86379028 8632e260 hal!HalGetScatterGatherList+0x26
8078aef4 909b3807 86379028 86379970 00000007 USBPORT!USBPORT_Core_iMapTransfer+0x21e
8078af24 909add18 86379028 86379970 86379002 USBPORT!USBPORT_Core_UsbMapDpc_Worker+0x1e3
8078af48 82aa73b5 8637997c 86379002 00000000 USBPORT!USBPORT_Xdpc_Worker+0x173
8078afa4 82aa7218 82b68d20 88139a98 00000000 nt!KiExecuteAllDpcs+0xf9
8078aff4 82aa69dc 9fd8cce4 00000000 00000000 nt!KiRetireDpcList+0xd5
8078aff8 9fd8cce4 00000000 00000000 00000000 nt!KiDispatchInterrupt+0x2c
WARNING: Frame IP not in any known module. Following frames may be wrong.
82aa69dc 00000000 0000001a 00d6850f bb830000 0x9fd8cce4

Debug session time: Sat Jan  9 16:34:48.134 2010 (GMT+0)
System Uptime: 0 days 1:53:05.929

1: kd> !analyze -v

DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)

CURRENT_IRQL:  2

PROCESS_NAME:  firefox.exe

TRAP_FRAME:  bb92449c -- (.trap 0xffffffffbb92449c)
ErrCode = 00000000
eax=000005b4 ebx=0db19ba0 ecx=80000000 edx=00000001 esi=85fdff29 edi=bb924530
eip=8bc7e2c7 esp=bb924510 ebp=bb924638 iopl=0 nv up ei ng nz na po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010282
tcpip!TcpBeginTcbSend+0xa83:
8bc7e2c7 eb06            jmp     tcpip!TcpBeginTcbSend+0xa8b (8bc7e2cf)

Resetting default scope

STACK_TEXT: 
bb92449c 8bc7e2c7 badb0d00 00000001 00000000 nt!KiTrap0E+0x2cf
bb924638 8bc7d2bf 87b39c78 00000000 00000001 tcpip!TcpBeginTcbSend+0xa83
bb92479c 8bc814b5 87b39c78 00000000 00000001 tcpip!TcpTcbSend+0x426
bb9247bc 8bc7f349 87b39c78 87fa6c38 00000000 tcpip!TcpEnqueueTcbSendOlmNotifySendComplete+0x157
bb92481c 8bc81846 87b39c78 bb92491c 00000000 tcpip!TcpEnqueueTcbSend+0x3ca
bb924838 82a95f8a bb9248c8 96d9c9d2 00000000 tcpip!TcpTlConnectionSendCalloutRoutine+0x17
bb9248a0 8bc80a0b 8bc8182f bb9248c8 00000000 nt!KeExpandKernelStackAndCalloutEx+0x132
bb9248d8 908b5d27 87b39c01 bb924900 85572e18 tcpip!TcpTlConnectionSend+0x73
bb92493c 908bb2e3 00d4f1e0 85572e18 85572eac tdx!TdxSendConnection+0x1d7
bb924958 82a424bc 86236b80 85572e18 862389c0 tdx!TdxTdiDispatchInternalDeviceControl+0x115
bb924970 908d65ca 86d0e0c8 00000000 86238990 nt!IofCallDriver+0x63
WARNING: Stack unwind information not available. Following frames may be wrong.
bb9249c8 908d17f8 86238990 85572e18 85572ed0 aswTdi+0x55ca
bb924a28 82a424bc 862388d8 85572e18 8623f0e8 aswTdi+0x7f8
bb924a40 90935310 8623f030 82a424bc 8623f030 nt!IofCallDriver+0x63
bb924a60 90900a0e 2b1c89ba bb924b20 00000001 aswRdr+0x310
bb924ab0 908ed542 00000000 908ed542 87a5c530 afd!AfdFastConnectionSend+0x2a6
bb924c28 82c608f7 87ec6701 00000001 02b5f8cc afd!AfdFastIoDeviceControl+0x53d
bb924cd0 82c634ac 85a89c10 0000024c 00000000 nt!IopXxxControlFile+0x2d0
bb924d04 82a4942a 00000240 0000024c 00000000 nt!NtDeviceIoControlFile+0x2a
bb924d04 774464f4 00000240 0000024c 00000000 nt!KiFastCallEntry+0x12a
02b5f920 00000000 00000000 00000000 00000000 0x774464f4

1: kd> u 8bc7e2cf
tcpip!TcpBeginTcbSend+0xa8b:
8bc7e2cf 83bd18ffffff00  cmp     dword ptr [ebp-0E8h],0

8bc7e2d6 0f84d1000000    je      tcpip!TcpBeginTcbSend+0xb68 (8bc7e3ad)
8bc7e2dc 8d85f8feffff    lea     eax,[ebp-108h]
8bc7e2e2 3bf8            cmp     edi,eax
8bc7e2e4 0f85c3000000    jne     tcpip!TcpBeginTcbSend+0xb68 (8bc7e3ad)
8bc7e2ea 83bd54ffffff00  cmp     dword ptr [ebp-0ACh],0
8bc7e2f1 0f84b6000000    je      tcpip!TcpBeginTcbSend+0xb68 (8bc7e3ad)
8bc7e2f7 f7433c00002000  test    dword ptr [ebx+3Ch],200000h

Debug session time: Sat Jan  9 19:42:50.817 2010 (GMT+0)
System Uptime: 0 days 3:07:23.612

3: kd> !analyze -v

BUGCODE_USB_DRIVER (fe)
USB Driver bugcheck, first parameter is USB bugcheck code.
Arguments:
Arg1: 00000006, USBBUGCODE_BAD_SIGNATURE An Internal data structure (object)
 has been corrupted.
Arg2: 864b20e0, Object address
Arg3: 4f444648, Signature that was expected
Arg4: 00000000

PROCESS_NAME:  System

CURRENT_IRQL:  2

STACK_TEXT: 
8d952b8c 90fa1025 000000fe 00000006 864b20e0 nt!KeBugCheckEx+0x1e
8d952ba8 90fa6672 864b20e0 4f444668 4f444648 USBPORT!USBPORT_AssertSig+0x20
8d952bc8 90fa4553 864b2028 85c57d10 82a8b334 USBPORT!USBPORT_FlushAdapterDBs+0x1b
8d952c00 90fa5178 00000001 856e3ab8 87fb98c0 USBPORT!USBPORT_Core_iCompleteDoneTransfer+0x3cb
8d952c2c 90fa89af 864b2028 864b20f0 864b2a98 USBPORT!USBPORT_Core_iIrpCsqCompleteDoneTransfer+0x33b
8d952c54 90fa2d18 864b2028 864b2a98 864b2002 USBPORT!USBPORT_Core_UsbIocDpc_Worker+0xbc
8d952c78 82ab33b5 864b2aa4 864b2002 00000000 USBPORT!USBPORT_Xdpc_Worker+0x173
8d952cd4 82ab3218 8d936120 8d93b800 00000000 nt!KiExecuteAllDpcs+0xf9
8d952d20 82ab3038 00000000 0000000e 00000000 nt!KiRetireDpcList+0xd5
8d952d24 00000000 0000000e 00000000 00000000 nt!KiIdleLoop+0x38

Debug session time: Sun Jan 10 04:06:19.856 2010 (GMT+0)
System Uptime: 0 days 0:23:05.651

1: kd> !analyze -v

PAGE_FAULT_IN_NONPAGED_AREA (50)

PROCESS_NAME:  FlightSimulatorB.exe

CURRENT_IRQL:  0

TRAP_FRAME:  a127fa30 -- (.trap 0xffffffffa127fa30)
ErrCode = 00000000
eax=a127fec8 ebx=00000000 ecx=00000011 edx=86488ba0 esi=86488b78 edi=00000000
eip=8b83b87d esp=a127faa4 ebp=a127fab8 iopl=0 nv up ei ng nz na po nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010282
fltmgr!TreeFindNodeOrParent+0×9:
8b83b87d 0885c974498b    or      byte ptr mcupdate_GenuineIntel!_NULL_IMPORT_DESCRIPTOR <PERF> (mcupdate_GenuineIntel+0×764c9) (8b4974c9)[ebp],al ss:0010:2c716f81=??

Resetting default scope

MISALIGNED_IP:
fltmgr!TreeFindNodeOrParent+9
8b83b87d 0885c974498b    or      byte ptr mcupdate_GenuineIntel!_NULL_IMPORT_DESCRIPTOR <PERF> (mcupdate_GenuineIntel+0x764c9) (8b4974c9)[ebp],al

STACK_TEXT: 
a127fa18 82a8d5f8 00000000 8b497414 00000000 nt!MmAccessFault+0x106
a127fa18 8b83b87d 00000000 8b497414 00000000 nt!KiTrap0E+0xdc
a127fab8 8b834340 86488ba4 86e5e458 00000000 fltmgr!TreeFindNodeOrParent+0x9
a127faf8 8b83440a 86488b78 86e5e458 00000000 fltmgr!GetContextFromStreamList+0x50
a127fb14 8b86c6da 86e5e458 86488b78 a127fb40 fltmgr!FltGetStreamContext+0x34
a127fb44 8b866b35 87f30718 a127fb98 a127fba8 fileinfo!FIStreamGet+0x36
a127fbac 8b833aeb 87f30718 a127fbcc a127fbf8 fileinfo!FIPreReadWriteCallback+0xf1
a127fc18 8b83617b a127fc54 85cfd738 a127fcac fltmgr!FltpPerformPreCallbacks+0x34d
a127fc30 8b848c37 0027fc54 8b848ad4 00000000 fltmgr!FltpPassThroughFastIo+0x3d
a127fc74 82c96b32 85cfd738 a127fcb4 00001000 fltmgr!FltpFastIoRead+0x163
a127fd08 82a8a42a 86e484c0 00000000 00000000 nt!NtReadFile+0x2d5
a127fd08 775864f4 86e484c0 00000000 00000000 nt!KiFastCallEntry+0x12a
WARNING: Frame IP not in any known module. Following frames may be wrong.
0202fc8c 00000000 00000000 00000000 00000000 0x775864f4

IMAGE_NAME:  hardware

1: kd> u fltmgr!TreeFindNodeOrParent
fltmgr!TreeFindNodeOrParent:
8b83b874 8bff            mov     edi,edi
8b83b876 55              push    ebp
8b83b877 8bec            mov     ebp,esp
8b83b879 8b4508          mov     eax,dword ptr [ebp+8]
8b83b87c 8b08            mov     ecx,dword ptr [eax]

8b83b87e 85c9            test    ecx,ecx
8b83b880 7449            je      fltmgr!TreeFindNodeOrParent+0×57 (8b83b8cb)
8b83b882 8b5510          mov     edx,dword ptr [ebp+10h]

1: kd> ub 8b834340
fltmgr!GetContextFromStreamList+0x37:
8b834327 8bcb            mov     ecx,ebx
8b834329 ff15a4d0838b    call    dword ptr [fltmgr!_imp_ExfAcquirePushLockShared (8b83d0a4)]
8b83432f 33db            xor     ebx,ebx
8b834331 895dfc          mov     dword ptr [ebp-4],ebx
8b834334 ff7510          push    dword ptr [ebp+10h]
8b834337 ff750c          push    dword ptr [ebp+0Ch]
8b83433a 57              push    edi
8b83433b e896750000      call    fltmgr!TreeLookup (8b83b8d6)

1: kd> uf 8b83b8d6
fltmgr!TreeLookup:
8b83b8d6 8bff            mov     edi,edi
8b83b8d8 55              push    ebp
8b83b8d9 8bec            mov     ebp,esp
8b83b8db 8d4510          lea     eax,[ebp+10h]
8b83b8de 50              push    eax
8b83b8df ff7510          push    dword ptr [ebp+10h]
8b83b8e2 ff750c          push    dword ptr [ebp+0Ch]
8b83b8e5 ff7508          push    dword ptr [ebp+8]
8b83b8e8 e887ffffff      call    fltmgr!TreeFindNodeOrParent (8b83b874)
8b83b8ed 48              dec     eax
8b83b8ee f7d8            neg     eax
8b83b8f0 1bc0            sbb     eax,eax
8b83b8f2 f7d0            not     eax
8b83b8f4 234510          and     eax,dword ptr [ebp+10h]
8b83b8f7 5d              pop     ebp
8b83b8f8 c20c00          ret     0Ch

With fix-privet,
Dr. DebugLove

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Crash Dump Analysis Patterns (Part 97)

Tuesday, March 16th, 2010

In the case of multiple different faults like bugchecks and/or different crash points, stack traces and modules we can look at what is common among them. It could be their process context, which can easily be seen from the default analysis:

1: kd> !analyze -v

[...]

PROCESS_NAME:  Application.exe

We give this pattern a name Fault Context. Then we can look whether an application is resource consumption intensive (could implicate hardware faults) like games and simulators or uses its own drivers (implicates latent corruption). In a production environment it can also be removed if it is functionally non-critical and can be avoided or replaced. See also a forthcoming case study.

With fix-privet,
Dr. DebugLove

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

On The Same Page (Debugging Slang, Part 8)

Monday, March 15th, 2010

On The Same Page - coming to the same conclusion as another engineer when looking at a memory dump or a software trace. Literally means the same page of memory where an exception occurred or a stack trace is reconstructed or the same “page” when browsing a software trace output using a viewer.

Examples: Aha, we are on the same page!

With fix-privet,
Dr. DebugLove

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Icons for Memory Dump Analysis Patterns (Part 6)

Monday, March 15th, 2010

Today we introduce an icon for Lateral Damage pattern:

B/W

Color

With fix-privet,
Dr. DebugLove

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Prescriptive Value-Added Debugging

Saturday, March 13th, 2010

This is a new methodology I’m working upon. The idea came from reading “About the Author” page in a book I got yesterday in my post:

The Nomadic Developer: Surviving and Thriving in the World of Technology Consulting

Buy from Amazon

I post a review here and on Amazon when finished reading. Just a few words now. This is the first career book I’m reading where I find pages in roman numerals useful. The page xiii itself looks like a good template (or an example) for a business-oriented CV summary. Thinking now about updating my CV book (2nd edition?):

Resume and CV: As a Book

Buy from Amazon

With fix-privet,
Dr. DebugLove

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Icons for Memory Dump Analysis Patterns (Part 5)

Friday, March 12th, 2010

Today we introduce an icon for False Positive Dump pattern:

B/W

Color

With fix-privet,
Dr. DebugLove

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Icons for Memory Dump Analysis Patterns (Part 4)

Thursday, March 11th, 2010

Today we introduce an icon for Dynamic Memory Corruption (kernel pool) pattern:

B/W

Color

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Icons for Memory Dump Analysis Patterns (Part 3)

Wednesday, March 10th, 2010

Today we introduce an icon for Dynamic Memory Corruption (process heap) pattern:

B/W

Color

Another alternative I considered to use is a chain metaphor but decided that it is more appropriate for linked lists.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Icons for Memory Dump Analysis Patterns (Part 2)

Tuesday, March 9th, 2010

Today we introduce an icon for Multiple Exceptions (kernel mode) pattern:

B/W

Color

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Icons for Memory Dump Analysis Patterns (Part 1)

Monday, March 8th, 2010

I borrowed a pattern icon idea from the book I’m reading now: Algorithms in a Nutshell

Buy from Amazon

From now on, every memory dump analysis pattern (and later trace analysis patterns) will have platform-independent pictorial representation. Today we introduce an icon for Multiple Exceptions (user mode) pattern:

B/W

 

Color

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Decomposing Memory Dumps via DumpFilter

Saturday, February 27th, 2010

This post was motivated during my work on a memory dump differing tool called DumpLogic that can do logical and arithmetic operations between memory snapshots, for example, take a difference between them for further visualization. This tool is forthcoming next week and it resulted in another simple tool called DumpFilter. The latter allows to filter certain unsigned integer (DWORD) values from a memory dump (or any binary file) by replacing them with 0xFFFFFFFF and all other values with 0×00000000. The resultant binary file can be visualized by any data visualization package or transformed to a bitmap file using Dump2Picture to see distribution of filtered values.

As a filtering example I used TestDefaultDebugger64 to generate a process user memory dump. It was converted to a BMP file by Dump2Picture: 

Then I filtered only AV exception code 0xc0000005:

C:\>DumpFilter tdd64.dmp tdd64.bin <dwords.txt

dwords.txt just contained one line 

c0000005

It is possible to filter many values. Just put more lines to dwords.txt file. tdd64.bin was converted to tdd64.bmp by Dump2Picture:

C:\>Dump2Picture tdd64.bin tdd64.bmp

Because the image had only black and while RGBA colors I saved it as a B/W bitmap (click to enlarge, it is a 3236×3236 1.3Mb bitmap):

 

Every EV exception code is a white dot there but it is difficult to see them unless magnified. So I enlarged them manually on the following map:

I put them on the original image too. We can see that exception processing spans many areas:

The tool and the sample dwords.txt file (for c0000005 and 80000003) can be downloaded from this location:

Download DumpFilter

Another example: Night Sky memory space art image is just a fragment after filtering all 1 values from another process memory dump.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Inconsistent dump, stack trace collection, LPC, thread, process, executive resource wait chains, missing threads and waiting thread time: pattern cooperation

Saturday, February 27th, 2010

Here is a synthetic case study to show various wait chain patterns. The complete memory dump from a frozen system is inconsistent, saved by LiveKd. Stack trace collection shows many threads waiting for LPC replies:

THREAD 87432520  Cid 0b10.0dd8  Teb: 7ff83000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable
    8743270c  Semaphore Limit 0x1
Waiting for reply to LPC MessageId 007ade85:
Current LPC port d676e560
Not impersonating
DeviceMap                 d6f05be8
Owning Process            87581c78       Image:         ProcessA.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      4093415        Ticks: 659565 (0:02:51:45.703)
Context Switch Count      111255            
UserTime                  00:00:27.781
KernelTime                00:00:02.015
Win32 Start Address MSVCR71!_threadstartex (0×7c3494f6)
Start Address kernel32!BaseThreadStartThunk (0×77e617ec)
Stack Init b6439000 Current b6438c08 Base b6439000 Limit b6436000 Call 0
Priority 8 BasePriority 8 PriorityDecrement 0
Kernel stack not resident.
ChildEBP RetAddr 
b6438c10 00000000 0×0

Checking MessageId by using !lpc command gives us the following LPC server thread that is waiting for a mutant owned by thread 866d63e8 (this is equivalent that thread 85b209d0 is waiting for thread 866d63e8):

THREAD 85b209d0  Cid 1524.2770  Teb: 7ff78000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable
    85e9ba00  Mutant - owning thread 866d63e8
Not impersonating
DeviceMap                 d64008d8
Owning Process            87200880       Image:         ProcessB.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      4093415        Ticks: 659565 (0:02:51:45.703)
Context Switch Count      28            
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Win32 Start Address 0×007ade85
LPC Server thread working on message Id 7ade85
Start Address kernel32!BaseThreadStartThunk (0×77e617ec)
Stack Init b42e5000 Current b42e4c60 Base b42e5000 Limit b42e2000 Call 0
Priority 10 BasePriority 10 PriorityDecrement 0
Kernel stack not resident.
ChildEBP RetAddr 
b42e4c68 00000000 0xa000a02

Thread 866d63e8 is waiting for a process 86b33b30:

THREAD 866d63e8  Cid 1524.3984  Teb: 7ff6e000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable
    86b33b30  ProcessObject
    866d6460  NotificationTimer
Not impersonating
DeviceMap                 d64008d8
Owning Process            87200880       Image:         ProcessB.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      4755080     
Context Switch Count      4767            
UserTime                  00:00:00.015
KernelTime                00:00:00.000
Win32 Start Address 0×007a5051
LPC Server thread working on message Id 7a5051
Start Address kernel32!BaseThreadStartThunk (0×77e617ec)
Stack Init ab459000 Current ab458c60 Base ab459000 Limit ab456000 Call 0
Priority 10 BasePriority 10 PriorityDecrement 0
ChildEBP RetAddr 
ab458c78 8083d5b1 nt!KiSwapContext+0×26
ab458ca4 8083df9e nt!KiSwapThread+0×2e5
ab458cec 8092ae67 nt!KeWaitForSingleObject+0×346
ab458d50 80833bef nt!NtWaitForSingleObject+0×9a
ab458d50 7c82860c nt!KiFastCallEntry+0xfc (TrapFrame @ ab458d64)
0499fbec 7c827d29 ntdll!KiFastSystemCallRet
0499fbf0 76548721 ntdll!ZwWaitForSingleObject+0xc
0499fc0c 7654aa54 ProcessB!WaitForProcess+0×4a
[…]
0499ffec 00000000 kernel32!BaseThreadStart+0×34

We find the following thread in the process 86b33b30 (there is only one thread left where we expect many of them in ProcessC):

THREAD 85a6fdb0  Cid 5c0c.26f4  Teb: 7ffdf000 Win32Thread: b9b778a0 WAIT: (Unknown) KernelMode Non-Alertable
    86686030  SynchronizationEvent
    85a6fe28  NotificationTimer
Not impersonating
DeviceMap                 d6767248
Owning Process            86b33b30       Image:         ProcessC.EXE
Attached Process          N/A            Image:         N/A
Wait Start TickCount      4755681     
Context Switch Count      77668                 LargeStack
UserTime                  00:00:00.437
KernelTime                00:00:04.421
*** ERROR: Module load completed but symbols could not be loaded for ProcessC.EXE
Win32 Start Address 0×300010cc
Start Address kernel32!BaseProcessStartThunk (0×77e617f8)
Stack Init 9ad92000 Current 9ad91a40 Base 9ad92000 Limit 9ad8d000 Call 0
Priority 10 BasePriority 10 PriorityDecrement 0
ChildEBP RetAddr 
9ad91a58 8083d5b1 nt!KiSwapContext+0×26
9ad91a84 8083df9e nt!KiSwapThread+0×2e5
9ad91acc 8081e05b nt!KeWaitForSingleObject+0×346
9ad91b08 8082e012 nt!ExpWaitForResource+0xd5
9ad91b28 b6a9c1ee nt!ExAcquireResourceExclusiveLite+0×8d
WARNING: Stack unwind information not available. Following frames may be wrong.
9ad91b38 b6ab09d3 DriverA+0×21ee
[…]
9ad91c3c 80840153 DriverA+0×1ed6

9ad91c50 8092b51f nt!IofCallDriver+0×45
9ad91c64 8092b454 nt!IopSynchronousServiceTail+0×10b
9ad91d00 8092b574 nt!IopXxxControlFile+0×60f
9ad91d34 80833bef nt!NtDeviceIoControlFile+0×2a
9ad91d34 7c82860c nt!KiFastCallEntry+0xfc (TrapFrame @ 9ad91d64)
0012d4e0 00000000 ntdll!KiFastSystemCallRet

It is blocked by DriverA waiting for an executive resource. Fortunately !locks command works for this inconsistent dump and we finally reach thread 86ba5638:

0: kd> !locks

Resource @ 0x85610d30    Exclusively owned
    Contention Count = 4
    NumberOfExclusiveWaiters = 1
     Threads: 86ba5638-01<*>
     Threads Waiting On Exclusive Access:
              85a6fdb0

This thread belongs to another instance of ProcessC.exe (different process 8690dd88):

0: kd> !thread 86ba5638 1f
THREAD 86ba5638  Cid 186c.1574  Teb: 7ffdf000 Win32Thread: b9a28a70 WAIT: (Unknown) KernelMode Non-Alertable
    8944e3d4  SynchronizationEvent
Not impersonating
DeviceMap                 d6767248
Owning Process            8690dd88       Image:         ProcessC.EXE
Attached Process          N/A            Image:         N/A
Wait Start TickCount      4074148        Ticks: 678832 (0:02:56:46.750)
Context Switch Count      95896                 LargeStack
UserTime                  00:00:00.593
KernelTime                00:00:05.343
*** ERROR: Module load completed but symbols could not be loaded for ProcessC.EXE
Win32 Start Address 0×300010cc
Start Address kernel32!BaseProcessStartThunk (0×77e617f8)
Stack Init 99ef2000 Current 99ef19c0 Base 99ef2000 Limit 99eec000 Call 0
Priority 10 BasePriority 10 PriorityDecrement 0
ChildEBP RetAddr 
99ef19d8 8083d5b1 nt!KiSwapContext+0×26
99ef1a04 8083df9e nt!KiSwapThread+0×2e5
99ef1a4c 80853f3f nt!KeWaitForSingleObject+0×346
99ef1a64 8081d45f nt!KiAcquireFastMutex+0×13
99ef1a70 b6a9c70d nt!ExAcquireFastMutex+0×20
WARNING: Stack unwind information not available. Following frames may be wrong.
99ef1a7c b6aaf22a DriverA+0×270d
[…]
99ef1c3c 80840153 DriverA+0×1ed6

99ef1c50 8092b51f nt!IofCallDriver+0×45
99ef1c64 8092b454 nt!IopSynchronousServiceTail+0×10b
99ef1d00 8092b574 nt!IopXxxControlFile+0×60f
99ef1d34 80833bef nt!NtDeviceIoControlFile+0×2a
99ef1d34 7c82860c nt!KiFastCallEntry+0xfc (TrapFrame @ 99ef1d64)
0012db08 00000000 ntdll!KiFastSystemCallRet

We see this thread is also blocked by DriverA. We also check thread waiting times. All threads involved in wait chains have their Ticks value less than 86ba5638. This means that thread 86ba5638 was blocked earlier. We contact DriverA vendor for any possible updates.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Crash Dump Analysis Patterns (Part 96)

Sunday, February 21st, 2010

For certain stack traces we should always be aware of Coincidental Frames similar to Coincidental Symbolic Information pattern for raw stack data. Such frames can lead to a wrong analysis conclusion. Consider this stack trace fragment from a kernel memory dump:

0: kd> kL 100
ChildEBP RetAddr
9c5b6550 8082d9a4 nt!KeBugCheckEx+0×1b
9c5b6914 8088befa nt!KiDispatchException+0×3a2
9c5b697c 8088beaent!CommonDispatchException+0×4a
9c5b699c 80a6056dnt!KiExceptionExit+0×186
9c5b69a0 80893ae2hal!KeReleaseQueuedSpinLock+0×2d
9c5b6a08 b20c3de5 nt!MiFreePoolPages+0×7dc
WARNING: Stack unwind information not available. Following frames may be wrong.
9c5b6a48 b20c4107 DeriverA+0×17de5
[…]

The frame with MiFreePoolPages symbol might suggest some sort of a pool corruption. We can even double check return addresses and see the valid common sense assembly language code:

0: kd> ub 8088beae
nt!KiExceptionExit+0×167:
8088be8f 33c9            xor     ecx,ecx
8088be91 e81a000000      call    nt!CommonDispatchException (8088beb0)
8088be96 33d2            xor     edx,edx
8088be98 b901000000      mov     ecx,1
8088be9d e80e000000      call    nt!CommonDispatchException (8088beb0)
8088bea2 33d2            xor     edx,edx
8088bea4 b902000000      mov     ecx,2
8088bea9 e802000000      call    nt!CommonDispatchException (8088beb0)

0: kd> ub 80a6056d
hal!KeReleaseQueuedSpinLock+0×1b:
80a6055b 7511            jne     hal!KeReleaseQueuedSpinLock+0×2e (80a6056e)
80a6055d 50              push    eax
80a6055e f00fb119        lock cmpxchg dword ptr [ecx],ebx
80a60562 58              pop     eax
80a60563 7512            jne     hal!KeReleaseQueuedSpinLock+0×37 (80a60577)
80a60565 5b              pop     ebx
80a60566 8aca            mov     cl,dl
80a60568 e8871e0000      call    hal!KfLowerIrql (80a623f4)

0: kd> ub 80893ae2
nt!MiFreePoolPages+0×7c3:
80893ac9 761c            jbe     nt!MiFreePoolPages+0×7e1 (80893ae7)
80893acb ff75f8          push    dword ptr [ebp-8]
80893ace ff7508          push    dword ptr [ebp+8]
80893ad1 e87ea1fcff      call    nt!MiFreeNonPagedPool (8085dc54)
80893ad6 8a55ff          mov     dl,byte ptr [ebp-1]
80893ad9 6a0f            push    0Fh
80893adb 59              pop     ecx
80893adc ff1524118080    call    dword ptr [nt!_imp_KeReleaseQueuedSpinLock (80801124)]

0: kd> ub b20c3de5
DriverA+0×17dcf:
b20c3dcf 51              push    ecx
b20c3dd0 ff5010          call    dword ptr [eax+10h]
b20c3dd3 eb10            jmp     DriverA+0×17de5 (b20c3de5)
b20c3dd5 8b5508          mov     edx,dword ptr [ebp+8]
b20c3dd8 52              push    edx
b20c3dd9 8d86a0000000    lea     eax,[esi+0A0h]
b20c3ddf 50              push    eax
b20c3de0 e8ebf1ffff      call    DriverA+0×16fd0 (b20c2fd0)

However, if we try to reconstruct the stack trace manually we would naturally skip these 3 frames (shown in magenta):

9c5b6550 8082d9a4 nt!KeBugCheckEx+0x1b
9c5b6914 8088befa nt!KiDispatchException+0x3a2
9c5b697c 8088beae nt!CommonDispatchException+0x4a
9c5b699c 80a6056d nt!KiExceptionExit+0×186
9c5b69a0 80893ae2 hal!KeReleaseQueuedSpinLock+0×2d
9c5b6a08 b20c3de5 nt!MiFreePoolPages+0×7dc

9c5b6a48 b20c4107 DeriverA+0×17de5
[…]

0: kd> !thread
THREAD 8f277020  Cid 081c.7298  Teb: 7ff11000 Win32Thread: 00000000 RUNNING on processor 0
IRP List:
    8e234b60: (0006,0094) Flags: 00000000  Mdl: 00000000
Not impersonating
DeviceMap                 e1002880
Owning Process            8fc78b80       Image:         ProcessA.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      49046879       Ticks: 0
Context Switch Count      10            
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Win32 Start Address DllA!ThreadA (0x7654dc90)
Start Address kernel32!BaseThreadStartThunk (0x77e617dc)
Stack Init 9c5b7000 Current 9c5b6c50 Base 9c5b7000 Limit 9c5b4000 Call 0
Priority 10 BasePriority 10 PriorityDecrement 0
ChildEBP RetAddr  Args to Child             
[…]

0: kd> dds 9c5b4000 9c5b7000
9c5b4000  00000000
9c5b4004  00000000
9c5b4008  00000000
[...]
9c5b6290  ffdff13c
9c5b6294  9c5b6550
9c5b6298  80827e01 nt!KeBugCheckEx+0×1b
9c5b629c  00000008
9c5b62a0  00000286
[…]
9c5b654c  00000000
9c5b6550  9c5b6914
9c5b6554  8082d9a4 nt!KiDispatchException+0×3a2
9c5b6558  0000008e
9c5b655c  c0000005
[…]
9c5b6910  ffffffff
9c5b6914  9c5b6984
9c5b6918  8088befa nt!CommonDispatchException+0×4a
9c5b691c  9c5b6930
9c5b6920  00000000
[…]
9c5b6980  8088beae nt!KiExceptionExit+0×186
9c5b6984  9c5b6a08
9c5b6988  b20c3032 DriverA+0×17032
9c5b698c  badb0d00
9c5b6990  00000006
9c5b6994  8dc11cec
9c5b6998  808b6900 nt!KiTimerTableLock+0×3c0
9c5b699c  9c5b69d4
9c5b69a0  80a6056d hal!KeReleaseQueuedSpinLock+0×2d
9c5b69a4  80893ae2 nt!MiFreePoolPages+0×7dc

9c5b69a8  808b0b40 nt!NonPagedPoolDescriptor
9c5b69ac  03151fd0
9c5b69b0  00000000
9c5b69b4  00000000
[…]
9c5b6a04  8f47123b
9c5b6a08  9c5b6a48
9c5b6a0c  b20c3de5 DriverA+0×17de5
9c5b6a10  8e3640a0
9c5b6a14  8f4710d0
[…]
9c5b6a44  00000000
9c5b6a48  9c5b6a80
9c5b6a4c  b20c4107 DriverA+0×18107
9c5b6a50  8f4710d0
9c5b6a54  9c5b6a6c
[…]

If we try to find a pointer to the exception record we get this crash address:

0: kd> .exr 9c5b6930
ExceptionAddress: b20c3032 (DriverA+0×00017032)
   ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000000
NumberParameters: 2
   Parameter[0]: 00000000
   Parameter[1]: 00000157
Attempt to read from address 00000157

If we disassemble it we see an inlined string or memory copy, perhaps wcscpy function:

0: kd> u b20c3032
DriverA+0×17032:
b20c3032 f3a5            rep movs dword ptr es:[edi],dword ptr [esi]
b20c3034 8bcb            mov     ecx,ebx
b20c3036 83e103          and     ecx,3
b20c3039 f3a4            rep movs byte ptr es:[edi],byte ptr [esi]
b20c303b 8b750c          mov     esi,dword ptr [ebp+0Ch]
b20c303e 0fb7ca          movzx   ecx,dx
b20c3041 894e14          mov     dword ptr [esi+14h],ecx
b20c3044 8b700c          mov     esi,dword ptr [eax+0Ch]

So the problem happened in DriverA code, not in MiFreePoolPages or KeReleaseQueuedSpinLock.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Bus Debugging

Friday, February 19th, 2010

This post is not about debugging a computer bus. It is about debugging on a bus. More correctly, it is about debugging software running on a bus, not on a computer bus but on a real bus. A few days ago I was on a bus leaving Dublin bus station to Dublin airport. Looking around inside the bus I noticed one monitor had a characteristic Windows XP-style message box of an access violation. It was just before disembarking the bus so I made a mental effort to memorize the referenced memory address: 0×4000 and the instruction address: x73f18a09. The application name was bb.exe. Google search for 73f10000 module load address points to this one:

ModLoad: 73f10000 73f6c000   C:\WINDOWS\system32\DSOUND.dll

Not really a debugging (there’s no fix from me) so it can be named as a bus analysis exercise.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Crash Dump Analysis Patterns (Part 42f)

Tuesday, February 16th, 2010

Here we show an example of a wait chain involving process objects. This Wait Chain pattern variation is similar to threads waiting for thread objects. When looking at stack trace collection from a complete memory dump file we see several threads in a set of processes are blocked in ALPC wait chain:

THREAD fffffa80110b8700  Cid 12f8.1328  Teb: 000000007ef9a000 Win32Thread: 0000000000000000 WAIT: (WrLpcReply) UserMode Non-Alertable
    fffffa80110b8a90  Semaphore Limit 0x1
Waiting for reply to ALPC Message fffff8801c7096e0 : queued at port fffffa8010c9d9a0 : owned by process fffffa80109c8c10
Not impersonating
DeviceMap                 fffff880097ce5e0
Owning Process            fffffa80110ad510       Image:         ProcessA.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      14004580       Ticks: 62149 (0:00:16:09.530)
Context Switch Count      25100            
UserTime                  00:00:00.421
KernelTime                00:00:00.218
Win32 Start Address 0×0000000074ca29e1
Stack Init fffffa6003bc4db0 Current fffffa6003bc4670
Base fffffa6003bc5000 Limit fffffa6003bbf000 Call 0
Priority 8 BasePriority 8 PriorityDecrement 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           Call Site
fffffa60`03bc46b0 fffff800`01cba0fa nt!KiSwapContext+0×7f
fffffa60`03bc47f0 fffff800`01caedab nt!KiSwapThread+0×13a
fffffa60`03bc4860 fffff800`01ce4e72 nt!KeWaitForSingleObject+0×2cb
fffffa60`03bc48f0 fffff800`01f32f34 nt!AlpcpSignalAndWait+0×92
fffffa60`03bc4980 fffff800`01f2f9c6 nt!AlpcpReceiveSynchronousReply+0×44
fffffa60`03bc49e0 fffff800`01f1f52f nt!AlpcpProcessSynchronousRequest+0×24f
fffffa60`03bc4b00 fffff800`01cb7973 nt!NtAlpcSendWaitReceivePort+0×19f
fffffa60`03bc4bb0 00000000`7713756a nt!KiSystemServiceCopyEnd+0×13 (TrapFrame @ fffffa60`03bc4c20)
00000000`016ee5b8 00000000`74f9993f ntdll!ZwAlpcSendWaitReceivePort+0xa
00000000`016ee5c0 00000000`74f8a996 wow64!whNtAlpcSendWaitReceivePort+0×5f
00000000`016ee610 00000000`75183688 wow64!Wow64SystemServiceEx+0xca
00000000`016eeec0 00000000`74f8ab46 wow64cpu!ServiceNoTurbo+0×28
00000000`016eef50 00000000`74f8a14c wow64!RunCpuSimulation+0xa
00000000`016eef80 00000000`771605a8 wow64!Wow64LdrpInitialize+0×4b4
00000000`016ef4e0 00000000`771168de ntdll! ?? ::FNODOBFM::`string’+0×20aa1
00000000`016ef590 00000000`00000000 ntdll!LdrInitializeThunk+0xe

1: kd> !alpc /m fffff8801c7096e0

Message @ fffff8801c7096e0
  MessageID             : 0x263C (9788)
  CallbackID            : 0x29F2A02 (43985410)
  SequenceNumber        : 0x000009FE (2558)
  Type                  : LPC_REQUEST
  DataLength            : 0x0058 (88)
  TotalLength           : 0x0080 (128)
  Canceled              : No
  Release               : No
  ReplyWaitReply        : No
  Continuation          : Yes
  OwnerPort             : fffffa8015128040 [ALPC_CLIENT_COMMUNICATION_PORT]
  WaitingThread         : fffffa80110b8700
  QueueType             : ALPC_MSGQUEUE_PENDING
  QueuePort             : fffffa8010c9d9a0 [ALPC_CONNECTION_PORT]
  QueuePortOwnerProcess : fffffa80109c8c10 (ProcessB.exe)
  ServerThread          : fffffa8013b87bb0
  QuotaCharged          : No
  CancelQueuePort       : 0000000000000000
  CancelSequencePort    : 0000000000000000
  CancelSequenceNumber  : 0×00000000 (0)
  ClientContext         : 0000000009b49208
  ServerContext         : 0000000000000000
  PortContext           : 000000000280f0d0
  CancelPortContext     : 0000000000000000
  SecurityData          : 0000000000000000
  View                  : 0000000000000000

If we look at process fffffa80109c8c10 and its thread fffffa8013b87bb0 we would see that it is blocked as well on some kind of a lock:

THREAD fffffa8013b87bb0  Cid 0358.2c60  Teb: 000007fffff7e000 Win32Thread: 0000000000000000 WAIT: (UserRequest) UserMode Non-Alertable
    fffffa8010bca370  Semaphore Limit 0x7fffffff
    fffffa8013b87c68  NotificationTimer
Impersonation token:  fffff8801e614060 (Level Impersonation)
DeviceMap                 fffff880097ce5e0
Owning Process            fffffa80109c8c10       Image:         ProcessB.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      14004580       Ticks: 62149 (0:00:16:09.530)
Context Switch Count      134            
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Win32 Start Address RPCRT4!ThreadStartRoutine (0x000007feff267780)
Stack Init fffffa6035a1fdb0 Current fffffa6035a1f940
Base fffffa6035a20000 Limit fffffa6035a1a000 Call 0
Priority 11 BasePriority 10 PriorityDecrement 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           Call Site
fffffa60`35a1f980 fffff800`01cba0fa nt!KiSwapContext+0x7f
fffffa60`35a1fac0 fffff800`01caedab nt!KiSwapThread+0x13a
fffffa60`35a1fb30 fffff800`01f1d608 nt!KeWaitForSingleObject+0x2cb
fffffa60`35a1fbc0 fffff800`01cb7973 nt!NtWaitForSingleObject+0x98
fffffa60`35a1fc20 00000000`77136d5a nt!KiSystemServiceCopyEnd+0x13 (TrapFrame @ fffffa60`35a1fc20)
00000000`0486ec28 00000000`770f559f ntdll!ZwWaitForSingleObject+0xa
00000000`0486ec30 00000000`ff77d4e9 ntdll!RtlAcquireResourceShared+0xd1
00000000`0486ec70 00000000`ff77fb4d ProcessB!CLock::CLock+0×61
[…]
00000000`0486eee0 000007fe`ff261f46 RPCRT4!Invoke+0×65
00000000`0486ef40 000007fe`ff26254d RPCRT4!NdrStubCall2+0×348
00000000`0486f520 000007fe`ff2868d4 RPCRT4!NdrServerCall2+0×1d
00000000`0486f550 000007fe`ff2869f0 RPCRT4!DispatchToStubInCNoAvrf+0×14
00000000`0486f580 000007fe`ff287402 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0×100
00000000`0486f670 000007fe`ff287080 RPCRT4!LRPC_SCALL::DispatchRequest+0×1c2
00000000`0486f6e0 000007fe`ff2862bb RPCRT4!LRPC_SCALL::HandleRequest+0×200
00000000`0486f800 000007fe`ff285e1a RPCRT4!LRPC_ADDRESS::ProcessIO+0×44a
00000000`0486f920 000007fe`ff267769 RPCRT4!LOADABLE_TRANSPORT::ProcessIOEvents+0×24a
00000000`0486f9d0 000007fe`ff267714 RPCRT4!ProcessIOEventsWrapper+0×9
00000000`0486fa00 000007fe`ff2677a4 RPCRT4!BaseCachedThreadRoutine+0×94
00000000`0486fa40 00000000`76fdbe3d RPCRT4!ThreadStartRoutine+0×24
00000000`0486fa70 00000000`77116a51 kernel32!BaseThreadInitThunk+0xd
00000000`0486faa0 00000000`00000000 ntdll!RtlUserThreadStart+0×1d

There are many such threads and inspection of all threads in the process fffffa80109c8c10 reveals another thread waiting for an ALPC reply:

THREAD fffffa8010c9b060  Cid 0358.02ac  Teb: 000007fffffd3000 Win32Thread: 0000000000000000 WAIT: (WrLpcReply) UserMode Non-Alertable
    fffffa8010c9b3f0  Semaphore Limit 0x1
    Waiting for reply to ALPC Message fffff88011994cf0 : queued at port fffffa8010840360 : owned by process fffffa801083e120
Not impersonating
DeviceMap                 fffff880000073d0
Owning Process            fffffa80109c8c10       Image:         ProcessB.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      13986969       Ticks: 79760 (0:00:20:44.263)
Context Switch Count      712            
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Win32 Start Address ntdll!TppWorkerThread (0×0000000077107cb0)
Stack Init fffffa6004bfbdb0 Current fffffa6004bfb670
Base fffffa6004bfc000 Limit fffffa6004bf6000 Call 0
Priority 10 BasePriority 10 PriorityDecrement 0 IoPriority 2 PagePriority 5
Kernel stack not resident.
Child-SP          RetAddr           Call Site
fffffa60`04bfb6b0 fffff800`01cba0fa nt!KiSwapContext+0×7f
fffffa60`04bfb7f0 fffff800`01caedab nt!KiSwapThread+0×13a
fffffa60`04bfb860 fffff800`01ce4e72 nt!KeWaitForSingleObject+0×2cb
fffffa60`04bfb8f0 fffff800`01f32f34 nt!AlpcpSignalAndWait+0×92
fffffa60`04bfb980 fffff800`01f2f9c6 nt!AlpcpReceiveSynchronousReply+0×44
fffffa60`04bfb9e0 fffff800`01f1f52f nt!AlpcpProcessSynchronousRequest+0×24f
fffffa60`04bfbb00 fffff800`01cb7973 nt!NtAlpcSendWaitReceivePort+0×19f
fffffa60`04bfbbb0 00000000`7713756a nt!KiSystemServiceCopyEnd+0×13 (TrapFrame @ fffffa60`04bfbc20)
00000000`00c3f2f8 00000000`771872c9 ntdll!ZwAlpcSendWaitReceivePort+0xa
[…]
00000000`00c3f810 00000000`77107fd0 ntdll!RtlpTpWorkCallback+0xf2
00000000`00c3f8c0 00000000`76fdbe3d ntdll!TppWorkerThread+0×3d6
00000000`00c3fb40 00000000`77116a51 kernel32!BaseThreadInitThunk+0xd
00000000`00c3fb70 00000000`00000000 ntdll!RtlUserThreadStart+0×1d

1: kd> !alpc /m fffff88011994cf0

Message @ fffff88011994cf0
  MessageID             : 0x033C (828)
  CallbackID            : 0x29CEF57 (43839319)
  SequenceNumber        : 0x000000D8 (216)
  Type                  : LPC_REQUEST
  DataLength            : 0x000C (12)
  TotalLength           : 0x0034 (52)
  Canceled              : No
  Release               : No
  ReplyWaitReply        : No
  Continuation          : Yes
  OwnerPort             : fffffa8010c99040 [ALPC_CLIENT_COMMUNICATION_PORT]
  WaitingThread         : fffffa8010c9b060
  QueueType             : ALPC_MSGQUEUE_PENDING
  QueuePort             : fffffa8010840360 [ALPC_CONNECTION_PORT]
  QueuePortOwnerProcess : fffffa801083e120 (ProcessC.exe)
  ServerThread          : fffffa80109837d0
  QuotaCharged          : No
  CancelQueuePort       : 0000000000000000
  CancelSequencePort    : 0000000000000000
  CancelSequenceNumber  : 0×00000000 (0)
  ClientContext         : 0000000000000000
  ServerContext         : 0000000000000000
  PortContext           : 00000000005f3400
  CancelPortContext     : 0000000000000000
  SecurityData          : 0000000000000000
  View                  : 0000000000000000

We see that ProcessC thread fffffa80109837d0 is waiting for a process object fffffa801434cb40:

THREAD fffffa80109837d0  Cid 027c.02b0  Teb: 000007fffffdb000 Win32Thread: 0000000000000000 WAIT: (UserRequest) UserMode Non-Alertable
    fffffa801434cb40  ProcessObject
Not impersonating
DeviceMap                 fffff880000073d0
Owning Process            fffffa801083e120       Image:         ProcessC.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      13986969       Ticks: 79760 (0:00:20:44.263)
Context Switch Count      520            
UserTime                  00:00:00.000
KernelTime                00:00:00.062
Win32 Start Address 0×000000004826dcf4
Stack Init fffffa6002547db0 Current fffffa6002547940
Base fffffa6002548000 Limit fffffa6002542000 Call 0
Priority 13 BasePriority 11 PriorityDecrement 0 IoPriority 2 PagePriority 5
Kernel stack not resident.
Child-SP          RetAddr           Call Site
fffffa60`02547980 fffff800`01cba0fa nt!KiSwapContext+0×7f
fffffa60`02547ac0 fffff800`01caedab nt!KiSwapThread+0×13a
fffffa60`02547b30 fffff800`01f1d608 nt!KeWaitForSingleObject+0×2cb
fffffa60`02547bc0 fffff800`01cb7973 nt!NtWaitForSingleObject+0×98
fffffa60`02547c20 00000000`77136d5a nt!KiSystemServiceCopyEnd+0×13 (TrapFrame @ fffffa60`02547c20)
00000000`0024f7c8 00000000`4826ea97 ntdll!ZwWaitForSingleObject+0xa
00000000`0024f7d0 00000000`4826ef44 ProcessC!TerminatePID+0xa3
[…]
00000000`0024fc90 00000000`00000000 ntdll!RtlUserThreadStart+0×29

When we inspect process fffffa801434cb40 we see that it has only one thread with many usual threads missing. The blocked thread stack trace has DriverA module code waiting for an event:

1: kd> !process fffffa801434cb40 ff
PROCESS fffffa801434cb40
    SessionId: 1  Cid: a0c8    Peb: 7fffffdc000  ParentCid: 1c08
    DirBase: 19c6cc000  ObjectTable: fffff8801767ee00  HandleCount: 287.
    Image: ProcessD.exe
    VadRoot fffffa8021be17d0 Vads 71 Clone 0 Private 955. Modified 1245. Locked 0.
    DeviceMap fffff880000073d0
    Token                             fffff880187cb3c0
    ElapsedTime                       00:49:23.432
    UserTime                          00:00:00.686
    KernelTime                        00:00:00.904
    QuotaPoolUsage[PagedPool]         208080
    QuotaPoolUsage[NonPagedPool]      6720
    Working Set Sizes (now,min,max)  (2620, 50, 345) (10480KB, 200KB, 1380KB)
    PeakWorkingSetSize                3136
    VirtualSize                       101 Mb
    PeakVirtualSize                   222 Mb
    PageFaultCount                    13495
    MemoryPriority                    BACKGROUND
    BasePriority                      13
    CommitCharge                      1154

[...]

THREAD fffffa8012249b30  Cid a0c8.31b4  Teb: 0000000000000000 Win32Thread: 0000000000000000 WAIT: (Executive) KernelMode Non-Alertable
   fffffa801180a6a0  SynchronizationEvent
   Not impersonating
DeviceMap                 fffff880000073d0
Owning Process            fffffa801434cb40       Image:         ProcessD.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      13986969       Ticks: 79760 (0:00:20:44.263)
Context Switch Count      97            
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Win32 Start Address DllA (0xfffff96000eeada0)
Stack Init fffffa601b841db0 Current fffffa601b841960
Base fffffa601b842000 Limit fffffa601b83c000 Call 0
Priority 13 BasePriority 13 PriorityDecrement 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           Call Site
fffffa60`1b8419a0 fffff800`01cba0fa nt!KiSwapContext+0x7f
fffffa60`1b841ae0 fffff800`01caedab nt!KiSwapThread+0x13a
fffffa60`1b841b50 fffff960`00eeb281 nt!KeWaitForSingleObject+0x2cb
fffffa60`1b841c20 fffff800`01ec7bc7 DriverA+0×4b281
fffffa60`1b841d50 fffff800`01cf65a6 nt!PspSystemThreadStartup+0×57
fffffa60`1b841d80 00000000`00000000 nt!KiStartSystemThread+0×16

We therefore recommend to contact the vendor of DriverA component.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Project CARE Update (14.02.10)

Sunday, February 14th, 2010

Thanks to everyone who submitted their debugger logs. Now VBScript and WinDbg script files are available for download from the CARE page:

http://www.dumpanalysis.org/care

VBScript file scans all hard drives for .DMP files and launches WinDbg to run a mode-independent WinDbg script. Each instance of WinDbg appends the output to dbgeng.log file that you can submit to CARE (please zip it if exceeds 2Mb).

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Memory Systems Language (Part 1)

Friday, February 12th, 2010

Computer memory analysis is based on interconnected structures of symbols and we state that there exists a memory language that extends a hierarchy of modeling and implementation languages (both domain-specific and general-purpose):

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Forthcoming Memory Dump Analysis Anthology, Volume 4

Thursday, February 11th, 2010

This is a revised, edited, cross-referenced and thematically organized volume of selected DumpAnalysis.org blog posts about crash dump analysis and debugging written in July 2009 - January 2010 for software engineers developing and maintaining products on Windows platforms, quality assurance engineers testing software on Windows platforms and technical support and escalation engineers dealing with complex software issues. The fourth volume features:

- 13 new crash dump analysis patterns
- 13 new pattern interaction case studies
- 10 new trace analysis patterns
- 6 new Debugware patterns and case study
- Workaround patterns
- Updated checklist
- Fully cross-referenced with Volume 1, Volume 2 and Volume 3
- New appendixes

Product information:

  • Title: Memory Dump Analysis Anthology, Volume 4
  • Author: Dmitry Vostokov
  • Language: English
  • Product Dimensions: 22.86 x 15.24
  • Paperback: 410 pages
  • Publisher: Opentask (30 March 2010)
  • ISBN-13: 978-1-906717-86-5
  • Hardcover: 410 pages
  • Publisher: Opentask (30 April 2010)
  • ISBN-13: 978-1-906717-87-2

Back cover features memory space art image: Internal Process Combustion.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Complete Stack Traces from x64 System

Tuesday, February 9th, 2010

Previously I wrote on how to get a 32-bit stack trace from a 32-bit process thread on an x64 system. There are situations when we are interested in all such stack traces, for example, from a complete memory dump. I wrote a script that extracts both 64-bit and WOW64 32-bit stack traces:

.load wow64exts
!for_each_thread "!thread @#Thread 1f;.thread /w @#Thread; .reload; kb 256; .effmach AMD64"

Example output fragment for a thread fffffa801f3a3bb0 from a very long debugger log file:

[...]

Setting context for owner process...
.process /p /r fffffa8013177c10

                                    

THREAD fffffa801f3a3bb0  Cid 4b4c.5fec  Teb: 000000007efaa000 Win32Thread: fffff900c1efad50 WAIT: (UserRequest) UserMode Non-Alertable
    fffffa8021ce4590  NotificationEvent
    fffffa801f3a3c68  NotificationTimer
Not impersonating
DeviceMap                 fffff8801b551720
Owning Process            fffffa8013177c10       Image:         application.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      14066428       Ticks: 301 (0:00:00:04.695)
Context Switch Count      248                 LargeStack
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Win32 Start Address mscorwks!Thread::intermediateThreadProc (0x00000000733853b3)
Stack Init fffffa60190e5db0 Current fffffa60190e5940
Base fffffa60190e6000 Limit fffffa60190df000 Call 0
Priority 11 BasePriority 10 PriorityDecrement 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           Call Site
fffffa60`190e5980 fffff800`01cba0fa nt!KiSwapContext+0x7f
fffffa60`190e5ac0 fffff800`01caedab nt!KiSwapThread+0x13a
fffffa60`190e5b30 fffff800`01f1d608 nt!KeWaitForSingleObject+0x2cb
fffffa60`190e5bc0 fffff800`01cb7973 nt!NtWaitForSingleObject+0x98
fffffa60`190e5c20 00000000`75183d09 nt!KiSystemServiceCopyEnd+0x13 (TrapFrame @ fffffa60`190e5c20)
00000000`069ef118 00000000`75183b06 wow64cpu!CpupSyscallStub+0x9
00000000`069ef120 00000000`74f8ab46 wow64cpu!Thunk0ArgReloadState+0x1a
00000000`069ef190 00000000`74f8a14c wow64!RunCpuSimulation+0xa
00000000`069ef1c0 00000000`771605a8 wow64!Wow64LdrpInitialize+0x4b4
00000000`069ef720 00000000`771168de ntdll! ?? ::FNODOBFM::`string'+0x20aa1
00000000`069ef7d0 00000000`00000000 ntdll!LdrInitializeThunk+0xe

.process /p /r 0
Implicit thread is now fffffa80`1f3a3bb0
WARNING: WOW context retrieval requires
switching to the thread's process context.
Use .process /p fffffa80`1f6b2990 to switch back.
Implicit process is now fffffa80`13177c10
x86 context set
Loading Kernel Symbols
Loading User Symbols
Loading unloaded module list
Loading Wow64 Symbols
ChildEBP RetAddr
06aefc68 76921270 ntdll_772b0000!ZwWaitForSingleObject+0x15
06aefcd8 7328c639 kernel32!WaitForSingleObjectEx+0xbe
06aefd1c 7328c56f mscorwks!PEImage::LoadImage+0x1af
06aefd6c 7328c58e mscorwks!CLREvent::WaitEx+0x117
06aefd80 733770fb mscorwks!CLREvent::Wait+0x17
06aefe00 73377589 mscorwks!ThreadpoolMgr::SafeWait+0x73
06aefe64 733853f9 mscorwks!ThreadpoolMgr::WorkerThreadStart+0x11c
06aeff88 7699eccb mscorwks!Thread::intermediateThreadProc+0x49
06aeff94 7732d24d kernel32!BaseThreadInitThunk+0xe
06aeffd4 7732d45f ntdll_772b0000!__RtlUserThreadStart+0x23
06aeffec 00000000 ntdll_772b0000!_RtlUserThreadStart+0x1b
Effective machine: x64 (AMD64)

[...]

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

10 Common Mistakes in Memory Analysis (Part 7)

Monday, February 8th, 2010

Another common mistake I observe is relying on what debuggers report without double-checking. Present day debuggers, like WinDbg or GDB, are symbol-driven, they do not possess much of that semantic knowledge that a human debugger has. Also, it is better to report more than less: what is irrelevant can be skipped over by a skilled memory analyst but what looks suspicious to the problem at hand shall be double-checked to see if it is not coincidental. One example we consider here is Coincidental Symbolic Information

An application is frequently crashing. The process memory dump file shows only one thread left inside without any exception handling frames. In order to hypothesize about the probable cause the thread raw stack data is analyzed. It shows a few C++ STL calls with a custom smart pointer class and memory allocator like this:

app!std::vector<SmartPtr<ClassA>, std::allocator<SmartPtr<ClassA> > >::operator[]+

WinDbg !analyze-v command output points to this code:

FOLLOWUP_IP:
app!std::bad_alloc::~bad_alloc <PERF> (app+0x0)+0
00400000 4d  dec     ebp

Raw stack data contains a few symbolic references to bad_alloc destructor too:

[...]
0012f9c0  00000100
0012f9c4  00400100 app!std::bad_alloc::~bad_alloc <PERF> (app+0x100)
0012f9c8  00000000
0012f9cc  0012f9b4
0012f9d0  00484488 app!_NULL_IMPORT_DESCRIPTOR+0x1984
0012f9d4  0012fa8c
0012f9d8  7c828290 ntdll!_except_handler3
0012f9dc  0012fa3c
0012f9e0  7c82b04a ntdll!RtlImageNtHeaderEx+0xee
0012f9e4  00482f08 app!_NULL_IMPORT_DESCRIPTOR+0x404
0012f9e8  00151ed0
0012f9ec  00484c1e app!_NULL_IMPORT_DESCRIPTOR+0x211a
0012f9f0  00000100
0012f9f4  00400100 app!std::bad_alloc::~bad_alloc <PERF> (app+0x100)
[...]

By linking all these three pieces together an engineer hypothesized that the cause of failure is memory allocation. However, careful analysis reveals all of them as a coincidental symbolic information and renders hypothesis much less plausible:

1.  The address of app!std::bad_alloc::~bad_alloc is 00400000 which coincides with the start of the main application module:

0:000> lm a 00400000
start    end        module name
00400000 004c4000   app    (no symbols)

As a consequence, its assembly language code makes no sense:

0:000> u 00400000
app:
00400000 4d              dec     ebp
00400001 5a              pop     edx
00400002 90              nop
00400003 0003            add     byte ptr [ebx],al
00400005 0000            add     byte ptr [eax],al
00400007 000400          add     byte ptr [eax+eax],al
0040000a 0000            add     byte ptr [eax],al
0040000c ff              ???

2. All std::vector references are in fact fragments of a UNICODE string that can be dumped using du command:

[...]
0012ef14  00430056 app!std::vector<SmartPtr<ClassA>, std::allocator<SmartPtr<ClassA> > >::operator[]+0x16
0012ef18  00300038
0012ef1c  0043002e app!std::vector<SmartPtr<ClassA>, std::allocator<SmartPtr<ClassA> > >::size+0x1
[...]

0:000> du 0012ef14 l6
0012ef14  "VC80.C"

3. Raw stack data references to bad_alloc destructor are still module addresses in disguise, 00400100 or app+0×100, with nonsense assembly code:

0:000> u 00400100
app+0x100:
00400100 50              push    eax
00400101 45              inc     ebp
00400102 0000            add     byte ptr [eax],al
00400104 4c              dec     esp
00400105 010500571aac    add     dword ptr ds:[0AC1A5700h],eax
0040010b 4a              dec     edx
0040010c 0000            add     byte ptr [eax],al
0040010e 0000            add     byte ptr [eax],al

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -