Archive for the ‘Debugging’ Category

Software Diagnostics Services

Friday, July 13th, 2012

For some time I was struggling with finding a good name for memory dump and software trace analysis activities. The name Memoretics I use for the science of memory dump analysis (that also incorporates software traces) seems not so good to describe the whole practical activity that should be transparent to everyone in IT. Fortunately, I timely understood that all these activities constitute the essence of software diagnostics that previously lacked any solid foundation. Thus, Software Diagnostics Institute was reborn from the previous Crash Dump Analysis Portal. This institute does pure and applied research and scientific activities and in recent years was funded mainly from OpenTask publisher and recently from Memory Dump Analysis Services. The latter company also recognized that the broadening of its commercial activities requires a new name. So, Software Diagnostics Services was reborn:

The First Comprehensive Software Diagnostics Service

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Architecture of Process Memory Dump Capture Done Right

Monday, July 2nd, 2012

Sometimes I get requests to review application memory dump capture design. Of course, such requests usually come only when such designs don’t work or there are problems with loading saved crash dumps. The common blueprint of such architectures is a top level exception handler that use some API do capture and save process memory state. However, such designs forget why separate processed were introduced in the first place: to guard process memory space of different unrelated tasks (for related tasks there are threads). The data of the module (and its thread state) that does process memory capture may also be corrupt. The right design would be to show a message box with an information on how to use external process memory dumper such as Task Manager. If we need an automation then the right thing is to rely on WER features. Let separate processes do their work in separate spaces.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Crash Dump Analysis Patterns (Part 178)

Wednesday, June 27th, 2012

One of the frequent problems is an access violation at an address that belongs to Unloaded Module. Here’s an example that recently happened on our machine during an auto-update of the popular software package so we immediately attached a debugger after seeing WER dialog box:

0:000> ~*k

.  0  Id: bc8.bcc Suspend: 1 Teb: 7efdd000 Unfrozen
ChildEBP RetAddr
0035f1c4 771a0bdd ntdll!ZwWaitForMultipleObjects+0x15
0035f260 75771a2c KERNELBASE!WaitForMultipleObjectsEx+0x100
0035f2a8 75774208 kernel32!WaitForMultipleObjectsExImplementation+0xe0
0035f2c4 757980a4 kernel32!WaitForMultipleObjects+0x18
0035f330 75797f63 kernel32!WerpReportFaultInternal+0x186
0035f344 75797858 kernel32!WerpReportFault+0x70
0035f354 757977d7 kernel32!BasepReportFault+0x20
0035f3e0 77ec74df kernel32!UnhandledExceptionFilter+0x1af
0035f3e8 77ec73bc ntdll!__RtlUserThreadStart+0x62
0035f3fc 77ec7261 ntdll!_EH4_CallFilterFunc+0x12
0035f424 77eab459 ntdll!_except_handler4+0x8e
0035f448 77eab42b ntdll!ExecuteHandler2+0x26
0035f46c 77eab3ce ntdll!ExecuteHandler+0x24
0035f4f8 77e60133 ntdll!RtlDispatchException+0x127
0035f4f8 73eb2200 ntdll!KiUserExceptionDispatcher+0xf
WARNING: Frame IP not in any known module. Following frames may be wrong.
0035f844 76e462fa <Unloaded_fpb.tmp>+0×12200
0035f870 76e46d3a USER32!InternalCallWinProc+0×23
0035f8e8 76e4965e USER32!UserCallWinProcCheckWow+0×109
0035f92c 76e496c5 USER32!SendMessageWorker+0×581
0035f950 7269c05c USER32!SendMessageW+0×7f
0035f9ec 7270be62 comctl32!CCSendNotify+0xc19
0035fa28 75f6f52a comctl32!SendNotify+0×36
0035fa4c 75f61d66 SHELL32!SetAppStartingCursor+0×6d
0035fa64 75f61ee2 SHELL32!CShellExecute::ExecuteNormal+0×16
0035fa78 75f61e70 SHELL32!ShellExecuteNormal+0×33
0035fa90 75f53cd0 SHELL32!ShellExecuteExW+0×62
0035fae4 003e2211 SHELL32!ShellExecuteW+0×77
0035fbc4 77e838be InstallFlashPlayer+0×2211
0035fcb4 77e83492 ntdll!RtlpFreeHeap+0xbb1
0035fcd4 757714dd ntdll!RtlFreeHeap+0×142
0035fce8 003f0324 kernel32!HeapFree+0×14
0035fd80 003f0241 InstallFlashPlayer+0×10324
0035fe10 7577339a InstallFlashPlayer+0×10241
0035fe1c 77e89ef2 kernel32!BaseThreadInitThunk+0xe
0035fe5c 77e89ec5 ntdll!__RtlUserThreadStart+0×70
0035fe74 00000000 ntdll!_RtlUserThreadStart+0×1b

1  Id: bc8.6b0 Suspend: 2 Teb: 7efda000 Unfrozen
ChildEBP RetAddr
03e1f9e0 77ea2f51 ntdll!ZwWaitForMultipleObjects+0x15
03e1fb74 7577339a ntdll!TppWaiterpThread+0x33d
03e1fb80 77e89ef2 kernel32!BaseThreadInitThunk+0xe
03e1fbc0 77e89ec5 ntdll!__RtlUserThreadStart+0x70
03e1fbd8 00000000 ntdll!_RtlUserThreadStart+0x1b

2  Id: bc8.8dc Suspend: 2 Teb: 7efd7000 Unfrozen
ChildEBP RetAddr
03f5fd50 77ea3352 ntdll!NtWaitForWorkViaWorkerFactory+0x12
03f5feb0 7577339a ntdll!TppWorkerThread+0x216
03f5febc 77e89ef2 kernel32!BaseThreadInitThunk+0xe
03f5fefc 77e89ec5 ntdll!__RtlUserThreadStart+0x70
03f5ff14 00000000 ntdll!_RtlUserThreadStart+0x1b

3  Id: bc8.944 Suspend: 2 Teb: 7efaf000 Unfrozen
ChildEBP RetAddr
0416f8b4 77ea3352 ntdll!NtWaitForWorkViaWorkerFactory+0x12
0416fa14 7577339a ntdll!TppWorkerThread+0x216
0416fa20 77e89ef2 kernel32!BaseThreadInitThunk+0xe
0416fa60 77e89ec5 ntdll!__RtlUserThreadStart+0x70
0416fa78 00000000 ntdll!_RtlUserThreadStart+0x1b

Exception thread shows fpb.tmp module as unloaded:

0:000> lmv m fpb.tmp
start    end        module name

Unloaded modules:
00cb0000 00d5a000   fpb.tmp

Timestamp: Fri Jun 01 02:56:00 2012 (4FC82130)
Checksum:  000B0CD5
ImageSize:  000AA000
73ea0000 73f15000   fpb.tmp
Timestamp: Fri Jun 01 02:49:25 2012 (4FC81FA5)
Checksum:  0007A7CE
ImageSize:  00075000

We change the exception thread context to get registers at the time of the exception:

0:000> kv
ChildEBP RetAddr  Args to Child
0035f1c4 771a0bdd 00000002 0035f214 00000001 ntdll!ZwWaitForMultipleObjects+0x15
0035f260 75771a2c 0035f214 0035f288 00000000 KERNELBASE!WaitForMultipleObjectsEx+0x100
0035f2a8 75774208 00000002 7efde000 00000000 kernel32!WaitForMultipleObjectsExImplementation+0xe0
0035f2c4 757980a4 00000002 0035f2f8 00000000 kernel32!WaitForMultipleObjects+0x18
0035f330 75797f63 0035f410 00000001 00000001 kernel32!WerpReportFaultInternal+0x186
0035f344 75797858 0035f410 00000001 0035f3e0 kernel32!WerpReportFault+0x70
0035f354 757977d7 0035f410 00000001 658587c7 kernel32!BasepReportFault+0x20
0035f3e0 77ec74df 00000000 77ec73bc 00000000 kernel32!UnhandledExceptionFilter+0x1af
0035f3e8 77ec73bc 00000000 0035fe5c 77e7c530 ntdll!__RtlUserThreadStart+0x62
0035f3fc 77ec7261 00000000 00000000 00000000 ntdll!_EH4_CallFilterFunc+0x12
0035f424 77eab459 fffffffe 0035fe4c 0035f560 ntdll!_except_handler4+0x8e
0035f448 77eab42b 0035f510 0035fe4c 0035f560 ntdll!ExecuteHandler2+0x26
0035f46c 77eab3ce 0035f510 0035fe4c 0035f560 ntdll!ExecuteHandler+0x24
0035f4f8 77e60133 0135f510 0035f560 0035f510 ntdll!RtlDispatchException+0x127
0035f4f8 73eb2200 0135f510 0035f560 0035f510 ntdll!KiUserExceptionDispatcher+0xf (CONTEXT @ 0035f560)
WARNING: Frame IP not in any known module. Following frames may be wrong.
0035f844 76e462fa 000201ce 0000004e 00000000 <Unloaded_fpb.tmp>+0×12200
0035f870 76e46d3a 73eb2200 000201ce 0000004e USER32!InternalCallWinProc+0×23
0035f8e8 76e4965e 00000000 73eb2200 000201ce USER32!UserCallWinProcCheckWow+0×109
0035f92c 76e496c5 013907f0 00000000 73eb2200 USER32!SendMessageWorker+0×581
0035f950 7269c05c 000201ce 0000004e 00000000 USER32!SendMessageW+0×7f
0035f9ec 7270be62 0035fa00 fffffff7 00000000 comctl32!CCSendNotify+0xc19
0035fa28 75f6f52a 000201ce 00000000 fffffff7 comctl32!SendNotify+0×36
0035fa4c 75f61d66 000201ce 00000001 00001500 SHELL32!SetAppStartingCursor+0×6d
0035fa64 75f61ee2 0035faa4 00001500 0035faa4 SHELL32!CShellExecute::ExecuteNormal+0×16
0035fa78 75f61e70 0035faa4 00001500 00000200 SHELL32!ShellExecuteNormal+0×33
0035fa90 75f53cd0 0035faa4 003fb654 003fa554 SHELL32!ShellExecuteExW+0×62
0035fae4 003e2211 000201ce 003fa554 0035fb14 SHELL32!ShellExecuteW+0×77
0035fbc4 77e838be 00da0138 77e8389a 77c467ad InstallFlashPlayer+0×2211
0035fcb4 77e83492 00000000 00da2320 00da2320 ntdll!RtlpFreeHeap+0xbb1
0035fcd4 757714dd 00da0000 00000000 00da2320 ntdll!RtlFreeHeap+0×142
0035fce8 003f0324 00da0000 00000000 003f0343 kernel32!HeapFree+0×14
0035fd80 003f0241 003e0000 00000000 010d3135 InstallFlashPlayer+0×10324
0035fe10 7577339a 7efde000 0035fe5c 77e89ef2 InstallFlashPlayer+0×10241
0035fe1c 77e89ef2 7efde000 77c46545 00000000 kernel32!BaseThreadInitThunk+0xe
0035fe5c 77e89ec5 003f02ac 7efde000 ffffffff ntdll!__RtlUserThreadStart+0×70
0035fe74 00000000 003f02ac 7efde000 00000000 ntdll!_RtlUserThreadStart+0×1b

0:000> .cxr 0035f560
eax=73eb2200 ebx=00000000 ecx=01080d68 edx=00000000 esi=73eb2200 edi=00000000
eip=73eb2200 esp=0035f848 ebp=0035f870 iopl=0 nv up ei pl nz na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b efl=00210206
<Unloaded_fpb.tmp>+0×12200:
73eb2200 ??              ???

Then we double check that a window procedure was indeed called from that module range:

0:000> kv
*** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr  Args to Child
WARNING: Frame IP not in any known module. Following frames may be wrong.
0035f844 76e462fa 000201ce 0000004e 00000000 <Unloaded_fpb.tmp>+0×12200
0035f870 76e46d3a 73eb2200 000201ce 0000004e USER32!InternalCallWinProc+0×23
0035f8e8 76e4965e 00000000 73eb2200 000201ce USER32!UserCallWinProcCheckWow+0×109
0035f92c 76e496c5 013907f0 00000000 73eb2200 USER32!SendMessageWorker+0×581
0035f950 7269c05c 000201ce 0000004e 00000000 USER32!SendMessageW+0×7f
0035f9ec 7270be62 0035fa00 fffffff7 00000000 comctl32!CCSendNotify+0xc19
0035fa28 75f6f52a 000201ce 00000000 fffffff7 comctl32!SendNotify+0×36
0035fa4c 75f61d66 000201ce 00000001 00001500 SHELL32!SetAppStartingCursor+0×6d
0035fa64 75f61ee2 0035faa4 00001500 0035faa4 SHELL32!CShellExecute::ExecuteNormal+0×16
0035fa78 75f61e70 0035faa4 00001500 00000200 SHELL32!ShellExecuteNormal+0×33
0035fa90 75f53cd0 0035faa4 003fb654 003fa554 SHELL32!ShellExecuteExW+0×62
0035fae4 003e2211 000201ce 003fa554 0035fb14 SHELL32!ShellExecuteW+0×77
0035fbc4 77e838be 00da0138 77e8389a 77c467ad InstallFlashPlayer+0×2211
0035fcb4 77e83492 00000000 00da2320 00da2320 ntdll!RtlpFreeHeap+0xbb1
00da15a0 00000000 00da1780 02971450 003e1000 ntdll!RtlFreeHeap+0×142

0:000> ub 76e462fa
USER32!InternalCallWinProc+0×6:
76e462dd 68cdabbadc      push    0DCBAABCDh
76e462e2 56              push    esi
76e462e3 ff7518          push    dword ptr [ebp+18h]
76e462e6 ff7514          push    dword ptr [ebp+14h]
76e462e9 ff7510          push    dword ptr [ebp+10h]
76e462ec ff750c          push    dword ptr [ebp+0Ch]
76e462ef 64800dca0f000001 or      byte ptr fs:[0FCAh],1
76e462f7 ff5508          call    dword ptr [ebp+8]

We now get a memory value pointed to by EBP+8 address:

0:000> r
Last set context:
eax=73eb2200 ebx=00000000 ecx=01080d68 edx=00000000 esi=73eb2200 edi=00000000
eip=73eb2200 esp=0035f848 ebp=0035f870 iopl=0 nv up ei pl nz na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b  efl=00210206
<Unloaded_fpb.tmp>+0×12200:
73eb2200 ??              ???

0:000> dp 0035f870+8 l1
0035f878  73eb2200

0:000> dd 73eb2200
73eb2200  ???????? ???????? ???????? ????????
73eb2210  ???????? ???????? ???????? ????????
73eb2220  ???????? ???????? ???????? ????????
73eb2230  ???????? ???????? ???????? ????????
73eb2240  ???????? ???????? ???????? ????????
73eb2250  ???????? ???????? ???????? ????????
73eb2260  ???????? ???????? ???????? ????????
73eb2270  ???????? ???????? ???????? ????????

The value is indeed belongs to unloaded fpb.tmp module address range:

0:000> ln 73eb2200
(73eb2200)   <Unloaded_fpb.tmp>+0×12200

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Individual and Enterprise Software Diagnostics Certifications

Monday, June 18th, 2012

Memory Dump Analysis Services will be administering certifications developed by Software Diagnostics Institute for memory dump and software trace analysis:

Software Diagnostics Maturity Enterprise Certification
Memory Dump Analysis Certification is available this September

Debugging TV Frames episode 0×10 contains some background information.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Software Diagnostics Institute

Tuesday, June 12th, 2012

DumpAnalysis.org portal has been reorganized to Software Diagnostics Institute to reflect the nature of its research activities. More updates later on.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Crash Dump Analysis Patterns (Part 177)

Saturday, June 9th, 2012

Stack Trace Change is an important pattern for differential memory dump analysis, for example, when memory dumps were generated before and after a problem such as a CPU spike or hang. In the example below we have a normal expected thread stack trace from a memory dump saved before an application was reported unresponsive and another different thread stack trace after:

3  Id: 24b8.24e4 Suspend: 0 Teb: 7efa1000 Unfrozen
ChildEBP RetAddr
037dfadc 75210bdd ntdll!ZwWaitForMultipleObjects+0x15
037dfb78 75791a2c KERNELBASE!WaitForMultipleObjectsEx+0x100
037dfbc0 7511086a kernel32!WaitForMultipleObjectsExImplementation+0xe0
037dfc14 00d17c1d user32!RealMsgWaitForMultipleObjectsEx+0x14d
037dfc3c 00ce161d ApplicationA!MsgWaitForMultipleObjects+0x2d
037dfc60 00cdc757 ApplicationA!WaitForSignal+0x1d
037dfc80 00cdaaf6 ApplicationA!WorkLoop+0x57
037dfca4 7579339a ApplicationA!ThreadStart+0x26
037dfcb0 77699ef2 kernel32!BaseThreadInitThunk+0xe
037dfcf0 77699ec5 ntdll!__RtlUserThreadStart+0x70
037dfd08 00000000 ntdll!_RtlUserThreadStart+0x1b

3  Id: 24b8.24e4 Suspend: 0 Teb: 7efa1000 Unfrozen
ChildEBP RetAddr
037df38c 752131bb ntdll!ZwDelayExecution+0x15
037df3f4 75213a8b KERNELBASE!SleepEx+0x65
037df404 00d1670b KERNELBASE!Sleep+0xf
037df40c 00d350ef ApplicationA!Sleep+0xb
037df430 6a868aab ApplicationA!PutData+0xbf
037df444 6a8662ec ModuleA!OutputData+0x1b
037df464 00d351de ModuleA!ProcessData+0x16c
037df4a4 00ca8cb4 ApplicationA!SendData+0xbe
[...]

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Crash Dump Analysis Patterns (Part 60, Mac OS X)

Tuesday, June 5th, 2012

This is a Mac OS X / GDB counterpart to Execution Residue pattern previously described for Windows platforms:

(gdb) bt
#0 0x00007fff8616e82a in __kill ()
#1 0x00007fff8fab9a9c in abort ()
#2 0x000000010269dc29 in bar_5 ()
#3 0x000000010269dc39 in bar_4 ()
#4 0x000000010269dc49 in bar_3 ()
#5 0x000000010269dc59 in bar_2 ()
#6 0x000000010269dc69 in bar_1 ()
#7 0x000000010269dc79 in bar ()
#8 0x000000010269dca0 in main (argc=1, argv=0x7fff6229cb00)

(gdb) x $rsp
0x7fff6229ca38: 0x8fab9a9c

(gdb) x/1000a 0x7fff6229c000
0×7fff6229c000: 0×7fff8947b000 0×7fff8947b570
0×7fff6229c010: 0×4f3ee10c 0×7fff90cb0000
0×7fff6229c020: 0×7fff90cb04d0 0×4e938b16
[…]
0×7fff6229c5f0: 0×7fff622d8d80 0×10269d640
0×7fff6229c600: 0×7fff6229cad0 0×7fff622a460b
0×7fff6229c610: 0×100000000 0×269d000
0×7fff6229c620: 0×7fff6229c630 0×10269db59 <foo_8+9>
0×7fff6229c630: 0×7fff6229c640 0×10269db69 <foo_7+9>
0×7fff6229c640: 0×7fff6229c650 0×10269db79 <foo_6+9>
0×7fff6229c650: 0×7fff6229c660 0×10269db89 <foo_5+9>
0×7fff6229c660: 0×7fff6229c670 0×10269db99 <foo_4+9>
0×7fff6229c670: 0×7fff6229c680 0×10269dba9 <foo_3+9>
0×7fff6229c680: 0×7fff6229c690 0×10269dbb9 <foo_2+9>
0×7fff6229c690: 0×7fff6229c6a0 0×10269dbc9 <foo_1+9>
0×7fff6229c6a0: 0×7fff6229cac0 0×10269dbee <foo+30>

0×7fff6229c6b0: 0×0 0×0
0×7fff6229c6c0: 0×0 0×0
0×7fff6229c6d0: 0×0 0×0
0×7fff6229c6e0: 0×0 0×0
[…]
0×7fff6229c8d0: 0×7fff6229c960 0×7fff622b49cd
0×7fff6229c8e0: 0×10269f05c 0×0
0×7fff6229c8f0: 0×7fff622c465c 0×7fff8a31e5c0 <_Z21dyldGlobalLockReleasev>
0×7fff6229c900: 0×7fff8fab99eb <abort> 0×10269f05c
0×7fff6229c910: 0×101000000000000 0×7fff622d2110
0×7fff6229c920: 0×7fff622d8d80 0×10269f078
0×7fff6229c930: 0×7fff622daac8 0×18
0×7fff6229c940: 0×0 0×0
0×7fff6229c950: 0×10269e030 0×0
0×7fff6229c960: 0×7fff6229c980 0×7fff622a1922
0×7fff6229c970: 0×0 0×0
0×7fff6229c980: 0×7fff6229ca50 0×7fff8a31e716 <dyld_stub_binder_+13>
0×7fff6229c990: 0×1 0×7fff6229cb00
0×7fff6229c9a0: 0×7fff6229cb10 0xe223ea612ddc10b7
0×7fff6229c9b0: 0×8 0×0
0×7fff6229c9c0: 0xe223ea612ddc10b7 0×0
0×7fff6229c9d0: 0×0 0×0
0×7fff6229c9e0: 0×585f5f00474e414c 0×20435058005f4350
0×7fff6229c9f0: 0×0 0×0
0×7fff6229ca00: 0×0 0×0
0×7fff6229ca10: 0×0 0×0
0×7fff6229ca20: 0×0 0×0
0×7fff6229ca30: 0×7fff6229ca60 0×7fff8fab9a9c <abort+177>
0×7fff6229ca40: 0×0 0×0
0×7fff6229ca50: 0×7fffffffffdf 0×0
0×7fff6229ca60: 0×7fff6229ca70 0×10269dc29 <bar_5+9>
0×7fff6229ca70: 0×7fff6229ca80 0×10269dc39 <bar_4+9>
0×7fff6229ca80: 0×7fff6229ca90 0×10269dc49 <bar_3+9>
0×7fff6229ca90: 0×7fff6229caa0 0×10269dc59 <bar_2+9>
0×7fff6229caa0: 0×7fff6229cab0 0×10269dc69 <bar_1+9>
0×7fff6229cab0: 0×7fff6229cac0 0×10269dc79 <bar+9>
0×7fff6229cac0: 0×7fff6229cae0 0×10269dca0 <main+32>
0×7fff6229cad0: 0×7fff6229cb00 0×1
0×7fff6229cae0: 0×7fff6229caf0 0×10269db34 <start+52>

0×7fff6229caf0: 0×0 0×1
0×7fff6229cb00: 0×7fff6229cc48 0×0
0×7fff6229cb10: 0×7fff6229ccae 0×7fff6229ccca
[…]

Here’s the source code of the modeling application:

#define def_call(name,x,y) void name##_##x() { name##_##y(); }

#define def_final(name,x) void name##_##x() { }

#define def_final_abort(name,x) void name##_##x() { abort(); }

#define def_init(name,y) void name() { name##_##y(); }

#define def_init_alloc(name,y,size) void name() { int arr[size]; name##_##y(); *arr=0; }

def_final(foo,9)

def_call(foo,8,9)

def_call(foo,7,8)

def_call(foo,6,7)

def_call(foo,5,6)

def_call(foo,4,5)

def_call(foo,3,4)

def_call(foo,2,3)

def_call(foo,1,2)

def_init_alloc(foo,1,256)

def_final_abort(bar,5)

def_call(bar,4,5)

def_call(bar,3,4)

def_call(bar,2,3)

def_call(bar,1,2)

def_init(bar,1)

int main(int argc, const char * argv[])

{

    foo();

    bar();

} 

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Forthcoming Training: Accelerated Mac OS X Core Dump Analysis

Trace Analysis Patterns (Part 50)

Sunday, June 3rd, 2012

In some cases it is useful to consider Message Context: a set of surrounding messages having some relation to the chosen message:

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Trace Analysis Patterns (Part 49)

Sunday, June 3rd, 2012

Sometimes we need to pay attention to Error Distribution, for example, the distribution of the same error across a software log space or different error messages in different parts of the same software log or trace (providing effective partition):

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Trace Analysis Patterns (Part 48)

Saturday, June 2nd, 2012

If Break-in Activity is usually unrelated to a thread or an adjoint thread which has a discontinuity then Resume Activity pattern highlights messages from that thread:

The difference can be seen on the following graphical representation of two traces we analyzed where in a working trace a break-in preceded resume activity whereas in a non-working trace both patterns were absent:

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Trace Analysis Patterns (Part 47)

Saturday, June 2nd, 2012

We resume our software trace analysis pattern catalog. The next pattern is called Break-in Activity. This is a message or a set of messages that surface just before the end of discontinuity of a adjoint thread and possibly triggered it:

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Crash Dump Analysis Patterns (Part 23a, Mac OS X)

Tuesday, May 29th, 2012

This is a Mac OS X / GDB counterpart to Double Free (process heap) pattern previously described for Windows platforms:

(gdb) bt
#0 0x00007fff8479582a in __kill ()
#1 0x00007fff8e0e0a9c in abort ()
#2 0x00007fff8e13f84c in free ()
#3 0x00000001035a8ef4 in main (argc=1, argv=0x7fff631a7b20)

(gdb) x/2i 0x00000001035a8ef4-8
0x1035a8eec : mov -0×20(%rbp),%edi
0×1035a8eef : callq 0×1035a8f06

(gdb) frame 3
#3 0x00000001035a8ef4 in main (argc=1, argv=0x7fff631a7b20)
at .../DoubleFree/main.c:23
23 free(p2);
Current language: auto; currently minimal

(gdb) x/g $rbp-0x20
0x7fff631a7ae0: 0x00007fe6a8801400

(gdb) x/2w 0x00007fe6a8801400
0x7fe6a8801400: 0x00000000 0xb0000000

Here’s the source code of the modeling application:

int main(int argc, const char * argv[])

{

    char *p1 = (char *) malloc (1024);

    printf(“p1 = %p\n”, p1);

 

    char *p2 = (char *) malloc (1024);

    printf(“p2 = %p\n”, p2);

 

    free(p2);

    free(p1);

    free(p2);

 

    return 0;

} 

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Forthcoming Training: Accelerated Mac OS X Core Dump Analysis

Fiction for Debugging: The Problem and The Solution

Monday, May 28th, 2012

After writing about music for debugging and founding software narratology I decided to start writing about fiction. The first masterpiece is The Sound and The Fury by William Faulkner. I confess that I’m in love with Folio Society books and when I saw this color version (an original idea by Faulkner now fulfilled by modern printing technology) I immediately recognized its importance for software trace analysis:

http://www.foliosociety.com/book/SAF/sound-and-the-fury

I’m pretty sure Faulkner would have been delighted to see trace analysis patterns and how they may help in writing fiction.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Crash Dump Analysis Patterns (Part 2, Mac OS X)

Sunday, May 27th, 2012

This is a Mac OS X / GDB counterpart to Dynamic Memory Corruption (process heap) pattern previously described for Windows platforms:

(gdb) bt
#0 0x00007fff8479582a in __kill ()
#1 0x00007fff8e0e0a9c in abort ()
#2 0x00007fff8e1024ac in szone_error ()
#3 0x00007fff8e1024e8 in free_list_checksum_botch ()
#4 0x00007fff8e102a7b in small_free_list_remove_ptr ()
#5 0x00007fff8e106bf7 in szone_free_definite_size ()
#6 0x00007fff8e13f789 in free ()
#7 0x000000010afafe23 in main (argc=1, argv=0x7fff6abaeb08)

Here’s the source code of the modeling application:

int main(int argc, const char * argv[])

{

    char *p1 = (char *) malloc (1024);

    printf(“p1 = %p\n”, p1);

 

    char *p2 = (char *) malloc (1024);

    printf(“p2 = %p\n”, p2);

 

    char *p3 = (char *) malloc (1024);

    printf(“p3 = %p\n”, p3);

 

    char *p4 = (char *) malloc (1024);

    printf(“p4 = %p\n”, p4);

 

    char *p5 = (char *) malloc (1024);

    printf(“p5 = %p\n”, p5);

 

    char *p6 = (char *) malloc (1024);

    printf(“p6 = %p\n”, p6);

 

    char *p7 = (char *) malloc (1024);

    printf(“p7 = %p\n”, p7);

 

    free(p6);

    free(p4);

    free(p2);

 

    printf(“Hello Crash!\n”);        

    strcpy(p2, “Hello Crash!”);

    strcpy(p4, “Hello Crash!”);

    strcpy(p6, “Hello Crash!”);

 

    p2 = (char *) malloc (512);

    printf(“p2 = %p\n”, p2);

 

    p4 = (char *) malloc (1024);

    printf(“p4 = %p\n”, p4);

 

    p6 = (char *) malloc (512);

    printf(“p6 = %p\n”, p6);

 

    free (p7);

    free (p6);

    free (p5);

    free (p4);

    free (p3);

    free (p2);

    free (p1);

 

    return 0;

}

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Forthcoming Training: Accelerated Mac OS X Core Dump Analysis

Crash Dump Analysis Patterns (Part 20d)

Saturday, May 19th, 2012

This is a specialization of Insufficient Memory (kernel pool) pattern called Memory Leak (I/O completion packets). The currently unique diagnostics this pattern provides in comparison with other kernel pool tags is that the pool allocation entries show the leaking process:

0: kd> !poolused 3
Sorting by  NonPaged Pool Consumed

Pool Used:
NonPaged                    Paged
Tag    Allocs    Frees     Diff     Used   Allocs    Frees     Diff     Used
Icp   1294074    42875  1251199 96642976        0        0        0        0 I/O completion packets queue on a completion ports
[…]

0: kd> !poolfind Icp

Scanning large pool allocation table for Tag: Icp  (fffffa8013e00000 : fffffa8014100000)

*fffffa800e188260 size:   50 previous size:   40  (Allocated) Icp  Process: fffffa800899dc40
*fffffa800e1882e0 size:   50 previous size:   30  (Allocated) Icp  Process: fffffa800899dc40
*fffffa800e188330 size:   50 previous size:   50  (Allocated) Icp  Process: fffffa800899dc40
*fffffa800e188380 size:   50 previous size:   50  (Allocated) Icp  Process: fffffa800899dc40
*fffffa800e1883d0 size:   50 previous size:   50  (Allocated) Icp  Process: fffffa800899dc40
*fffffa800e188420 size:   50 previous size:   50  (Allocated) Icp  Process: fffffa800899dc40
*fffffa800e188470 size:   50 previous size:   50  (Allocated) Icp  Process: fffffa800899dc40
*fffffa800e1884c0 size:   50 previous size:   50  (Allocated) Icp  Process: fffffa800899dc40

0: kd> !process  fffffa800899dc40 1
PROCESS fffffa800899dc40
SessionId: 0  Cid: 43a4    Peb: 7efdf000  ParentCid: 0412
DirBase: 09d6b000  ObjectTable: fffff8a0046c8c10  HandleCount: 1068.
Image: ServiceA.exe
[…]

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Crash Dump Analysis Patterns (Part 172)

Saturday, May 19th, 2012

Recently I observed a few occurrences of a rare No Current Thread pattern in a large set of process memory dumps:

0:???> k
WARNING: The debugger does not have a current process or thread
WARNING: Many commands will not work
^ Illegal thread error in ‘k’

0:???> ~
WARNING: The debugger does not have a current process or thread
WARNING: Many commands will not work
0  Id: 95f4.6780 Suspend: 1 Teb: 7efdd000 Unfrozen

Setting a current thread helps:

0:???> ~0s
WARNING: The debugger does not have a current process or thread
WARNING: Many commands will not work
eax=037d0010 ebx=0002bda0 ecx=03b1a010 edx=00000007 esi=037d0010 edi=03b069fc
eip=0397939f esp=0018fd98 ebp=0018fdd8 iopl=0  nv up ei pl nz na po nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b  efl=00200202
DllA+0×939f:
0397939f 8b10 mov edx,dword ptr [eax] ds:002b:037d0010=03b1a010

0:000> k
ChildEBP RetAddr
WARNING: Stack unwind information not available. Following frames may be wrong.
0018fdd8 03975257 DllA+0x939f
0018fdf8 03975577 DllA+0x5257
0018fe58 772bb9a0 DllA+0x5577
0018fe78 772d9b96 ntdll!LdrpCallInitRoutine+0x14
0018ff1c 772d9a38 ntdll!LdrShutdownProcess+0x1aa
0018ff30 752279f4 ntdll!RtlExitUserProcess+0x74
0018ff44 0040625d kernel32!ExitProcessStub+0x12
0018ff5c 012528e5 Application+0x625d
0018ff88 7522339a Application!foo+0xdc88f1
0018ff94 772bbf42 kernel32!BaseThreadInitThunk+0xe
0018ffd4 772bbf15 ntdll!__RtlUserThreadStart+0x70
0018ffec 00000000 ntdll!_RtlUserThreadStart+0x1b

However, EIP of the new current thread doesn’t point to any access violation and the dereferenced address is valid:

0:000> !address 037d0010
Usage:                  <unclassified>
Allocation Base:        037d0000
Base Address:           037d0000
End Address:            038dd000
Region Size:            0010d000
Type:                   00020000 MEM_PRIVATE
State:                  00001000 MEM_COMMIT
Protect:                00000004 PAGE_READWRITE

Also, if we inspect the raw stack data we won’t find any hidden exceptions there. So we conclude that the missing thread was exceptional. Indeed, there is a saved exception context in the process memory dump:

0:000> .exr -1
ExceptionAddress: 08a9ae18 (<Unloaded_DllB.dll>+0x001cae18)
ExceptionCode: c0000005 (Access violation)
ExceptionFlags: 00000001
NumberParameters: 1
Parameter[0]: 00000008

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Memorandum (Debugging Slang, Part 31)

Thursday, May 10th, 2012

Memorandum - when memory ran dump.

Examples: We got a few memorandums from that market leader.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Crash Dump Analysis Patterns (Part 14, Mac OS X)

Wednesday, May 9th, 2012

This is a Mac OS X / GDB counterpart to Spiking Thread pattern previously described for Windows platforms:

(gdb) info threads
4 0×00007fff85b542df in sqrt$fenv_access_off ()
3 0×00007fff8616ee42 in __semwait_signal ()
2 0×00007fff8616ee42 in __semwait_signal ()
* 1 0×00007fff8616ee42 in __semwait_signal ()

We notice a non-waiting thread and switch to it:

(gdb) thread 4
[Switching to thread 4 (core thread 3)]
0x00007fff85b542df in sqrt$fenv_access_off ()

(gdb) bt
#0  0x00007fff85b542df in sqrt$fenv_access_off ()
#1  0×000000010cc85dc9 in thread_three (arg=0×7fff6c884ac0)
#2  0×00007fff8fac68bf in _pthread_start ()
#3  0×00007fff8fac9b75 in thread_start ()

If we disassemble the return address for thread_three function to come back from sqrt call we see an infinite loop:

(gdb) disass 0x000000010cc85dc9
Dump of assembler code for function thread_three:
0x000000010cc85db0 <thread_three+0>: push   %rbp
0×000000010cc85db1 <thread_three+1>: mov    %rsp,%rbp
0×000000010cc85db4 <thread_three+4>: sub    $0×10,%rsp
0×000000010cc85db8 <thread_three+8>: mov    %rdi,-0×10(%rbp)
0×000000010cc85dbc <thread_three+12>: mov    -0×10(%rbp),%ax
0×000000010cc85dc0 <thread_three+16>: movsd  (%rax),%xmm0
0×000000010cc85dc4 <thread_three+20>: callq  0×10cc85eac <dyld_stub_sqrt>
0×000000010cc85dc9 <thread_three+25>: mov    -0×10(%rbp),%rax
0×000000010cc85dcd <thread_three+29>: movsd  %xmm0,(%rax)
0×000000010cc85dd1 <thread_three+33>: jmpq   0×10cc85dbc <thread_three+12>
End of assembler dump.

Here’s the source code of the modeling application:

void * thread_one (void *arg)

{

    while (1)

    {

       sleep (1);

    }

 

    return 0;

}

 

void * thread_two (void *arg)

{

    while (1)

    {

        sleep (2);

    }

 

    return 0;

}

 

void * thread_three (void *arg)

{

    while (1)

    {

        *(double*)arg=sqrt(*(double *)arg);

    }

 

    return 0;

}

 

int main(int argc, const char * argv[])

{

    pthread_t threadID_one, threadID_two, threadID_three;

 

    double result = 0xffffffff;

 

    pthread_create (&threadID_one, NULL, thread_one, NULL);

    pthread_create (&threadID_two, NULL, thread_two, NULL);

    pthread_create (&threadID_three, NULL, thread_three,

       &result);

 

    pthread_join(threadID_three, NULL);

 

    return 0;

}

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Forthcoming Training: Accelerated Mac OS X Core Dump Analysis

Crash Dump Analysis Patterns (Part 6a, Mac OS X)

Thursday, May 3rd, 2012

This is a Mac OS X / GDB counterpart to NULL Pointer (code) pattern previously described for Windows platforms:

(gdb) bt
#0 0×0000000000000000 in ?? ()
#1 0×000000010e8cce73 in bar (ps=0×7fff6e4cbac0)
#2 0×000000010e8cce95 in foo (ps=0×7fff6e4cbac0)
#3 0×000000010e8cced5 in main (argc=1, argv=0×7fff6e4cbb08)

(gdb) disass 0×000000010e8cce73-3 0×000000010e8cce73
Dump of assembler code from 0×10e8cce70 to 0×10e8cce73:
0×000000010e8cce70 : callq *0×8(%rdi)
End of assembler dump.

(gdb) info r rdi
rdi 0x7fff6e4cbac0 140735043910336

(gdb) x/2 0x7fff6e4cbac0
0x7fff6e4cbac0: 0x0000000a 0×00000000

(gdb) p/x *($rdi+8)
$7 = 0×0

(gdb) bt
#0 0x0000000000000000 in ?? ()
#1 0x000000010e8cce73 in bar (ps=0×7fff6e4cbac0)
#2 0×000000010e8cce95 in foo (ps=0×7fff6e4cbac0)
#3 0×000000010e8cced5 in main (argc=1, argv=0×7fff6e4cbb08)

(gdb) ptype MYSTRUCT
type = struct _MyStruct_tag {
int data;
PFUNC pfunc;
}

(gdb) print {MYSTRUCT}0×7fff6e4cbac0
$2 = {data = 10, pfunc = 0}

Here’s the source code of the modeling application:

typedef void (*PFUNC)(void);

 

typedef struct _MyStruct_tag

{

    int   data;

    PFUNC pfunc;

} MYSTRUCT;

 

void bar(MYSTRUCT *ps)

{

    ps->pfunc();

}

 

void foo(MYSTRUCT *ps)

{

    bar(ps);

}

 

int main(int argc, const char * argv[])

{

    MYSTRUCT pstruct = {10, NULL};

 

    foo(&pstruct);

 

    return 0;

} 

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Forthcoming Training: Accelerated Mac OS X Core Dump Analysis

GDB for WinDbg Users (Part 8)

Monday, April 30th, 2012

As we started providing memory dump analysis pattern examples for Mac OS X we resume our table of command correspondence between WinDbg and GDB providing some corrections on the way. For example, in the previous version of table we omitted a correspondence to ub WinDbg command. Now we provide such an equivalent:

(gdb) bt
[...]
#1 0×000000010e8cce73 in bar (ps=0×7fff6e4cbac0)
[…]

(gdb) disas 0×000000010e8cce73-10 0×000000010e8cce73
Dump of assembler code from 0×10e8cce69 to 0×10e8cce73:
0×000000010e8cce69 : mov %edi,-0×8(%rbp)
0×000000010e8cce6c : mov -0×8(%rbp),%rdi
0×000000010e8cce70 : callq *0×8(%rdi)
End of assembler dump.

Please note that the beginning of assembly will be dependent on how good we guessed the offset:

(gdb) disas 0x000000010e8cce73-0×10 0×000000010e8cce73
Dump of assembler code from 0×10e8cce63 to 0×10e8cce73:
0×000000010e8cce63 : in $0×48,%eax
0×000000010e8cce65 : sub $0×10,%esp
0×000000010e8cce68 : mov %rdi,-0×8(%rbp)
0×000000010e8cce6c : mov -0×8(%rbp),%rdi
0×000000010e8cce70 : callq *0×8(%rdi)
End of assembler dump.

(gdb) disas 0x000000010e8cce73-0×13 0×000000010e8cce73
Dump of assembler code from 0×10e8cce60 to 0×10e8cce73:
0×000000010e8cce60 : push %rbp
0×000000010e8cce61 : mov %rsp,%rbp
0×000000010e8cce64 : sub $0×10,%rsp

0×000000010e8cce68 : mov %rdi,-0×8(%rbp)
0×000000010e8cce6c : mov -0×8(%rbp),%rdi
0×000000010e8cce70 : callq *0×8(%rdi)
End of assembler dump.

However, we can ignore that because our goal is to check whether a CPU instruction before a return address is a call.

Additional commands we add are x/<N>bc for db (WinDbg), thread <N> for ~<N>s (WinDbg, process dumps), maintenance info sections for for !address (WinDbg), add-symbol-file for .reload (WinDbg), info r for r (WinDbg).

Action                      | GDB                 | WinDbg
----------------------------------------------------------------
Start the process           | run                 | g
Exit                        | (q)uit              | q
Disassemble (forward)       | (disas)semble       | uf, u
Disassemble N instructions  | x/<N>i              | -
Disassemble (backward)      | disas <a-o> <a>     | ub
Stack trace                 | backtrace (bt)      | k
Full stack trace            | bt full             | kv
Stack trace with parameters | bt full             | kP
Partial trace (innermost)   | bt <N>              | k <N>
Partial trace (outermost)   | bt -<N>             | -
Stack trace for all threads | thread apply all bt | ~*k
Breakpoint                  | break               | bp
Frame numbers               | any bt command      | kn
Select frame                | frame               | .frame
Display parameters          | info args           | dv /t /i /V
Display locals              | info locals         | dv /t /i /V
Dump byte char array        | x/<N>bc             | db
Switch to thread            | thread <N>          | ~<N>s
Sections/regions            | maint info sections | !address
Load symbol file            | add-symbol-file     | .reload
CPU registers               | i(nfo) r            | r

Now an advertisement command:

(gdb) info training
(gdb) Accelerated Mac OS X Core Dump Analysis training

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -