Archive for the ‘Crash Dump Analysis’ Category

Forthcoming 2nd edition of Memory Dump Analysis Anthology, Volume 1

Sunday, April 15th, 2012

After 4 years in print this bestselling title needs an update to address minor changes, include extra examples and reference additional research published in Volumes 2, 3, 4, 5 and 6.

  • Title: Memory Dump Analysis Anthology, Volume 1
  • Author: Dmitry Vostokov
  • Publisher: OpenTask (Summer 2012)
  • Language: English
  • Product Dimensions: 22.86 x 15.24
  • Paperback: 800 pages
  • ISBN-13: 978-1-908043-35-1
  • Hardcover: 800 pages
  • ISBN-13: 978-1-908043-36-8

The cover for both paperback and hardcover titles will also have a matte finish. We used A Memory Window artwork for the back cover.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Crash Dump Analysis Patterns (Part 13h)

Monday, April 9th, 2012

Allocated dynamic memory such as process heap can remain reserved after deallocation and its virtual memory region might become unavailable for usage. One example of this I encountered recently while debugging a .NET service. During a peak usage it reported various out-of-memory events but its managed heap was healthy and didn’t consume much. However, its process heap statistics showed a large reserved heap segment missing in a similar memory dump from a development environment. Remaining allocated entries in that heap segment contained a specific module hint that allowed us to suggest removing a 3rd-party product from a production environment.

In order to provide the proof of that possible scenario of reserved heap regions we created a special modeling application:

int _tmain(int argc, _TCHAR* argv[])
{
  static char *pAlloc[1000000];
  for (int i = 0; i < 1000000; i++)
  {
    pAlloc[i] = (char *)malloc (1000);
  }
  getc(stdin);
  for (int i = 0; i < 1000000; i++)
  {
     free(pAlloc[i]);
  }
  getc(stdin);
  return 0;
}

Here’s the debugging log:

0:001> .symfix c:\mss

0:001> .reload
Reloading current modules
.....

After allocation:

0:001> !heap -s
LFH Key                   : 0x156356e0
Termination on corruption : ENABLED
Heap     Flags   Reserv  Commit  Virt   Free  List   UCR  Virt  Lock  Fast
(k)     (k)    (k)     (k) length      blocks cont. heap
—————————————————————————–
00520000 00000002    1024    112   1024      8     1     1    0      0   LFH
007e0000 00001002 1019328 1012444 1019328    131    68    67    0      0   LFH
—————————————————————————–

0:001> g
(1588.14b0): Break instruction exception - code 80000003 (first chance)
eax=7efda000 ebx=00000000 ecx=00000000 edx=770ff85a esi=00000000 edi=00000000
eip=7707000c esp=00f0f7e4 ebp=00f0f810 iopl=0         nv up ei pl zr na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000246
ntdll!DbgBreakPoint:
7707000c cc              int     3

After deallocation:

0:001> !heap -s
LFH Key                   : 0x156356e0
Termination on corruption : ENABLED
Heap     Flags   Reserv  Commit  Virt   Free  List   UCR  Virt  Lock  Fast
(k)     (k)    (k)     (k) length      blocks cont. heap
—————————————————————————–
00520000 00000002    1024    112   1024      8     1     1    0      0   LFH
007e0000 00001002 1019328  73040 1019328  71365   419   165    0      0   LFH
External fragmentation  97 % (419 free blocks)
Virtual address fragmentation  92 % (165 uncommited ranges)
—————————————————————————–

0:001> !address -summary
--- Usage Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
Free                                     26          3fbe7000 (1019.902 Mb)           49.80%
<unclassified>                          752          3f8ec000 (1016.922 Mb)  98.92%   49.66%
Image                                    41            76b000 (   7.418 Mb)   0.72%    0.36%
Stack                                     6            200000 (   2.000 Mb)   0.19%    0.10%
MemoryMappedFile                          8            1af000 (   1.684 Mb)   0.16%    0.08%
TEB                                       2              2000 (   8.000 kb)   0.00%    0.00%
PEB                                       1              1000 (   4.000 kb)   0.00%    0.00%

--- Type Summary (for busy) ------ RgnCount ----------- Total Size -------- %ofBusy %ofTotal
MEM_PRIVATE                             734          3f8a2000 (1016.633 Mb)  98.89%   49.64%
MEM_IMAGE                                68            9b8000 (   9.719 Mb)   0.95%    0.47%
MEM_MAPPED                                8            1af000 (   1.684 Mb)   0.16%    0.08%

--- State Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
MEM_FREE                                 26          3fbe7000 (1019.902 Mb)           49.80%
MEM_RESERVE                             374          3f6e8000 (1014.906 Mb)  98.72%   49.56%
MEM_COMMIT                              436            d21000 (  13.129 Mb)   1.28%    0.64%

--- Protect Summary (for commit) - RgnCount ----------- Total Size -------- %ofBusy %ofTotal
PAGE_READWRITE                          383            725000 (   7.145 Mb)   0.69%    0.35%
PAGE_EXECUTE_READ                        10            414000 (   4.078 Mb)   0.40%    0.20%
PAGE_READONLY                            29            1cd000 (   1.801 Mb)   0.18%    0.09%
PAGE_WRITECOPY                           10             12000 (  72.000 kb)   0.01%    0.00%
PAGE_READWRITE|PAGE_GUARD                 4              9000 (  36.000 kb)   0.00%    0.00%

--- Largest Region by Usage ----------- Base Address -------- Region Size ----------
Free                                        3f0c0000          33050000 ( 816.313 Mb)
<unclassified>                              158a1000            fcf000 (  15.809 Mb)
Image                                        1083000            3d1000 (   3.816 Mb)
Stack                                         200000             fd000 (1012.000 kb)
MemoryMappedFile                            7efe5000             fb000 (1004.000 kb)
TEB                                         7efda000              1000 (   4.000 kb)
PEB                                         7efde000              1000 (   4.000 kb)

We see that free memory available for allocation is only 816 Mb.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Forthcoming Introduction to Pattern-Driven Software Diagnostics

Monday, April 9th, 2012

Memory Dump Analysis Services organizes a free Webinar on Unified Software Diagnostics (USD) and the new scalable cost-effective software support model called Pattern-Driven Software Support devised to address various shortcomings in existing tiered software support organizations. Examples cover Windows, Mac OS  and Linux.

 Introduction to Pattern-Driven Software Diagnostics Logo

Date: 22nd of June, 2012
Time: 17:00 (BST) 12:00 (EST) 09:00 (PST)
Duration: 60 minutes

Space is limited.
Reserve your Webinar seat now at:
https://www3.gotomeeting.com/register/172771078

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Crash Dump Analysis Patterns (Part 1, Mac OS X)

Wednesday, April 4th, 2012

The first Windows pattern called Multiple Exceptions in user mode now has Mac OS X equivalent. In the example below there are 3 threads and two of them experienced NULL Pointer (data) access violation exception:

(gdb) thread apply all bt full

Thread 3 (core thread 2):
#0  0x00000001062ffe4e in thread_two (arg=0x0)
at main.c:24
p = (int *) 0×0
#1  0×00007fff8abf58bf in _pthread_start ()
No symbol table info available.
#2  0×00007fff8abf8b75 in thread_start ()
No symbol table info available.

Thread 2 (core thread 1):
#0  0x00000001062ffe1e in thread_one (arg=0x0)
at main.c:16
p = (int *) 0×0
#1  0×00007fff8abf58bf in _pthread_start ()
No symbol table info available.
#2  0×00007fff8abf8b75 in thread_start ()
No symbol table info available.

Thread 1 (core thread 0):
#0  0x00007fff854e0e42 in __semwait_signal ()
No symbol table info available.
#1  0x00007fff8ababdea in nanosleep ()
No symbol table info available.
#2  0x00007fff8ababc2c in sleep ()
No symbol table info available.
#3  0x00000001062ffec3 in main (argc=1, argv=0x7fff65efeab8)
at main.c:36
threadID_one = (pthread_t) 0×1063b4000
threadID_two = (pthread_t) 0×106581000

(gdb) thread 2
[Switching to thread 2 (core thread 1)]
0x00000001062ffe1e in thread_one (arg=0x0)
at main.c:16
16    *p = 1;

(gdb) p/x p
$1 = 0×0

(gdb) thread 3
[Switching to thread 3 (core thread 2)]
0x00000001062ffe4e in thread_two (arg=0x0)
at main.c:24
24    *p = 2;

(gdb) p/x p
$2 = 0×0

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Crash Dumps, Acquisitions and Layoffs (Part 1)

Sunday, April 1st, 2012

This is a story I would like to present for a possible discussion. A leading debugging services company A was contacted by a company B to analyze dozens of crash dumps for a modest fee*. No NDA was signed and the work was done promptly. Unfortunately there was no response to an invoice and after some time a representative from the company A contacted the company B and got an immediate reply that they were bought by a company C. The new invoice was requested and promptly sent by the company A to the company C. More time passed beyond any reasonable time frame and a representative from the company A contacted the company C again and didn’t get an immediate reply as before. After some time out of the blue came a group reply from all high execs saying that the modest fee cannot be paid because it had to be done by the company B that they bought. And by the way all guys from the company B dealing with the company A were no longer with the company C. The company A pointed out that by an implicit agreement an act of nonpayment from the company B due to unforeseen circumstances automatically made all crash dumps submitted to the company A a property of the company A.

Many would say please contact a lawyer but the modest fee doesn’t worth such a contact. When I devise a happy ending I write a second part.

* an average one day exec salary or less

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Crash Dump Analysis Patterns (Part 25, Mac OS X)

Sunday, March 25th, 2012

This is a Mac OS X / GDB counterpart to Stack Trace pattern previously described for Windows platforms. Here we show a stack trace when symbols are not available and also how to apply symbols:

(gdb) bt
#0  0×000000010d3b0e90 in ?? ()
#1  0×000000010d3b0ea9 in ?? ()
#2  0×000000010d3b0ec4 in ?? ()
#3  0×000000010d3b0e74 in ?? ()

(gdb) maintenance info sections
Exec file:
[...]
Core file:
`/cores/core.262', file type mach-o-le.
0×000000010d3b0000->0×000000010d3b1000 at 0×00001000: LC_SEGMENT. ALLOC LOAD CODE HAS_CONTENTS
0×000000010d3b1000->0×000000010d3b2000 at 0×00002000: LC_SEGMENT. ALLOC LOAD CODE HAS_CONTENTS
0×000000010d3b2000->0×000000010d3b3000 at 0×00003000: LC_SEGMENT. ALLOC LOAD CODE HAS_CONTENTS
0×000000010d3b3000->0×000000010d3b4000 at 0×00004000: LC_SEGMENT. ALLOC LOAD CODE HAS_CONTENTS
0×000000010d3b4000->0×000000010d3b5000 at 0×00005000: LC_SEGMENT. ALLOC LOAD CODE HAS_CONTENTS
0×000000010d3b5000->0×000000010d3b6000 at 0×00006000: LC_SEGMENT. ALLOC LOAD CODE HAS_CONTENTS
0×000000010d3b6000->0×000000010d3cb000 at 0×00007000: LC_SEGMENT. ALLOC LOAD CODE HAS_CONTENTS
0×000000010d3cb000->0×000000010d3cc000 at 0×0001c000: LC_SEGMENT. ALLOC LOAD CODE HAS_CONTENTS
0×000000010d3cc000->0×000000010d3cd000 at 0×0001d000: LC_SEGMENT. ALLOC LOAD CODE HAS_CONTENTS
0×000000010d3cd000->0×000000010d3e2000 at 0×0001e000: LC_SEGMENT. ALLOC LOAD CODE HAS_CONTENTS
0×000000010d3e2000->0×000000010d3e3000 at 0×00033000: LC_SEGMENT. ALLOC LOAD CODE HAS_CONTENTS
0×000000010d3e3000->0×000000010d3e4000 at 0×00034000: LC_SEGMENT. ALLOC LOAD CODE HAS_CONTENTS
0×000000010d400000->0×000000010d500000 at 0×00035000: LC_SEGMENT. ALLOC LOAD CODE HAS_CONTENTS
[…]

(gdb) add-symbol-file ~/Documents/Work/Test.sym 0×000000010d3b0000
add symbol table from file “/Users/DumpAnalysis/Documents/Work/Test.sym” at
LC_SEGMENT.__TEXT = 0×10d3b0000
(y or n) y
Reading symbols from /Users/DumpAnalysis/Documents/Work/Test.sym…done.

(gdb) bt
#0  0x000000010d3b0e90 in bar () at main.c:15
#1  0x000000010d3b0ea9 in foo () at main.c:20
#2  0x000000010d3b0ec4 in main (argc=1,
argv=0x7fff6cfafbf8) at main.c:25

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Forthcoming Training: Accelerated Mac OS X Core Dump Analysis

GDB Annoyances: Incomplete Stack Trace

Sunday, March 25th, 2012

Users of WinDbg debugger accustomed to full thread stack traces will wonder whether a thread starts from main:

(gdb) where
#0  0x000000010d3b0e90 in bar () at main.c:15
#1  0x000000010d3b0ea9 in foo () at main.c:20
#2  0x000000010d3b0ec4 in main (argc=1,
argv=0x7fff6cfafbf8) at main.c:25

Of course, not and by default a stack trace is shown starting from main function. You can change this behavior by using the following command:

(gdb) set backtrace past-main

Now we see an additional frame:

(gdb) where
#0  0x000000010d3b0e90 in bar () at main.c:15
#1  0x000000010d3b0ea9 in foo () at main.c:20
#2  0x000000010d3b0ec4 in main (argc=1,
argv=0x7fff6cfafbf8) at main.c:25
#3  0×000000010d3b0e74 in start ()

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Forthcoming Training: Accelerated Mac OS X Core Dump Analysis

Crash Dump Analysis Patterns (Part 6b, Mac OS X)

Sunday, March 25th, 2012

This is a Mac OS X / GDB counterpart to NULL Pointer (data) pattern previously described for Windows platforms:

(gdb) bt
#0  0×000000010d3b0e90 in bar () at main.c:15
#1  0×000000010d3b0ea9 in foo () at main.c:20
#2  0×000000010d3b0ec4 in main (argc=1,
argv=0×7fff6cfafbf8) at main.c:25

(gdb) disassemble
Dump of assembler code for function bar:
0x000000010d3b0e80 <bar+0>: push   %rbp
0×000000010d3b0e81 <bar+1>: mov    %rsp,%rbp
0×000000010d3b0e84 <bar+4>: movq   $0×0,-0×8(%rbp)
0×000000010d3b0e8c <bar+12>: mov    -0×8(%rbp),%rax
0×000000010d3b0e90 <bar+16>: movl   $0×1,(%rax)
0×000000010d3b0e96 <bar+22>: pop    %bp
0×000000010d3b0e97 <bar+23>: retq
End of assembler dump.

(gdb) p/x $rax
$1 = 0×0

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Forthcoming Training: Accelerated Mac OS X Core Dump Analysis

WinDbg shortcuts: !heap -x -v

Friday, March 23rd, 2012

The following command is useful for searching a process virtual space for any value references:

!heap -x -v <value> ”will search the entire virtual memory space of the current process for pointers to this” value (from WinDbg help).

Example:

0:000> !heap -x -v 6e412d82
Search VM for address range 6e412d82 - 6e412d82 : 778042bc (6e412d82),

0:000> dp 778042bc l1
778042bc  6e412d82

0:000> !heap -x -v c0000005
Search VM for address range c0000005 - c0000005 : 014df8d0 (c0000005), 014dfe8c (c0000005), 0155d908 (c0000005), 0155dd10 (c0000005), 0155ddc8 (c0000005), 0155dfa8 (c0000005), 0155dff0 (c0000005), 0155ea20 (c0000005), 6d000f9c (c0000005), 70d44054 (c0000005), 725c30d4 (c0000005), 7270d20c (c0000005), 7282ef74 (c0000005), 7449a878 (c0000005), 74511958 (c0000005), 74562ec4 (c0000005), 74563280 (c0000005), 74564fc8 (c0000005), 7456562c (c0000005), 74565748 (c0000005), 745664a8 (c0000005), 74566a30 (c0000005), 74566ad8 (c0000005), 747f6730 (c0000005), 747f682c (c0000005), 74861ef0 (c0000005), 7488743c (c0000005), 748aea68 (c0000005), 748b2830 (c0000005), 748c5118 (c0000005), 74935068 (c0000005), 749412a8 (c0000005), 7495caf0 (c0000005), 74a3a780 (c0000005), 74aa462c (c0000005), 74b19b68 (c0000005), 74b61060 (c0000005), 74b8fb44 (c0000005), 74b9d1c8 (c0000005), 74be1ad8 (c0000005), 74be72c8 (c0000005), 74c14b60 (c0000005), 74c83b84 (c0000005), 74c83b88 (c0000005), 74c83b9c (c0000005), 74c83ba0 (c0000005), 74c83ba4 (c0000005), 74c83ba8 (c0000005), 74c83bac (c0000005), 74c83bb0 (c0000005), 74c83bb4 (c0000005), 74c83bb8 (c0000005), 74c83bbc (c0000005), 74c83bc0 (c0000005), 74c83bc8 (c0000005), 74c83bcc (c0000005), 74c83bd0 (c0000005), 74c83bd4 (c0000005), 74c83bd8 (c0000005), 74c83bdc (c0000005), 74c83be0 (c0000005), 74c83be4 (c0000005), 74c83be8 (c0000005), 74c83bec (c0000005), 74c83bf0 (c0000005), 74c83bf4 (c0000005), 74c83bf8 (c0000005), 74c83bfc (c0000005), 74c83c00 (c0000005), 74c83c04 (c0000005), 74c83c08 (c0000005), 74c83c0c (c0000005), 74c83c10 (c0000005), 74c83c14 (c0000005), 74c83c18 (c0000005), 74c83c1c (c0000005), 74c83c20 (c0000005), 74c83c24 (c0000005), 74c83c28 (c0000005), 74c83c2c (c0000005), 74c83c34 (c0000005), 74c83c38 (c0000005), 74c83c3c (c0000005), 74c8c7ac (c0000005), 75019298 (c0000005), 750ff7b0 (c0000005), 751c1adc (c0000005), 751c2514 (c0000005), 7522c530 (c0000005), 752c311c (c0000005), 752d4734 (c0000005), 752d4ae8 (c0000005), 752d534c (c0000005), 752d7038 (c0000005), 752d7e9c (c0000005), 752eda04 (c0000005), 752edab0 (c0000005), 756d6624 (c0000005), 7571adc0 (c0000005), 7571addc (c0000005), 75723780 (c0000005), 757af774 (c0000005), 759c0f10 (c0000005), 76702360 (c0000005), 76703a30 (c0000005), 76d437ac (c0000005), 76d527ec (c0000005), 76dd0fa4 (c0000005), 77581f2c (c0000005), 777a33c0 (c0000005), 777c8b14 (c0000005),

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Crash Dump Analysis Patterns (Part 169)

Thursday, March 22nd, 2012

This is another “blockage” pattern called Blocked DPC. Here we have blocked per-processor Deferred Procedure Call queues because of threads running on processors with IRQL > DISPATCH_LEVEL. For example, on the processor 11 (0×0b):

11: kd> !dpcs
CPU Type      KDPC       Function
3: Normal  : 0x8accacec 0xf710567a DriverA

5: Normal  : 0x89f449e4 0xf595b83a DriverB

7: Normal  : 0x8a63664c 0xf59e3f04 USBPORT!USBPORT_IsrDpc

11: Normal  : 0x8acb2cec 0xf710567a DriverA
11: Normal  : 0x8b5e955c 0xf73484e6 ACPI!ACPIInterruptServiceRoutineDPC

11: kd> !thread
THREAD 89806428  Cid 0934.0944  Teb: 7ffdb000 Win32Thread: bc17dda0 RUNNING on processor b
Not impersonating
DeviceMap                 e1002258
Owning Process            89972290       Image:         ApplicationA.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      2863772        Ticks: 368905 (0:01:36:04.140)
Context Switch Count      145085                 LargeStack
UserTime                  00:00:00.015
KernelTime                01:36:04.203
Win32 Start Address MSVCR90!_threadstartex (0×7854345e)
Start Address kernel32!BaseThreadStartThunk (0×77e617ec)
Stack Init f3f63000 Current f3f62c4c Base f3f63000 Limit f3f5f000 Call 0
Priority 10 BasePriority 10 PriorityDecrement 0
ChildEBP RetAddr  Args to Child
f777d3b0 f3f62d28 00000010 00000000 00000000 hal!KeAcquireInStackQueuedSpinLockRaiseToSynch+0×36
WARNING: Frame IP not in any known module. Following frames may be wrong.
f777d3b4 00000000 00000000 00000000 00000000 0xf3f62d28

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Meta-Memory Dump Patterns

Thursday, March 22nd, 2012

A page to reference all different kinds of patterns related to memory dumps as a whole and their properties is necessary, so I created this post:

I’ll update it as soon as I add more similar patterns.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Crash Dump Analysis Patterns (Part 168)

Tuesday, March 20th, 2012

This is a binary opposition counterpart to Early Crash Dump pattern called Late Crash Dump. It is usually saved after patterns representing a problem such as an exception thread stack trace are already gone. Most often we see one thread with process termination functions (Special Stack Trace pattern):

0:000> ~*k
ChildEBP RetAddr
0037fcf0 770bd55c ntdll!ZwTerminateProcess+0x12
0037fd0c 750f79f4 ntdll!RtlExitUserProcess+0x85
0037fdf8 750f339a kernel32!ExitProcessStub+0x12
0037fe04 770a9ef2 kernel32!BaseThreadInitThunk+0xe
0037fe44 770a9ec5 ntdll!__RtlUserThreadStart+0x70
0037fe5c 00000000 ntdll!_RtlUserThreadStart+0x1b

0:000> ~*k
ChildEBP RetAddr
0032faf0 77a9d55c ntdll!ZwTerminateProcess+0x12
0032fb0c 775579f4 ntdll!RtlExitUserProcess+0x85
0032fb20 74ac1720 kernel32!ExitProcessStub+0x12
0032fb28 74ac1a03 msvcr80!__crtExitProcess+0x14
0032fb64 74ac1a4b msvcr80!_cinit+0x101
0032fb74 01339bb3 msvcr80!exit+0xd
0032fbf8 7755339a App!__tmainCRTStartup+0x155
0032fc04 77a89ef2 kernel32!BaseThreadInitThunk+0xe
0032fc44 77a89ec5 ntdll!__RtlUserThreadStart+0x70
0032fc5c 00000000 ntdll!_RtlUserThreadStart+0x1b

However, sometimes, it is possible to see some execution residue left on a raw stack such as hidden exceptions, module hints, error codes and handled exceptions that might shed light on possible causes.

Another variant is when a memory dump is saved after a problem message box is dismissed or potentially disastrous exceptions such as access violations are handled and ignored to the fault in exception handling mechanism or severe corruption resuted in unresponsive process (hang).

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

What is a Software Narrative?

Tuesday, March 20th, 2012

The previous definition of software narratology was restricted to software traces and logs (the top left quadrant on a software narrative square, also the part of Memoretics which studies memory snapshots). Now, with the broadening of the domain of software narratology to the whole world of software narrative stories including actor interactions with software in construction requirements use cases and post-construction incidents we give another definition:

Software narrative is a representation of software events and changes of state. Software Narratology is a discipline that studies such software narratives (software narrative science).

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Accelerated Mac OS X Core Dump Analysis Training

Saturday, March 3rd, 2012

Accelerated Mac OS X Core Dump Analysis Logo

Memory Dump Analysis Services organizes a new training course:

Description: Learn how to analyze app crashes and freezes, navigate through process core memory dump space and diagnose corruption, memory leaks, CPU spikes, blocked threads, deadlocks, wait chains, and much more. We use a unique and innovative pattern-driven analysis approach to speed up the learning curve. The training consists of practical step-by-step exercises using Xcode and GDB environments highlighting various patterns diagnosed in 64-bit process core memory dumps. The training also includes an overview of relevant similarities and differences between Windows and Mac OS X user space memory dump analysis useful for engineers with Wintel background.

If you are registered you are allowed to optionally submit your app core dumps before the training. This will allow us in addition to the carefully constructed problems tailor additional examples to the needs of the attendees.

The training consists of 2 two-hour sessions. When you finish the training you additionally get:

  1. A full transcript in PDF format (retail price $200)
  2. 6 volumes of Memory Dump Analysis Anthology in PDF format (retail price $120)
  3. A personalized attendance certificate with unique CID (PDF format)
  4. Mac OS X Debugging: Practical Foundations in PDF format (retail price $15)
  5. Free Dump Analysis World Network membership including updates to full PDF transcript Q&A section

Prerequisites: Basic Mac OS X troubleshooting and debugging

Audience: Software technical support and escalation engineers, system administrators, software developers and quality assurance engineers.

Session 1: October 19, 2012 4:00 PM - 6:00 PM BST
Session 2: October 22, 2012 4:00 PM - 6:00 PM BST

Price: 210 USD

Space is limited.
Reserve your remote training seat now at:
https://student.gototraining.com/r/3803636572165653760

If you are mainly interested in Windows memory dump analysis there is another course available:

Accelerated Windows Memory Dump Analysis

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Watching a Movie (Debugging Slang, Part 29)

Monday, February 20th, 2012

Watching a Movie - Watching the prodigious output of some debugging commands and scripts in real time.

Examples: Watching the output of !process 0 ff  WinDbg command. Watching the output of user stack trace database and breaking in when it becomes uniform.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

The Design of Memory Dump Analysis: 7 Steps of Highly Successful Analysts

Monday, February 20th, 2012

I was recently asked by a group of trainees to outline a simple approach to proceed after opening a memory dump. So I came up with these 7 steps:

1. !analyze -v [-hang]
2. Exception (Bugcheck): stack trace analysis with d* and lmv
3. !locks
4. !runaway f (!running)
5. Dump all (processes and) thread stack traces [with 32-bit] ~*kv (!process 0 ff)
6. Search for signs/patterns of abnormal behavior (exceptions, wait chains, message boxes [, from your custom checklist])
7. Narrow analysis down to a specific thread and dump raw stack data if needed [repeat*]

(commands/options in brackets denote kernel/complete dump variation)
[notes in square brackets denote additional options, such as x64 specifics, your product details, etc.]

What are your steps? I would be interested to hear about alternative analysis steps, techniques, etc.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Postmortem effects of -g

Sunday, February 19th, 2012

One of attendees of accelerated memory dump analysis training pointed me to the possible effects of -g option for AeDebug custom postmortem debugger command line for CDB, NTSD or WinDbg. So I tested that with x64 TestWER tool (should be the same with x86 version) and indeed there are differences.

With -g option with have this stack trace:

AeDebug\Debugger = "C:\Program Files\Debugging Tools for Windows (x64)\windbg.exe" -p %ld -e %ld -g

0:000> kL
Child-SP          RetAddr           Call Site
00000000`0012f210 00000001`40004148 TestWER64!CTestDefaultDebuggerDlg::OnBnClickedButton1+0x7e
00000000`0012f250 00000001`40004388 TestWER64!_AfxDispatchCmdMsg+0xc4
00000000`0012f280 00000001`40003552 TestWER64!CCmdTarget::OnCmdMsg+0x180
00000000`0012f2e0 00000001`4000cc44 TestWER64!CDialog::OnCmdMsg+0x32
00000000`0012f320 00000001`4000d877 TestWER64!CWnd::OnCommand+0xcc
00000000`0012f3b0 00000001`40008c2c TestWER64!CWnd::OnWndMsg+0x5f
00000000`0012f4f0 00000001`4000c272 TestWER64!CWnd::WindowProc+0x38
00000000`0012f530 00000001`4000c32d TestWER64!AfxCallWndProc+0xfe
00000000`0012f5d0 00000000`77519bd1 TestWER64!AfxWndProc+0x59
00000000`0012f610 00000000`77516aa8 USER32!UserCallWinProcCheckWow+0x1ad
00000000`0012f6d0 00000000`77516bad USER32!SendMessageWorker+0x682
00000000`0012f760 000007fe`fccb0bbf USER32!SendMessageW+0x5c
00000000`0012f7b0 000007fe`fccb47df COMCTL32!Button_ReleaseCapture+0x157
00000000`0012f7f0 00000000`77519bd1 COMCTL32!Button_WndProc+0xcbf
00000000`0012f8b0 00000000`775198da USER32!UserCallWinProcCheckWow+0x1ad
00000000`0012f970 00000000`775167c2 USER32!DispatchMessageWorker+0x3b5
00000000`0012f9f0 00000001`400079cc USER32!IsDialogMessageW+0x153
00000000`0012fa80 00000001`40009148 TestWER64!CWnd::IsDialogMessageW+0x38
00000000`0012fab0 00000001`40003513 TestWER64!CWnd::PreTranslateInput+0x28
00000000`0012fae0 00000001`4000b696 TestWER64!CDialog::PreTranslateMessage+0xc3
00000000`0012fb10 00000001`40004c1f TestWER64!CWnd::WalkPreTranslateTree+0x3a
00000000`0012fb40 00000001`40004c7f TestWER64!AfxInternalPreTranslateMessage+0x67
00000000`0012fb70 00000001`40004e26 TestWER64!AfxPreTranslateMessage+0x23
00000000`0012fba0 00000001`40004e6b TestWER64!AfxInternalPumpMessage+0x3a
00000000`0012fbd0 00000001`4000aba6 TestWER64!AfxPumpMessage+0x1b
00000000`0012fc00 00000001`40003e4a TestWER64!CWnd::RunModalLoop+0xea
00000000`0012fc60 00000001`40024da4 TestWER64!CDialog::DoModal+0x1c6
00000000`0012fd10 00000001`40024625 TestWER64!CTestDefaultDebuggerApp::InitInstance+0xc4
00000000`0012fe70 00000001`400153c2 TestWER64!AfxWinMain+0x75
00000000`0012feb0 00000000`77ad652d TestWER64!__tmainCRTStartup+0x186
00000000`0012ff60 00000000`77c0c521 kernel32!BaseThreadInitThunk+0xd
00000000`0012ff90 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

0:000> r
rax=0000000000000000 rbx=0000000000000001 rcx=000000000012fd50
rdx=00000000000003e8 rsi=000000000012fd50 rdi=000000014002daa0
rip=00000001400247ae rsp=000000000012f210 rbp=0000000000000111
r8=0000000000000000  r9=0000000140024730 r10=0000000140024730
r11=000000000012f310 r12=0000000000000000 r13=00000000000003e8
r14=0000000000000110 r15=0000000000000001
iopl=0         nv up ei pl zr na po nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010244
TestWER64!CTestDefaultDebuggerDlg::OnBnClickedButton1+0x7e:
00000001`400247ae c704250000000000000000 mov dword ptr [0],0 ds:00000000`00000000=????????

Without -g option we also see exception processing (highlighted in red):

AeDebugger\Debugger = "C:\Program Files\Debugging Tools for Windows (x64)\windbg.exe" -p %ld -e %ld

0:000> kL
Child-SP RetAddr Call Site
00000000`0012e368 000007fe`fe301420 ntdll!ZwWaitForMultipleObjects+0xa
00000000`0012e370 00000000`77ae2cf3 KERNELBASE!WaitForMultipleObjectsEx+0xe8
00000000`0012e470 00000000`77b590f5 kernel32!WaitForMultipleObjectsExImplementation+0xb3
00000000`0012e500 00000000`77b59277 kernel32!WerpReportFaultInternal+0×215
00000000`0012e5a0 00000000`77b592cf kernel32!WerpReportFault+0×77
00000000`0012e5d0 00000000`77b594ec kernel32!BasepReportFault+0×1f
00000000`0012e600 00000000`77c743b8 kernel32!UnhandledExceptionFilter+0×1fc
00000000`0012e6e0 00000000`77bf85a8 ntdll! ?? ::FNODOBFM::`string’+0×2365
00000000`0012e710 00000000`77c09d0d ntdll!_C_specific_handler+0×8c
00000000`0012e780 00000000`77bf91af ntdll!RtlpExecuteHandlerForException+0xd
00000000`0012e7b0 00000000`77c31278 ntdll!RtlDispatchException+0×45a
00000000`0012ee90 00000001`400247ae ntdll!KiUserExceptionDispatcher+0×2e

00000000`0012f450 00000001`40004148 TestWER64!CTestDefaultDebuggerDlg::OnBnClickedButton1+0×7e
00000000`0012f490 00000001`40004388 TestWER64!_AfxDispatchCmdMsg+0xc4
00000000`0012f4c0 00000001`40003552 TestWER64!CCmdTarget::OnCmdMsg+0×180
00000000`0012f520 00000001`4000cc44 TestWER64!CDialog::OnCmdMsg+0×32
00000000`0012f560 00000001`4000d877 TestWER64!CWnd::OnCommand+0xcc
00000000`0012f5f0 00000001`40008c2c TestWER64!CWnd::OnWndMsg+0×5f
00000000`0012f730 00000001`4000c272 TestWER64!CWnd::WindowProc+0×38
00000000`0012f770 00000001`4000c32d TestWER64!AfxCallWndProc+0xfe
00000000`0012f810 00000000`77519bd1 TestWER64!AfxWndProc+0×59
00000000`0012f850 00000000`77516aa8 USER32!UserCallWinProcCheckWow+0×1ad
00000000`0012f910 00000000`77516bad USER32!SendMessageWorker+0×682
00000000`0012f9a0 00000000`7751eda7 USER32!SendMessageW+0×5c
00000000`0012f9f0 00000001`400079cc USER32!IsDialogMessageW+0×85f
00000000`0012fa80 00000001`40009148 TestWER64!CWnd::IsDialogMessageW+0×38
00000000`0012fab0 00000001`40003513 TestWER64!CWnd::PreTranslateInput+0×28
00000000`0012fae0 00000001`4000b696 TestWER64!CDialog::PreTranslateMessage+0xc3
00000000`0012fb10 00000001`40004c1f TestWER64!CWnd::WalkPreTranslateTree+0×3a
00000000`0012fb40 00000001`40004c7f TestWER64!AfxInternalPreTranslateMessage+0×67
00000000`0012fb70 00000001`40004e26 TestWER64!AfxPreTranslateMessage+0×23
00000000`0012fba0 00000001`40004e6b TestWER64!AfxInternalPumpMessage+0×3a
00000000`0012fbd0 00000001`4000aba6 TestWER64!AfxPumpMessage+0×1b
00000000`0012fc00 00000001`40003e4a TestWER64!CWnd::RunModalLoop+0xea
00000000`0012fc60 00000001`40024da4 TestWER64!CDialog::DoModal+0×1c6
00000000`0012fd10 00000001`40024625 TestWER64!CTestDefaultDebuggerApp::InitInstance+0xc4
00000000`0012fe70 00000001`400153c2 TestWER64!AfxWinMain+0×75
00000000`0012feb0 00000000`77ad652d TestWER64!__tmainCRTStartup+0×186
00000000`0012ff60 00000000`77c0c521 kernel32!BaseThreadInitThunk+0xd
00000000`0012ff90 00000000`00000000 ntdll!RtlUserThreadStart+0×1d

I now prefer omitting -g option to get stack traces equivalent to manual crash dumps saved by userdump.exe on pre-Vista platforms and Task Manager on later platforms.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Software Anti-Narrative

Friday, February 17th, 2012

In narratology anti-narrative denotes a narrative having sequences of events impossible in reality. In software traces such sequences usually depict abnormal software behaviour. Here are some parallels with corresponding trace analysis patterns:

Fiction                     | Software Trace
================================================
Repeated unrepeatable       | Periodic Error (?)
Denarration (erasure)       | No Activity / Incomplete History
Chronological contradiction | Impossible Trace

Question mark means that possibly another pattern is needed there.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Crash Dump Analysis Patterns (Part 167)

Sunday, February 12th, 2012

Regular Data pattern generalizes ASCII and UNICODE-type (00xx00yy) data found in memory to domain-specific data formats such as bitmaps and vector data. An example of the latter could be the sequence of …0xxx0yyy… (xxx are triplets of hex digits). A typical usage of this pattern is analysis of corrupt dynamic memory blocks (process heap, kernel pool) where continuity of regular data across block boundary points to a possible buffer overwrite.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Crash Dump Analysis Patterns (Part 166)

Sunday, February 12th, 2012

Runtime software exceptions (such as C++ exceptions) can be translated by custom exception handlers into other exceptions by changing exception data. This is different from nested exceptions where another exception is thrown. One example of such possible translation I recently encountered when looking at a raw stack data (!teb -> dps) having signs of hidden exceptions (multiple RaiseException calls) and also CLR execution residue (valid return addresses of clr module). In addition of final invalid handle exception and one hidden access violation there were many exception codes c0000027. Google search pointed to the article about skipped C++ destructors written by S. Senthil Kumar that prompted me to introduce the pattern Translated Exception.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -