Critical section high contention and wait chains, blocked threads, and periodic error: memory dump and trace analysis pattern cooperation

October 9th, 2009

This is the first case study here that shows an interplay of memory dump analysis (DA) and software trace analysis (TA) patterns, what I call DATA analysis patterns (or DA+TA).  

It was reported that one process was blocking vital server functionality. After the process restart the problem was gone away. A complete memory dump was saved on the next occurrence and it revealed critical section wait chains in that process but no critical section deadlocks:

0: kd> .process /r /p 87f76020
Implicit process is now 87f76020
Loading User Symbols
[...]

0: kd> !cs -l -o -s
-----------------------------------------
DebugInfo          = 0x0016c6d8
Critical section   = 0×0032be30 (+0×32BE30)
LOCKED
LockCount          = 0×34
WaiterWoken        = No
OwningThread       = 0×00001c64
RecursionCount     = 0×1
LockSemaphore      = 0×624
SpinCount          = 0×00000000
OwningThread       = .thread 86396db0
ntdll!RtlpStackTraceDataBase is NULL. Probably the stack traces are not enabled.
[…]

The thread 86396db0 (TID 1c64) that blocked more than 50 threads (0×34) was blocked itself sleeping for more than 6 seconds:

0: kd> .thread 86396db0
Implicit thread is now 86396db0

0: kd> kL 100
  *** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr 
ae7f8c98 8083d5b1 nt!KiSwapContext+0x26
ae7f8cc4 8083cf69 nt!KiSwapThread+0x2e5
ae7f8d0c 8092b03f nt!KeDelayExecutionThread+0x2ab
ae7f8d54 80833bef nt!NtDelayExecution+0x84
ae7f8d54 7c82860c nt!KiFastCallEntry+0xfc
1020e8ac 7c826f69 ntdll!KiFastSystemCallRet
1020e8b0 77e41ed5 ntdll!NtDelayExecution+0xc
1020e918 77e424fd kernel32!SleepEx+0x68
1020e928 67739357 kernel32!Sleep+0xf
1020e944 6773c3a2 ComponentA!DB_Driver_Command+0xa7
[…]
1020ec64 67485393 ComponentB!DatabaseSearch+0×34
[…]
1020ffb8 77e6482f msvcrt!_endthreadex+0xa3
1020ffec 00000000 kernel32!BaseThreadStart+0×34

0: kd> kv
  *** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr  Args to Child             
[...]
1020e918 77e424fd 00001b00 00000000 1020e944 kernel32!SleepEx+0x68 (FPO: [SEH])
1020e928 67739357 00001b00 00000000 0032ac6c kernel32!Sleep+0xf (FPO: [1,0,0])
[…]

0: kd> ? 1b00 / 0n1000
Evaluate expression: 6 = 00000006

Critical section it owns shows high contention count too:

0: kd> dt -r1 _RTL_CRITICAL_SECTION   0x0032be30
ProcessA!_RTL_CRITICAL_SECTION
   +0x000 DebugInfo        : 0x0016c6d8 _RTL_CRITICAL_SECTION_DEBUG
      +0x000 Type             : 0
      +0x002 CreatorBackTraceIndex : 0
      +0x004 CriticalSection  : 0x0032be30 _RTL_CRITICAL_SECTION
      +0x008 ProcessLocksList : _LIST_ENTRY [ 0x16c708 - 0x16c6b8 ]
      +0x010 EntryCount       : 0
      +0×014 ContentionCount  : 0xac352
      +0×018 Spare            : [2] 0×43005c
   +0×004 LockCount        : -210
   +0×008 RecursionCount   : 1
   +0×00c OwningThread     : 0×00001c64
   +0×010 LockSemaphore    : 0×00000624
   +0×014 SpinCount        : 0

Fortunately, that process had ETW tracing capability and its software trace recorded before the complete memory dump was saved the following recurrent periodic errorfrom different threads that confirms our observation about the possible problem with a database and explains thread delays we see (> 6 seconds for Sleep):

#    PID  TID  Time         Message
[...]
1972 2780 5992 10:05:11.005 Error: [DB Driver] Not enough space on temp disk
1973 2780 5992 10:05:11.005 Execute DB command sleeps on error (retry 26)
[...]
4513 2780 3292 10:06:02.942 Error: [DB Driver] Not enough space on temp disk
4514 2780 3292 10:06:02.942 Execute DB command sleeps on error (retry 11)
4515 2780 3292 10:06:09.598 Error: [DB Driver] Not enough space on temp disk
4516 2780 3292 10:06:09.598 Execute DB command sleeps on error (retry 12)
[…]

- Dmitry Vostokov @ DumpAnalysis.org -

Windows Installer logs

October 7th, 2009

Software installation may fail: this is a fact (like Evolution). Therefore this is the domain of troubleshooting and debugging proper. Usually such problems are typically analyzed by reading Windows Installer MSI logs that are examples of software traces. The following book is on my desk now:

The Definitive Guide to Windows Installer

Buy from Amazon

- Dmitry Vostokov @ DumpAnalysis.org -

Dictionary of Debugging: Breakpoint

October 6th, 2009

Breakpoint

A code or a processor state modification to plan for a synchronous diversion to another execution path when some condition is met. Usually implemented by a special processor instruction inserted at the specified address or a special processor register that holds the specified condition to be met. If that condition is met or the special instruction is executed the processor interrupts a computational process (a debuggee) and transfers the execution to another computational process (a debugger) that can inspect the debuggee state.

No breakpoints (normal execution path, the yellow line represents a function call):

2 breakpoints (BP#1 is inside the function and BP#2 is at the entry of another function):

Synonyms:

Antonyms:

Also: virtual memory, software breakpoint, hardware breakpoint, processor breakpoint, data breakpoint, code breakpoint, exception, debugger event.

- Dmitry Vostokov @ DumpAnalysis.org -

Dictionary of Debugging: Virtual Memory

October 5th, 2009

Virtual Memory

A computational process view of its memory. Memory content is combined from the process host memory (may not be physical) and from a storage memory. Virtual memory usually has linear ordered addresses in the range [0, N] where some regions may be inaccessible:

Synonyms: virtual space

Antonyms:

Also: memory space, memory dump, memory region, user dump, kernel dump, complete dump, kernel spacephysical memory, user space, generalized kernel space.

- Dmitry Vostokov @ DumpAnalysis.org -

Sample Chapter from A.NET.Debugging Book

October 2nd, 2009

While doing Google search today found the site for the forthcoming Mario Hewardt’s new book Advanced .NET Debugging:

www.advanceddotnetdebugging.com

with 74-page sample chapter. Looking forward to reading this book.

- Dmitry Vostokov @ DumpAnalysis.org -

Pictures from Memory Space (Part 4)

September 30th, 2009

More images mined today:

Equalizer

Labyrinth of Code

- Dmitry Vostokov @ DumpAnalysis.org -

Memory Dump and Minidumps

September 30th, 2009

Welcome to Physicalist Art that has its foundation in Physicalism. The first physicalist composition was on display today and I took a picture of it (weather condition was not good):

 

Material: blue agate

It was originally called “Blue in a gate: memory dump and minidumps”. I plan to reinstall it again with more elaborate surroundings.

- Dmitry Vostokov @ DumpAnalysis.org -

The Tsar of Memory Dump Analysis

September 30th, 2009

- Dmitry Vostokov @ DumpAnalysis.org -

What color is your instruction?

September 30th, 2009

Opcodism art is not limited to assembly language code and binary installations. It also provides beautiful color illustrations of processor opcodes and instructions. In this post I provide illustrations of NOP, PAUSE and INT 3 instructions generated by Dump2Picture from memory dump images of crashed 1MbNop and 1MbPause processes.

0:000> lmp
start             end                 module name
00000000`77030000 00000000`7715d000   kernel32     
00000000`77230000 00000000`773b6000   ntdll
00000001`40000000 00000001`40144000   1MbNop
000007fe`fd1c0000 000007fe`fd1f5000   apphelp
000007fe`fdaf0000 000007fe`fdc33000   rpcrt4
000007fe`ff400000 000007fe`ff508000   advapi32

8 bit image of 1Mb NOP field fenced by INT 3 wall:

16 bit image of 1Mb NOP field fenced by INT 3 wall:

24 bit image of 1Mb NOP field fenced by INT 3 wall:

32 bit image of 1Mb NOP field fenced by INT 3 wall:

0:000> lmp
start             end                 module name
00000000`77030000 00000000`7715d000   kernel32
00000000`77230000 00000000`773b6000   ntdll
00000001`40000000 00000001`40284000   1MbPause

8 bit image of 1Mb PAUSE field fenced by INT 3 wall:

The same as above but PAUSE / INT 3 transition magnified:

16 bit image of 1Mb PAUSE field fenced by INT 3 wall:

24 bit image of 1Mb PAUSE field fenced by INT 3 wall:

The same as above but PAUSE / INT 3 transition magnified:

32 bit image of 1Mb PAUSE field fenced by INT 3 wall:

- Dmitry Vostokov @ DumpAnalysis.org -

Memory Dumping an Idea

September 29th, 2009

I always carry my blogging notebook with me. A few weeks ago I was pictured while trying to reach it and write down one of ideas that usually spring to my mind during nature and family walks:

I plan to update The Perfect Gift for a Blogger in Q1, 2010 taking into account my year long experience with it and various accumulated suggestions. It will also have a short Twitter section.

- Dmitry Vostokov @ DumpAnalysis.org -

Dictionary of Debugging: Physical Memory

September 29th, 2009

Physical Memory

The linear ordering and numbering of physical memory unit implementations, one-to-one and onto the range [0, M] of addresses:

Synonyms: physical space

Antonyms:

Also: memory space, memory dump, memory region, user dump, kernel dump, complete dump, kernel space, virtual memory, user space, generalized kernel space.

- Dmitry Vostokov @ DumpAnalysis.org -

Can Software Tweet?

September 28th, 2009

Every PID has its twitter account. Processes emit short trace messages (STM) and others subscribe to them. This is the technical support of the future, the concept of SoftWeet (*):

www.SoftWeet.com

(*) to weet

to know; to wit (archaic)

- Dmitry Vostokov @ DumpAnalysis.org -

Opcodism: The Art of Opcodes

September 28th, 2009

Fascinated by Kazimir Malevich’s Black Square I created the new art genre with the following two artistic installations:

A Pause before Crash

This is 1Mb of PAUSE instructions without the point of return:

_text SEGMENT

main PROC

DW 100000h DUP (90f3h)

main ENDP

_text ENDS

END

When launched it crashes:

0:000> kL
Child-SP          RetAddr           Call Site
00000000`0012ff58 00000000`7704be3d 1MbPause+0x201011
00000000`0012ff60 00000000`77256a51 kernel32!BaseThreadInitThunk+0xd
00000000`0012ff90 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

0:000> ub rip
1MbPause+0x201002:
00000001`40201002 f390            pause
00000001`40201004 f390            pause
00000001`40201006 f390            pause
00000001`40201008 f390            pause
00000001`4020100a f390            pause
00000001`4020100c f390            pause
00000001`4020100e f390            pause
00000001`40201010 cc              int     3

You can download the source code, PDB and 64-bit EXE from here:

1MbPause.zip

Do Nothing and Crash

This is 1Mb of NOP instructions without the point of return:

_text SEGMENT

main PROC

DB 100000h DUP (90h)

main ENDP

_text ENDS

END

When launched it crashes too:

0:000> kL
Child-SP          RetAddr           Call Site
00000000`0012ff58 00000000`7704be3d 1MbNop+0x101011
00000000`0012ff60 00000000`77256a51 kernel32!BaseThreadInitThunk+0xd
00000000`0012ff90 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

0:000> ub rip
1MbNop+0x101009:
00000001`40101009 90              nop
00000001`4010100a 90              nop
00000001`4010100b 90              nop
00000001`4010100c 90              nop
00000001`4010100d 90              nop
00000001`4010100e 90              nop
00000001`4010100f 90              nop
00000001`40101010 cc              int     3

You can download the source code, PDB and 64-bit EXE from here:

1MbNop.zip

The earliest opcodism binary was created on October 25th, 2006 that I now call Nothingness and Crash: The Smallest Program.

- Dmitry Vostokov @ DumpAnalysis.org -

Forthcoming Memory Dump Analysis Anthology, Volume 3

September 26th, 2009

This is a revised, edited, cross-referenced and thematically organized volume of selected DumpAnalysis.org blog posts about crash dump analysis and debugging written in October 2008 - June 2009 for software engineers developing and maintaining products on Windows platforms, quality assurance engineers testing software on Windows platforms and technical support and escalation engineers dealing with complex software issues. The third volume features:

- 15 new crash dump analysis patterns
- 29 new pattern interaction case studies
- Trace analysis patterns
- Updated checklist
- Fully cross-referenced with Volume 1 and Volume 2
- New appendixes

Product information:

  • Title: Memory Dump Analysis Anthology, Volume 3
  • Author: Dmitry Vostokov
  • Language: English
  • Product Dimensions: 22.86 x 15.24
  • Paperback: 404 pages
  • Publisher: Opentask (20 December 2009)
  • ISBN-13: 978-1-906717-43-8
  • Hardcover: 404 pages
  • Publisher: Opentask (30 January 2010)
  • ISBN-13: 978-1-906717-44-5

Back cover features 3D computer memory visualization image.

- Dmitry Vostokov @ DumpAnalysis.org -

Laptop Reviews

September 26th, 2009

DumpAnalysis.org accepts hardware such as laptops for reviewing in relation to their suitability for extreme debugging, virtualization, trace analysis, computer forensics, memory dump analysis, visualization and auralization. If you work for a H/W company like HP, Apple, Dell, Acer, Sony or any other respectable manufacturer please don’t hesitate to forward this post to your management: it could be your company brand or laptop model that debugging and software technical support community chooses next time of upgrade or for T&D / R&D! H/W reviews will be posted on the main portal page which currently has an audience of more than 200,000 unique visitors per year from more than 30,000 network locations (*).

If your company is interested please don’t hesitate to use this contact form:

http://www.dumpanalysis.org/contact

(*) From Google Analytics report.

- Dmitry Vostokov @ DumpAnalysis.org -

Dictionary of Debugging: Kernel Space

September 25th, 2009

Kernel Space

The linear range of memory addresses, a sub-interval of a memory space, comprising from code and data of an operating system computational process or its kernel part. For example, for a memory space [0, M] the kernel space can have the range of [N, M] addresses, where 0 < N < M, as illustrated on the following diagram valid for most of contemporary operating systems:

The memory contents might not be available for specific memory regions of a kernel space.

Synonyms:

Antonyms:

Also: memory space, memory dump, memory region, user dump, kernel dump, complete dump, physical memory, virtual memoryuser space, generalized kernel space.

- Dmitry Vostokov @ DumpAnalysis.org -

New Open Jobs in Citrix EMEA, Ireland

September 24th, 2009

The portal jobs page has been updated:

http://www.dumpanalysis.org/jobs

- Dmitry Vostokov @ DumpAnalysis.org -

DebugWare Patterns (Part 9)

September 24th, 2009

Real troubleshooting is usually done by combining several units of work chosen from a manual. Checklist pattern summarizes this recurrent practice. Checklist Coordinator orchestrates troubleshooting units of work (TUWs) components from TUW Repository according to checklists from Checklist Repository (in the simple case it can be just one checklist). This is illustrated on the following UML component diagram:

- Dmitry Vostokov @ DumpAnalysis.org -

Dictionary of Debugging: User Space

September 24th, 2009

User Space

The linear range of memory addresses, a sub-interval of a memory space, that computational process instructions can potentially read values from. For example, for a memory space [0, M] the user space can have the range of [0, N] addresses, where N < M, as illustrated on the following diagram valid for most of contemporary operating systems:

The memory contents might not be available for specific memory regions of a user space.

Synonyms:

Antonyms:

Also: memory space, memory dump, memory region, user dump, kernel dump, complete dump, physical memory, virtual memory, kernel space, generalized user space.

- Dmitry Vostokov @ DumpAnalysis.org -

WDPF books gain even more value after being used

September 23rd, 2009

I noticed previously that WDPF book gains value after being used but didn’t anticipate the scale of price value leak and spike. Today I noticed that used books gain even more value and now cost more than gold, platinum and iridium (note that the first seller’s price is one cent cheaper, really a super book deal):

- Dmitry Vostokov @ DumpAnalysis.org