Archive for the ‘Trace Analysis Patterns’ Category

Trace Analysis Patterns (Part 18)

Monday, March 8th, 2010

Sometimes we have a sequence of Activity Regions with increasing values of Statement Current, like depicted here:

The boundaries of regions may be blurry and arbitrarily drawn. Nevertheless, the current is visibly increasing or decreasing, hence the name of this pattern: Trace Acceleration, by analogy with physical acceleration, second-order derivative. We can also metaphorically use here the notion of a partial derivative for trace statement current and acceleration for Threads of Activity and Adjoint Threads of Activity but whether it is useful remains to be seen.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Trace Analysis Patterns (Part 17)

Thursday, March 4th, 2010

This is an extension of Thread of Activity pattern based on the concept of multibraiding and it is called Adjoint Thread of Activity correspondingly. I’m going to illustrate it soon when I publish a synthetic case study involving several software trace analysis patterns.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Trace Analysis Patterns (Part 16)

Saturday, February 13th, 2010

Another useful pattern is called Time Delta. This is a time interval between significant events. For example,

#     Module PID  TID  Time         File    Function Message
1                      10:06:18.994                  (Start)
[...]
6060  dllA   1604 7108 10:06:21.746 fileA.c DllMain  DLL_PROCESS_ATTACH
[…]
24480 dllA   1604 7108 10:06:32.262 fileA.c Exec     Path: C:\Program Files\CompanyA\appB.exe
[…]
24550 dllB   1604 9588 10:06:32.362 fileB.c PostMsg  Event Q
[…]
28230                  10:07:05.170                  (End)

Such deltas are useful in examining delays. In the trace fragment above we are interested in dllA activity from its load until it launches appB.exe. We see that the time delta was only 10 seconds. The message #24550 was the last message from the process ID 1604 and after that we didn’t “hear” from that PID for more than 30 seconds until the tracing was stopped.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Trace Analysis Patterns (Part 15)

Saturday, February 13th, 2010

When looking at software traces and doing either a search for or just scrolling certain messages have our attention immediately. We call them Significant Events and hence the name of this pattern, Significant Event. It could be a recorded exception or an error, a basic fact, a trace message from vocabulary index, or just any trace statement that marks the start of some activity we want to explore in depth, for example, a certain DLL is attached to the process, a coupled process is started or a function is called. The start of a trace and the end of it are trivial significant events and are used in deciding whether the trace is circular, in determining the trace recording interval or its average statement current.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Memory Systems Language (Part 1)

Friday, February 12th, 2010

Computer memory analysis is based on interconnected structures of symbols and we state that there exists a memory language that extends a hierarchy of modeling and implementation languages (both domain-specific and general-purpose):

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Forthcoming Memory Dump Analysis Anthology, Volume 4

Thursday, February 11th, 2010

This is a revised, edited, cross-referenced and thematically organized volume of selected DumpAnalysis.org blog posts about crash dump analysis and debugging written in July 2009 - January 2010 for software engineers developing and maintaining products on Windows platforms, quality assurance engineers testing software on Windows platforms and technical support and escalation engineers dealing with complex software issues. The fourth volume features:

- 13 new crash dump analysis patterns
- 13 new pattern interaction case studies
- 10 new trace analysis patterns
- 6 new Debugware patterns and case study
- Workaround patterns
- Updated checklist
- Fully cross-referenced with Volume 1, Volume 2 and Volume 3
- New appendixes

Product information:

  • Title: Memory Dump Analysis Anthology, Volume 4
  • Author: Dmitry Vostokov
  • Language: English
  • Product Dimensions: 22.86 x 15.24
  • Paperback: 410 pages
  • Publisher: Opentask (30 March 2010)
  • ISBN-13: 978-1-906717-86-5
  • Hardcover: 410 pages
  • Publisher: Opentask (30 April 2010)
  • ISBN-13: 978-1-906717-87-2

Back cover features memory space art image: Internal Process Combustion.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Extending Multithreading to Multibraiding (Adjoint Threading)

Sunday, January 17th, 2010

Having considered computational threads as braided strings and after discerning several software trace analysis patterns (just the beginning) we can see formatted and tabulated software trace output in a new light and employ the “fabric of traces” and braid metaphors for an Adjoint Thread concept. This new concept was motivated by reading about Extended Phenotype (*) and extensive analysis of Citrix ETW-based CDF traces using CDFAnalyzer. The term Adjoint was borrowed from mathematics because the concept we discuss below resembles this metaphorical formula: (Thread A, B) = [A, Thread B]. Let me first illustrate adjoint threading using simplified trace tables. Consider this generalized software trace example (date and time column is omitted for visual clarity):

#

Source Dir

PID

TID

File Name

Function

Message

1

\src\subsystemA

2792

5676

file1.cpp

fooA

Message text…

2

\src\subsystemA

2792

5676

file1.cpp

fooA

Message text…

3

\src\subsystemA

2792

5676

file1.cpp

fooA

Message text…

4

\src\lib

2792

5680

file2.cpp

barA

Message text…

5

\src\subsystemA

2792

5680

file1.cpp

fooA

Message text…

6

\src\subsystemA

2792

5676

file1.cpp

fooA

Message text…

7

\src\lib

2792

5680

file2.cpp

fooA

Message text…

8

\src\lib

2792

5680

file2.cpp

fooA

Message text…

9

\src\subsystemB

2792

3912

file3.cpp

barB

Message text…

10

\src\subsystemB

2792

3912

file3.cpp

barB

Message text…

11

\src\subsystemB

2792

3912

file3.cpp

barB

Message text…

12

\src\subsystemB

2792

3912

file3.cpp

barB

Message text…

13

\src\subsystemB

2792

3912

file3.cpp

barB

Message text…

14

\src\subsystemB

2792

3912

file3.cpp

barB

Message text…

15

\src\subsystemB

2792

2992

file4.cpp

fooB

Message text…

16

\src\subsystemB

2792

3008

file4.cpp

fooB

Message text…

We see several threads in a process PID 2792. In CDFAnalyzer we can filter trace messages that belong to any column and if we filter by TID we get a view of any Thread of Activity. However, each thread can “run” through any source directory, file name or function. If a function belongs to a library multiple threads would access it. This source location (can be considered as a subsystem), file or function view of activity is called an Adjoint Thread. For example, if we filter only subsystemA column in the trace above we get this table:

#

Source Dir

PID

TID

File Name

Function

Message

1

\src\subsystemA

2792

5676

file1.cpp

fooA

Message …

2

\src\subsystemA

2792

5676

file1.cpp

fooA

Message …

3

\src\subsystemA

2792

5676

file1.cpp

fooA

Message …

5

\src\subsystemA

2792

5680

file1.cpp

fooA

Message …

6

\src\subsystemA

2792

5676

file1.cpp

fooA

Message …

7005

\src\subsystemA

2792

5664

file1.cpp

fooA

Message …

10198

\src\subsystemA

2792

5664

file1.cpp

fooA

Message …

10364

\src\subsystemA

2792

5664

file1.cpp

fooA

Message …

10417

\src\subsystemA

2792

5664

file1.cpp

fooA

Message …

10420

\src\subsystemA

2792

5676

file1.cpp

fooA

Message …

10422

\src\subsystemA

2792

5680

file1.cpp

fooA

Message …

10587

\src\subsystemA

2792

5664

file1.cpp

fooA

Message …

10767

\src\subsystemA

2792

5680

file1.cpp

fooA

Message …

11126

\src\subsystemA

2792

5668

file1.cpp

fooA

Message …

11131

\src\subsystemA

2792

5680

file1.cpp

fooA

Message …

11398

\src\subsystemA

2792

5676

file1.cpp

fooA

Message …

11501

\src\subsystemA

2792

5668

file1.cpp

fooA

Message …

11507

\src\subsystemA

2792

5668

file1.cpp

fooA

Message …

11509

\src\subsystemA

2792

5664

file1.cpp

fooA

Message …

11513

\src\subsystemA

2792

5680

file1.cpp

fooA

Message …

11524

\src\subsystemA

2792

5668

file1.cpp

fooA

Message …

We can graphically view subsystemA as a braid string that “permeates the fabric of threads”:

We can get many different braids by changing filters, hence multibraiding. Here is another example of a driver source file view initially permeating 2 process contexts and 4 threads:

#

Source Dir

PID

TID

File Name

Function

Message

41

\src\sys\driver

3636

3848

entry.c

DriverEntry

IOCTL …

80

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

99

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

102

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

179

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

180

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

311

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

447

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

448

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

457

\src\sys\driver

2792

5108

entry.c

DriverEntry

IOCTL …

608

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

614

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

655

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

675

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

678

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

680

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

681

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

1145

\src\sys\driver

3636

4960

entry.c

DriverEntry

IOCTL …

1153

\src\sys\driver

3636

4960

entry.c

DriverEntry

IOCTL …

1154

\src\sys\driver

3636

4960

entry.c

DriverEntry

IOCTL …

(*) A bit of digression. Looks like biology keeps giving insights into software, there is even a software phenotype metaphor albeit a bit restricted to code, I just thought that we need also an Extended Software Phenotype.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

The Year of Debugging in Retrospection

Thursday, January 14th, 2010

The Year of Debugging, 0×7D9, was a remarkable year for DumpAnalysis.org. Here is the list of achievements to report:

- Software Trace Analysis as a new discipline with its own set of patterns

- Unification of Memory Dump Analysis with Software Trace Analysis (DA+TA)

- New computer memory dump-based art movements: Opcodism and Physicalist Art

- Discovery of 3D computer memory visualization techniques

- Establishing Software Maintenance Institute

- Broadening software fault injection as Software Defect Construction discipline

- Establishing a new profession of a Software Defect Researcher

- Starting ambitious Dictionary of Debugging

- Publishing Windows Debugging: Practical Foundations book

- Publishing the first x86-free Windows debugging book: x64 Windows Debugging: Practical Foundations

- Establishing the new debugging magazine: Debugged! MZ/PE

- Publishing Memory Dump Analysis Anthology, Volume 3

- Cooperation with OpenTask to promote First Fault Software Problem Solving book

- Establishing Debugging Expert(s) Magazine Online

- Creating the first development process for debugging and software troubleshooting tools: RADII

- Publishing the first pattern-driven memory dump analysis troubleshooting methodology as a foundation for software debugging

- Proposal for an International Memory Analysts and Debuggers Day

- Almost completed Windows Debugging Notebook to be published soon

- The founder of DumpAnalysis.org (Dr. DebugLove) becomes a member of Citrix Systems Tweetrix Support Team

Now DumpAnalysis.org focuses on The Year of Dump Analysis, 0×7DA, as a foundation for the forthcoming debugging decade and reveals future plans this weekend.

I’m sure that many other organizations and individuals have no less remarkable accomplishments to report for 2009. I promise to track down and write about some of them in the forthcoming book: 

The Science of Dr. Watson: An Illustrated History of Debugging (ISBN: 978-1906717070)

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Trace Analysis Patterns (Part 14)

Tuesday, January 12th, 2010

Inter-Correlation pattern is analogous to the previously described Intra-Correlation pattern but involves several traces from possibly different trace agents recorded (most commonly) at the same time or during an overlapping time interval:

 

Let’s look at a typical example of an application subclassing windows to add additional look and feel element to its GUI or thjat hooks into window messaging. Suppose this application also records important trace points like window parameters before and after subclassing using ETW technology (Event Tracing for Windows). When we run the application in terminal services envirinment all windows (including other processes) are shown with an incorrect dimension. We therefore request the application trace and in addition WindowHistory trace to see how corrdinates of all windows are changed over time. We easily find some Basic Facts in both traces such as window class name or time but it looks like window handle is different. In another set of traces recorded for comparison we have same window handle values, class name is absent from the ETW trace but a process and thread ID for the same window handle are different. We, therefore, don’t see a correlation between these traces and suspect that both traces in 2 sets were recorded in different terminal sessions, for example:

ETW trace:

#      PID   TID   Time          Message
[…] 
46750  5890  6960  10:17:18.825  Subclassing, handle=0×100B8, class=MyWindowClass, […]
[…]

WindowHistory trace:

Handle: 0001006E Class: “MyWindowClass” Title: “”
   Captured at: 10:17:19:637
   Process ID: 19e0
   Thread ID: 16e4

   Parent: 0
   Screen position (l,t,r,b): (-2,896,1282,1026)
   Client rectangle (l,t,r,b): (0,0,1276,122)
   Visible: true
   Window placement command: SW_SHOWNORMAL
   Foreground: false
   HungApp: false
   Minimized: false
   Maximized: false
[…]

- Dmitry Vostokov @ TraceAnalysis.org -

Mystique Back Covers Revealed

Thursday, January 7th, 2010

Some practical engineers asked me how do Debugged! MZ/PE magazine back covers look like from a birds eye view:

 

One engineer even commented that they look better and better (counterclockwise) :-) 

- Dmitry Vostokov @ DumpAnalysis.org -

Trace Analysis Patterns (Part 13)

Thursday, December 31st, 2009

What will you do confronted with a one million trace messages recorded between 10:44:15 and 10:46:55 with an average trace statement current of 7,000 msg/s from dozens of modules and having a one sentence problem description? One solution is to try to search for a specific vocabulary relevant to the problem description, for example, if a problem is an intermittent re-authentication then we might try to search for a word “password” or a similar one drawn from a troubleshooting domain vocabulary. So it is useful to have a Vocabulary Index to search for. Hence, the same name of this pattern. In our trace example, the search for “password” jumps straight to a small activity region of authorization modules starting from the message number #180,010 and the last “password” occurrence is in the message #180,490 that narrows initial analysis region to just 500 messages. Note the similarity here between a book and its index and a trace as a software narrative and its vocabulary index.

- Dmitry Vostokov @ TraceAnalysis.org -

Memory Dump Analysis Anthology, Volume 3

Sunday, December 20th, 2009

“Memory dumps are facts.”

I’m very excited to announce that Volume 3 is available in paperback, hardcover and digital editions:

Memory Dump Analysis Anthology, Volume 3

Table of Contents

In two weeks paperback edition should also appear on Amazon and other bookstores. Amazon hardcover edition is planned to be available in January 2010.

The amount of information was so voluminous that I had to split the originally planned volume into two. Volume 4 should appear by the middle of February together with Color Supplement for Volumes 1-4. 

- Dmitry Vostokov @ DumpAnalysis.org -

Debugged! MZ/PE September issue is out

Wednesday, December 16th, 2009

Finally, after the long delay, the issue is available in print on Amazon and through other sellers:

Debugged! MZ/PE: Software Tracing

Buy from Amazon

- Dmitry Vostokov @ DumpAnalysis.org -

Trace Analysis Patterns (Part 12)

Tuesday, November 17th, 2009

When looking at lengthy traces with thousands and millions of messages (trace statements) we can see regions of activity where statement current (Jm, msg/s) is much higher than in surrounding temporal regions. Hence the name of this pattern, Activity Region. Here is an illustration for a typical ETW/CDF trace where a middle region of activity (Jm2) signifies a system performing some response function like a user session initialization and application launch:

- Dmitry Vostokov @ TraceAnalysis.org -

New Article from Debugged! MZ/PE

Tuesday, November 10th, 2009

The new article has been added to the free online version of Debugged! magazine (www.debuggingexpert.com or www.debuggingexperts.com):

Memory Dump and Trace Analysis: A Unified Pattern Approach

- Dmitry Vostokov @ DumpAnalysis.org -

Trace Analysis Patterns (Part 11)

Saturday, November 7th, 2009

Birds eye view of software traces makes it easier to see their coarse blocked structure:

where further finer structure is discernible and even nested blocks:

Some blocks of output can be seen when scrolling trace viewer output but if a viewer support zooming it is possible to get an overview and jump directly into a Characteristic Message Block, for example, debug messages of repeated attempts to query a database. If a viewer supports message coloring it also helps. Sometimes this technique is useful to ignore bulk messages and start the analysis around block boundaries.

- Dmitry Vostokov @ TraceAnalysis.org -

Software Trace: Birds Eye View

Friday, November 6th, 2009

Here is a fragment of a condensed view of a CDF (ETW-based) trace imported into MS Word:

- Dmitry Vostokov @ TraceAnalysis.org -

Statement current, coupled processes, wait chain, spiking thread, hidden exception, message box and not my version: memory dump and trace analysis pattern cooperation

Monday, October 12th, 2009

It was reported that one important system functionality is not available from time to time but is usually restored to normal operation when one service (ServiceA) is restarted. That service was coupled with ServiceB and their memory dumps were saved and delivered for analysis. Unfortunately, nothing raising a suspicion was found inside. To tackle the problem it was advised to get an ETW trace from the system including modules from ServiceA together with process memory dumps when the problem happens again. The trace revealed the following message with exceptionally high statement current of 72,118 msg/s (and also superdense - no other types of trace statements were found inside):

#      PID    TID    Message
[...]
823296 11300  2484   ServiceB notification failed, error code = 6
[…]

Where the error 6 is invalid handle error:

0:000> !error 6
Error code: (Win32) 0x6 (6) - The handle is invalid.

The thread 2484 (9B4) corresponds to thread #22 in ServiceA and it is blocked waiting for an LPC reply:

  22  Id: 2c24.9b4 Suspend: 1 Teb: 7ffa4000 Unfrozen
ChildEBP RetAddr 
020cfa18 7c827899 ntdll!KiFastSystemCallRet
020cfa1c 77c80a6e ntdll!ZwRequestWaitReplyPort+0xc
020cfa68 77c7fcf0 rpcrt4!LRPC_CCALL::SendReceive+0×230

020cfa74 77c80673 rpcrt4!I_RpcSendReceive+0×24
020cfa88 77ce315a rpcrt4!NdrSendReceive+0×2b
020cfe70 73077ca5 rpcrt4!NdrClientCall2+0×22e
020cfe88 73077c2a ServiceA!RpcNextNotification+0×1c
020cffb8 77e6482f ServiceA!EventWatcherThread+0×107
020cffec 00000000 kernel32!BaseThreadStart+0×34

Suspicious of a loop we confirm that the thread was spiking:

0:000> !runaway f
 User Mode Time
  Thread       Time
  22:9b4       0 days 0:41:27.453
  19:4768      0 days 0:00:00.109
[…] 
Kernel Mode Time
  Thread       Time
  22:9b4       0 days 0:24:27.984
  23:407c      0 days 0:00:00.437
[…]
 Elapsed Time
  Thread       Time
[…]
  22:9b4       0 days 5:26:21.499
[…]

Looking at the raw stack data (using !teb and dds WinDbg commands) we see a hidden processed exception:

020cf6c4  020cf4c0
020cf6c8  020cf6d8
020cf6cc  020cf718
020cf6d0  7c828290 ntdll!_except_handler3
020cf6d4  7c82a120 ntdll!CheckHeapFillPattern+0x54
020cf6d8  020cf6e8
020cf6dc  00140000
020cf6e0  7c82a144 ntdll!RtlpAllocateFromHeapLookaside+0x13
020cf6e4  00140868
020cf6e8  020cf910
020cf6ec  7c82a0d8 ntdll!RtlAllocateHeap+0x1dd
020cf6f0  7c82a11c ntdll!RtlAllocateHeap+0xee7
020cf6f4  73074548
020cf6f8  00000000
020cf6fc  00000000
020cf700  00000000
020cf704  00000000
020cf708  00218ef0
020cf70c  020cf728
020cf710  7c82a791 ntdll!RtlpCoalesceFreeBlocks+0x383
020cf714  020d0000
020cf718  00218ef0
020cf71c  020cf9fc
020cf720  7c82865c ntdll!RtlRaiseException+0×3d
020cf724  020ce000
020cf728  020cf72c
020cf72c  00010007
020cf730  020cf810
020cf734  7c829f5d ntdll!RtlFreeHeap+0×20e
020cf738  001407d8
020cf73c  7c829f79 ntdll!RtlFreeHeap+0×70f
020cf740  00000000

After some time another pair of coupled processes was collected where ServiceA(2) was hanging on an LPC request again but this time ServiceB(2) had one thread blocked by a GUI property sheet processing code (a variant of Message Box pattern):

0:015> kL 100
ChildEBP RetAddr 
017fb9f0 7c827d29 ntdll!KiFastSystemCallRet
017fb9f4 77e61d1e ntdll!ZwWaitForSingleObject+0xc
017fba64 77e61c8d kernel32!WaitForSingleObjectEx+0xac
017fba78 6dfcdac3 kernel32!WaitForSingleObject+0x12
[...]
017fbdac 730801c5 compstui!CommonPropertySheetUIW+0×17
017fbdf4 73080f5d ServiceB!CommonPropertySheetUI+0×43

WARNING: Stack unwind information not available. Following frames may be wrong.
017fc27c 5c3ae4e6 ComponentA!DllGetClassObject+0xbf4e
[…]
017ff8f8 77ce33e1 rpcrt4!Invoke+0×30
017ffcf8 77ce35c4 rpcrt4!NdrStubCall2+0×299
017ffd14 77c7ff7a rpcrt4!NdrServerCall2+0×19
017ffd48 77c8042d rpcrt4!DispatchToStubInCNoAvrf+0×38
017ffd9c 77c80353 rpcrt4!RPC_INTERFACE::DispatchToStubWorker+0×11f
017ffdc0 77c811dc rpcrt4!RPC_INTERFACE::DispatchToStub+0xa3
017ffdfc 77c812f0 rpcrt4!LRPC_SCALL::DealWithRequestMessage+0×42c
017ffe20 77c88678 rpcrt4!LRPC_ADDRESS::DealWithLRPCRequest+0×127
017fff84 77c88792 rpcrt4!LRPC_ADDRESS::ReceiveLotsaCalls+0×430
017fff8c 77c8872d rpcrt4!RecvLotsaCallsWrapper+0xd
017fffac 77c7b110 rpcrt4!BaseCachedThreadRoutine+0×9d
017fffb8 77e6482f rpcrt4!ThreadStartRoutine+0×1b
017fffec 00000000 kernel32!BaseThreadStart+0×34

ComponentA was also found loaded in ServiceB(1) user dump and in the ServiceB memory dump from the initial coupled pair where nothing was found before. The timestamp of that component was old enough (lmv command) to warrant more attention to it and contact its ISV.

- Dmitry Vostokov @ DumpAnalysis.org -

Critical section high contention and wait chains, blocked threads, and periodic error: memory dump and trace analysis pattern cooperation

Friday, October 9th, 2009

This is the first case study here that shows an interplay of memory dump analysis (DA) and software trace analysis (TA) patterns, what I call DATA analysis patterns (or DA+TA).  

It was reported that one process was blocking vital server functionality. After the process restart the problem was gone away. A complete memory dump was saved on the next occurrence and it revealed critical section wait chains in that process but no critical section deadlocks:

0: kd> .process /r /p 87f76020
Implicit process is now 87f76020
Loading User Symbols
[...]

0: kd> !cs -l -o -s
-----------------------------------------
DebugInfo          = 0x0016c6d8
Critical section   = 0×0032be30 (+0×32BE30)
LOCKED
LockCount          = 0×34
WaiterWoken        = No
OwningThread       = 0×00001c64
RecursionCount     = 0×1
LockSemaphore      = 0×624
SpinCount          = 0×00000000
OwningThread       = .thread 86396db0
ntdll!RtlpStackTraceDataBase is NULL. Probably the stack traces are not enabled.
[…]

The thread 86396db0 (TID 1c64) that blocked more than 50 threads (0×34) was blocked itself sleeping for more than 6 seconds:

0: kd> .thread 86396db0
Implicit thread is now 86396db0

0: kd> kL 100
  *** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr 
ae7f8c98 8083d5b1 nt!KiSwapContext+0x26
ae7f8cc4 8083cf69 nt!KiSwapThread+0x2e5
ae7f8d0c 8092b03f nt!KeDelayExecutionThread+0x2ab
ae7f8d54 80833bef nt!NtDelayExecution+0x84
ae7f8d54 7c82860c nt!KiFastCallEntry+0xfc
1020e8ac 7c826f69 ntdll!KiFastSystemCallRet
1020e8b0 77e41ed5 ntdll!NtDelayExecution+0xc
1020e918 77e424fd kernel32!SleepEx+0x68
1020e928 67739357 kernel32!Sleep+0xf
1020e944 6773c3a2 ComponentA!DB_Driver_Command+0xa7
[…]
1020ec64 67485393 ComponentB!DatabaseSearch+0×34
[…]
1020ffb8 77e6482f msvcrt!_endthreadex+0xa3
1020ffec 00000000 kernel32!BaseThreadStart+0×34

0: kd> kv
  *** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr  Args to Child             
[...]
1020e918 77e424fd 00001b00 00000000 1020e944 kernel32!SleepEx+0x68 (FPO: [SEH])
1020e928 67739357 00001b00 00000000 0032ac6c kernel32!Sleep+0xf (FPO: [1,0,0])
[…]

0: kd> ? 1b00 / 0n1000
Evaluate expression: 6 = 00000006

Critical section it owns shows high contention count too:

0: kd> dt -r1 _RTL_CRITICAL_SECTION   0x0032be30
ProcessA!_RTL_CRITICAL_SECTION
   +0x000 DebugInfo        : 0x0016c6d8 _RTL_CRITICAL_SECTION_DEBUG
      +0x000 Type             : 0
      +0x002 CreatorBackTraceIndex : 0
      +0x004 CriticalSection  : 0x0032be30 _RTL_CRITICAL_SECTION
      +0x008 ProcessLocksList : _LIST_ENTRY [ 0x16c708 - 0x16c6b8 ]
      +0x010 EntryCount       : 0
      +0×014 ContentionCount  : 0xac352
      +0×018 Spare            : [2] 0×43005c
   +0×004 LockCount        : -210
   +0×008 RecursionCount   : 1
   +0×00c OwningThread     : 0×00001c64
   +0×010 LockSemaphore    : 0×00000624
   +0×014 SpinCount        : 0

Fortunately, that process had ETW tracing capability and its software trace recorded before the complete memory dump was saved the following recurrent periodic errorfrom different threads that confirms our observation about the possible problem with a database and explains thread delays we see (> 6 seconds for Sleep):

#    PID  TID  Time         Message
[...]
1972 2780 5992 10:05:11.005 Error: [DB Driver] Not enough space on temp disk
1973 2780 5992 10:05:11.005 Execute DB command sleeps on error (retry 26)
[...]
4513 2780 3292 10:06:02.942 Error: [DB Driver] Not enough space on temp disk
4514 2780 3292 10:06:02.942 Execute DB command sleeps on error (retry 11)
4515 2780 3292 10:06:09.598 Error: [DB Driver] Not enough space on temp disk
4516 2780 3292 10:06:09.598 Execute DB command sleeps on error (retry 12)
[…]

- Dmitry Vostokov @ DumpAnalysis.org -

Forthcoming Memory Dump Analysis Anthology, Volume 3

Saturday, September 26th, 2009

This is a revised, edited, cross-referenced and thematically organized volume of selected DumpAnalysis.org blog posts about crash dump analysis and debugging written in October 2008 - June 2009 for software engineers developing and maintaining products on Windows platforms, quality assurance engineers testing software on Windows platforms and technical support and escalation engineers dealing with complex software issues. The third volume features:

- 15 new crash dump analysis patterns
- 29 new pattern interaction case studies
- Trace analysis patterns
- Updated checklist
- Fully cross-referenced with Volume 1 and Volume 2
- New appendixes

Product information:

  • Title: Memory Dump Analysis Anthology, Volume 3
  • Author: Dmitry Vostokov
  • Language: English
  • Product Dimensions: 22.86 x 15.24
  • Paperback: 404 pages
  • Publisher: Opentask (20 December 2009)
  • ISBN-13: 978-1-906717-43-8
  • Hardcover: 404 pages
  • Publisher: Opentask (30 January 2010)
  • ISBN-13: 978-1-906717-44-5

Back cover features 3D computer memory visualization image.

- Dmitry Vostokov @ DumpAnalysis.org -