Trace Analysis Patterns (Part 1)
CARE: Crash Analysis Report Environment
DATA (Dump Analysis + Trace Analysis) Facebook group
Please join the community of memory (dump) and trace analysis engineers. This group promotes scientific methods and memory dump-based worldview.
Twitter @ DumpAnalysis You can now follow portal and blog news at DumpAnalysis on Twitter
LinkedIn Group Dr. Watson Enthusiasts All about Dr. Watson errors and more. Get news, excerpts and progress reports about the forthcoming book The Science of Dr. Watson: An Illustrated History of Debugging (ISBN 978-1906717070)
2010 (0x7DA) - The Year of Dump Analysis 2011 (0x7DB) - 2020 (0x7E4) The Debugging Decade
After coming back to engineering I decided to expand the domain of my research and start the new series of posts called Trace Analysis Patterns. In addition to Citrix CDF / Microsoft ETW traces I plan to cover other variants based on my extensive software engineering background in the past where I used tracing in software products ranging from soft multi-platform real-time systems to static code analysis tools. Connection with memory dump analysis will be covered too because sometimes the combination of static and dynamic data leads to interesting observations and helps to troubleshoot and resolve customer problems especially when not all data can be collected dynamically.
In fact, stack traces and their collections are specializations of the more general traces. Another example is historical information in memory dump files especially when it is somehow timestamped.
In this part I start with the obvious and to some extent the trivial pattern called Periodic Error. This is an error or a status value that is observed periodically many times:
No PID TID Date Time Statement
[...]
664957 1788 22504 4/23/2009 17:59:14.600 MyClass::Initialize: Cannot open connection “Client ID: 310″, status=5
[…]
668834 1788 19868 4/23/2009 19:11:52.979 MyClass::Initialize: Cannot open connection “Client ID: 612″, status=5
[…]
or
No PID TID Date Time Statement
[...]
202314 1788 19128 4/21/2009 16:03:46.861 HandleDataLevel: Error 12005 Getting Mask
[…]
347653 1788 17812 4/22/2009 13:26:00.735 HandleDataLevel: Error 12005 Getting Mask
[…]
Here single trace entries can be isolated from the trace and studied in detail.
Be aware though that some modules might report periodic errors that are false positive, in the sense, that they are expected as a part of implementation details, for example, when a function returns an error to indicate that bigger buffer is required or to estimate its size for a subsequent call. It merits its own pattern name and I come to it next time with more examples.
I also created a page where I’ll will be adding all tracing patterns:
- Dmitry Vostokov @ TraceAnalysis.org -
_1125.png)
Coming Soon:
Debugging Notebook: Essential Concepts, WinDbg Commands and Tools
Crash Dump Analysis for System Administrators and Support Engineers
New Magazines:
Debugged! MZ/PE: MagaZine for/from Practicing Engineers
New Books:
Memory Dump Analysis Anthology, Volume 3
First Fault Software Problem Solving: A Guide for Engineers, Managers and Users
x64 Windows Debugging: Practical Foundations
Also available:
Windows Debugging: Practical Foundations
DLL List Landscape: The Art from Computer Memory Space
Dumps, Bugs and Debugging Forensics: The Adventures of Dr. Debugalov
WinDbg: A Reference Poster and Learning Cards
Memory Dump Analysis Anthology, Volume 2
Memory Dump Analysis Anthology, Volume 1
New Children's Book:
July 22nd, 2009 at 1:55 pm
[…] we have several disjoint Periodic Errors and possible false positives. We wonder where should we start or assign relative priorities for […]
August 4th, 2009 at 3:57 pm
[…] Sometimes there are reported delays in application startup, session initialization, long response times and simply the absence of response. All these problems can be reflected in software traces showing sudden gaps in threads of activity. I call this pattern Discontinuity per analogy with continuous and discontinuous functions in mathematics. Here is an example I came upon recently and it will be covered fully in the forthcoming pattern cooperation case study spanning both memory dump and trace analysis. One process was reported to have a long period of CPU spiking calculation and a CDF trace was recorded. When we open it we would see this periodic error: […]
October 9th, 2009 at 2:32 pm
[…] and its software trace recorded before the complete memory dump was saved the following recurrent periodic error from different threads that confirms our observation about the possible problem with a database and […]