Trace Analysis Patterns (Part 1)

After coming back to engineering I decided to expand the domain of my research and start the new series of posts called Trace Analysis Patterns. In addition to Citrix CDF / Microsoft ETW traces I plan to cover other variants based on my extensive software engineering background in the past where I used tracing in software products ranging from soft multi-platform real-time systems to static code analysis tools. Connection with memory dump analysis will be covered too because sometimes the combination of static and dynamic data leads to interesting observations and helps to troubleshoot and resolve customer problems especially when not all data can be collected dynamically.

In fact, stack traces and their collections are specializations of the more general traces. Another example is historical information in memory dump files especially when it is somehow timestamped.  

In this part I start with the obvious and to some extent the trivial pattern called Periodic Error. This is an error or a status value that is observed periodically many times:

No     PID  TID   Date      Time         Statement
[...]
664957 1788 22504 4/23/2009 17:59:14.600 MyClass::Initialize: Cannot open connection “Client ID: 310″, status=5  
[…]
668834 1788 19868 4/23/2009 19:11:52.979 MyClass::Initialize: Cannot open connection “Client ID: 612″, status=5 
[…]

or 

No     PID  TID   Date      Time         Statement
[...] 
202314 1788 19128 4/21/2009 16:03:46.861 HandleDataLevel: Error 12005 Getting Mask
[…]
347653 1788 17812 4/22/2009 13:26:00.735 HandleDataLevel: Error 12005 Getting Mask
[…]

Here single trace entries can be isolated from the trace and studied in detail. 

Be aware though that some modules might report periodic errors that are false positive, in the sense, that they are expected as a part of implementation details, for example, when a function returns an error to indicate that bigger buffer is required or to estimate its size for a subsequent call. It merits its own pattern name and I come to it next time with more examples.

I also created a page where I’ll will be adding all tracing patterns:

Trace Analysis Patterns   

- Dmitry Vostokov @ TraceAnalysis.org -

5 Responses to “Trace Analysis Patterns (Part 1)”

  1. Crash Dump Analysis » Blog Archive » Trace Analysis Patterns (Part 5) Says:

    […] we have several disjoint Periodic Errors and possible false positives. We wonder where should we start or assign relative priorities for […]

  2. Crash Dump Analysis » Blog Archive » Trace Analysis Patterns (Part 8) Says:

    […] Sometimes there are reported delays in application startup, session initialization, long response times and simply the absence of response. All these problems can be reflected in software traces showing sudden gaps in threads of activity. I call this pattern Discontinuity per analogy with continuous and discontinuous functions in mathematics. Here is an example I came upon recently and it will be covered fully in the forthcoming pattern cooperation case study spanning both memory dump and trace analysis. One process was reported to have a long period of CPU spiking calculation and a CDF trace was recorded. When we open it we would see this periodic error: […]

  3. Crash Dump Analysis » Blog Archive » Critical section high contention and wait chains, blocked threads, and periodic error: memory dump and trace analysis pattern cooperation Says:

    […] and its software trace recorded before the complete memory dump was saved the following recurrent periodic error from different threads that confirms our observation about the possible problem with a database and […]

  4. Crash Dump Analysis » Blog Archive » Trace Analysis Patterns (Part 28) Says:

    […] Error. I have already mentioned when I introduced the first software trace analysis pattern Periodic Error. Here is an example from the real trace. In a non-working trace we found this error in an adjoint […]

  5. Crash Dump Analysis » Blog Archive » Basic facts, periodic error and defamiliarizing effect: software trace pattern cooperation Says:

    […] was also Periodic Errors throughout the whole trace […]

Leave a Reply