Archive for the ‘Log Analysis’ Category
Saturday, March 12th, 2016
Data Selector is a variant of Inter-Correlation trace analysis pattern where we use data found in one trace to select Message Set or Adjoint Thread of Activity in another trace. This analysis activity is depicted in the following picture where we have a client log and corresponding server log. In the server log we have log entries for many client sessions. To select messages corresponding to our client session we use some data attribute in the client trace, for example, the user name, and Linked Messages analysis pattern to find one of the messages in the server log that contains the same user name. Then we find out which user session it belongs to and form its Adjoint Thread:

This pattern is different from Identification Messages where we don’t even know the object that emitted trace messages. In Data Selector case we know in principle what kind of messages we are looking for. We just need to select among many alternatives.
- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in CDF Analysis Tips and Tricks, Log Analysis, Software Trace Analysis, Trace Analysis Patterns | No Comments »
Saturday, January 30th, 2016
Often, we need to identify the source of messages based on problem object or subsystem description (what question) before we proceed answering where question (where in the trace we can find messages related to the problem). Even when we know where are messages there can be many sources to select from (if we don’t know the where question we can use Indirect Message analysis pattern). To answer what question we propose Identification Messages analysis pattern. Basic Fact problem description may include properties and behavioural description of the problem object or subsystem. Based on that we can map them to the log messages that such an object can produce:

These messages may not be Error Messages or some other type of messages reflecting abnormal behavior. These messages are only used to identify the software object, module or subsystem.
For example, in one case there were problems with the custom status bar. However, the window handle for it or its parent wasn’t specified in the problem report. In the log file we had a lot of messages describing GUI behavior of many windows. To find out the status bar we thought that it should have small height but long width. Indeed we found one such child window. In addition, for this window the log file contained many messages related to frequent window text changes, possibly reflecting the status bar updates. Having identified the window handle, we proceeded to the analysis of another log with thousands of window messages. Because of the known window handle we were able to select only messages pertaining to our problem status bar.
- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in CDF Analysis Tips and Tricks, Log Analysis, Software Trace Analysis, Software Trace Reading, Trace Analysis Patterns | No Comments »
Saturday, January 16th, 2016
When we have very large traces and Basic Facts containing some data values such as a user name, device name, or registry key value we may use Data Interval analysis pattern to select the trace fragment for the initial log analysis. The first and the last trace messages containing selected data for the closed Data Interval. Depending on the trace size and other considerations we can also choose open Data Intervals. It is illustrated in the following diagram where we use Analysis interval notation borrowed mathematics:

Interval boundary messages may also be used as Trace Mask for another trace.
- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in CDF Analysis Tips and Tricks, Log Analysis, Software Trace Analysis, Trace Analysis Patterns | No Comments »
Saturday, November 14th, 2015
Multiple traces and logs are usually collected for diagnosing distributed systems. Different tools and tracing settings (circular, sequential, file size limit) may be used, systems may be unsynchronized, and individual system tracing may be started at different times due to manual tracing setup and switching between systems. There may be Blackouts, Circular, and Truncated traces. When we analyze such a trace set (Inter-Correlation) we usually select one trace or log that is used as Calibrating Trace. It is used for measuring all other traces against Basic Facts such as start and end tracing times, and the time of the problem. One such scenario is illustrated in the following diagram:

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in CDF Analysis Tips and Tricks, Log Analysis, Software Trace Analysis, Trace Analysis Patterns | No Comments »
Tuesday, November 10th, 2015
We would like to introduce Trace Dimension pattern to address the emerging complexity of logs from distributed environments. By a distributed environment we mean not only a collection of multiple computers (for example, client-server) but also terminal services environments with several different user sessions on one computer (OS) and even multiple user processes (IPC) in some cases. If some task can be performed on one machine or session or inside one process then splitting it across several computers, sessions, or processes usually results in logs with added Distributed Infrastructure Messages (DIMs) such as from proxies, channels and their management:

So, one of the trace simplification strategies is to request the reproduction and its tracing in a simplified environment (such as inside one terminal services session) to eliminate DIMs. In one case, we analyzed a trace for a clipboard paste problem in Windows terminal services environment. After a clipboard copy different data was pasted to different applications. The same behavior was observed for application processes running inside different sessions and processes running inside one session. However, the log was collected for the more complex multiple session scenario with many False Positive Errors which completely disappeared from one session scenario log.
DIM abbreviation played a role in naming this pattern. Additionally, if sessions can be considered a second dimension, then separate VMs can be considered as a third dimension, separate clouds as a 4th dimension, and so on.
- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in CDF Analysis Tips and Tricks, Log Analysis, Software Trace Analysis, Trace Analysis Patterns | No Comments »
Thursday, October 22nd, 2015
Typical software trace may contain several Error Messages with different error codes and different exception names with Exception Stack Traces. Searching for individual codes or exceptions in problem databases may show many matches. Searching for all of them may show nothing. Therefore, we can construct the set of all subsets from the set of codes and exceptions (a power set) and perform analytic reasoning (and a search) based on certain subsets based on the problem description, Trace Viewpoints such as Use Case Trails, Motifs, Focus of Tracing, Foreground Components, (Adjoint) Threads of Activity, and simply some Activity Regions and Message Sets.
The following picture illustrates Error Powerset analysis pattern with a trace that has 4 error messages where 2 messages have the same error code.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in Log Analysis, Software Trace Analysis, Trace Analysis Patterns | No Comments »
Wednesday, September 30th, 2015
Sometimes we have Periodic Message Blocks of a few adjacent messages, for example, when flags are translated into separate messages per bit. Then we may have a pattern of Sequence Repeat Anomaly when one of several message blocks have missing or added messages compared to the more numerous number of expected identical message blocks. Then Missing Message Message Context may be explored further. The following diagram illustrates the pattern:

The name of the pattern comes from the notion of repeated DNA sequences.
- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in CDF Analysis Tips and Tricks, Log Analysis, Software Trace Analysis, Software Trace Analysis and Genetics, Software Trace Analysis and Genomics, Software Trace Reading, Trace Analysis Patterns | No Comments »
Saturday, September 12th, 2015
Recently we analyzed a few logs which ended with a specialized Activity Region from a subsystem that sets operational parameters. The problem description stated that the system became unresponsive after changing parameters in a certain sequence. Usually, for that system, when we stop logging (even after setting parameters) we end up with messages from some Background Components since some time passes between the end of setting parameters activity and the time the operator sends stop logging request:

However, in the problem case we see message flow stops right in the middle of parameter setting activity:

So we advised to check for any crashes or hangs, and, indeed, it was found that the system was actually experiencing system crashes, and we got memory dumps for analysis where we found Top Module from a 3rd-party vendor related to parameter setting activity.
Please also note an analogy here between normal thread stack traces from threads that are waiting most of the time and Spiking Thread stack trace caught up in the middle of some function.
We call this pattern Ruptured Trace after a ruptured computation.
Note, that if it is possible to restart the system and resume the same tracing we may get an instance of Blackout analysis pattern.
- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in Core Dump Analysis, Crash Dump Analysis, Crash Dump Patterns, Log Analysis, Software Trace Analysis, Software Trace Analysis Tips and Tricks, Trace Analysis Patterns | No Comments »
Thursday, July 9th, 2015
Sometimes a few Error Messages or Periodic Errors with low Statement Density for specific Activity Regions or Adjoint Threads of Activity (for specific component, file or function) may constitute Activity Disruption. If the particular functionality was no longer available at the logging time then its unavailability may not be explained by such disruptions, and such messages may be considered False Positive Errors in relation to the reported problem:

But, if we have Periodic Message Blocks containing only Periodic Errors, Activity Region or Adjoint Thread Discontinuity, or simply No Activity, then we may have the complete cease of activity that may correlate with the unavailable functionality:

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in Log Analysis, Software Trace Analysis, Trace Analysis Patterns | No Comments »
Tuesday, June 30th, 2015
General traces and logs may have Message Space regions “surrounded” by the so-called Interspace. Such Interspace regions may link individual Message Space regions like in this diagram generalizing WinDbg !process 0 3f command output:

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in Debugging, Log Analysis, Software Trace Analysis, Software Trace Reading, Software Trace Visualization, Trace Analysis Patterns | No Comments »
Tuesday, March 3rd, 2015
When we do tracing and logging much of computational activity is not visible. For live tracing and debugging this can be alleviated by adding Watch Threads. These are selected memory locations that may or may not be formatted according to specific data structures and are inspected at each main trace message occurrence or after specific intervals or events:

This analysis pattern is different from State Dump which is about intrinsic tracing where the developer of logging statements already incorporated variable watch in source code. Watch Threads are completely independent from original tracing and may be added independently. Counter Value is the simplest example of Watch Thread if done externally because the former usually doesn’t require source code and often means some OS or module variable independent of product internals. Watch Thread is also similar to Data Flow pattern where specific data we are interested in is a part of every trace message.
- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in Debugging, Log Analysis, Performance Monitoring, Software Narratology, Software Trace Analysis, Trace Analysis Patterns | No Comments »
Sunday, February 22nd, 2015
Sometimes we have Basic Facts in a problem description but can’t find messages corresponding to them in a trace or log file but we are sure the tracing (logging) was done correctly. This may be because we have Sparse Trace, or we are not familiar well with product or system tracing messages (such as with Implementation Discourse).

In such a case we for search for Indirect Message of a possible cause:

Having found such a message we may hypothesize that Missing Message should have located nearby (this is based on semantics of both messages), and we then explore corresponding Message Context:

The same analysis strategy is possible for missing causal messages. Here we search for effect or side effect messages:

Having found them we proceed with further analysis:

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in CDF Analysis Tips and Tricks, Log Analysis, Software Trace Analysis, Software Trace Analysis Tips and Tricks, Trace Analysis Patterns | No Comments »
Saturday, January 31st, 2015
Sometimes we need memory reference information not available in software traces and logs, for example, to see pointer dereferences, to follow pointers and linked structures. In such cases memory dumps saved during logging sessions may help. In case of process memory dumps we can even have several Step Dumps. Complete and kernel memory dumps may be forced after saving a log file. We call such pattern Adjoint Space:

Then we can analyze logs and memory dumps together, for example, to follow pointer data further in memory space:

There is also a reverse situation when we use logs to see past data changes before memory snapshot time (Paratext memory analysis pattern):

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in CDF Analysis Tips and Tricks, Core Dump Analysis, Crash Dump Analysis, Crash Dump Patterns, Debugging, Log Analysis, Software Trace Analysis, Software Trace Analysis Tips and Tricks, Software Trace Reading, Trace Analysis Patterns | No Comments »
Saturday, January 24th, 2015
Sometimes specific parts of simultaneous Use Case Trails, blocks of Significant Events or Message Sets in general may overlap. This may point to possible synchronization problems such as race conditions (prognostics) or be visible root causes of them if such problems are reported (diagnostics). We call this pattern Activity Overlap:

For example, a first request may start a new session and we expect the second request to be processed by the same already established session:

However, users report the second session started upon the second request. If we filter execution log by session id and do Intra-Correlational analysis we find out that session initialization prologues are overlapped. The new session started because the first session initialization was not completed:

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in CDF Analysis Tips and Tricks, Debugging, Log Analysis, Software Prognostics, Software Trace Analysis, Trace Analysis Patterns | No Comments »
Wednesday, January 7th, 2015
Some Discontinuities may be Periodic as Silent Messages. If such discontinuities belong to the same Thread of Activity and their Time Deltas are constant we may see Timeout pattern. When timeouts are followed by Error Message we can identify them by back tracing. Timeouts are different from Blackouts where the latter are usually Singleton Events and have large time deltas.
Here is a generalized graphical case study. An error message was identified based on incident Basic Facts:

We filtered the trace for error message TID and found 3 timeouts 30 minutes each:


- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in Debugging, Log Analysis, Software Trace Analysis, Trace Analysis Patterns | No Comments »
Saturday, November 29th, 2014
Some traces and logs may have Periodic Message Blocks with very similar message structure and content (mostly Message Invariants). The only significant difference between them is some unique data. We call such pattern Visitor Trace by analogy with Visitor design pattern where tracing code “visits” each object data or data part to log its content or status.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in Log Analysis, Software Engineering, Software Trace Analysis, Software Trace Reading, Trace Analysis Patterns, Tracing and Design Patterns | No Comments »
Wednesday, November 19th, 2014
There are events that by design or system configuration should be seen in a log only once or not seen at all if code responsible for them was executed before tracing session. For example, the launch of certain services during system initialization shouldn’t be seen again when we trace system activity long after that. It can also be just messages from singleton objects in the application log. The appearance of extra Singleton Events may point to design violations or some abnormal system events such as process restart. The latter may Intra-Correlate with the start of the fault handling process such as WerFault.exe in Windows Process Monitor logs (Guest Component).

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in Log Analysis, Software Engineering, Software Trace Analysis, Software Trace Analysis Tips and Tricks, Software Trace Reading, Trace Analysis Patterns, Tracing and Design Patterns | No Comments »
Sunday, November 9th, 2014
Use cases are implemented in various components such as subsystems, processes, modules, and source code files. Most of the time with good tracing implementation we are able to see Use Case Trails: log messages corresponding to use case scenarios. For simple systems one log may fully correspond to just one use case, but for complex systems, especially distributed client-server ones, there may be several use case instances present simultaneously in one log. One way to disentangle them in the absence of UCID (Use Case ID) or some other grouping tag is to use Event Sequence Phase, the subject of the next pattern.

Master Traces may also correspond to use cases but they should ideally correspond to only one use case instance.
- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in CDF Analysis Tips and Tricks, Debugging, Log Analysis, Software Architecture, Software Trace Analysis, Software Trace Deconstruction, Software Trace Reading, Trace Analysis Patterns | No Comments »
Thursday, June 26th, 2014
Usually when we analyse traces and find an Anchor or Error Message we do backtracking using a combination of Data Flow and Message Sets and selecting appropriate log messages to form Back Trace leading to a possible root cause message:

This pattern is different from Error Thread pattern which just backtracks messages having the same TID (or in general ATID). It is also different from Exception Stack Trace pattern which is just a serialized stack trace from memory snapshot.
- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in CDF Analysis Tips and Tricks, Debugging, Log Analysis, Software Trace Analysis, Trace Analysis Patterns | No Comments »
Tuesday, June 24th, 2014
The previous patterns such as Basic Facts and Vocabulary Index address the mapping of a problem description to software execution artefacts such traces and logs. Indirect Facts analysis pattern addresses the problem of an incomplete problem description. However, we need another pattern for completeness that addresses the mapping from a log to troubleshooting and debugging recommendations. We call it Hidden Facts which are uncovered by trace analysis. Of course, there can be many such hidden facts and usually they are uncovered after narrowing down analysis to particular Threads of Activity, Adjoint Threads, Message Context, Message Set, or Data Flow. The need for that pattern had arisen during the pattern-oriented analysis of the trace case study from Malcolm McCaffery and can be illustrated on this diagram:

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -
Posted in Debugging Environment, Log Analysis, Software Technical Support, Software Trace Analysis, Software Trace Reading, Trace Analysis Patterns, Troubleshooting Methodology | No Comments »