Archive for the ‘Data Analysis’ Category

Trace Analysis Patterns (Part 222)

Saturday, February 11th, 2023

Trace Windows, the obvious analysis pattern that was always implicit, is added now due to the proliferation of stream processing nowadays. However, it captures not only horizontal windows but vertical ones, similar to subspaces if we consider messages as vectors. Both types of windows can be combined. This is illustrated in the following diagram:

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Trace Analysis Patterns (Part 210)

Wednesday, August 11th, 2021

When we have different traces and logs not necessarily with the same Trace Schema and select only messages that have some condition, for example, the same ATID (see Adjoint Thread of Activity) or FID (see Feature of Activity) value, we get the new trace that we call Trace Join. A combination of ATID from one trace or Message Set from another is also possible as illustrated in this allegorical picture when joining is done by “Plato” author value or title containing “Plato” (all case-insensitive):

This is very similar to relational data joins. Join of the same trace is possible too. A Dia|gram picture (similar to the previous patterns) is left as an exercise.

We initially wanted to call this analysis pattern Filtered Mask but later realized that it may not be possible to do Trace Mask if there is no global ordering information, such as time. In such a case, Serial Trace is possible. 

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Trace Analysis Patterns (Part 207)

Monday, April 26th, 2021

Trace Schema can be represented as Schema Trace or, avoiding naming confusion, Definition Trace. The resulting trace looses ordering (similar to unordered Message Set) but allows application of trace and log analysis patterns, especially if some order is fixed, for example, alphabetical for names or original presentation column arrangement. Schema definition Trace Schema can be represented as another Definition Trace as illustrated in the following diagram:

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Trace Analysis Patterns (Part 206)

Sunday, April 11th, 2021

Most of trace and log analysis pattern illustrations using Dia|gram language are of these two general forms:

Although the first form represents typical ETW trace attributes, the analysis pattern descriptions are usually independent of attribute name semantics. It, therefore, makes sense to generalize such forms into the following Trace Schema forms, with ATIDs for Adjoint Threads of Activity for the first form, and with FIDs for Features of Activity for the second form:

Such Trace Schemas are useful for various trace and log joins other than Trace Mask.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Trace Analysis Patterns (Part 205)

Sunday, April 4th, 2021

When looking at trace and log messages we are usually interested in some features (for example, when doing feature engineering, but not limited to) which can be labelled via Feature IDs (FID). Messages that have the same FID value constitute Feature of Activity, similar to Thread of Activity (or Adjoint Thread of Activity).

Such Features of Activity can span several (A)TIDs in contrast to Fibers of Activity which are confined to the same (A)TID and may have different FID values. Therefore, inside (A)TID there can be several Features of Activity having different FID values.

This analysis pattern serves as a base for other data science analysis patterns we add next.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -