Archive for the ‘Science of Software Tracing’ Category

Malware Analysis Report System (MARS)

Friday, October 22nd, 2010

I detour for MARS expedition. You may also call it Memory Analysis Report System as malware analysis is always exploration of memory (in general). Why is this sudden change of course? After reading Gilles Deleuze I want to broaden the concept of “malware” and give it new orientation and direction of thinking. Beside that I also want new challenges after many years of research in pattern-driven memory dump and software trace analysis of abnormal software behaviour.

You may have also noticed small restructuring (rebranding) of this blog and DumpAnalysis.org headers.

See you there :-)

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Malware Analysis Patterns

Wednesday, October 20th, 2010

As a practical example of applying behavioral and structural pattern analysis of computer memory and traces OpenTask plans to publish the following title next year:

  • Title: Malware Patterns: Structure and Behavior of Computer Adware, Crimeware, Rootkits, Scareware, Spyware, Trojans, Viruses, Victimware and Worms
  • Author: Dmitry Vostokov
  • Paperback: 1200 pages
  • Publisher: OpenTask (October 2011)
  • ISBN-13: 978-1-908043-01-6

The inclusion of victimware is necessary because of the effects of defective malware.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Notation for Memory and Trace Analysis (Part 1)

Thursday, October 7th, 2010

It is time now to introduce a syntactical notation for memory (dump) and software trace analysis pattern languages (in addition to graphical notation proposed earlier). It should be simple and concise: allow easy grammar with plain syntax and obvious reading semantics. We propose to use capitalized letters for major pattern categories, for example, W for wait chains and D for deadlocks. Then use subscripts (or small letters) for pattern subcategories, for example, Wcs and Dlpc. Several categories and subcategories can be combined by using slash (/), for example, Wcs/Dcs/lpc. Slash notation is better viewed using subscripts:

Wcs/Dcs/lpc

Next part will introduce more categories and propose notational adornments for pattern succession, space differentiation and the inclusion of details in notational sentences.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Structural Memory Patterns (Part 1)

Friday, September 24th, 2010

Now it’s time to divide memory analysis patterns discerned so far as mostly abnormal software behavior memory dump and software trace patterns into behavioral and structural catalogues. The goal is to account for normal system-independent structural entities and relationships visible in memory like modules, threads, processes and so on.

The first pattern (and also a super-pattern) we discuss in this part is called Memory Snapshot. It is further subdivided into Structured Memory Snapshot and BLOB Memory Snapshot. Structured sub-pattern includes:

- Contiguous memory dump files with artificially generated headers (for example, physical or process virtual space memory dump)

- Software trace messages with imposed internal structure

BLOB sub-pattern variety includes address range snapshots without any externally imposed structure, for example, saved by .writemem WinDbg command or ReadProcessMemory API and contiguous buffer and raw memory dumps saved by various memory acquisition tools.

Behavioral patterns that relate to Memory Snapshot pattern are:

I strive initially to publish at least one such pattern every day to fill the gap of normal patterns in memory analysis and later add more multi-platform details and examples from other platforms like Linux, Mac OS X, embedded and selected important historical architectures.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Debugged! MZ/PE March print issue is out

Sunday, September 19th, 2010

Finally, after the delay, the issue is available in print on Amazon and through other sellers:

Debugged! MZ/PE: Multithreading

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Presenting a Software Story

Sunday, July 4th, 2010

It’s time to introduce a conceptual software narratological framework for viewing software traces (using rich ETW / CDF tracing as our main focus). Here we consider a software story (fabula) as a full trace when every component was selected for tracing and emits debug messages during code execution paths. However, during viewing we can filter on and off certain modules, threads, processes, messages, etc. (adjoint threading) and see a different sub-story or plot (sujet). Every software plot (please do not confuse with PLOT acronym) can be presented differently (using appropriate discourse). Some presentational examples include temporal rearrangement, collapse of repetitive regions, source code hypertext (lexia) and allegorical devices such as message tool-tip comments. Here is a diagram that depicts story (fable, fabula) - plot (sujet) - presentation (discourse):

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

The Extended Software Trace

Sunday, June 13th, 2010

By analogy with paratext let’s introduce a software narratological concept of the extended software trace that consists of a software trace plus additional supporting information that makes troubleshooting and debugging easier. Such “paratextual” information can consists of pictures, videos, accounts of scenarios and past problem histories, customer interviews and even software trace delivery medium and format (if preformatted).

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Bugtation No.120

Saturday, May 22nd, 2010

… science is in reality a classification and analysis of the contents of the memory;

Karl Pearson, The Grammar of Science

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Two Readings of a Software Trace

Friday, May 21st, 2010

When we have a software trace we read it in two directions. The first one is to deconstruct it into a linear ordered source code based on PLOT fragments. The second direction is to construct an interpretation that serve as an explanation for reported software behaviour. During the interpretive reading we remove irrelevant information, compress relevant activity regions and construct the new fictional software trace based on discovered patterns and our problem description.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Software Behavior Patterns (Part 1)

Monday, May 10th, 2010

My drive to generalization led me to place an adornment on the portal to highlight the fact that memory and software trace analysis patterns are under an umbrella of general software behaviour patterns:

http://www.dumpanalysis.org/Software-Behavior-Patterns-Headline

In the forthcoming post series I plan to write about similarities between these two branches and also provide pattern examples from non-Windows platforms. All this material will provide the foundation for the forthcoming book Software Behavior: A Guide to Systematic Analysis (ISBN: 978-1906717162).

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Basic Software PLOTs (Part 0)

Thursday, May 6th, 2010

Befind every trace and its messages is source code:

Borrowing the acronym PLOT (Program Lines of Trace) we now try to discern basic source code patterns that give rise to simple message patterns in software traces. There are only a few distinct PLOTs and the ability to mentally map trace statements to source code is crucial to software trace reading and comprehension. More about that in subsequent parts. More complex message patterns (for example, specific message blocks or correlated messages) arise from supportable and maintainable realizations of architectural, design and implementation patterns and will be covered in another post series.

I was thinking about acronym SLOT (Source Lines of Trace) but decided to use PLOT because it metaphorically bijects into literary theory and narrative plots.

Forthcoming CDF and ETW Software Trace Analysis: Practical Foundations

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org

Archaeological Foundations for Memory Analysis

Thursday, April 22nd, 2010

I’ve decided to adapt archaeological classificatory framework (using my favourite method of inquiry: metaphorical bijectionism) to lay out foundations for yet another attempt to classify DA+TA patterns):

Attribute  ↔ Pattern
Artefact   ↔ Component Artefact1
Assemblage ↔ Component Assemblage
Culture    ↔ Memory System Culture
2

1 Can be either a component-generated artefact or a component like a module or symbol file
2 Typical examples of memory system cultures are Windows, UNIX or even “Multiplatform”

I propose a word Memoarchaeological for such a framework and Memoarchaeology for a branch of Memoretics that studies saved computer memory artifacts from past computations (as opposed to live memory).

Note: In one of the forthcoming issues of Debugged! MZ/PE magazine there will be presented yet another classificatory scheme.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Software Behavior Patterns (Part 0)

Thursday, April 22nd, 2010

Forthcoming CARE and STARE online systems additionally aim to provide software behaviour pattern identification via debugger log and trace analysis and suggest possible software troubleshooting patterns. The purpose of these post series is to provide high level overview of possible patterns of software behavior and how they can be recognised and analyzed. This work started in October, 2006 with the identification of computer memory patterns and later continued with software trace patterns. Bringing all of them under a unified linked framework seems quite natural to me.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Modern Memory Dump and Software Trace Analysis: Volumes 1-3

Sunday, April 18th, 2010

OpenTask to offer first 3 volumes of Memory Dump Analysis Anthology in one set:

The set is available exclusively from OpenTask e-Commerce web site starting from June. Individual volumes are also available from Amazon, Barnes & Noble and other bookstores worldwide.

Product information:

  • Title: Modern Memory Dump and Software Trace Analysis: Volumes 1-3
  • Author: Dmitry Vostokov
  • Language: English
  • Product Dimensions: 22.86 x 15.24
  • Paperback: 1600 pages
  • Publisher: Opentask (31 May 2010)
  • ISBN-13: 978-1-906717-99-5

Information about individual volumes:

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Trace Analysis Patterns (Part 17)

Thursday, March 4th, 2010

This is an extension of Thread of Activity pattern based on the concept of multibraiding and it is called Adjoint Thread of Activity correspondingly. I’m going to illustrate it soon when I publish a synthetic case study involving several software trace analysis patterns.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Forthcoming Memory Dump Analysis Anthology, Volume 4

Thursday, February 11th, 2010

This is a revised, edited, cross-referenced and thematically organized volume of selected DumpAnalysis.org blog posts about crash dump analysis and debugging written in July 2009 - January 2010 for software engineers developing and maintaining products on Windows platforms, quality assurance engineers testing software on Windows platforms and technical support and escalation engineers dealing with complex software issues. The fourth volume features:

- 13 new crash dump analysis patterns
- 13 new pattern interaction case studies
- 10 new trace analysis patterns
- 6 new Debugware patterns and case study
- Workaround patterns
- Updated checklist
- Fully cross-referenced with Volume 1, Volume 2 and Volume 3
- New appendixes

Product information:

  • Title: Memory Dump Analysis Anthology, Volume 4
  • Author: Dmitry Vostokov
  • Language: English
  • Product Dimensions: 22.86 x 15.24
  • Paperback: 410 pages
  • Publisher: Opentask (30 March 2010)
  • ISBN-13: 978-1-906717-86-5
  • Hardcover: 410 pages
  • Publisher: Opentask (30 April 2010)
  • ISBN-13: 978-1-906717-87-2

Back cover features memory space art image: Internal Process Combustion.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Extending Multithreading to Multibraiding (Adjoint Threading)

Sunday, January 17th, 2010

Having considered computational threads as braided strings and after discerning several software trace analysis patterns (just the beginning) we can see formatted and tabulated software trace output in a new light and employ the “fabric of traces” and braid metaphors for an Adjoint Thread concept. This new concept was motivated by reading about Extended Phenotype (*) and extensive analysis of Citrix ETW-based CDF traces using CDFAnalyzer. The term Adjoint was borrowed from mathematics because the concept we discuss below resembles this metaphorical formula: (Thread A, B) = [A, Thread B]. Let me first illustrate adjoint threading using simplified trace tables. Consider this generalized software trace example (date and time column is omitted for visual clarity):

#

Source Dir

PID

TID

File Name

Function

Message

1

\src\subsystemA

2792

5676

file1.cpp

fooA

Message text…

2

\src\subsystemA

2792

5676

file1.cpp

fooA

Message text…

3

\src\subsystemA

2792

5676

file1.cpp

fooA

Message text…

4

\src\lib

2792

5680

file2.cpp

barA

Message text…

5

\src\subsystemA

2792

5680

file1.cpp

fooA

Message text…

6

\src\subsystemA

2792

5676

file1.cpp

fooA

Message text…

7

\src\lib

2792

5680

file2.cpp

fooA

Message text…

8

\src\lib

2792

5680

file2.cpp

fooA

Message text…

9

\src\subsystemB

2792

3912

file3.cpp

barB

Message text…

10

\src\subsystemB

2792

3912

file3.cpp

barB

Message text…

11

\src\subsystemB

2792

3912

file3.cpp

barB

Message text…

12

\src\subsystemB

2792

3912

file3.cpp

barB

Message text…

13

\src\subsystemB

2792

3912

file3.cpp

barB

Message text…

14

\src\subsystemB

2792

3912

file3.cpp

barB

Message text…

15

\src\subsystemB

2792

2992

file4.cpp

fooB

Message text…

16

\src\subsystemB

2792

3008

file4.cpp

fooB

Message text…

We see several threads in a process PID 2792. In CDFAnalyzer we can filter trace messages that belong to any column and if we filter by TID we get a view of any Thread of Activity. However, each thread can “run” through any source directory, file name or function. If a function belongs to a library multiple threads would access it. This source location (can be considered as a subsystem), file or function view of activity is called an Adjoint Thread. For example, if we filter only subsystemA column in the trace above we get this table:

#

Source Dir

PID

TID

File Name

Function

Message

1

\src\subsystemA

2792

5676

file1.cpp

fooA

Message …

2

\src\subsystemA

2792

5676

file1.cpp

fooA

Message …

3

\src\subsystemA

2792

5676

file1.cpp

fooA

Message …

5

\src\subsystemA

2792

5680

file1.cpp

fooA

Message …

6

\src\subsystemA

2792

5676

file1.cpp

fooA

Message …

7005

\src\subsystemA

2792

5664

file1.cpp

fooA

Message …

10198

\src\subsystemA

2792

5664

file1.cpp

fooA

Message …

10364

\src\subsystemA

2792

5664

file1.cpp

fooA

Message …

10417

\src\subsystemA

2792

5664

file1.cpp

fooA

Message …

10420

\src\subsystemA

2792

5676

file1.cpp

fooA

Message …

10422

\src\subsystemA

2792

5680

file1.cpp

fooA

Message …

10587

\src\subsystemA

2792

5664

file1.cpp

fooA

Message …

10767

\src\subsystemA

2792

5680

file1.cpp

fooA

Message …

11126

\src\subsystemA

2792

5668

file1.cpp

fooA

Message …

11131

\src\subsystemA

2792

5680

file1.cpp

fooA

Message …

11398

\src\subsystemA

2792

5676

file1.cpp

fooA

Message …

11501

\src\subsystemA

2792

5668

file1.cpp

fooA

Message …

11507

\src\subsystemA

2792

5668

file1.cpp

fooA

Message …

11509

\src\subsystemA

2792

5664

file1.cpp

fooA

Message …

11513

\src\subsystemA

2792

5680

file1.cpp

fooA

Message …

11524

\src\subsystemA

2792

5668

file1.cpp

fooA

Message …

We can graphically view subsystemA as a braid string that “permeates the fabric of threads”:

We can get many different braids by changing filters, hence multibraiding. Here is another example of a driver source file view initially permeating 2 process contexts and 4 threads:

#

Source Dir

PID

TID

File Name

Function

Message

41

\src\sys\driver

3636

3848

entry.c

DriverEntry

IOCTL …

80

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

99

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

102

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

179

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

180

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

311

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

447

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

448

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

457

\src\sys\driver

2792

5108

entry.c

DriverEntry

IOCTL …

608

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

614

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

655

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

675

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

678

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

680

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

681

\src\sys\driver

3636

3896

entry.c

DriverEntry

IOCTL …

1145

\src\sys\driver

3636

4960

entry.c

DriverEntry

IOCTL …

1153

\src\sys\driver

3636

4960

entry.c

DriverEntry

IOCTL …

1154

\src\sys\driver

3636

4960

entry.c

DriverEntry

IOCTL …

(*) A bit of digression. Looks like biology keeps giving insights into software, there is even a software phenotype metaphor albeit a bit restricted to code, I just thought that we need also an Extended Software Phenotype.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

MemD Category (Categories for the Working Software Defect Researcher, Part 1)

Friday, January 8th, 2010

I started applying category theory (as an alternative to traditional set-theoretic approach of memory bits) to memory dump analysis, debugging and software trace analysis in parallel to my studies of that branch of mathematics and reading the book Memory Evolutive Systems. In addition to complex systems modelled in the latter book I apply evolutive systems approach to computer memory. Here is a picture illustrating MemD category of memory dumps (snapshots) as category objects and category arrows as different ways in arriving at the same memory picture:

 

This category definitely applies to software traces as well if we consider every individual trace message or statement as a minidump. We currently consider software trace category MemT as a subcategory of MemD.

Configuration category of a computer memory dump represents its memory internals at an instant t (ideal memory dumps) or at a time interval T: components and links, pointers, wait chains, causal relations, data flows, … .

Pointers and their links are also objects and arrows to form a category, called MemP(tr). The following picture illustrates it with the last pointer shown as a dereference fixpoint:

The perception field of a pointer is a category of all links to its memory location:

However, the operating field of a pointer is its link to a memory location it is pointing to.

- Dmitry Vostokov @ DumpAnalysis.org -

Memory Dump Analysis Anthology, Volume 3

Sunday, December 20th, 2009

“Memory dumps are facts.”

I’m very excited to announce that Volume 3 is available in paperback, hardcover and digital editions:

Memory Dump Analysis Anthology, Volume 3

Table of Contents

In two weeks paperback edition should also appear on Amazon and other bookstores. Amazon hardcover edition is planned to be available in January 2010.

The amount of information was so voluminous that I had to split the originally planned volume into two. Volume 4 should appear by the middle of February together with Color Supplement for Volumes 1-4. 

- Dmitry Vostokov @ DumpAnalysis.org -

The Pyramid of Memory Analysis Institutions

Thursday, December 17th, 2009

Previously announced Software Maintenance Institute was finally registered in Ireland (Reg. No. 400906) and its certificate was received yesterday.

Here is the current component structure of various institutions (depicted in UML):

Interface Tags:

IIP Interface of Iterative Publishing
IRD Interface of Research and Development
IDR Interface of Defect Research
IIR Interface of Information Repository
IME Interface of Memetic Engineering

- Dmitry Vostokov @ DumpAnalysis.org -