Archive for February, 2016

Crash Dump Analysis Patterns (Part 236)

Saturday, February 13th, 2016

When we have a performance issue we may request a set of consecutive memory dump saved after some interval. If in such memory dumps we may see the same thread(s) having similar stack trace(s). In this simple diagnostic scenario we may diagnose several patterns based on the stack traces: Active Threads that can be Spiking Threads with Spike Intervals or stable, not changing, Wait Chains. Here we may easily identify Top active and Blocking modules based on Module Wait Chain.

More complex case arises when we have different Active Threads and/or Wait Chains with different thread IDs at different times. However, if their Top Module is the same we may have found it as a performance root cause component especially in the case of Active Threads since it is statistically probable that such threads were active for considerable time deltas around the snapshot times (since threads are usually waiting). Such hypothesis may also be confirmed by Inter-Correlation analysis with software traces and logs where we can see Thread of Activity Discontinuities and Time Deltas.

We call this analysis pattern Diachronic Module since we see the module component appears in different thread stack traces diachronically (at different times). The typical simplified scenario is illustrated in this diagram:

This analysis pattern is different from synchronous module case (the module component appears in different thread stack traces at the same time) which was named Ubiquitous Component.

- Dmitry Vostokov @ + -