Crash Dump Analysis Patterns (Part 228)

Predicate Stack Trace Collections allow you to get a subset of stack traces, for example, by showing only stack traces where a specific module is used (for example, !stacks 2 module WinDbg command). From diagnostic analysis perspective, the order in which threads from the subset appear is also important, especially when the output is sorted by thread creation time or simply the order is given by a global thread linked list. We call this analysis pattern Thread Poset by analogy with a mathematical concept of poset (partially ordered set):

Such an analysis pattern is mostly useful when we compare stack traces for differences or when we don’t have symbols for some problem version and want to map threads to some other previous normal run where symbol files are available. Any discrepancies may point in the direction of further diagnostic analysis. For example, we got this fragment of Stack Trace Collection:

4.000188 fffffa800d3d3b50 ffd0780f Blocked ModuleA+0x1ac1
4.00018c fffffa800d3f9950 ffd07b53 Blocked ModuleA+0xd802
4.000190 fffffa800d4161b0 fffffda6 Blocked ModuleA+0x9ce4
4.000194 fffffa800d418b50 fffffda6 Blocked ModuleA+0x9ce4
4.000198 fffffa800d418660 fffffda6 Blocked ModuleA+0x9ce4
4.0001ac fffffa800d41eb50 ffd078d2 Blocked ModuleA+0xa7cf
4.0001b0 fffffa800d41e660 ffd0780f Blocked ModuleA+0x9ce4
4.0001c0 fffffa800d48f300 ffd0e5c0 Blocked ModuleA+0x7ee5

We didn’t have symbols, and, therefore, didn’t know whether there was anything wrong with those threads. Fortunately, we had Thread Poset from an earlier 32-bit version with available symbol files:

4.0000ec 85d8dc58 000068c Blocked ModuleA!FuncA+0x9b
4.0000f0 85d9fc78 001375a Blocked ModuleA!FuncB+0x67
4.0000fc 85db8a58 000068c Blocked ModuleA!WorkerThread+0xa2
4.000104 85cdbd48 000ff44 Blocked ModuleA!WorkerThread+0xa2
4.000108 85da2788 000ff47 Blocked ModuleA!WorkerThread+0xa2

4.000110 857862e0 0013758 Blocked ModuleA!FuncC+0xe4
4.000114 85dda250 000ff44 Blocked ModuleA!FuncD+0xf2

If we map worker threads to the middle section of x64 version we see just one more worker thread but the overall order is the same:

4.000188 fffffa800d3d3b50 ffd0780f Blocked ModuleA+0x1ac1
4.00018c fffffa800d3f9950 ffd07b53 Blocked ModuleA+0xd802
4.000190 fffffa800d4161b0 fffffda6 Blocked ModuleA+0×9ce4
4.000194 fffffa800d418b50 fffffda6 Blocked ModuleA+0×9ce4
4.000198 fffffa800d418660 fffffda6 Blocked ModuleA+0×9ce4

4.0001ac fffffa800d41eb50 ffd078d2 Blocked ModuleA+0xa7cf
4.0001b0 fffffa800d41e660 ffd0780f Blocked ModuleA+0×9ce4
4.0001c0 fffffa800d48f300 ffd0e5c0 Blocked ModuleA+0×7ee5

So we may think of x64 Thread Poset as normal if x86 Thread Poset is normal too. Of course, only initially, then to continue looking for other patterns of abnormal behavior. If necessary, we may need to inspect stack traces deeper, because individual threads from two Thread Posets may differ in their stack trace depth, subtraces, and in usage of other components. Despite the same order, some threads may actually be abnormal.

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Leave a Reply