Crash Dump Analysis Patterns (Part 13j)

Once we have seen a sequence of process memory dumps with the largest one almost 4GB. They were all saved from the process with growing memory consumption from 200MB initially. Initially we suspected process heap Memory Leak. However, heap statistics (!heap -s) was normal. There were not even large block allocations. The dumps were also supplied with UMDH logs but their difference only showed Memory Fluctuation and not increase. Stack Trace Collection revealed one Spiking Thread was logging a heap allocation into user mode stack trace database. We could also see that it was Distributed Spike. Inspection of address space showed large number of sequential regions of the same size with Stack Trace Database entries inside. So we concluded that it was stack trace logging Instrumentation Side Effect and advised to limit stack backtrace size in gflags.exe.

To make sure we understood that problem correctly we decided to model it. We didn’t come to the same results probably due to different logging implementation but memory dumps clearly show the possibility of Insufficient Memory pattern variant. Here’s the source code:

void foo20 (int size)
{
    free(malloc(size));
}
#define FOO(x,y) void foo##x (int size) { foo##y(size); }
FOO(19,20)
FOO(18,19)
FOO(17,18)
FOO(16,17)
FOO(15,16)
FOO(14,15)
FOO(13,14)
FOO(12,13)
FOO(11,12)
FOO(10,11)
FOO(9,10)
FOO(8,9)
FOO(7,8)
FOO(6,7)
FOO(5,6)
FOO(4,5)
FOO(3,4)
FOO(2,3)
FOO(1,2)
typedef void (*PFN) (int);
#define ARRSZ 20
PFN pfnArr[ARRSZ]  = {foo1, foo2, foo3, foo4, foo5, foo6, foo7,
    foo8, foo9, foo10, foo11, foo12, foo13, foo14,
    foo15, foo16, foo17, foo18, foo19, foo20};
int _tmain(int argc, _TCHAR* argv[])
{
    int i;
    for (i = 1; i < 1000000000; ++i) 
    {
        pfnArr[i%ARRSZ](i);
    }
    Sleep(-1); 
    return 0;
}

It allocates and then freezes heap entries of different size from 1 byte to 1,000,000,000 bytes all with different 20 possible stack traces. We choose different stack traces to increase the number of different {size, stack backtrace} pairs as several allocation of similar size having the same stack trace may be recorded only once in the database. We emulate different stack traces by calling different entries in pfnArr. Each call then leads to foo20 but the resulting stack trace depth is different. We also enabled “Create user mode stack trace database” checkbox in gflags.exe for our application called AllocFree.exe.

Then we see the expansion of Stack Trace Database regions (addresses are different because memory dumps were taken from different application runs):

0:000> !address
[...]
+ 0`00240000 0`00312000 0`000d2000 MEM_PRIVATE MEM_COMMIT PAGE_READWRITE Other [Stack Trace Database]
0`00312000 0`01a37000 0`01725000 MEM_PRIVATE MEM_RESERVE Other [Stack Trace Database]
0`01a37000 0`01a40000 0`00009000 MEM_PRIVATE MEM_COMMIT PAGE_READWRITE Other [Stack Trace Database]

0:000> !address
[...]
+ 0`001b0000 0`0188c000 0`016dc000 MEM_PRIVATE MEM_COMMIT PAGE_READWRITE Other [Stack Trace Database]
0`0188c000 0`0188d000 0`00001000 MEM_PRIVATE MEM_RESERVE Other [Stack Trace Database]
0`0188d000 0`019b0000 0`00123000 MEM_PRIVATE MEM_COMMIT PAGE_READWRITE Other [Stack Trace Database]

Heap stays the same:

0:000> !heap -s
NtGlobalFlag enables following debugging aids for new heaps:
stack back traces
LFH Key                   : 0x000000f841c4f9c0
Termination on corruption : ENABLED
           Heap     Flags   Reserv  Commit  Virt   Free  List   UCR  Virt  Lock
  Fast
                             (k)     (k)    (k)     (k) length      blocks cont.
 heap
-------------------------------------------------------------------------------
------
0000000001a40000 08000002    4096   1444   4096   1164     4     3    0      0
   LFH
External fragmentation  80 % (4 free blocks)
0000000000010000 08008000      64      4     64      1     1     1    0      0
0000000000020000 08008000      64     64     64     61     1     1    0      0
-------------------------------------------------------------------------------
------
0:000> !heap -s
NtGlobalFlag enables following debugging aids for new heaps:
stack back traces
LFH Key                   : 0x000000473a639107
Termination on corruption : ENABLED
           Heap     Flags   Reserv  Commit  Virt   Free  List   UCR  Virt  Lock
  Fast
                             (k)     (k)    (k)     (k) length      blocks cont.
 heap
-------------------------------------------------------------------------------
------
00000000019c0000 08000002    4096   1444   4096   1164     4     3    0      0
   LFH
External fragmentation  80 % (4 free blocks)
0000000000010000 08008000      64      4     64      1     1     1    0      0
0000000000020000 08008000      64     64     64     61     1     1    0      0
-------------------------------------------------------------------------------
------

But we see the thread consuming much CPU and caught while logging stack backtrace:

0:000> kc
Call Site
ntdll!RtlpStdLogCapturedStackTrace
ntdll!RtlStdLogStackTrace
ntdll!RtlLogStackBackTraceEx

ntdll!RtlpAllocateHeap
ntdll!RtlAllocateHeap
AllocFree!_heap_alloc
AllocFree!malloc

AllocFree!foo20
AllocFree!foo19
AllocFree!foo18
AllocFree!foo17
AllocFree!foo16
AllocFree!foo15
AllocFree!foo14
AllocFree!foo13
AllocFree!foo12
AllocFree!foo11
AllocFree!foo10
AllocFree!foo9
AllocFree!foo8
AllocFree!foo7
AllocFree!foo6
AllocFree!foo5
AllocFree!foo4
AllocFree!foo3
AllocFree!foo2
AllocFree!foo1
AllocFree!wmain
AllocFree!__tmainCRTStartup
kernel32!BaseThreadInitThunk
ntdll!RtlUserThreadStart

0:000> !runaway f
 User Mode Time
  Thread       Time
   0:53b8      0 days 3:22:02.354
 Kernel Mode Time
  Thread       Time
   0:53b8      0 days 0:20:39.022
 Elapsed Time
  Thread       Time
   0:53b8      0 days 10:11:23.596

If we dump some portion of the region we see recorded stack backtraces:

0:000> dps 0`0188c000-200 L200/8
00000000`0188be00 00000000`77891142 ntdll!RtlpAllocateHeap+0x33bd
00000000`0188be08 00000000`778834d8 ntdll!RtlAllocateHeap+0x16c
00000000`0188be10 00000001`3fcc13cb AllocFree!malloc+0x5b
00000000`0188be18 00000001`3fcc1015 AllocFree!foo20+0x15
00000000`0188be20 00000001`3fcc1041 AllocFree!foo19+0x11
00000000`0188be28 00000001`3fcc1061 AllocFree!foo18+0x11
00000000`0188be30 00000001`3fcc12e3 AllocFree!wmain+0x53
00000000`0188be38 00000001`3fcc156c AllocFree!__tmainCRTStartup+0x144
00000000`0188be40 00000000`777259ed kernel32!BaseThreadInitThunk+0xd
00000000`0188be48 00000000`7785c541 ntdll!RtlUserThreadStart+0x1d
00000000`0188be50 00000000`0188b1d0
00000000`0188be58 0009457d`00024fff
00000000`0188be60 00000000`77891142 ntdll!RtlpAllocateHeap+0x33bd
00000000`0188be68 00000000`778834d8 ntdll!RtlAllocateHeap+0x16c
00000000`0188be70 00000001`3fcc13cb AllocFree!malloc+0x5b
00000000`0188be78 00000001`3fcc1015 AllocFree!foo20+0x15
00000000`0188be80 00000001`3fcc1041 AllocFree!foo19+0x11
00000000`0188be88 00000001`3fcc12e3 AllocFree!wmain+0x53
00000000`0188be90 00000001`3fcc156c AllocFree!__tmainCRTStartup+0x144
00000000`0188be98 00000000`777259ed kernel32!BaseThreadInitThunk+0xd
00000000`0188bea0 00000000`7785c541 ntdll!RtlUserThreadStart+0x1d
00000000`0188bea8 00000000`00000000
00000000`0188beb0 00000000`0188b230
00000000`0188beb8 0008457e`00023fff
00000000`0188bec0 00000000`77891142 ntdll!RtlpAllocateHeap+0x33bd
00000000`0188bec8 00000000`778834d8 ntdll!RtlAllocateHeap+0x16c
00000000`0188bed0 00000001`3fcc13cb AllocFree!malloc+0x5b
00000000`0188bed8 00000001`3fcc1015 AllocFree!foo20+0x15
00000000`0188bee0 00000001`3fcc12e3 AllocFree!wmain+0x53
00000000`0188bee8 00000001`3fcc156c AllocFree!__tmainCRTStartup+0x144
00000000`0188bef0 00000000`777259ed kernel32!BaseThreadInitThunk+0xd
00000000`0188bef8 00000000`7785c541 ntdll!RtlUserThreadStart+0x1d
00000000`0188bf00 00000000`0188b280
00000000`0188bf08 001b457f`0002dfff
00000000`0188bf10 00000000`77891142 ntdll!RtlpAllocateHeap+0x33bd
00000000`0188bf18 00000000`778834d8 ntdll!RtlAllocateHeap+0x16c
00000000`0188bf20 00000001`3fcc13cb AllocFree!malloc+0x5b
00000000`0188bf28 00000001`3fcc1015 AllocFree!foo20+0x15
00000000`0188bf30 00000001`3fcc1041 AllocFree!foo19+0x11
00000000`0188bf38 00000001`3fcc1061 AllocFree!foo18+0x11
00000000`0188bf40 00000001`3fcc1081 AllocFree!foo17+0x11
00000000`0188bf48 00000001`3fcc10a1 AllocFree!foo16+0x11
00000000`0188bf50 00000001`3fcc10c1 AllocFree!foo15+0x11
00000000`0188bf58 00000001`3fcc10e1 AllocFree!foo14+0x11
00000000`0188bf60 00000001`3fcc1101 AllocFree!foo13+0x11
00000000`0188bf68 00000001`3fcc1121 AllocFree!foo12+0x11
00000000`0188bf70 00000001`3fcc1141 AllocFree!foo11+0x11
00000000`0188bf78 00000001`3fcc1161 AllocFree!foo10+0x11
00000000`0188bf80 00000001`3fcc1181 AllocFree!foo9+0x11
00000000`0188bf88 00000001`3fcc11a1 AllocFree!foo8+0x11
00000000`0188bf90 00000001`3fcc11c1 AllocFree!foo7+0x11
00000000`0188bf98 00000001`3fcc11e1 AllocFree!foo6+0x11
00000000`0188bfa0 00000001`3fcc1201 AllocFree!foo5+0x11
00000000`0188bfa8 00000001`3fcc1221 AllocFree!foo4+0x11
00000000`0188bfb0 00000001`3fcc1241 AllocFree!foo3+0x11
00000000`0188bfb8 00000001`3fcc1261 AllocFree!foo2+0x11
00000000`0188bfc0 00000001`3fcc1281 AllocFree!foo1+0x11
00000000`0188bfc8 00000001`3fcc12e3 AllocFree!wmain+0x53
00000000`0188bfd0 00000001`3fcc156c AllocFree!__tmainCRTStartup+0x144
00000000`0188bfd8 00000000`777259ed kernel32!BaseThreadInitThunk+0xd
00000000`0188bfe0 00000000`7785c541 ntdll!RtlUserThreadStart+0x1d
00000000`0188bfe8 00000000`00000000
00000000`0188bff0 00000000`00000000
00000000`0188bff8 00000000`00000000

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Leave a Reply