Comparative Memory Dump Analysis: CPU Spikes
Comparative analysis is based on saving several consecutive memory dump files to see similarities and differences. Most often this technique is used for memory leaks, for example, process heap leaks. Here we see another example related to CPU spikes.
One application was spiking CPU and its user dump file was saved manually. !runaway WinDbg command showed the following distribution of kernel and user thread times:
0:000> !runaway f
User Mode Time
Thread Time
14:f48 0 days 1:22:34.343
15:f4c 0 days 0:00:45.671
17:f54 0 days 0:00:41.359
18:f58 0 days 0:00:33.468
16:f50 0 days 0:00:24.171
12:f40 0 days 0:00:17.531
13:f44 0 days 0:00:07.109
0:f10 0 days 0:00:00.500
[…]
Kernel Mode Time
Thread Time
15:f4c 0 days 0:00:51.046
17:f54 0 days 0:00:41.125
18:f58 0 days 0:00:32.765
16:f50 0 days 0:00:23.500
12:f40 0 days 0:00:18.671
13:f44 0 days 0:00:07.453
14:f48 0 days 0:00:02.031
1:f14 0 days 0:00:00.343
[…]
Elapsed Time
Thread Time
1:f14 0 days 12:01:42.656
0:f10 0 days 12:01:42.656
24:f70 0 days 12:01:42.093
23:f6c 0 days 12:01:42.093
22:f68 0 days 12:01:42.093
21:f64 0 days 12:01:42.093
20:f60 0 days 12:01:42.093
19:f5c 0 days 12:01:42.093
18:f58 0 days 12:01:42.093
17:f54 0 days 12:01:42.093
16:f50 0 days 12:01:42.093
15:f4c 0 days 12:01:42.093
14:f48 0 days 12:01:42.093
13:f44 0 days 12:01:42.093
[…]
We see that thread #14 spent 1 hour and 22 minutes in user space and only 2 seconds in kernel space. Its thread stack showed DllA and DllB modules:
0:000> ~14s
*** ERROR: Symbol file could not be found. Defaulted to export symbols for DllA.dll -
eax=00010da2 ebx=09227bdc ecx=09236bcc edx=0000003a esi=006769e4 edi=00000335
eip=6120b231 esp=00fefd04 ebp=00fefd2c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
DllA!foo+0×3549:
6120b231 8b45ec mov eax,dword ptr [ebp-14h] ss:0023:00fefd18=09251a30
0:000> kv
ChildEBP RetAddr Args to Child
WARNING: Stack unwind information not available. Following frames may be wrong.
00fefd2c 6120be31 00000063 8633cdb9 006769e4 DllA!foo+0×3549
00fefd64 6120c6db 00fefe00 00006f07 0353f594 DllA!foo+0×4149
00fefdbc 6120d517 00fefe00 00fefde8 8633cd25 DllA!foo+0×49f3
00fefdf8 6120db4d 0353f594 09251848 00000000 DllA!foo+0×582f
00fefe14 6120f280 006769a0 00000000 0000020c DllA!foo+0×5e65
00fefe84 00933436 0000020c 0353f5ac 00000001 DllA!foo+0×7598
00fefec0 008414f1 0353f580 034db5d0 00000003 DllB!read+0xc6
[…]
00feffb8 77e64829 0078f290 00000000 00000000 msvcr71!_endthreadex+0xa0
00feffec 00000000 7c3494f6 0078f290 00000000 kernel32!BaseThreadStart+0×34
These modules belong to different vendors and the code could loop in DllB or in one of DllA functions. In order to show statistically which module is more suspect than the other, we requested 3 consecutive memory dumps of the same spiking process. They revealed the similar uniform pattern:
0:000> !runaway f
User Mode Time
Thread Time
16:e5c 0 days 1:37:24.640
0:e1c 0 days 0:00:00.640
17:e60 0 days 0:00:00.453
18:e64 0 days 0:00:00.328
15:e58 0 days 0:00:00.265
14:e54 0 days 0:00:00.203
[…]
Kernel Mode Time
Thread Time
16:e5c 0 days 0:00:00.687
15:e58 0 days 0:00:00.421
17:e60 0 days 0:00:00.375
14:e54 0 days 0:00:00.265
18:e64 0 days 0:00:00.250
0:e1c 0 days 0:00:00.046
[…]
Elapsed Time
Thread Time
1:e20 0 days 4:41:11.562
0:e1c 0 days 4:41:11.562
18:e64 0 days 4:41:10.875
17:e60 0 days 4:41:10.875
16:e5c 0 days 4:41:10.875
15:e58 0 days 4:41:10.875
14:e54 0 days 4:41:10.875
13:e50 0 days 4:41:10.875
12:e4c 0 days 4:41:10.875
11:e48 0 days 4:41:10.875
10:e44 0 days 4:41:10.875
9:e40 0 days 4:41:10.875
8:e3c 0 days 4:41:10.875
7:e38 0 days 4:41:10.875
6:e34 0 days 4:41:10.875
5:e30 0 days 4:41:10.875
4:e2c 0 days 4:41:10.875
3:e28 0 days 4:41:10.875
2:e24 0 days 4:41:10.875
[…]
1st dump file: Process Uptime: 0 days 4:41:12.000
0:000> ~16r
eax=07abbb6c ebx=07ab45d4 ecx=07ab4604 edx=00679344 esi=006769e4 edi=00000035
eip=6120b23e esp=0106fd30 ebp=0106fd58 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
DllA!foo+0×3556:
6120b23e 8bcb mov ecx,ebx
0:000> ~16kv
ChildEBP RetAddr Args to Child
WARNING: Stack unwind information not available. Following frames may be wrong.
0106fd58 6120be31 0000000b 47d6689c 006769e4 DllA!foo+0×3556
0106fd90 6120c6db 0106fe2c 00007329 0342d14c DllA!foo+0×4149
0106fde8 6120d517 0106fe2c 0106fe14 47d66b28 DllA!foo+0×49f3
0106fe24 6120db4d 0342d14c 07abb960 00000000 DllA!foo+0×582f
0106fe40 6120f280 006769a0 00000000 000001d8 DllA!foo+0×5e65
0106feb0 00933436 000001d8 0342d164 00000001 DllA!foo+0×7598
0106feec 008414f1 0342d138 00000000 0340b290 DllB!read+0xc6
[…]
0106ffb8 77e64829 00784be0 00000000 00000000 msvcr71!_endthreadex+0xa0
0106ffec 00000000 7c3494f6 00784be0 00000000 kernel32!BaseThreadStart+0×34
2nd dump file: Process Uptime: 0 days 4:41:42.000
0:000> ~16r
eax=0002550b ebx=00679344 ecx=07ab45d4 edx=00000009 esi=006769e4 edi=00000035
eip=6120b231 esp=0106fd30 ebp=0106fd58 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
*** ERROR: Symbol file could not be found. Defaulted to export symbols for DllA.dll -
DllA!foo+0×3549:
6120b231 8b45ec mov eax,dword ptr [ebp-14h] ss:0023:0106fd44=07abbb48
0:000> ~16kv
ChildEBP RetAddr Args to Child
WARNING: Stack unwind information not available. Following frames may be wrong.
0106fd58 6120be31 0000000b 47d6689c 006769e4 DllA!foo+0×3549
0106fd90 6120c6db 0106fe2c 00007329 0342d14c DllA!foo+0×4149
0106fde8 6120d517 0106fe2c 0106fe14 47d66b28 DllA!foo+0×49f3
0106fe24 6120db4d 0342d14c 07abb960 00000000 DllA!foo+0×582f
0106fe40 6120f280 006769a0 00000000 000001d8 DllA!foo+0×5e65
0106feb0 00933436 000001d8 0342d164 00000001 DllA!foo+0×7598
0106feec 008414f1 0342d138 00000000 0340b290 DllB!read+0xc6
[…]
0106ffb8 77e64829 00784be0 00000000 00000000 msvcr71!_endthreadex+0xa0
0106ffec 00000000 7c3494f6 00784be0 00000000 kernel32!BaseThreadStart+0×34
3rd dump file: Process Uptime: 0 days 4:41:59.000
0:000> ~16r
eax=07abbb6c ebx=07ab4604 ecx=07ab45d4 edx=07ab4604 esi=006769e4 edi=00000035
eip=6120b239 esp=0106fd30 ebp=0106fd58 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
DllA!foo+0×3551:
6120b239 895108 mov dword ptr [ecx+8],edx ds:0023:07ab45dc=07ab4604
0:000> ~16kv
ChildEBP RetAddr Args to Child
WARNING: Stack unwind information not available. Following frames may be wrong.
0106fd58 6120be31 0000000b 47d6689c 006769e4 DllA!foo+0×3551
0106fd90 6120c6db 0106fe2c 00007329 0342d14c DllA!foo+0×4149
0106fde8 6120d517 0106fe2c 0106fe14 47d66b28 DllA!foo+0×49f3
0106fe24 6120db4d 0342d14c 07abb960 00000000 DllA!foo+0×582f
0106fe40 6120f280 006769a0 00000000 000001d8 DllA!foo+0×5e65
0106feb0 00933436 000001d8 0342d164 00000001 DllA!foo+0×7598
0106feec 008414f1 0342d138 00000000 0340b290 DllB!read+0xc6
[…]
0106ffb8 77e64829 00784be0 00000000 00000000 msvcr71!_endthreadex+0xa0
0106ffec 00000000 7c3494f6 00784be0 00000000 kernel32!BaseThreadStart+0×34
We see exactly the same stack trace except the first (top) frame. The offset differs very little: +0×3556, +0×3549 (twice) and +0×3551. We can suspect the small loop here:
0:000> ub
DllA!foo+0x3549:
6120b231 8b45ec mov eax,dword ptr [ebp-14h]
6120b234 8d0490 lea eax,[eax+edx*4]
6120b237 8b10 mov edx,dword ptr [eax]
6120b239 895108 mov dword ptr [ecx+8],edx
6120b23c 8908 mov dword ptr [eax],ecx
6120b23e 8bcb mov ecx,ebx
6120b240 85db test ebx,ebx
6120b242 75e3 jne DllA!foo+0x353f (6120b227)
0:000> ub 6120b231
DllA!foo+0x3535:
6120b21d 8b4d08 mov ecx,dword ptr [ebp+8]
6120b220 8b0c88 mov ecx,dword ptr [eax+ecx*4]
6120b223 3bcb cmp ecx,ebx
6120b225 741d je DllA!foo+0x355c (6120b244)
6120b227 8b5908 mov ebx,dword ptr [ecx+8]
6120b22a 8b410c mov eax,dword ptr [ecx+0Ch]
6120b22d 33d2 xor edx,edx
6120b22f f7f7 div eax,edi
0:000> u
DllA!foo+0x3549:
6120b231 8b45ec mov eax,dword ptr [ebp-14h]
6120b234 8d0490 lea eax,[eax+edx*4]
6120b237 8b10 mov edx,dword ptr [eax]
6120b239 895108 mov dword ptr [ecx+8],edx
6120b23c 8908 mov dword ptr [eax],ecx
6120b23e 8bcb mov ecx,ebx
6120b240 85db test ebx,ebx
6120b242 75e3 jne DllA!foo+0x353f (6120b227)
0:000> u 6120b242
DllA!foo+0x355a:
6120b242 75e3 jne DllA!foo+0x353f (6120b227)
6120b244 ff4508 inc dword ptr [ebp+8]
6120b247 8b4508 mov eax,dword ptr [ebp+8]
6120b24a 3b4608 cmp eax,dword ptr [esi+8]
6120b24d 72cc jb DllA!foo+0x3533 (6120b21b)
6120b24f ff36 push dword ptr [esi]
6120b251 e88dc70000 call DllA!foo+0xfcfb (612179e3)
6120b256 59 pop ecx
All this strongly rules out DllB and points to DllA as the first vendor contact.
- Dmitry Vostokov @ DumpAnalysis.org -