Comparative Memory Dump Analysis: CPU Spikes

Comparative analysis is based on saving several consecutive memory dump files to see similarities and differences. Most often this technique is used for memory leaks, for example, process heap leaks. Here we see another example related to CPU spikes.

One application was spiking CPU and its user dump file was saved manually. !runaway WinDbg command showed the following distribution of kernel and user thread times:

0:000> !runaway f
 User Mode Time
  Thread       Time
  14:f48       0 days 1:22:34.343
  15:f4c       0 days 0:00:45.671
  17:f54       0 days 0:00:41.359
  18:f58       0 days 0:00:33.468
  16:f50       0 days 0:00:24.171
  12:f40       0 days 0:00:17.531
  13:f44       0 days 0:00:07.109
   0:f10       0 days 0:00:00.500
[…]
 Kernel Mode Time
  Thread       Time
  15:f4c       0 days 0:00:51.046
  17:f54       0 days 0:00:41.125
  18:f58       0 days 0:00:32.765
  16:f50       0 days 0:00:23.500
  12:f40       0 days 0:00:18.671
  13:f44       0 days 0:00:07.453
  14:f48       0 days 0:00:02.031
   1:f14       0 days 0:00:00.343
[…]
 Elapsed Time
  Thread       Time
   1:f14       0 days 12:01:42.656
   0:f10       0 days 12:01:42.656
  24:f70       0 days 12:01:42.093
  23:f6c       0 days 12:01:42.093
  22:f68       0 days 12:01:42.093
  21:f64       0 days 12:01:42.093
  20:f60       0 days 12:01:42.093
  19:f5c       0 days 12:01:42.093
  18:f58       0 days 12:01:42.093
  17:f54       0 days 12:01:42.093
  16:f50       0 days 12:01:42.093
  15:f4c       0 days 12:01:42.093
  14:f48       0 days 12:01:42.093
  13:f44       0 days 12:01:42.093
[…]

We see that thread #14 spent 1 hour and 22 minutes in user space and only 2 seconds in kernel space. Its thread stack showed DllA and DllB modules:

0:000> ~14s
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for DllA.dll -
eax=00010da2 ebx=09227bdc ecx=09236bcc edx=0000003a esi=006769e4 edi=00000335
eip=6120b231 esp=00fefd04 ebp=00fefd2c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
DllA!foo+0×3549:
6120b231 8b45ec          mov     eax,dword ptr [ebp-14h] ss:0023:00fefd18=09251a30

0:000> kv
ChildEBP RetAddr  Args to Child             
WARNING: Stack unwind information not available. Following frames may be wrong.
00fefd2c 6120be31 00000063 8633cdb9 006769e4 DllA!foo+0×3549
00fefd64 6120c6db 00fefe00 00006f07 0353f594 DllA!foo+0×4149
00fefdbc 6120d517 00fefe00 00fefde8 8633cd25 DllA!foo+0×49f3
00fefdf8 6120db4d 0353f594 09251848 00000000 DllA!foo+0×582f
00fefe14 6120f280 006769a0 00000000 0000020c DllA!foo+0×5e65
00fefe84 00933436 0000020c 0353f5ac 00000001 DllA!foo+0×7598

00fefec0 008414f1 0353f580 034db5d0 00000003 DllB!read+0xc6
[…]
00feffb8 77e64829 0078f290 00000000 00000000 msvcr71!_endthreadex+0xa0
00feffec 00000000 7c3494f6 0078f290 00000000 kernel32!BaseThreadStart+0×34

These modules belong to different vendors and the code could loop in DllB or in one of DllA functions. In order to show statistically which module is more suspect than the other, we requested 3 consecutive memory dumps of the same spiking process. They revealed the similar uniform pattern:

0:000> !runaway f
 User Mode Time
  Thread       Time
  16:e5c       0 days 1:37:24.640
   0:e1c       0 days 0:00:00.640
  17:e60       0 days 0:00:00.453
  18:e64       0 days 0:00:00.328
  15:e58       0 days 0:00:00.265
  14:e54       0 days 0:00:00.203
[…]
 Kernel Mode Time
  Thread       Time
  16:e5c       0 days 0:00:00.687
  15:e58       0 days 0:00:00.421
  17:e60       0 days 0:00:00.375
  14:e54       0 days 0:00:00.265
  18:e64       0 days 0:00:00.250
   0:e1c       0 days 0:00:00.046
[…]
 Elapsed Time
  Thread       Time
   1:e20       0 days 4:41:11.562
   0:e1c       0 days 4:41:11.562
  18:e64       0 days 4:41:10.875
  17:e60       0 days 4:41:10.875
  16:e5c       0 days 4:41:10.875
  15:e58       0 days 4:41:10.875
  14:e54       0 days 4:41:10.875
  13:e50       0 days 4:41:10.875
  12:e4c       0 days 4:41:10.875
  11:e48       0 days 4:41:10.875
  10:e44       0 days 4:41:10.875
   9:e40       0 days 4:41:10.875
   8:e3c       0 days 4:41:10.875
   7:e38       0 days 4:41:10.875
   6:e34       0 days 4:41:10.875
   5:e30       0 days 4:41:10.875
   4:e2c       0 days 4:41:10.875
   3:e28       0 days 4:41:10.875
   2:e24       0 days 4:41:10.875
[…]

1st dump file: Process Uptime: 0 days 4:41:12.000

0:000> ~16r
eax=07abbb6c ebx=07ab45d4 ecx=07ab4604 edx=00679344 esi=006769e4 edi=00000035
eip=6120b23e esp=0106fd30 ebp=0106fd58 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
DllA!foo+0×3556:
6120b23e 8bcb            mov     ecx,ebx

0:000> ~16kv
ChildEBP RetAddr  Args to Child             
WARNING: Stack unwind information not available. Following frames may be wrong.
0106fd58 6120be31 0000000b 47d6689c 006769e4 DllA!foo+0×3556
0106fd90 6120c6db 0106fe2c 00007329 0342d14c DllA!foo+0×4149
0106fde8 6120d517 0106fe2c 0106fe14 47d66b28 DllA!foo+0×49f3
0106fe24 6120db4d 0342d14c 07abb960 00000000 DllA!foo+0×582f
0106fe40 6120f280 006769a0 00000000 000001d8 DllA!foo+0×5e65
0106feb0 00933436 000001d8 0342d164 00000001 DllA!foo+0×7598

0106feec 008414f1 0342d138 00000000 0340b290 DllB!read+0xc6
[…]
0106ffb8 77e64829 00784be0 00000000 00000000 msvcr71!_endthreadex+0xa0
0106ffec 00000000 7c3494f6 00784be0 00000000 kernel32!BaseThreadStart+0×34

2nd dump file: Process Uptime: 0 days 4:41:42.000

0:000> ~16r
eax=0002550b ebx=00679344 ecx=07ab45d4 edx=00000009 esi=006769e4 edi=00000035
eip=6120b231 esp=0106fd30 ebp=0106fd58 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for DllA.dll -
DllA!foo+0×3549:
6120b231 8b45ec          mov     eax,dword ptr [ebp-14h] ss:0023:0106fd44=07abbb48

0:000> ~16kv
ChildEBP RetAddr  Args to Child             
WARNING: Stack unwind information not available. Following frames may be wrong.
0106fd58 6120be31 0000000b 47d6689c 006769e4 DllA!foo+0×3549
0106fd90 6120c6db 0106fe2c 00007329 0342d14c DllA!foo+0×4149
0106fde8 6120d517 0106fe2c 0106fe14 47d66b28 DllA!foo+0×49f3
0106fe24 6120db4d 0342d14c 07abb960 00000000 DllA!foo+0×582f
0106fe40 6120f280 006769a0 00000000 000001d8 DllA!foo+0×5e65
0106feb0 00933436 000001d8 0342d164 00000001 DllA!foo+0×7598

0106feec 008414f1 0342d138 00000000 0340b290 DllB!read+0xc6
[…]
0106ffb8 77e64829 00784be0 00000000 00000000 msvcr71!_endthreadex+0xa0
0106ffec 00000000 7c3494f6 00784be0 00000000 kernel32!BaseThreadStart+0×34

3rd dump file: Process Uptime: 0 days 4:41:59.000

0:000> ~16r
eax=07abbb6c ebx=07ab4604 ecx=07ab45d4 edx=07ab4604 esi=006769e4 edi=00000035
eip=6120b239 esp=0106fd30 ebp=0106fd58 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
DllA!foo+0×3551:
6120b239 895108          mov     dword ptr [ecx+8],edx ds:0023:07ab45dc=07ab4604

0:000> ~16kv
ChildEBP RetAddr  Args to Child             
WARNING: Stack unwind information not available. Following frames may be wrong.
0106fd58 6120be31 0000000b 47d6689c 006769e4 DllA!foo+0×3551
0106fd90 6120c6db 0106fe2c 00007329 0342d14c DllA!foo+0×4149
0106fde8 6120d517 0106fe2c 0106fe14 47d66b28 DllA!foo+0×49f3
0106fe24 6120db4d 0342d14c 07abb960 00000000 DllA!foo+0×582f
0106fe40 6120f280 006769a0 00000000 000001d8 DllA!foo+0×5e65
0106feb0 00933436 000001d8 0342d164 00000001 DllA!foo+0×7598

0106feec 008414f1 0342d138 00000000 0340b290 DllB!read+0xc6
[…]
0106ffb8 77e64829 00784be0 00000000 00000000 msvcr71!_endthreadex+0xa0
0106ffec 00000000 7c3494f6 00784be0 00000000 kernel32!BaseThreadStart+0×34

We see exactly the same stack trace except the first (top) frame. The offset differs very little: +0×3556, +0×3549 (twice) and +0×3551. We can suspect the small loop here:

0:000> ub
DllA!foo+0x3549:
6120b231 8b45ec          mov     eax,dword ptr [ebp-14h]
6120b234 8d0490          lea     eax,[eax+edx*4]
6120b237 8b10            mov     edx,dword ptr [eax]
6120b239 895108          mov     dword ptr [ecx+8],edx
6120b23c 8908            mov     dword ptr [eax],ecx
6120b23e 8bcb            mov     ecx,ebx
6120b240 85db            test    ebx,ebx
6120b242 75e3            jne     DllA!foo+0x353f (6120b227)

0:000> ub 6120b231
DllA!foo+0x3535:
6120b21d 8b4d08          mov     ecx,dword ptr [ebp+8]
6120b220 8b0c88          mov     ecx,dword ptr [eax+ecx*4]
6120b223 3bcb            cmp     ecx,ebx
6120b225 741d            je      DllA!foo+0x355c (6120b244)
6120b227 8b5908          mov     ebx,dword ptr [ecx+8]
6120b22a 8b410c          mov     eax,dword ptr [ecx+0Ch]
6120b22d 33d2            xor     edx,edx
6120b22f f7f7            div     eax,edi

0:000> u
DllA!foo+0x3549:
6120b231 8b45ec          mov     eax,dword ptr [ebp-14h]
6120b234 8d0490          lea     eax,[eax+edx*4]
6120b237 8b10            mov     edx,dword ptr [eax]
6120b239 895108          mov     dword ptr [ecx+8],edx
6120b23c 8908            mov     dword ptr [eax],ecx
6120b23e 8bcb            mov     ecx,ebx
6120b240 85db            test    ebx,ebx
6120b242 75e3            jne     DllA!foo+0x353f (6120b227)

0:000> u 6120b242
DllA!foo+0x355a:
6120b242 75e3            jne     DllA!foo+0x353f (6120b227)
6120b244 ff4508          inc     dword ptr [ebp+8]
6120b247 8b4508          mov     eax,dword ptr [ebp+8]
6120b24a 3b4608          cmp     eax,dword ptr [esi+8]
6120b24d 72cc            jb      DllA!foo+0x3533 (6120b21b)
6120b24f ff36            push    dword ptr [esi]
6120b251 e88dc70000      call    DllA!foo+0xfcfb (612179e3)
6120b256 59              pop     ecx

All this strongly rules out DllB and points to DllA as the first vendor contact.

- Dmitry Vostokov @ DumpAnalysis.org -

Leave a Reply

You must be logged in to post a comment.