Looking for abnormal: case study
I’m RARE rule #5 says:
“Provide appropriate explanations and narrative in the cases where analysis is inconclusive”.
Here is the typical example of such case when a kernel dump was taken with the vague description about server problems. The dump file analysis revealed the following abnormal conditions warranting further troubleshooting steps:
The AppA.exe, the part of the customer product, is about 1Gb in size when its typical size should be no more than 200Mb. Perhaps we have a memory leak here. We can suggest to take a few consecutive memory dumps of the growing memory and analyze it later as described in a heap leak pattern. This can also be a .NET leak too if unmanaged AppA.exe happened to load any managed components through 3rd-party DLLs. It could be also some unknown loaded component reserved and committed large portion of virtual memory space.
0: kd> !vm
[...]
0eec AppA.exe 241366 ( 965464 Kb)
03c0 svchost.exe 10304 ( 41216 Kb)
0230 lsass.exe 8764 ( 35056 Kb)
0298 svchost.exe 6402 ( 25608 Kb)
01f4 winlogon.exe 5787 ( 23148 Kb)
[…]
We can confirm the absence of handle leaks:
0: kd> !process 0eec
Searching for Process with Cid == eec
Cid Handle table at fffffa80014d6000 with 794 Entries in use
PROCESS fffffade6e601860
SessionId: 0 Cid: 0eec Peb: 7efdf000 ParentCid: 0eb8
DirBase: b10fa000 ObjectTable: fffffa8000c39170 HandleCount: 865.
Image: AppA.exe
VadRoot fffffade68d7e580 Vads 1961 Clone 0 Private 237843. Modified 77. Locked 1.
DeviceMap fffffa8001221580
Token fffffa8001fdebe0
ElapsedTime 6 Days 22:18:09.271
UserTime 00:23:00.406
KernelTime 00:27:31.281
QuotaPoolUsage[PagedPool] 106968
QuotaPoolUsage[NonPagedPool] 19055186
Working Set Sizes (now,min,max) (240529, 50, 345) (962116KB, 200KB, 1380KB)
PeakWorkingSetSize 240671
VirtualSize 1244 Mb
PeakVirtualSize 1246 Mb
PageFaultCount 244053
MemoryPriority BACKGROUND
BasePriority 8
CommitCharge 241366
Kernel and user times seem high (about 30 minutes) but it correlates with almost 7 day extensive application usage that involves constant database access.
Looking further at running processes we see that the crucial AppB and AppC applications that were supposed to be running to serve user requests are orphaned:
09e8 AppB.exe 0 ( 0 Kb)
09c0 AppC.exe 0 ( 0 Kb)
Were they closed normally, forcefully terminated after being hang or crashed? These questions should be asked and appropriate measures taken to capture crash dumps in case event logs reveal access violations.
- Dmitry Vostokov @ DumpAnalysis.org -