10 Common Mistakes in Memory Analysis (Part 4)
CARE: Crash Analysis Report Environment
DATA (Dump Analysis + Trace Analysis) Facebook group
Please join the community of memory (dump) and trace analysis engineers. This group promotes scientific methods and memory dump-based worldview.
Twitter @ DumpAnalysis You can now follow portal and blog news at DumpAnalysis on Twitter
LinkedIn Group Dr. Watson Enthusiasts All about Dr. Watson errors and more. Get news, excerpts and progress reports about the forthcoming book The Science of Dr. Watson: An Illustrated History of Debugging (ISBN 978-1906717070)
2010 (0x7DA) - The Year of Dump Analysis 2011 (0x7DB) - 2020 (0x7E4) The Debugging Decade
One of the common mistakes that I observe is to habitually stick to certain WinDbg commands to recognize patterns. One example is !locks command used to find out any wait chains and deadlock conditions among threads. Recently a service process was reported to be hang and !locks command showed no blocked threads:
0:000> !locks
CritSec +18caf94 at 018CAF94
LockCount -2
RecursionCount 1
OwningThread 58e8
EntryCount 0
ContentionCount 0
*** Locked
CritSec +18cc7c4 at 018CC7C4
LockCount -2
RecursionCount 1
OwningThread 58e8
EntryCount 0
ContentionCount 0
*** Locked
The number of threads waiting for the lock is 0 (this calculation is explained in the MSDN article):
0:000> ? ((-1) - (-2)) >> 2
Evaluate expression: 0 = 00000000
In the past, for that hang sevice memory dumps, !locks command always showed LockCount values corresponding to several waiting threads. Therefore, an engineer assumed that the dump was taken at some random time, not at the time the service was hanging, and asked for a new right dump. The mistake here is that the engineer didn’t look at the corresponding thread stack trace that shows the characteristic pattern of the blocked thread waiting for a reply from an LRPC call:
0:000> ~~[58e8]kc 100
ntdll!KiFastSystemCallRet
ntdll!NtRequestWaitReplyPort
RPCRT4!LRPC_CCALL::SendReceive
RPCRT4!I_RpcSendReceive
RPCRT4!NdrSendReceive
RPCRT4!NdrClientCall2
ServiceA!foo
[…]
ServiceA!bar
RPCRT4!NdrStubCall2
RPCRT4!NdrServerCall2
RPCRT4!DispatchToStubInCNoAvrf
RPCRT4!RPC_INTERFACE::DispatchToStubWorker
RPCRT4!RPC_INTERFACE::DispatchToStub
RPCRT4!RPC_INTERFACE::DispatchToStubWithObject
RPCRT4!LRPC_SCALL::DealWithRequestMessage
RPCRT4!LRPC_ADDRESS::DealWithLRPCRequest
RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls
RPCRT4!RecvLotsaCallsWrapper
RPCRT4!BaseCachedThreadRoutine
RPCRT4!ThreadStartRoutine
kernel32!BaseThreadStart
We don’t see other blocked threads and wait chains because the dump was saved as soon as the freezing condition was detected: the service didn’t allow a user connection to proceed. If more users tried to connect we would have seen critical section wait chains that are absent in this dump.
To prevent such mistakes checklists are indispensable. For one example, see Crash Dump Analysis Checklist. You can also order it in print:
WinDbg: A Reference Poster and Learning Cards
- Dmitry Vostokov @ DumpAnalysis.org -
_1125.png)
Coming Soon:
Debugging Notebook: Essential Concepts, WinDbg Commands and Tools
Crash Dump Analysis for System Administrators and Support Engineers
New Magazines:
Debugged! MZ/PE: MagaZine for/from Practicing Engineers
New Books:
Memory Dump Analysis Anthology, Volume 3
First Fault Software Problem Solving: A Guide for Engineers, Managers and Users
x64 Windows Debugging: Practical Foundations
Also available:
Windows Debugging: Practical Foundations
DLL List Landscape: The Art from Computer Memory Space
Dumps, Bugs and Debugging Forensics: The Adventures of Dr. Debugalov
WinDbg: A Reference Poster and Learning Cards
Memory Dump Analysis Anthology, Volume 2
Memory Dump Analysis Anthology, Volume 1
New Children's Book:
July 4th, 2009 at 9:00 pm
[…] small case study continues where Not using checklists common mistake case study left, after identifying the blocked LPC thread in ServiceA process. We […]