Archive for July 3rd, 2009

Naming Infinity

Friday, July 3rd, 2009

I read this book from cover to cover while flying on a plane from Dublin to St. Petersburg and back. That was so wonderful reading experience - I couldn’t put the book down during those flights. I recall that I visited the Department of Mathematics a few times when I studied Chemistry in Moscow State University although at that time I knew next to nothing about Russian mathematicians. The book touched me so deeply that I bought the main work of Florensky: The Pillar and Ground of the Truth, the history of Russian philosophy and several books explaining Orthodox Church. This is the best mathematics history book I have ever read, my feelings perhaps comparable to those that I experienced when I finished reading Mathematics: The Loss of Certainty by Morris Kline but that was more than 20 years ago.

Naming Infinity: A True Story of Religious Mysticism and Mathematical Creativity

Buy from Amazon

- Dmitry Vostokov @ LiterateScientist.com -

10 Common Mistakes in Memory Analysis (Part 4)

Friday, July 3rd, 2009

One of the common mistakes that I observe is to habitually stick to certain WinDbg commands to recognize patterns. One example is !locks command used to find out any wait chains and deadlock conditions among threads. Recently a service process was reported to be hang and !locks command showed no blocked threads:

0:000> !locks
CritSec +18caf94 at 018CAF94
LockCount          -2
RecursionCount     1
OwningThread       58e8
EntryCount         0
ContentionCount    0
*** Locked

CritSec +18cc7c4 at 018CC7C4
LockCount          -2
RecursionCount     1
OwningThread       58e8
EntryCount         0
ContentionCount    0
*** Locked

The number of threads waiting for the lock is 0 (this calculation is explained in the MSDN article): 

0:000> ? ((-1) - (-2)) >> 2
Evaluate expression: 0 = 00000000

In the past, for that hang sevice memory dumps, !locks command always showed LockCount values corresponding to several waiting threads. Therefore, an engineer assumed that the dump was taken at some random time, not at the time the service was hanging, and asked for a new right dump. The mistake here is that the engineer didn’t look at the corresponding thread stack trace that shows the characteristic pattern of the blocked thread waiting for a reply from an LRPC call:

0:000> ~~[58e8]kc 100

ntdll!KiFastSystemCallRet
ntdll!NtRequestWaitReplyPort
RPCRT4!LRPC_CCALL::SendReceive
RPCRT4!I_RpcSendReceive
RPCRT4!NdrSendReceive
RPCRT4!NdrClientCall2

ServiceA!foo
[…]
ServiceA!bar
RPCRT4!NdrStubCall2
RPCRT4!NdrServerCall2
RPCRT4!DispatchToStubInCNoAvrf
RPCRT4!RPC_INTERFACE::DispatchToStubWorker
RPCRT4!RPC_INTERFACE::DispatchToStub
RPCRT4!RPC_INTERFACE::DispatchToStubWithObject
RPCRT4!LRPC_SCALL::DealWithRequestMessage
RPCRT4!LRPC_ADDRESS::DealWithLRPCRequest
RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls
RPCRT4!RecvLotsaCallsWrapper
RPCRT4!BaseCachedThreadRoutine
RPCRT4!ThreadStartRoutine
kernel32!BaseThreadStart

We don’t see other blocked threads and wait chains because the dump was saved as soon as the freezing condition was detected: the service didn’t allow a user connection to proceed. If more users tried to connect we would have seen critical section wait chains that are absent in this dump.

To prevent such mistakes checklists are indispensable. For one example, see Crash Dump Analysis Checklist. You can also order it in print:

WinDbg: A Reference Poster and Learning Cards

- Dmitry Vostokov @ DumpAnalysis.org -

Overcoming Resistance

Friday, July 3rd, 2009

A picture taken during my recent visit to Peterhof (one of the 7 wonders of Russia):

- Dmitry Vostokov @ DumpAnalysis.org -