Bugtation No.46
Friday, October 10th, 2008“Good” troubleshooters “see analogies between” applications “or” services, “the very best ones see analogies between analogies.”
- Dmitry Vostokov @ DumpAnalysis.org -
“Good” troubleshooters “see analogies between” applications “or” services, “the very best ones see analogies between analogies.”
- Dmitry Vostokov @ DumpAnalysis.org -
“If the” modules “in my” process “don’t work with or without” yours, “I cannot blame” you.
Francisco Alves, “If the brakes in my car don’t work with or without petrol in the fuel tank, I cannot blame the fuel”
- Dmitry Vostokov @ DumpAnalysis.org -
It was reported that one important Windows service stops responding from time to time. The customer was proactive in gathering memory dumps and we got several early crash dumps. Most of them were false positive showing normal error handling via throwing an exception:
0:042> kL
ChildEBP RetAddr
0f7bec6c 77c31e37 kernel32!RaiseException+0×53
0f7bec84 77c32042 rpcrt4!RpcpRaiseException+0×24
0f7bec94 77cb30e4 rpcrt4!NdrGetBuffer+0×46
0f7bf080 09a554a6 rpcrt4!NdrClientCall2+0×197
[…]
However one such dump also had a clearly blocked thread which was blocking 10 different threads:
0:042> !locks
CritSec MyService!MainCriticalSection+0 at 0041b9a0
WaiterWoken No
LockCount 0
RecursionCount 1
OwningThread ad0
EntryCount 0
ContentionCount 0
*** Locked
CritSec +339fb8 at 00339fb8
WaiterWoken No
LockCount 10
RecursionCount 1
OwningThread ad0
EntryCount 0
ContentionCount 31
*** Locked
0:042> ~~[ad0]kL
ChildEBP RetAddr
008dc1e0 7c94734b ntdll!KiFastSystemCallRet
008dc1e4 77d96c61 ntdll!NtOpenKey+0xc
008dc244 77d8e15f advapi32!LocalBaseRegOpenKey+0xd0
008dc278 6064fe47 advapi32!RegOpenKeyExA+0×11c
WARNING: Stack unwind information not available. Following frames may be wrong.
008dc8b8 6064fa00 NotMyDLL!getvar+0×4e7
[…]
Checking NotMyDLL module time stamp we identified Not My Version pattern because we expected much later version:
0:042> lmt m NotMyDLL
start end module name
60600000 60686000 NotMyDLL Mon Oct 30 10:14:07 1999
We know this component often had problems in the past and although being stuck in registry access could be a coincidence, registry corruption or system-wide problem we immediately advised to upgrade the component to the latest stable version. We also got a manual dump of the service when the customer tried to restart it and it showed the signs of Lost Opportunity pattern:
0:000> kv
ChildEBP RetAddr Args to Child
1744fd44 7c947d0b 7c821d1e 00001b58 00000000 ntdll!KiFastSystemCallRet
1744fd48 7c821d1e 00001b58 00000000 00000000 ntdll!NtWaitForSingleObject+0xc
1744fdb8 7c821c8d 00001b58 ffffffff 00000000 kernel32!WaitForSingleObjectEx+0xac
1744fdcc 67e223dd 00001b58 ffffffff 1744fdf4 kernel32!WaitForSingleObject+0x12
1744fde0 7c93a352 67e20000 00000000 00000001 MyDLL!DllInitialize+0xed
1744fe00 7c950e70 67e222f0 67e20000 00000000 ntdll!LdrpCallInitRoutine+0x14
1744feb8 7c8268a3 00000000 00000000 00000000 ntdll!LdrShutdownProcess+0x182
1744ffa4 7c826905 c0000005 77e8f3b0 ffffffff kernel32!_ExitProcess+0x43
1744ffb8 7c8392c1 c0000005 00000000 00000000 kernel32!ExitProcess+0×14
1744ffec 00000000 77c4b0f5 0b644720 00000000 kernel32!BaseThreadStart+0×5f
0:000> !teb
TEB at 7ff4b000
ExceptionList: 1744fda8
StackBase: 17450000
StackLimit: 17449000
SubSystemTib: 00000000
FiberData: 00001e00
ArbitraryUserPointer: 00000000
Self: 7ff4b000
EnvironmentPointer: 00000000
ClientId: 00001e90 . 00001168
RpcHandle: 00000000
Tls Storage: 00000000
PEB Address: 7ffdd000
LastErrorValue: 0
LastStatusValue: 103
Count Owned Locks: 0
HardErrorMode: 0
0:000> dds 17449000 17450000
[...]
1744f4b0 7c94775b ntdll!NtRaiseHardError+0xc
1744f4b4 7c842610 kernel32!UnhandledExceptionFilter+0×51a
1744f4b8 d0000144
1744f4bc 00000000
[…]
0:000> !error d0000144
Error code: (NTSTATUS) 0xd0000144 (3489661252) - {Application Error} The exception %s (0x%08lx) occurred in the application at location 0x%08lx.
Therefore we additionally advised to dump the process manually using userdump.exe when an error message box appears on the console session. We hope that getting right dump files at the right time via the right method would prove or disprove our hypothesis about NotMyDLL component.
- Dmitry Vostokov @ DumpAnalysis.org -
“I’m gonna do better than learn to” troubleshoot, “I’m gonna learn to” debug.
- Dmitry Vostokov @ DumpAnalysis.org -
“Some of the greatest advances in” debugging “have been due to the invention of symbols, which it afterwards became necessary to explain;”
Aldous Leonard Huxley, Jesting Pilate
For explanation of symbols please read:
Crash Dumps for Dummies: Part 5 - Symbol files explained
- Dmitry Vostokov @ DumpAnalysis.org -
Reminiscence on a memory dump as an integer:
“The trouble with” memory dumps “is that we have examined only the very small ones. Maybe all the exciting stuff happens at really big” memory dumps, “ones we can’t even begin to think about in any very definite way. So maybe all the action is really inaccessible and we’re just fiddling around. Our brains have evolved to get us out of the rain, find where the berries are, and keep us from getting killed. Our brains did not evolve to help us grasp really large” memory dumps “or to look at things in a hundred thousand” memory locations.
Ronald Lewis Graham, quoted in “Computers, Pattern, Chaos and Beauty” by Clifford A. Pickover
- Dmitry Vostokov @ DumpAnalysis.org -
Sometimes application developers with WinDbg live debugging and user dump experience need a quick guide to start with kernel and complete memory dumps. Familiar stack trace browsing commands no longer work and here is preliminary discussion/tutorial on the forum:
http://www.dumpanalysis.org/forum/viewtopic.php?f=10&t=270
If you want to dig deeper please see Moving to kernel space (updated references) post for reading list.
- Dmitry Vostokov @ DumpAnalysis.org -
Most of the time Data Alignment manifests itself on Intel platforms from performance perspective and GP faults for some instructions that require natural boundary for their qword operands. There are no exceptions generally if we move a dword value from or to an odd memory location address when the whole operand fits into one page. However we need to take the possibility of page boundary spans into account when checking memory addresses for their validity. Consider this exception:
0: kd> .trap 0xffffffffa38df520
ErrCode = 00000002
eax=b6d9220f ebx=b6ab4ffb ecx=00000304 edx=eaf2fdea esi=b6d9214c edi=b6ab8189
eip=bfa10e6e esp=a38df594 ebp=a38df5ac iopl=0 nv up ei ng nz ac po cy
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010293
driver+0x2ae6e:
bfa10e6e 895304 mov dword ptr [ebx+4],edx ds:0023:b6ab4fff=????????
The address seems to be valid:
0: kd> !pte b6ab4fff
VA b6ab4fff
PDE at C0300B68 PTE at C02DAAD0
contains 7F0DD863 contains 426B0863
pfn 7f0dd —DA–KWEV pfn 426b0 —DA–KWEV
But careful examination of the instruction reveals that it writes 32 bit value so we need to inspect the next byte too because it is on another page:
0: kd> !pte b6ab4fff+1
VA b6ab5000
PDE at C0300B68 PTE at C02DAAD4
contains 7F0DD863 contains 00000080
pfn 7f0dd —DA–KWEV not valid
DemandZero
Protect: 4 - ReadWrite
Although the page is demand zero and this should have been satisfied by creating a new page filled with zeroes, my point here that the page could have been completely invalid or paged out in the case of IRQL >= 2.
- Dmitry Vostokov @ DumpAnalysis.org -
Another variation of the previous bugtation No.40:
“Read” code “at whim!”
Randall Jarrell, A Sad Heart at the Supermarket: Essays & Fables
- Dmitry Vostokov @ DumpAnalysis.org -
Debug “at whim!” Debug “at whim!”
Randall Jarrell, A Sad Heart at the Supermarket: Essays & Fables
- Dmitry Vostokov @ DumpAnalysis.org -
Crash dumps “have another hypnotic effect. Because they are not immediately understood, they, like certain jokes, are suspected of holding in some sort of magic embrace the secret of” troubleshooting, “or at least some of its more” difficult “parts.”
Scott Milross Buchanan, Poetry and Mathematics
- Dmitry Vostokov @ DumpAnalysis.org -
“Everything is memory dump.”
I’m very excited to announce that Volume 2 is available in paperback, hardcover and digital editions:
Memory Dump Analysis Anthology, Volume 2
In one or two weeks paperback edition should also appear on Amazon and other bookstores. Amazon hardcover edition is planned to be available by the end of October.
I’m often asked when Volume 3 is available and I currently plan to release it in October - November, 2009. In the mean time I’m planning to concentrate on other publishing projects.
- Dmitry Vostokov @ DumpAnalysis.org -
Out of 61,500,000 Google hits for “Everything is” X I couldn’t find X == memory dump so I presume this quotation is also traced to me
“Everything is memory dump.”
- Dmitry Vostokov @ DumpAnalysis.org -
Out of 85,800 Google hits for “In the beginning there was the” X I couldn’t find X == crash so I presume this quotation is traced to me
“In the beginning there was the crash.”
- Dmitry Vostokov @ DumpAnalysis.org -
The book is nearly finished and here is the final TOC:
Memory Dump Analysis Anthology, Volume 2: Table of Contents
- Dmitry Vostokov @ DumpAnalysis.org -
Exception “is what we see at a glance.”
- Dmitry Vostokov @ DumpAnalysis.org -
Today Citrix officially joined the club of public symbol server companies! Please refer to the following article for details:
How to Use the Citrix Symbol Server to Obtain Debug Symbols
- Dmitry Vostokov @ DumpAnalysis.org -
Crash dump analysis “does not consist merely in” peeking” the memory and enlightening the understanding. Its main business should be to direct the” Customer.
Joseph Joubert, Pensées
- Dmitry Vostokov @ DumpAnalysis.org -
Here you can find the draft TOC for the forthcoming book “DebugWare: The Art and Craft of Writing Troubleshooting and Debugging Tools”:
- Dmitry Vostokov @ DumpAnalysis.org -
“An excellent precept for” programmers: “have a clear idea of all the” functions “and expressions you need, and you will find them.”
Ximénès Doudan, Pensées et fragments suivis des révolutions du goût
- Dmitry Vostokov @ DumpAnalysis.org -