Archive for the ‘Debugging’ Category

Bugtation No.46

Friday, October 10th, 2008

“Good” troubleshooters “see analogies between” applications “or” services, “the very best ones see analogies between analogies.”

Stefan Banach

- Dmitry Vostokov @ DumpAnalysis.org -

Bugtation No.45

Thursday, October 9th, 2008

“If the” modules “in my” process “don’t work with or without” yours, “I cannot blame” you.

Francisco Alves, “If the brakes in my car don’t work with or without petrol in the fuel tank, I cannot blame the fuel”

- Dmitry Vostokov @ DumpAnalysis.org -

Early crash dump, blocked thread, not my version and lost opportunity: pattern cooperation

Thursday, October 9th, 2008

It was reported that one important Windows service stops responding from time to time. The customer was proactive in gathering memory dumps and we got several early crash dumps. Most of them were false positive showing normal error handling via throwing an exception:

0:042> kL
ChildEBP RetAddr 
0f7bec6c 77c31e37 kernel32!RaiseException+0×53
0f7bec84 77c32042 rpcrt4!RpcpRaiseException+0×24
0f7bec94 77cb30e4 rpcrt4!NdrGetBuffer+0×46
0f7bf080 09a554a6 rpcrt4!NdrClientCall2+0×197
[…]

However one such dump also had a clearly blocked thread which was blocking 10 different threads:

0:042> !locks

CritSec MyService!MainCriticalSection+0 at 0041b9a0
WaiterWoken        No
LockCount          0
RecursionCount     1
OwningThread       ad0
EntryCount         0
ContentionCount    0
*** Locked

CritSec +339fb8 at 00339fb8
WaiterWoken        No
LockCount          10
RecursionCount     1
OwningThread       ad0
EntryCount         0
ContentionCount    31
*** Locked

0:042> ~~[ad0]kL
ChildEBP RetAddr 
008dc1e0 7c94734b ntdll!KiFastSystemCallRet
008dc1e4 77d96c61 ntdll!NtOpenKey+0xc
008dc244 77d8e15f advapi32!LocalBaseRegOpenKey+0xd0
008dc278 6064fe47 advapi32!RegOpenKeyExA+0×11c
WARNING: Stack unwind information not available. Following frames may be wrong.
008dc8b8 6064fa00 NotMyDLL!getvar+0×4e7
[…]

Checking NotMyDLL module time stamp we identified Not My Version pattern because we expected much later version:

0:042> lmt m NotMyDLL
start    end        module name
60600000 60686000   NotMyDLL  Mon Oct 30 10:14:07 1999

We know this component often had problems in the past and although being stuck in registry access could be a coincidence, registry corruption or system-wide problem we immediately advised to upgrade the component to the latest stable version. We also got a manual dump of the service when the customer tried to restart it and it showed the signs of Lost Opportunity pattern:

0:000> kv
ChildEBP RetAddr Args to Child
1744fd44 7c947d0b 7c821d1e 00001b58 00000000 ntdll!KiFastSystemCallRet
1744fd48 7c821d1e 00001b58 00000000 00000000 ntdll!NtWaitForSingleObject+0xc
1744fdb8 7c821c8d 00001b58 ffffffff 00000000 kernel32!WaitForSingleObjectEx+0xac
1744fdcc 67e223dd 00001b58 ffffffff 1744fdf4 kernel32!WaitForSingleObject+0x12
1744fde0 7c93a352 67e20000 00000000 00000001 MyDLL!DllInitialize+0xed
1744fe00 7c950e70 67e222f0 67e20000 00000000 ntdll!LdrpCallInitRoutine+0x14
1744feb8 7c8268a3 00000000 00000000 00000000 ntdll!LdrShutdownProcess+0x182
1744ffa4 7c826905 c0000005 77e8f3b0 ffffffff kernel32!_ExitProcess+0x43
1744ffb8 7c8392c1 c0000005 00000000 00000000 kernel32!ExitProcess+0×14
1744ffec 00000000 77c4b0f5 0b644720 00000000 kernel32!BaseThreadStart+0×5f

0:000> !teb
TEB at 7ff4b000
    ExceptionList:        1744fda8
    StackBase:            17450000
    StackLimit:           17449000
    SubSystemTib:         00000000
    FiberData:            00001e00
    ArbitraryUserPointer: 00000000
    Self:                 7ff4b000
    EnvironmentPointer:   00000000
    ClientId:             00001e90 . 00001168
    RpcHandle:            00000000
    Tls Storage:          00000000
    PEB Address:          7ffdd000
    LastErrorValue:       0
    LastStatusValue:      103
    Count Owned Locks:    0
    HardErrorMode:        0

0:000> dds 17449000 17450000
[...]
1744f4b0  7c94775b ntdll!NtRaiseHardError+0xc
1744f4b4  7c842610 kernel32!UnhandledExceptionFilter+0×51a
1744f4b8  d0000144
1744f4bc  00000000
[…]

0:000> !error d0000144
Error code: (NTSTATUS) 0xd0000144 (3489661252) - {Application Error} The exception %s (0x%08lx) occurred in the application at location 0x%08lx.

Therefore we additionally advised to dump the process manually using userdump.exe when an error message box appears on the console session. We hope that getting right dump files at the right time via the right method would prove or disprove our hypothesis about NotMyDLL component.

- Dmitry Vostokov @ DumpAnalysis.org -

Bugtation No.44

Thursday, October 9th, 2008

“I’m gonna do better than learn to” troubleshoot, “I’m gonna learn to” debug.

Alexander Murray Palmer Haley, Roots

- Dmitry Vostokov @ DumpAnalysis.org -

Bugtation No.43

Wednesday, October 8th, 2008

“Some of the greatest advances in” debugging “have been due to the invention of symbols, which it afterwards became necessary to explain;”

Aldous Leonard Huxley, Jesting Pilate

For explanation of symbols please read:

Crash Dumps for Dummies: Part 5 - Symbol files explained  

- Dmitry Vostokov @ DumpAnalysis.org -

Bugtation No.42

Tuesday, October 7th, 2008

Reminiscence on a memory dump as an integer:

“The trouble with” memory dumps “is that we have examined only the very small ones. Maybe all the exciting stuff happens at really big” memory dumps, “ones we can’t even begin to think about in any very definite way. So maybe all the action is really inaccessible and we’re just fiddling around. Our brains have evolved to get us out of the rain, find where the berries are, and keep us from getting killed. Our brains did not evolve to help us grasp really large” memory dumps “or to look at things in a hundred thousand” memory locations.

Ronald Lewis Graham, quoted in “Computers, Pattern, Chaos and Beauty” by Clifford A. Pickover

- Dmitry Vostokov @ DumpAnalysis.org -

From user to kernel dumps

Tuesday, October 7th, 2008

Sometimes application developers with WinDbg live debugging and user dump experience need a quick guide to start with kernel and complete memory dumps. Familiar stack trace browsing commands no longer work and here is preliminary discussion/tutorial on the forum:

http://www.dumpanalysis.org/forum/viewtopic.php?f=10&t=270

If you want to dig deeper please see Moving to kernel space (updated references) post for reading list.

- Dmitry Vostokov @ DumpAnalysis.org -

Crash Dump Analysis Patterns (Part 76)

Monday, October 6th, 2008

Most of the time Data Alignment manifests itself on Intel platforms from performance perspective and GP faults for some instructions that require natural boundary for their qword operands. There are no exceptions generally if we move a dword value from or to an odd memory location address when the whole operand fits into one page. However we need to take the possibility of page boundary spans into account when checking memory addresses for their validity. Consider this exception:

0: kd> .trap 0xffffffffa38df520
ErrCode = 00000002
eax=b6d9220f ebx=b6ab4ffb ecx=00000304 edx=eaf2fdea esi=b6d9214c edi=b6ab8189
eip=bfa10e6e esp=a38df594 ebp=a38df5ac iopl=0 nv up ei ng nz ac po cy
cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000  efl=00010293
driver+0x2ae6e:
bfa10e6e 895304  mov    dword ptr [ebx+4],edx ds:0023:b6ab4fff=????????

The address seems to be valid:

0: kd> !pte b6ab4fff
               VA b6ab4fff
PDE at   C0300B68        PTE at C02DAAD0
contains 7F0DD863      contains 426B0863
pfn 7f0dd —DA–KWEV    pfn 426b0 —DA–KWEV

But careful examination of the instruction reveals that it writes 32 bit value so we need to inspect the next byte too because it is on another page:

0: kd> !pte b6ab4fff+1
               VA b6ab5000
PDE at   C0300B68        PTE at C02DAAD4
contains 7F0DD863      contains 00000080
pfn 7f0dd —DA–KWEV                           not valid
                       DemandZero
                       Protect: 4 - ReadWrite

Although the page is demand zero and this should have been satisfied by creating a new page filled with zeroes, my point here that the page could have been completely invalid or paged out in the case of IRQL >= 2. 

- Dmitry Vostokov @ DumpAnalysis.org -

Bugtation No.41

Monday, October 6th, 2008

Another variation of the previous bugtation No.40:

“Read” code “at whim!”

Randall Jarrell, A Sad Heart at the Supermarket: Essays & Fables

- Dmitry Vostokov @ DumpAnalysis.org -

Bugtation No.40

Monday, October 6th, 2008

Debug “at whim!” Debug “at whim!”

Randall Jarrell, A Sad Heart at the Supermarket: Essays & Fables

- Dmitry Vostokov @ DumpAnalysis.org -

Bugtation No.39

Monday, October 6th, 2008

Crash dumps “have another hypnotic effect. Because they are not immediately understood, they, like certain jokes, are suspected of holding in some sort of magic embrace the secret of” troubleshooting, “or at least some of its more” difficult “parts.”

Scott Milross Buchanan, Poetry and Mathematics

- Dmitry Vostokov @ DumpAnalysis.org -

Memory Dump Analysis Anthology, Volume 2

Friday, October 3rd, 2008

“Everything is memory dump.”

I’m very excited to announce that Volume 2 is available in paperback, hardcover and digital editions:

Memory Dump Analysis Anthology, Volume 2

In one or two weeks paperback edition should also appear on Amazon and other bookstores. Amazon hardcover edition is planned to be available by the end of October.

I’m often asked when Volume 3 is available and I currently plan to release it in October - November, 2009. In the mean time I’m planning to concentrate on other publishing projects. 

- Dmitry Vostokov @ DumpAnalysis.org -

Bugtation No.38

Thursday, October 2nd, 2008

Out of 61,500,000 Google hits for “Everything is” X I couldn’t find X == memory dump so I presume this quotation is also traced to me :-)

“Everything is memory dump.”

Dmitry Vostokov

- Dmitry Vostokov @ DumpAnalysis.org -

Bugtation No.37

Thursday, October 2nd, 2008

Out of 85,800 Google hits for “In the beginning there was the” X I couldn’t find X == crash so I presume this quotation is traced to me :-)

“In the beginning there was the crash.”

Dmitry Vostokov

- Dmitry Vostokov @ DumpAnalysis.org -

MDAA Volume 2: Table of Contents

Wednesday, October 1st, 2008

The book is nearly finished and here is the final TOC:

Memory Dump Analysis Anthology, Volume 2: Table of Contents

- Dmitry Vostokov @ DumpAnalysis.org -

Bugtation No.36

Wednesday, October 1st, 2008

Exception “is what we see at a glance.”

Blaise Pascal, Pensées

- Dmitry Vostokov @ DumpAnalysis.org -

Citrix joins Symbol Server Club!

Tuesday, September 30th, 2008

Today Citrix officially joined the club of public symbol server companies! Please refer to the following article for details:

How to Use the Citrix Symbol Server to Obtain Debug Symbols

- Dmitry Vostokov @ DumpAnalysis.org -

Bugtation No.35

Sunday, September 28th, 2008

Crash dump analysis “does not consist merely in” peeking” the memory and enlightening the understanding. Its main business should be to direct the” Customer.

Joseph Joubert, Pensées

- Dmitry Vostokov @ DumpAnalysis.org -

DebugWare Book: Table of Contents

Friday, September 26th, 2008

Here you can find the draft TOC for the forthcoming book “DebugWare: The Art and Craft of Writing Troubleshooting and Debugging Tools”:

Table of Contents

- Dmitry Vostokov @ DumpAnalysis.org -

Bugtation No.34

Thursday, September 25th, 2008

“An excellent precept for” programmers: “have a clear idea of all the” functions “and expressions you need, and you will find them.”

Ximénès Doudan, Pensées et fragments suivis des révolutions du goût

- Dmitry Vostokov @ DumpAnalysis.org -