Software Diagnostics Library

Archive for July 10th, 2008

Crash Dump Analysis Patterns (Part 70)

Thursday, July 10th, 2008

Sometimes compilers optimize code by replacing function calls with their bodies. This procedure is called function inlining and functions themselves are called inline. On one platform we can see the real function call on the stack trace but on another platform or product version we only see the same problem instruction. Fortunately the rest of stack trace should be the same. Therefore when comparing Stack Traces we shouldn’t pay attention only to the top function call. This pattern is called Inline Function Optimization.

It is frequently seen when threads crash while copying or moving memory. Consider this stack trace:

0: kd> kL ChildEBP RetAddr f22efaf4 f279ec3d driver!QueueValue+0x26b f22efb30 8081dcdf driver!BufferAppendData+0x35f f22efc7c 808f47b7 nt!IofCallDriver+0x45 f22efc90 808f24ee nt!IopSynchronousServiceTail+0x10b f22efd38 80888c7c nt!NtWriteFile+0x65a f22efd38 7c82ed54 nt!KiFastCallEntry+0xfc

When looking at rep movs instruction we might suspect that QueueValue was copying memory:

0: kd> r eax=00000640 ebx=89b23000 ecx=00000190 edx=89b3c828 esi=02124220 edi=e2108f58 eip=f279c797 esp=f22efadc ebp=f22efaf4 iopl=0 nv up ei pl nz na pe nc cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010206 driver!QueueValue+0x26b: f279c797 f3a5 rep movs dword ptr es:[edi],dword ptr [esi] es:0023:e2108f58=dfefbecf ds:0023:02124220=????????

On x64 bit platform the same driver had the similar stack trace but with memcpy at its top:

fffffadf`8955f4a8 fffffadf`8d1bef46 driver!memcpy+0x1c0 fffffadf`8955f4b0 fffffadf`8d1c15c9 driver!QueueValue+0x2fe fffffadf`8955f550 fffff800`01273ed9 driver!BufferAppendData+0x481 [...]

We also see how QueueValue+0×2fe and QueueValue+0×26b are close. In fact the source code for the driver calls RtlCopyMemory function only once and it is defined as memcpy in wdm.h. The latter function is also exported from nt:

0: kd> x nt! [...] 80881780 nt!memcpy = <no type information> [...]

but usually can be found in any driver that links it from C runtime library, for example, on my x64 Windows:

1: kd> x nt!memcpy fffff800`01c464e0 nt!memcpy = <no type information>

1: kd> x srv!memcpy fffff980`0eafdf20 srv!memcpy = <no type information>

1: kd> x win32k!memcpy fffff960`000c1b40 win32k!memcpy = <no type information>

Therefore we see that when compiling for x86 platform Visual C++ compiler decided to inline memcpy code but AMD compiler on x64 platform didn’t inline it. The overall stack trace without offsets is very similar and we can suppose that the problem was identical.

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Crash Dump Analysis, Crash Dump Patterns, Debugging | 2 Comments »

Windows Internals Beta Exam

Thursday, July 10th, 2008

The new Microsoft beta exam is open for registration from 17th of July! Details can be found here:

Windows Internals Beta Exam 71-660

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Announcements, Software Technical Support | No Comments »

Crash Dump Analysis Patterns (Part 19b)

Thursday, July 10th, 2008

Previously I wrote about how to calculate thread waiting time in kernel and complete memory dumps (Waiting Thread Time pattern). Now I show how to do it for user dumps. Unfortunately this information is not available in user space and therefore this can be done only approximately by calculating average or maximum waiting time.

The key is to take advantage of !runaway WinDbg command. Modern versions of postmortem debuggers and process dumpers usually save thread time information additionally. For example, /ma switch for .dump command does it. The later versions of userdump.exe also do it. When !runaway is invoked with 0y111 (7) flag it shows each thread user and kernel times and also the time elapsed since each thread was created. Therefore for threads that didn’t do much CPU intensive computation their average waiting time will almost correspond to their elapsed time. Unfortunately we cannot say when that computation took place. It could be the case that the thread in question was waiting 98.9% of the time, then did some computation for 0.01% and was waiting again 1% of the time or it could be the case that it was doing computation 0.01% of the time initially and then waiting the rest of the time (99.9%) until the dump was saved.

Consider this example of a sleeping thread:

0:000> ~32kL 100 ChildEBP RetAddr 0a18f03c 7c826f4b ntdll!KiFastSystemCallRet 0a18f040 77e41ed1 ntdll!NtDelayExecution+0xc 0a18f0a8 77e424ed kernel32!SleepEx+0x68 0a18f0b8 1b0ac343 kernel32!Sleep+0xf 0a18f100 1b00ba49 msjet40!System::AllocatePages+0x15e 0a18f11c 1b00ba08 msjet40!PageDesc::WireCurrentPage+0x2f 0a18f13c 1b00b8dd msjet40!PageDesc::ReadPage+0x119 0a18f164 1b01b567 msjet40!Database::ReadPage+0x7a 0a18f190 1b00e309 msjet40!TableMover::GetNextRow+0xc9 0a18f1a8 1b015de9 msjet40!TableMover::Move+0xc4 0a18f1d8 1b015d9c msjet40!ErrIsamMove+0x6c 0a18f1f0 1b038aa4 msjet40!ErrDispMove+0x43 0a18f43c 1b015d9c msjet40!ErrJPMoveRange+0x350 0a18f454 1b0200b3 msjet40!ErrDispMove+0x43 0a18f6a0 1b021e4d msjet40!ErrMaterializeRows+0x1fd 0a18f6f0 1b021c0d msjet40!ErrJPSetColumnSort+0x191 0a18f718 1b0210a4 msjet40!ErrJPOpenSort+0x105 0a18f750 1b020de5 msjet40!ErrJPOpenRvt+0x171 0a18f9f0 1b039b82 msjet40!ErrExecuteQuery+0x548 0a18fa3c 1b05548b msjet40!ErrExecuteTempQuery+0x13d 0a18fa6c 4c23b01e msjet40!JetExecuteTempQuery+0xc9 0a18fa94 4c23a8d1 odbcjt32!DoJetCloseTable+0x64 0a18fd10 4c23a6b6 odbcjt32!SQLInternalExecute+0x217 0a18fd20 4c23a694 odbcjt32!SQLExecuteCover+0x1f 0a18fd28 488b3fc1 odbcjt32!SQLExecute+0x9 0a18fd44 0c4898b5 odbc32!SQLExecute+0xd3 [...]

The process uptime can be seen from vertarget WinDbg command:

0:000> vertarget [...] Process Uptime: 0 days 1:17:22.000 [...]

Then we dump thread times (only information for our 32nd thread is shown here):

0:000> !runaway 0y111 User Mode Time Thread Time 32:cfc 0 days 0:00:00.109 [...] Kernel Mode Time Thread Time 32:cfc 0 days 0:00:00.062 [...] Elapsed Time Thread Time [...] 32:cfc 0 days 1:17:20.703 [...]

We see that 1 hour, 17 minutes and almost 21 seconds passed since the thread was created. By subtracting this time from process uptime we see that it was created in the first 2 seconds and it was consuming less than one second of CPU time. Therefore, most of the time this thread was waiting. Unfortunately we cannot say when it was waiting most of the time, in the beginning before it started sleeping or after. Fortunately we might continue guessing by looking at Sleep argument:

0:032> kv ChildEBP RetAddr Args to Child 0a18f03c 7c826f4b 77e41ed1 00000000 0a18f080 ntdll!KiFastSystemCallRet 0a18f040 77e41ed1 00000000 0a18f080 00000000 ntdll!NtDelayExecution+0xc 0a18f0a8 77e424ed 00000032 00000000 04390000 kernel32!SleepEx+0x68 0a18f0b8 1b0ac343 00000032 042a34b4 0a18f1c8 kernel32!Sleep+0xf […]

We see that it had been sleeping for at most 32 milliseconds and perhaps this is a retry / sleep loop. We might guess that the thread was recently spinning in that loop and therefore waiting all the time before that. Alternatively we might guess that the retry portion is very small and fast and 32 milliseconds are spread over all elapsed time and the thread was in this loop the significant proportion of time. What we can surely say that the last waiting time was no more than 32 milliseconds. in the case of waiting for an event, for example, I don’t see any reliable way of calculating this time. If anyone knows please post a comment.

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Crash Dump Analysis, Crash Dump Patterns, WinDbg Tips and Tricks | 3 Comments »