Archive for March 24th, 2010

Adjoint threads, discontinuity and time delta: software trace pattern cooperation

Wednesday, March 24th, 2010

Here is one of the first case studies in pattern-driven software trace analysis. A user starts printing but nothing comes out. However, if the older printer driver is installed everything works as expected. We suspect that print spooler crashes if the newer printer driver is used. Based on known module name in ETW trace we find PID for print spooler process (19984) and immediately see discontinuity in the trace with the large time delta between the last PID message and the last trace statement (almost 4 minutes): 

No   Source        PID   TID   Time         Message
712  \src\print\ui 19984 16200 12:22:31.571 PropertySheet returns 1
[… no messages for PID 19984 …]
5103 \src\mgmt   1292  7604  12:26:11.659 WaitAction

If we select the adjoint thread of source \src\print\driver (in other words, filter only its messages) we would see discontinuity with the similar time delta. We know that printer driver runs in print spooler context. However, PID had changed and that means print spooler was restarted (perhaps after a crash):

No   Source            PID   TID   Time         Message
557  \src\print\driver 19984 16200 12:22:28.069 DisableDevice returns
[… discontinuity for \print\driver …]
1462 \src\print\driver 10828 17584 12:26:03.854 DllMain

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Icons for Memory Dump Analysis Patterns (Part 11)

Wednesday, March 24th, 2010

Today we introduce an icon for Inconsistent Dump pattern:

B/W

Color

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Icons for Memory Dump Analysis Patterns (Part 10)

Wednesday, March 24th, 2010

Today we introduce an icon for NULL Pointer (data) pattern:

B/W

Color

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

Reading Notebook: 22-March-10

Wednesday, March 24th, 2010

Comments in italics are mine and express my own views, thoughts and opinions

Windows Internals by M. Russinovich, D. Solomon and A. Ionescu:

Processor mode doesn’t affect thread scheduling (p. 414)

Preemption can be before a quantum ends and in that case the preempted thread is pushed at the front of a ready queue (pp. 414 - 415)

Clock interval extension of quanta for interrupted threads (pp. 416 - 417)

Context Switching (p. 418) - just noticed (never paid attention to before) that WinDbg shows empty context for the preempted thread:

x86 W2K3:

0: kd> kL
ChildEBP RetAddr
ba3a2a44 80833ed1 nt!KiSwapContext+0x26
ba3a2a70 80829c14 nt!KiSwapThread+0x2e5
ba3a2ab8 b9c5674d nt!KeWaitForSingleObject+0x346
[...]

0: kd> r
Last set context:
eax=00000000 ebx=00000000 ecx=00000000 edx=00000000 esi=00000000 edi=00000000
eip=8088f77e esp=ba3a2a38 ebp=ba3a2a70 iopl=0         nv up di pl nz na po nc
cs=0008  ss=0010  ds=0000  es=0000  fs=0000  gs=0000             efl=00000000
nt!KiSwapContext+0×26:
8088f77e 8b2c24          mov     ebp,dword ptr [esp]  ss:0010:ba3a2a38=ba3a2a70

0: kd> uf nt!KiSwapContext
nt!KiSwapContext:
8088f758 sub     esp,10h
8088f75b mov     dword ptr [esp+0Ch],ebx
8088f75f mov     dword ptr [esp+8],esi
8088f763 mov     dword ptr [esp+4],edi
8088f767 mov     dword ptr [esp],ebp
8088f76a mov     ebx,dword ptr fs:[1Ch]
8088f771 mov     edi,ecx
8088f773 mov     esi,edx
8088f775 movzx   ecx,byte ptr [edi+4Eh]
8088f779 call    nt!SwapContext (8088f880)
8088f77e mov     ebp,dword ptr [esp]
8088f781 mov     edi,dword ptr [esp+4]
8088f785 mov     esi,dword ptr [esp+8]
8088f789 mov     ebx,dword ptr [esp+0Ch]
8088f78d add     esp,10h
8088f790 ret

x64 W2K8:

1: kd> kL
*** Stack trace for last set context - .thread/.cxr resets it
Child-SP          RetAddr           Call Site
fffffa60`02ddc7c0 fffff800`0187a6fa nt!KiSwapContext+0x7f
fffffa60`02ddc900 fffff800`0186f35b nt!KiSwapThread+0x13a
fffffa60`02ddc970 fffff800`01ad9e57 nt!KeWaitForSingleObject+0x2cb
fffffa60`02ddca00 fffff800`01ad9219 nt!AlpcpReceiveMessagePort+0x287
fffffa60`02ddca60 fffff800`01ada58a nt!AlpcpReceiveMessage+0x245
fffffa60`02ddcb00 fffff800`01877ef3 nt!NtAlpcSendWaitReceivePort+0x1da
fffffa60`02ddcbb0 00000000`7747756a nt!KiSystemServiceCopyEnd+0x13
00000000`0020f5a8 00000000`00000000 ntdll!ZwAlpcSendWaitReceivePort+0xa

1: kd> r
Last set context:
rax=0000000000000000 rbx=0000000000000000 rcx=0000000000000000
rdx=0000000000000000 rsi=0000000000000000 rdi=0000000000000000

rip=fffff8000187ac7f rsp=fffffa6002ddc7c0 rbp=fffffa80047ca290
 r8=0000000000000000  r9=0000000000000000 r10=0000000000000000
r11=0000000000000000 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0         nv up di pl nz na pe nc
cs=0000  ss=0000  ds=0000  es=0000  fs=0000  gs=0000             efl=00000000

nt!KiSwapContext+0×7f:
fffff800`0187ac7f 488d8c2400010000 lea     rcx,[rsp+100h]

1: kd> uf nt!KiSwapContext
nt!KiSwapContext:
fffff800`0187ac00 sub     rsp,138h
fffff800`0187ac07 lea     rax,[rsp+100h]
fffff800`0187ac0f movaps  xmmword ptr [rsp+30h],xmm6
fffff800`0187ac14 movaps  xmmword ptr [rsp+40h],xmm7
fffff800`0187ac19 movaps  xmmword ptr [rsp+50h],xmm8
fffff800`0187ac1f movaps  xmmword ptr [rsp+60h],xmm9
fffff800`0187ac25 movaps  xmmword ptr [rsp+70h],xmm10
fffff800`0187ac2b movdqa  xmmword ptr [rax-80h],xmm11
fffff800`0187ac31 movdqa  xmmword ptr [rax-70h],xmm12
fffff800`0187ac37 movdqa  xmmword ptr [rax-60h],xmm13
fffff800`0187ac3d movdqa  xmmword ptr [rax-50h],xmm14
fffff800`0187ac43 movdqa  xmmword ptr [rax-40h],xmm15
fffff800`0187ac49 mov     qword ptr [rax],rbx
fffff800`0187ac4c mov     qword ptr [rax+8],rdi
fffff800`0187ac50 mov     qword ptr [rax+10h],rsi
fffff800`0187ac54 mov     qword ptr [rax+18h],r12
fffff800`0187ac58 mov     qword ptr [rax+20h],r13
fffff800`0187ac5c mov     qword ptr [rax+28h],r14
fffff800`0187ac60 mov     qword ptr [rax+30h],r15
fffff800`0187ac64 mov     rbx,qword ptr gs:[20h]
fffff800`0187ac6d mov     rdi,rcx
fffff800`0187ac70 mov     rsi,rdx
fffff800`0187ac73 movzx   ecx,byte ptr [rdi+156h]
fffff800`0187ac7a call    nt!SwapContext (fffff800`0187af50)
fffff800`0187ac7f lea     rcx,[rsp+100h]
fffff800`0187ac87 movdqa  xmm6,xmmword ptr [rsp+30h]
fffff800`0187ac8d movdqa  xmm7,xmmword ptr [rsp+40h]
fffff800`0187ac93 movdqa  xmm8,xmmword ptr [rsp+50h]
fffff800`0187ac9a movdqa  xmm9,xmmword ptr [rsp+60h]
fffff800`0187aca1 movdqa  xmm10,xmmword ptr [rsp+70h]
fffff800`0187aca8 movdqa  xmm11,xmmword ptr [rcx-80h]
fffff800`0187acae movdqa  xmm12,xmmword ptr [rcx-70h]
fffff800`0187acb4 movdqa  xmm13,xmmword ptr [rcx-60h]
fffff800`0187acba movdqa  xmm14,xmmword ptr [rcx-50h]
fffff800`0187acc0 movdqa  xmm15,xmmword ptr [rcx-40h]
fffff800`0187acc6 mov     rbx,qword ptr [rcx]
fffff800`0187acc9 mov     rdi,qword ptr [rcx+8]
fffff800`0187accd mov     rsi,qword ptr [rcx+10h]
fffff800`0187acd1 mov     r12,qword ptr [rcx+18h]
fffff800`0187acd5 mov     r13,qword ptr [rcx+20h]
fffff800`0187acd9 mov     r14,qword ptr [rcx+28h]
fffff800`0187acdd mov     r15,qword ptr [rcx+30h]
fffff800`0187ace1 add     rsp,138h
fffff800`0187ace8 ret

We also see that if there is an attempt to switch from a DPC we get a bugcheck

1: kd> uf nt!SwapContext
nt!SwapContext:
fffff800`0187af50 sub     rsp,38h
fffff800`0187af54 mov     qword ptr [rsp+30h],rbp
fffff800`0187af59 mov     byte ptr [rsp+28h],cl
fffff800`0187af5d cmp     byte ptr [rsi+95h],0
fffff800`0187af64 jne     nt!SwapContext+0x1cb (fffff800`0187b11b)

[...]

nt!SwapContext+0x1b2:
fffff800`0187b102 xor     r9,r9
fffff800`0187b105 mov     qword ptr [rsp+20h],r9
fffff800`0187b10a mov     r8,rsi
fffff800`0187b10d mov     rdx,rdi
fffff800`0187b110 mov     ecx,0B8h
fffff800`0187b115 call    nt!KeBugCheckEx (fffff800`01878450)
fffff800`0187b11a ret

It happens infrequently: http://www.dumpanalysis.org/blog/index.php/2008/03/12/bug-check-frequencies/

Idle process and threads can have NULL fields (pp. 418 - 419) - on x64 W2K8:

1: kd> !process poi(PsIdleProcess)
PROCESS fffff800019970c0
SessionId: none  Cid: 0000    Peb: 00000000  ParentCid: 0000
DirBase: 00124000  ObjectTable: fffff88000000080  HandleCount: 551.
Image: Idle
VadRoot fffffa8003b97c70 Vads 1 Clone 0 Private 1. Modified 0. Locked 0.
DeviceMap 0000000000000000
Token                             fffff88000003330
ElapsedTime                       00:00:00.000
UserTime                          00:00:00.000
KernelTime                        00:00:00.000
QuotaPoolUsage[PagedPool]         0
QuotaPoolUsage[NonPagedPool]      0
Working Set Sizes (now,min,max)  (6, 50, 450) (24KB, 200KB, 1800KB)
PeakWorkingSetSize                6
VirtualSize                       0 Mb
PeakVirtualSize                   0 Mb
PageFaultCount                    1
MemoryPriority                    BACKGROUND
BasePriority                      0
CommitCharge                      0

        THREAD fffff80001996b80  Cid 0000.0000  Teb: 0000000000000000 Win32Thread: 0000000000000000 RUNNING on processor 0
Not impersonating
DeviceMap                 fffff88000007310
Owning Process            fffff800019970c0       Image:         Idle
Attached Process          fffffa8003bf1040       Image:         System
Wait Start TickCount      16846          Ticks: 1721 (0:00:00:26.847)
Context Switch Count      229608
UserTime                  00:00:00.000
KernelTime                00:04:13.532
Win32 Start Address nt!KiIdleLoop (0xfffff8000187c880)
Stack Init fffff80002bdadb0 Current fffff80002bdad40
Base fffff80002bdb000 Limit fffff80002bd5000 Call 0
Priority 16 BasePriority 0 PriorityDecrement 0 IoPriority 0 PagePriority 0
Child-SP          RetAddr           Call Site
fffff800`02bdad80 fffff800`01a49860 nt!KiIdleLoop+0x11b
fffff800`02bdadb0 00000000`00000000 nt!zzz_AsmCodeRange_End+0x4

        THREAD fffffa60005f5d40  Cid 0000.0000  Teb: 0000000000000000 Win32Thread: 0000000000000000 RUNNING on processor 1
Not impersonating
DeviceMap                 fffff88000007310
Owning Process            fffff800019970c0       Image:         Idle
Attached Process          fffffa8003bf1040       Image:         System
Wait Start TickCount      0              Ticks: 18567 (0:00:04:49.647)
Context Switch Count      241262
UserTime                  00:00:00.000
KernelTime                00:04:23.501
Win32 Start Address nt!KiIdleLoop (0xfffff8000187c880)
Stack Init fffffa600191bdb0 Current fffffa600191bd40
Base fffffa600191c000 Limit fffffa6001916000 Call 0
Priority 16 BasePriority 0 PriorityDecrement 0 IoPriority 0 PagePriority 0
Child-SP          RetAddr           Call Site
fffffa60`0191bcd8 fffffa60`00f07685 intelppm!C1Halt+0x2
fffffa60`0191bce0 fffff800`0187cb83 intelppm!C1Idle+0x9
fffffa60`0191bd10 fffff800`0187c8a1 nt!PoIdle+0x183
fffffa60`0191bd80 fffff800`01a49860 nt!KiIdleLoop+0x21
fffffa60`0191bdb0 00000000`fffffa60 nt!zzz_AsmCodeRange_End+0x4
fffffa60`005efd00 00000000`00000000 0xfffffa60

MMCSS (MultiMedia Class Schedular Service) and priority boosts in Vista (p. 420)

Priority boosts never go beyond level 15 (p. 421) - looks like addition of velocities in relativity, where v1 > c/2, v2 > c/2 but v1+v2 < c (where c is the speed of light) :-)

Priority boosts for low prioroty _ERESOURCE owners (pp. 422 - 423)