Software Diagnostics Library

Bugchecks: KMODE_EXCEPTION_NOT_HANDLED

April 25th, 2007

This bugcheck (0×1E) is essentially the same as KERNEL_MODE_EXCEPTION_NOT_HANDLED (0×8E) although parameters are different:

KMODE_EXCEPTION_NOT_HANDLED (1e)
This is a very common bugcheck. Usually the exception address pinpoints the driver/function that caused the problem. Always note this address as well as the link date of the driver/image that contains this address.
Arguments:
Arg1: c0000005, The exception code that was not handled
Arg2: 8046ce72, The address that the exception occurred at
Arg3: 00000000, Parameter 0 of the exception
Arg4: 00000000, Parameter 1 of the exception

KERNEL_MODE_EXCEPTION_NOT_HANDLED (8e)
This is a very common bugcheck. Usually the exception address pinpoints the driver/function that caused the problem. Always note this address as well as the link date of the driver/image that contains this address. Some common problems are exception code 0×80000003. This means a hard coded breakpoint or assertion was hit, but this system was booted /NODEBUG. This is not supposed to happen as developers should never have hardcoded breakpoints in retail code, but … If this happens, make sure a debugger gets connected, and the system is booted /DEBUG. This will let us see why this breakpoint is happening.
Arguments:
Arg1: c0000005, The exception code that was not handled
Arg2: 808cbb8d, The address that the exception occurred at
Arg3: f5a84638, Trap Frame
Arg4: 00000000

Bugcheck 0×1E is called from the same routine KiDispatchException on x64 W2K3 and on x86 W2K whereas 0×8E is called on x86 W2K3 and Vista platforms. I haven’t checked this with x64 Vista. Here is the modified UML diagram showing both bugchecks:

- Dmitry Vostokov -

Posted in Bugchecks Depicted, Crash Dump Analysis, Debugging, Vista | No Comments »

Crash Dump Analysis Poster v1.1 (HTML version)

April 22nd, 2007

Here is an HTML version of Crash Dump Analysis Poster with hyperlinks. Command links launch WinDbg Help for corresponding topic. If you click on !heap, for example, WinDbg Help window for that command will open. In order to have this functionality you need to save source code of the following HTML file below to your disk and launch it locally.

http://www.dumpanalysis.org/CDAPoster.html

Your WinDbg Help file must be in the default installation path, i.e.

C:\Program Files\Debugging Tools for Windows\debugger.chm

If you installed WinDbg to a different folder then you can simply create the default folder and copy debugger.chm there.

I keep this HTML file open locally on a second monitor and found it very easy to jump to an appropriate command help when I need its parameter description.

This HTML poster was created and edited in Notepad.

I’m working on the second version and will announce it as soon as it is ready.

- Dmitry Vostokov -

Posted in Announcements, Crash Dump Analysis, Tools, WinDbg Tips and Tricks | 2 Comments »

Bugchecks: KERNEL_MODE_EXCEPTION_NOT_HANDLED

April 22nd, 2007

Here is the next depicted bugcheck: 0×8E. It is very common in kernel crash dumps and it means that:

If access violation exception happened the read or write address was in user space
Frame-based exception handling was allowed, kernel debugger (if any) didn’t handle the exception (first chance), then no exception handlers were willing to process the exception and at last kernel debugger (if any) didn’t handle the exception (second chance)
Frame-based exception handling wasn’t allowed and kernel debugger (if any) didn’t handle the exception

The second option is depicted on the following UML sequence diagram:

Note: if you have an access violation and read or write address is in kernel space you get a different bugcheck as explained in Invalid Pointer Pattern

I assumed that you know about structured and frame based exception handling (SEH). If you don’t know how it is implemented please read Matt Pietrek’s article: A Crash Course on the Depths of Win32 Structured Exception Handling

References used:

“Windows NT/2000 Native API Reference” book by Gary Nebbett
Local kernel debugging on Windows XP to check that the flow on the diagram above is correct

- Dmitry Vostokov -

Posted in Bugchecks Depicted, Crash Dump Analysis, Debugging | No Comments »

BSOD Shortcut Icon

April 21st, 2007

Created the most simple blue screen icon and put it on my site. Now you can observe it when you run your browser:

- Dmitry Vostokov -

Posted in Announcements | No Comments »

More WinDbg Scripts

April 21st, 2007

The good collection of WinDbg scripts was posted this month on Roberto Farah’s blog. Most of his scripts use DML (Debugger Markup Language). I also looked over my previous posts and put my old crash dump analysis scripts into one collection.

Found one problem. I was writing a script yesterday to call some external programs and found that I couldn’t easily pass parameters to the script. I mean something like this would be great:

$$><myscript.txt param1 param2

and I could possibly reference parameters inside the script perhaps by some pseudo-register. The current approach is to set up pseudo-registers before running the script and I don’t like it. I’ll post a better solution if I find it.

- Dmitry Vostokov -

Posted in Crash Dump Analysis, WinDbg Scripts | 1 Comment »

Crash Dump Analysis Patterns (Part 12)

April 20th, 2007

Another pattern that happens so often in crash dumps: No Component Symbols. In this case we can guess what a component does by looking at its name, overall thread stack where it is called and also its import table. Here is an example. We have component.sys driver visible on some thread stack in a kernel dump but we don’t know what that component can potentially do. Because we don’t have symbols we cannot see its imported functions:

kd> x component!* kd>

We use !dh command to dump its image headers:

kd> lmv m component start end module name fffffadf`e0eb5000 fffffadf`e0ebc000 component (no symbols) Loaded symbol image file: component.sys Image path: \??\C:\Component\x64\component.sys Image name: component.sys Timestamp: Sat Jul 01 19:06:16 2006 (44A6B998) CheckSum: 000074EF ImageSize: 00007000 Translations: 0000.04b0 0000.04e0 0409.04b0 0409.04e0 kd> !dh fffffadf`e0eb5000 File Type: EXECUTABLE IMAGE FILE HEADER VALUES 8664 machine (X64) 6 number of sections 44A6B998 time date stamp Sat Jul 01 19:06:16 2006 0 file pointer to symbol table 0 number of symbols F0 size of optional header 22 characteristics Executable App can handle >2gb addresses OPTIONAL HEADER VALUES 20B magic # 8.00 linker version C00 size of code A00 size of initialized data 0 size of uninitialized data 5100 address of entry point 1000 base of code ----- new ----- 0000000000010000 image base 1000 section alignment 200 file alignment 1 subsystem (Native) 5.02 operating system version 5.02 image version 5.02 subsystem version 7000 size of image 400 size of headers 74EF checksum 0000000000040000 size of stack reserve 0000000000001000 size of stack commit 0000000000100000 size of heap reserve 0000000000001000 size of heap commit 0 [ 0] address [size] of Export Directory 51B0 [ 28] address [size] of Import Directory 6000 [ 3B8] address [size] of Resource Directory 4000 [ 6C] address [size] of Exception Directory 0 [ 0] address [size] of Security Directory 0 [ 0] address [size] of Base Relocation Directory 2090 [ 1C] address [size] of Debug Directory 0 [ 0] address [size] of Description Directory 0 [ 0] address [size] of Special Directory 0 [ 0] address [size] of Thread Storage Directory 0 [ 0] address [size] of Load Configuration Directory 0 [ 0] address [size] of Bound Import Directory 2000 [ 88] address [size] of Import Address Table Directory 0 [ 0] address [size] of Delay Import Directory 0 [ 0] address [size] of COR20 Header Directory 0 [ 0] address [size] of Reserved Directory … … …

Then we display the contents of Import Address Table Directory using dps command:

kd> dps fffffadf`e0eb5000+2000 fffffadf`e0eb5000+2000+88 fffffadf`e0eb7000 fffff800`01044370 nt!IoCompleteRequest fffffadf`e0eb7008 fffff800`01019700 nt!IoDeleteDevice fffffadf`e0eb7010 fffff800`012551a0 nt!IoDeleteSymbolicLink fffffadf`e0eb7018 fffff800`01056a90 nt!MiResolveTransitionFault+0x7c2 fffffadf`e0eb7020 fffff800`0103a380 nt!ObDereferenceObject fffffadf`e0eb7028 fffff800`0103ace0 nt!KeWaitForSingleObject fffffadf`e0eb7030 fffff800`0103c570 nt!KeSetTimer fffffadf`e0eb7038 fffff800`0102d070 nt!IoBuildPartialMdl+0x3 fffffadf`e0eb7040 fffff800`012d4480 nt!PsTerminateSystemThread fffffadf`e0eb7048 fffff800`01041690 nt!KeBugCheckEx fffffadf`e0eb7050 fffff800`010381b0 nt!KeInitializeTimer fffffadf`e0eb7058 fffff800`0103ceb0 nt!ZwClose fffffadf`e0eb7060 fffff800`012b39f0 nt!ObReferenceObjectByHandle fffffadf`e0eb7068 fffff800`012b7380 nt!PsCreateSystemThread fffffadf`e0eb7070 fffff800`01251f90 nt!FsRtlpIsDfsEnabled+0x114 fffffadf`e0eb7078 fffff800`01275160 nt!IoCreateDevice fffffadf`e0eb7080 00000000`00000000 fffffadf`e0eb7088 00000000`00000000

We see that this driver under certain circumstances could bugcheck the system using KeBugCheckEx, it creates system thread(s) (PsCreateSystemThread) and uses timer(s) (KeInitializeTimer, KeSetTimer).

If you see name+offset in import table (I think this is an effect of OMAP code optimization) you can get the function by using ln command (list nearest symbols):

kd> ln fffff800`01056a90 (fffff800`01056760) nt!MiResolveTransitionFault+0x7c2 | (fffff800`01056a92) nt!RtlInitUnicodeString kd> ln fffff800`01251f90 (fffff800`01251e90) nt!FsRtlpIsDfsEnabled+0×114 | (fffff800`01251f92) nt!IoCreateSymbolicLink

This technique is useful if you have a bugcheck that happens when a driver calls certain functions or must call certain function in pairs, like bugcheck 0×20:

kd> !analyze -show 0x20 KERNEL_APC_PENDING_DURING_EXIT (20) The key data item is the thread's APC disable count. If this is non-zero, then this is the source of the problem. The APC disable count is decremented each time a driver calls KeEnterCriticalRegion, KeInitializeMutex, or FsRtlEnterFileSystem. The APC disable count is incremented each time a driver calls KeLeaveCriticalRegion, KeReleaseMutex, or FsRtlExitFileSystem. Since these calls should always be in pairs, this value should be zero when a thread exits. A negative value indicates that a driver has disabled APC calls without re-enabling them. A positive value indicates that the reverse is true. If you ever see this error, be very suspicious of all drivers installed on the machine — especially unusual or non-standard drivers. Third party file system redirectors are especially suspicious since they do not generally receive the heavy duty testing that NTFS, FAT, RDR, etc receive. This current IRQL should also be 0. If it is not, that a driver’s cancelation routine can cause this bugcheck by returning at an elevated IRQL. Always attempt to note what you were doing/closing at the time of the crash, and note all of the installed drivers at the time of the crash. This symptom is usually a severe bug in a third party driver.

Then you can see at least whether the suspicious driver could have potentially used those functions and if it imports one of them you can see whether it imports the corresponding counterpart function.

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Crash Dump Analysis, Crash Dump Patterns, WinDbg Tips and Tricks | 4 Comments »

Crash Dump Analysis Patterns (Part 5b)

April 20th, 2007

This is a follow up to Optimized Code pattern written previously. Now I discuss the following feature that often bewilders beginners. It is called OMAP code optimization. It is used to make code that needs to be present in memory smaller. So instead of flat address space for compiled function you have pieces of it scattered here and there. This leads to an ambiguity when you try to disassemble OMAP code at its address because WinDbg doesn’t know whether it should treat address range as a function offset (starting from the beginning of the function source code) or just a memory layout offset (starting from the address of that function). Let me illustrate this on IoCreateDevice function code.

Let’s first evaluate a random address starting from the first address of the function (memory layout offset):

kd> ? nt!IoCreateDevice Evaluate expression: -8796073668256 = fffff800`01275160 kd> ? nt!IoCreateDevice+0×144 Evaluate expression: -8796073667932 = fffff800`012752a4 kd> ? fffff800`012752a4-fffff800`01275160 Evaluate expression: 324 = 00000000`00000144

If we try to disassemble code at the same address the expression will also be evaluated as the memory layout offset:

kd> u nt!IoCreateDevice+0×144 nt!IoCreateDevice+0×1a3: fffff800`012752a4 83c810 or eax,10h fffff800`012752a7 898424b0000000 mov dword ptr [rsp+0B0h],eax fffff800`012752ae 85ed test ebp,ebp fffff800`012752b0 8bdd mov ebx,ebp fffff800`012752b2 0f858123feff jne nt!IoCreateDevice+0×1b3 fffff800`012752b8 035c2454 add ebx,dword ptr [rsp+54h] fffff800`012752bc 488b1585dcf2ff mov rdx,qword ptr [nt!IoDeviceObjectType] fffff800`012752c3 488d8c2488000000 lea rcx,[rsp+88h]

You see the difference: we give +0×144 offset but the code is shown from +0×1a3! This is because OMAP optimization moved the code from the function offset +0×1a3 to memory locations starting from +0×144. The following picture illustrates this:

If you see this when disassembling a function name+offset address from a thread stack trace you can use raw address instead:

kd> k Child-SP RetAddr Call Site fffffadf`e3a18d30 fffff800`012b331e component!function+0×72 fffffadf`e3a18d70 fffff800`01044196 nt!PspSystemThreadStartup+0×3e fffffadf`e3a18dd0 00000000`00000000 nt!KxStartSystemThread+0×16 kd> u fffff800`012b331e nt!PspSystemThreadStartup+0×3e: fffff800`012b331e 90 nop fffff800`012b331f f683fc03000040 test byte ptr [rbx+3FCh],40h fffff800`012b3326 0f8515d30600 jne nt!PspSystemThreadStartup+0×4c fffff800`012b332c 65488b042588010000 mov rax,qword ptr gs:[188h] fffff800`012b3335 483bd8 cmp rbx,rax fffff800`012b3338 0f85a6d30600 jne nt!PspSystemThreadStartup+0×10c fffff800`012b333e 838bfc03000001 or dword ptr [rbx+3FCh],1 fffff800`012b3345 33c9 xor ecx,ecx

You also see OMAP in action also when you try to disassemble the function body using uf command:

kd> uf nt!IoCreateDevice nt!IoCreateDevice+0×34d: fffff800`0123907d 834f3008 or dword ptr [rdi+30h],8 fffff800`01239081 e955c30300 jmp nt!IoCreateDevice+0×351 … … … nt!IoCreateDevice+0×14c: fffff800`0126f320 6641be0002 mov r14w,200h fffff800`0126f325 e92f5f0000 jmp nt!IoCreateDevice+0×158 nt!IoCreateDevice+0×3cc: fffff800`01270bd0 488d4750 lea rax,[rdi+50h] fffff800`01270bd4 48894008 mov qword ptr [rax+8],rax fffff800`01270bd8 488900 mov qword ptr [rax],rax fffff800`01270bdb e95b480000 jmp nt!IoCreateDevice+0×3d7 nt!IoCreateDevice+0xa4: fffff800`01273eb9 41b801000000 mov r8d,1 fffff800`01273ebf 488d154a010700 lea rdx,[nt!`string’] fffff800`01273ec6 488d8c24d8000000 lea rcx,[rsp+0D8h] fffff800`01273ece 440fc10522f0f2ff xadd dword ptr [nt!IopUniqueDeviceObjectNumber],r8d fffff800`01273ed6 41ffc0 inc r8d fffff800`01273ed9 e8d236deff call nt!swprintf fffff800`01273ede 4584ed test r13b,r13b fffff800`01273ee1 0f85c1a70800 jne nt!IoCreateDevice+0xce … … …

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Crash Dump Analysis, Crash Dump Patterns | 4 Comments »

Finding a needle in a hay

April 19th, 2007

Found a good WinDbg command to list unique threads in a process. Some processes have so many threads that it is difficult to find anomalies in the output of ~*kv command especially when most threads are similar like waiting for LPC reply, etc. In this case we can use !uniqstack command to list only threads with unique call stacks and then list duplicate thread numbers.

0:046> !uniqstack Processing 51 threads, please wait . 0 Id: 1d50.1dc0 Suspend: 1 Teb: 7fffe000 Unfrozen Priority: 0 Priority class: 32 ChildEBP RetAddr 0012fbcc 7c821b84 ntdll!KiFastSystemCallRet 0012fbd0 77e4189f ntdll!NtReadFile+0xc 0012fc38 77f795ab kernel32!ReadFile+0×16c 0012fc64 77f7943c ADVAPI32!ScGetPipeInput+0×2a 0012fcd8 77f796c1 ADVAPI32!ScDispatcherLoop+0×51 0012ff3c 004018fb ADVAPI32!StartServiceCtrlDispatcherW+0xe3 … … … . 26 Id: 1d50.44ec Suspend: 1 Teb: 7ffaf000 Unfrozen Priority: 1 Priority class: 32 ChildEBP RetAddr 0752fea0 7c822124 ntdll!KiFastSystemCallRet 0752fea4 77e6bad8 ntdll!NtWaitForSingleObject+0xc 0752ff14 77e6ba42 kernel32!WaitForSingleObjectEx+0xac 0752ff28 1b00999e kernel32!WaitForSingleObject+0×12 0752ff34 1b009966 msjet40!Semaphore::Wait+0xe 0752ff5c 1b00358c msjet40!Queue::GetMessageW+0xc9 0752ffb8 77e6608b msjet40!System::WorkerThread+0×41 0752ffec 00000000 kernel32!BaseThreadStart+0×34 … … … Total threads: 51 Duplicate callstacks: 31 (windbg thread #s follow): 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 21, 22, 23, 27, 28, 29, 33, 39, 40, 41, 42, 43, 44, 47, 49, 50 0:046> ~49kL ChildEBP RetAddr 0c58fe18 7c821c54 ntdll!KiFastSystemCallRet 0c58fe1c 77c7538c ntdll!ZwReplyWaitReceivePortEx+0xc 0c58ff84 77c5778f RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0×198 0c58ff8c 77c5f7dd RPCRT4!RecvLotsaCallsWrapper+0xd 0c58ffac 77c5de88 RPCRT4!BaseCachedThreadRoutine+0×9d 0c58ffb8 77e6608b RPCRT4!ThreadStartRoutine+0×1b 0c58ffec 00000000 kernel32!BaseThreadStart+0×34 0:046> ~47kL ChildEBP RetAddr 0b65fe18 7c821c54 ntdll!KiFastSystemCallRet 0b65fe1c 77c7538c ntdll!ZwReplyWaitReceivePortEx+0xc 0b65ff84 77c5778f RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0×198 0b65ff8c 77c5f7dd RPCRT4!RecvLotsaCallsWrapper+0xd 0b65ffac 77c5de88 RPCRT4!BaseCachedThreadRoutine+0×9d 0b65ffb8 77e6608b RPCRT4!ThreadStartRoutine+0×1b 0b65ffec 00000000 kernel32!BaseThreadStart+0×34

- Dmitry Vostokov -

Posted in Crash Dump Analysis, Debugging, WinDbg Tips and Tricks | No Comments »

Reading Chinese

April 18th, 2007

I’m very pleased that Da-Chang Guan started translating Crash Dump Analysis Patterns into Chinese. I’m personally thinking about translating them to Russian too, my native language.

Please visit his blog http://windriver.polar.tw/blog/ where he also mentions Windows Academic Program I didn’t know about before. Definitely it is good to learn about Windows Internals by studying Windows source code. Now there is an alternative to studying free Linux source code in university operating system courses.

If you are interested in learning Chinese language I’d like also to recommend some books that will help you to learn to read traditional Chinese. I’m learning it now and here I would recommend the Chinese Reader series (I own and I’m studying the first two at the moment):

My personal opinion and belief is that if you ignore writing and speaking then learning to read Chinese becomes less arduous task. If you like linguistics and languages like me I would also recommend the following two popular books that I read some time ago:

and the more linguistic one I read too:

Mandarin Chinese: An Introduction

I have another book I haven’t read yet but it is on my reading list:

Chinese Language: Fact and Fantasy

I became interested in English grammar and eventually in other languages and linguistics during my work for Programming Research Ltd., global leader in software quality and coding standards, 4 years ago when I was working on extending and writing new C++ semantics components for their C++ static analysis product.

- Dmitry Vostokov -

Posted in Crash Dump Analysis | 3 Comments »

Crash Dump Analysis Poster v1.0

April 15th, 2007

In December when I announced Crash Dump Analysis Card I talked about my plans to make a poster. Here is what I came to in the first version: A4 format poster with the following goals in mind:

Have it easy to stick nearby or have it handy
Foldable into two halves (user / kernel dumps)
Possibility to make it a background image on a PC desktop
Have it displayed on a second monitor
To facilitate mastering commands and their options
Encourage to look in WinDbg Help

You can download large JPEG file (1701×1208) for free (PDF file will be available later):

Download Crash Dump Analysis Poster

In a couple of months I’m going to release the new version after using and playing with the current one and collecting feedback. I have some extension commands missing in the first version of this poster like !list command, various scripting and meta-commands and I will add them in the next version. The current choice of commands is based on my previous Crash Dump Analysis Card and my personal day-to-day crash dump analysis work.

Originally I wanted to call it like something like WinDbg Cheat Sheet or WinDbg Poster but then I realized that I had to omit various live debugging commands and options and there are already several similar cheat sheets for live debugging.

- Dmitry Vostokov -

Posted in Announcements, Crash Dump Analysis, Tools, WinDbg Tips and Tricks | 1 Comment »

Race conditions on a uniprocessor machine

April 14th, 2007

It is a known fact that hidden race conditions in code are manifested more frequently on a multiprocessor machine than on a uniprocessor machine. I was trying to create an example to illustrate this and wrote the following code which was motivated by the similar kernel level code and the discussion on Russian Software Development Network forum:

volatile bool b; void thread_true(void *) { while(true) { b = true; } } void thread_false(void *) { while(true) { b = false; } } int _tmain(int argc, _TCHAR* argv[]) { _beginthread(thread_true, 0, NULL); _beginthread(thread_false, 0, NULL); while(true) { assert (b == false || b == true); } return 0; }

The program has three threads. Two of them are trying to set the same boolean variable b to different values and the main thread checks that its value is either true or false. The assertion should fail in the following scenario: the first thread (thread_true) sets b variable to true value so the first comparison in assertion fails and we expect the second comparison to succeed but the main thread is preempted by the second thread (thread_false) that sets that value to false and therefore the second comparison fails too. We get an assertion dialog in debug build showing that boolean variable b is neither true nor false!

I compiled and ran that program and it wasn’t failing for hours on my uniprocessor laptop. On a multiprocessor machine it was failing in a couple of minutes. If we look at assertion assembly language code we would see that it is very short so statistically speaking the chances that our main thread is preempted in the middle of the assertion are very low. This is because on a uniprocessor machine two threads are running not in parallel but until their quantum is expired. So we should make the assertion code longer to exceed the quantum. To simulate this I added a call to SwitchToThread API. When the assertion code yields execution to another thread then perhaps that thread would be thread_false and as soon as it is preempted by main thread again we get the assertion failure:

volatile bool b; bool SlowOp() { SwitchToThread(); return false; } void thread_true(void *) { while(true) { b = true; } } void thread_false(void *) { while(true) { b = false; } } int _tmain(int argc, _TCHAR* argv[]) { _beginthread(thread_true, 0, NULL); _beginthread(thread_false, 0, NULL); while(true) { assert (b == false || SlowOp() || b == true); } return 0; }

I compiled and ran the program again and I couldn’t see any failure for a long time. It looks like thread_false is always running before the main thread and when the main thread is running then due to short-circuit operator || evaluation rule we don’t have a chance to execute SlowOp(). Then I added a fourth thread called thread_true_2 to make the number of threads setting b variable to true value as twice as many as the number of threads setting b variable to false value (2 to 1) so we could have more chances to set b variable to true value before executing the assertion:

volatile bool b; bool SlowOp() { SwitchToThread(); return false; } void thread_true(void *) { while(true) { b = true; } } void thread_true_2(void *) { while(true) { b = true; } } void thread_false(void *) { while(true) { b = false; } } int _tmain(int argc, _TCHAR* argv[]) { _beginthread(thread_true, 0, NULL); _beginthread(thread_false, 0, NULL); _beginthread(thread_true_2, 0, NULL); while(true) { assert (b == false || SlowOp() || b == true); } return 0; }

Now when I ran the new program I got the assertion failure in a couple of minutes! It is hard to make race conditions manifest themselves on a uniprocessor machine.

- Dmitry Vostokov -

Posted in Debugging, Multithreading | 1 Comment »

Yet another look at Zw* and Nt* functions

April 10th, 2007

While reading the new book “Professional Rootkits” by Ric Vieler I encountered the following macro definition to get function index in system service table:

#define HOOK_INDEX(function2hook) *(PULONG)((PUCHAR)function2hook+1)

Couldn’t understand the code until looked at disassembly of a typical ntdll!Zw and nt!Zw function (x86 W2K3):

lkd> u ntdll!ZwCreateProcess ntdll!NtCreateProcess: 7c821298 b831000000 mov eax,31h 7c82129d ba0003fe7f mov edx,offset SharedUserData!SystemCallStub (7ffe0300) 7c8212a2 ff12 call dword ptr [edx] 7c8212a4 c22000 ret 20h 7c8212a7 90 nop ntdll!ZwCreateProcessEx: 7c8212a8 b832000000 mov eax,32h 7c8212ad ba0003fe7f mov edx,offset SharedUserData!SystemCallStub (7ffe0300) 7c8212b2 ff12 call dword ptr [edx]

lkd> u nt!ZwCreateProcess nt!ZwCreateProcess: 8083c2a3 b831000000 mov eax,31h 8083c2a8 8d542404 lea edx,[esp+4] 8083c2ac 9c pushfd 8083c2ad 6a08 push 8 8083c2af e8c688ffff call nt!KiSystemService (80834b7a) 8083c2b4 c22000 ret 20h nt!ZwCreateProcessEx: 8083c2b7 b832000000 mov eax,32h 8083c2bc 8d542404 lea edx,[esp+4]

You can notice that user space ntdll!Nt and ntdll!Zw variants are the same. This is not the case in kernel space:

lkd> u nt!NtCreateProcess nt!NtCreateProcess: 808f80ea 8bff mov edi,edi 808f80ec 55 push ebp 808f80ed 8bec mov ebp,esp 808f80ef 33c0 xor eax,eax 808f80f1 f6451c01 test byte ptr [ebp+1Ch],1 808f80f5 0f8549d10600 jne nt!NtCreateProcess+0xd (80965244) 808f80fb f6452001 test byte ptr [ebp+20h],1 808f80ff 0f8545d10600 jne nt!NtCreateProcess+0×14 (8096524a)

nt!Zw functions are dispatched through service table. nt!Nt functions are actual code.

For completeness let’s look at AMD x64 W2K3. User space x64 call:

0:001> u ntdll!ZwCreateProcess ntdll!NtCreateProcess: 00000000`78ef1ab0 4c8bd1 mov r10,rcx 00000000`78ef1ab3 b882000000 mov eax,82h 00000000`78ef1ab8 0f05 syscall 00000000`78ef1aba c3 ret 00000000`78ef1abb 666690 xchg ax,ax 00000000`78ef1abe 6690 xchg ax,ax ntdll!NtCreateProfile: 00000000`78ef1ac0 4c8bd1 mov r10,rcx 00000000`78ef1ac3 b883000000 mov eax,83h

User space x86 call in x64 W2K3:

0:001> u ntdll!ZwCreateProcess ntdll!ZwCreateProcess: 7d61d428 b882000000 mov eax,82h 7d61d42d 33c9 xor ecx,ecx 7d61d42f 8d542404 lea edx,[esp+4] 7d61d433 64ff15c0000000 call dword ptr fs:[0C0h] 7d61d43a c22000 ret 20h 7d61d43d 8d4900 lea ecx,[ecx] ntdll!ZwCreateProfile: 7d61d440 b883000000 mov eax,83h 7d61d445 33c9 xor ecx,ecx

Kernel space call in x64 W2K3:

kd> u nt!ZwCreateProcess nt!ZwCreateProcess+20 nt!ZwCreateProcess: fffff800`0103dd70 488bc4 mov rax,rsp fffff800`0103dd73 fa cli fffff800`0103dd74 4883ec10 sub rsp,10h fffff800`0103dd78 50 push rax fffff800`0103dd79 9c pushfq fffff800`0103dd7a 6a10 push 10h fffff800`0103dd7c 488d057d380000 lea rax,[nt!KiServiceLinkage (fffff800`01041600)] fffff800`0103dd83 50 push rax fffff800`0103dd84 b882000000 mov eax,82h fffff800`0103dd89 e972310000 jmp nt!KiServiceInternal (fffff800`01040f00) fffff800`0103dd8e 6690 xchg ax,ax

kd> u nt!NtCreateProcess nt!NtCreateProcess: fffff800`01245832 53 push rbx fffff800`01245833 4883ec50 sub rsp,50h fffff800`01245837 4c8b9c2488000000 mov r11,qword ptr [rsp+88h] fffff800`0124583f b801000000 mov eax,1 fffff800`01245844 488bd9 mov rbx,rcx fffff800`01245847 488b8c2490000000 mov rcx,qword ptr [rsp+90h] fffff800`0124584f 41f6c301 test r11b,1 fffff800`01245853 41ba00000000 mov r10d,0

Here is the same as in kernel x86: Zw functions are dispatched but Nt functions are actual code. If you want to remember which function variant is dispatched and which is actual code I propose the mnemonic “Z-dispatch”.

- Dmitry Vostokov -

Posted in Assembly Language, Kernel Development | 4 Comments »

Programmer Universalis

April 9th, 2007

Just a short observation: it’s very good to be able to understand and even write everything from GUI down to machine language instructions or up. Certainly understanding how software works at every level is very helpful in memory dump analysis. Seeing thread stacks in memory dumps helps in understanding software. The more you know the better you are at dump analysis and debugging. Debugging is not about stepping through the code. This is a very narrow view of a specialist programmer. Programmer Universalis can do debugging at every possible level and therefore can write any possible software layer.

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Crash Dump Analysis, Debugging | 2 Comments »

Analyzing Dr. Watson logs

April 8th, 2007

The main problem with Dr. Watson logs is lack of symbol information but this can be alleviated by using WinDbg if you have the same binary that crashed and produced the log entry. I’m going to illustrate this by using TestDefaultDebugger tool. Its main purpose is to crash. I use this tool here just to show you how to reconstruct stack trace.

If you run it and Dr. Watson is your default postmortem debugger you will get this event recoded in your Dr. Watson log:

*** ERROR: Module load completed but symbols could not be loaded for C:\Work\TestDefaultDebugger.exe function: TestDefaultDebugger 004014e6 cc int 3 004014e7 cc int 3 004014e8 cc int 3 004014e9 cc int 3 004014ea cc int 3 004014eb cc int 3 004014ec cc int 3 004014ed cc int 3 004014ee cc int 3 004014ef cc int 3 FAULT ->004014f0 c7050000000000000000 mov dword ptr ds:[0],0 ds:0023:00000000=???????? 004014fa c3 ret 004014fb cc int 3 004014fc cc int 3 004014fd cc int 3 004014fe cc int 3 004014ff cc int 3 00401500 0fb7542404 movzx edx,word ptr [esp+4] 00401505 89542404 mov dword ptr [esp+4],edx 00401509 e98e1c0000 jmp TestDefaultDebugger+0×319c (0040319c) 0040150e cc int 3 *—-> Stack Back Trace <----* *** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\WINDOWS\system32\ntdll.dll - ChildEBP RetAddr Args to Child WARNING: Stack unwind information not available. Following frames may be wrong. TestDefaultDebugger+0x14f0 TestDefaultDebugger+0x3470 TestDefaultDebugger+0x2a27 TestDefaultDebugger+0x8e69 TestDefaultDebugger+0x98d9 TestDefaultDebugger+0x6258 TestDefaultDebugger+0x836d

You see that when the log entry was saved there were no symbols available and this is the most common case. If you have such a log and no corresponding user dump (perhaps it was overwritten) then you can still reconstruct stack trace. To do this run WinDbg, set path to your application symbol files and load your application as a crash dump:

Microsoft (R) Windows Debugger Version 6.6.0007.5 Copyright (c) Microsoft Corporation. All rights reserved. Loading Dump File [C:\Work\TestDefaultDebugger.exe] Symbol search path is: SRV*c:\websymbols*http://msdl.microsoft.com/download/symbols;c:\work Executable search path is: ModLoad: 00400000 00435000 C:\Work\TestDefaultDebugger.exe eax=00000000 ebx=00000000 ecx=00000000 edx=00000000 esi=00000000 edi=00000000 eip=0040e8bb esp=00000000 ebp=00000000 iopl=0 nv up di pl nz na po nc cs=0000 ss=0000 ds=0000 es=0000 fs=0000 gs=0000 efl=00000000 TestDefaultDebugger!wWinMainCRTStartup: 0040e8bb e876440000 call TestDefaultDebugger!__security_init_cookie (00412d36)

Now use ln command to find the nearest symbol:

0:000> ln TestDefaultDebugger+0×14f0 c:\testdefaultdebugger\testdefaultdebuggerdlg.cpp(155) (004014f0) TestDefaultDebugger!CTestDefaultDebuggerDlg::OnBnClickedButton1 | (00401500) TestDefaultDebugger!CDialog::Create Exact matches: TestDefaultDebugger!CTestDefaultDebuggerDlg::OnBnClickedButton1 (void) 0:000> ln TestDefaultDebugger+0×3470 f:\rtm\vctools\vc7libs\ship\atlmfc\src\mfc\cmdtarg.cpp(381)+0×18 (00403358) TestDefaultDebugger!CCmdTarget::OnCmdMsg+0×118 | (00403472) TestDefaultDebugger!CCmdTarget::IsInvokeAllowed 0:000> ln TestDefaultDebugger+0×2a27 f:\rtm\vctools\vc7libs\ship\atlmfc\src\mfc\dlgcore.cpp(85)+0×17 (00402a0c) TestDefaultDebugger!CDialog::OnCmdMsg+0×1b | (00402a91) TestDefaultDebugger!CDialog::`scalar deleting destructor’ 0:000> ln TestDefaultDebugger+0×8e69 f:\rtm\vctools\vc7libs\ship\atlmfc\src\mfc\wincore.cpp(2299)+0xd (00408dd9) TestDefaultDebugger!CWnd::OnCommand+0×90 | (00408e70) TestDefaultDebugger!CWnd::OnNotify 0:000> ln TestDefaultDebugger+0×98d9 f:\rtm\vctools\vc7libs\ship\atlmfc\src\mfc\wincore.cpp(1755)+0xe (004098a3) TestDefaultDebugger!CWnd::OnWndMsg+0×36 | (00409ecf) TestDefaultDebugger!CWnd::ReflectChildNotify 0:000> ln TestDefaultDebugger+0×6258 f:\rtm\vctools\vc7libs\ship\atlmfc\src\mfc\wincore.cpp(1741)+0×17 (00406236) TestDefaultDebugger!CWnd::WindowProc+0×22 | (0040627a) TestDefaultDebugger!CTestCmdUI::CTestCmdUI 0:000> ln TestDefaultDebugger+0×836d f:\rtm\vctools\vc7libs\ship\atlmfc\src\mfc\wincore.cpp(243) (004082d3) TestDefaultDebugger!AfxCallWndProc+0×9a | (004083c0) TestDefaultDebugger!AfxWndProc

So we reconstructed the stack trace:

TestDefaultDebugger!CTestDefaultDebuggerDlg::OnBnClickedButton1 TestDefaultDebugger!CCmdTarget::OnCmdMsg+0×118 TestDefaultDebugger!CDialog::OnCmdMsg+0×1b TestDefaultDebugger!CWnd::OnCommand+0×90 TestDefaultDebugger!CWnd::OnWndMsg+0×36 TestDefaultDebugger!CWnd::WindowProc+0×22 TestDefaultDebugger!AfxCallWndProc+0×9a

To check it we disassemble the top and see that it corresponds to our crash point from Dr. Watson log:

0:000> u TestDefaultDebugger!CTestDefaultDebuggerDlg::OnBnClickedButton1 TestDefaultDebugger!CTestDefaultDebuggerDlg::OnBnClickedButton1 [c:\testdefaultdebugger\testdefaultdebuggerdlg.cpp @ 155]: 004014f0 c7050000000000000000 mov dword ptr ds:[0],0 004014fa c3 ret 004014fb cc int 3 004014fc cc int 3 004014fd cc int 3 004014fe cc int 3 004014ff cc int 3

Although I haven’t tried it yet I believe you can also apply this technique to old Windows 98 or Windows Me Dr. Watson logs.

- Dmitry Vostokov -

Posted in Crash Dump Analysis, WinDbg Tips and Tricks | No Comments »

ScreenHistory 1.0

April 8th, 2007

After working with many customer issues where I needed good screenshots I decided to write a screen or window capture tool to make troubleshooting and reading other logs/traces easier. Here is ScreenHistory tool with familiar History-like GUI interface if you have seen WindowHistory, MessageHistory and ProcessHistory tools.

The tool captures the whole screen (currently the primary monitor) after specified interval (default is 1 second) or the contents of a current foreground window (multi-monitor independent) and saves its screenshot in JPEG, GIF (default) or PNG file. Additionally an HTML file is generated with links to screenshots. New forthcoming versions of WindowHistory and MessageHistory will reference these screenshots. Windows Mobile version will be released soon too.

Instead of forming a mental picture about screen when you look at messages or relating them to arbitrary screenshots sent by your customers you can easily check real-time screenshots when you look at message traces, for example, MessageHistory trace:

13:12:24:944 S WM_ACTIVATEAPP (0x1c) wParam: 0x0 lParam: 0x12ec Deactivated / TID of activated window: 0x12ec … [Screen] 13:12:47:268 S WM_ACTIVATEAPP (0×1c) wParam: 0×1 lParam: 0×0 Activated / TID of deactivated window: 0×0 … [Screen]

or WindowHistory trace

Handle: 000300E4 Class: "MyClass" Title: "My Application" Captured at: 13:11:47:983 Process ID: 6c4 Thread ID: 1054 Parent: 0 Screen position (l,t,r,b): (264,161,1032,691) Visible: true Window placement command: SW_SHOWNORMAL Foreground: false Foreground changed at 13:12:20:626 to true [Screen] Foreground changed at 13:12:24:959 to false [Screen] Foreground changed at 13:12:47:284 to true [Screen] Foreground changed at 13:12:51:852 to false [Screen]

The following ScreenHistory screenshot was saved by the tool itself:

If you save an HTML file and load it in IE you would see formatted screen log (screenshot was saved by ScreenHistory):

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Announcements, Citrix, Debugging, Software Technical Support, Tools | No Comments »

Upgrading Dr. Watson

April 7th, 2007

I’ve been using NTSD as a default debugger on my laptop for a while and decided to revert it to Dr. Watson to get a couple of logs. Unfortunately Dr. Watson itself crashed in dbghelp.dll. Loading drwtsn32.exe dump reveals that it depends on both dbghelp.dll and dbgeng.dll. I tried to replace these DLLs with newer versions from the latest Debugging Tools for Windows and found that this change in system32 folder is immediately reverted back to original file versions. Instead of battling against Windows I decided to create a completely separate Dr. Watson folder and copy drwtsn32.exe, the latest dbghelp.dll and dbgeng.dll from Debugging Tools for Windows there. Then I altered “Debugger” value under the following key to include the full path to drwtsn32.exe:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\AeDebug Debugger=c:\drwatson\drwtsn32 -p %ld -e %ld -g

This solved the problem. Dr. Watson now uses the latest debugging engine to save dumps and logs.

- Dmitry Vostokov -

Posted in Crash Dump Analysis, Tools | No Comments »

Post-debugging complications

April 3rd, 2007

Real story: suddenly an application being developed started to leak memory very rapidly and in huge amounts, 100Mb per second. That application used a DLL that was known for memory leaks but those leaks were much smaller before. After spending the whole day debugging this problem a developer renamed the application just to keep its current version and launched it again. The same executable file but under a different name started to consume much less memory as before the problem. After renaming it back the application started to consume huge amounts of memory again. Scratching his head the developer recalled that he enabled full page heap (placing allocations at the end of full pages) 3 weeks ago…

The moral of this story is always to revert changes made for debugging purposes back as soon as debugging session is finished or to use fresh and separate debugging environment every time. The latter is much easier nowadays if you use VMWare or Virtual PC.

- Dmitry Vostokov -

Posted in Debugging | No Comments »

Crash Dump Analysis Patterns (Part 11)

April 3rd, 2007

One of mistakes beginners make is trusting WinDbg !analyze or kv commands displaying stack trace. WinDbg is only a tool, sometimes information necessary to get correct stack trace is missing and therefore some critical thought is required to distinguish between correct and incorrect stack traces. I call this pattern Incorrect Stack Trace. Incorrect stack traces usually

Have WinDbg warning: “Following frames may be wrong”
Don’t have the correct bottom frame like kernel32!BaseThreadStart (in user-mode)
Have function calls that don’t make any sense
Have strange looking disassembled function code or code that doesn’t make any sense from compiler perspective
Have ChildEBP and RetAddr addresses that don’t make any sense

Consider the following stack trace:

0:011> k ChildEBP RetAddr WARNING: Frame IP not in any known module. Following frames may be wrong. 0184e434 7c830b10 0×184e5bf 0184e51c 7c81f832 ntdll!RtlGetFullPathName_Ustr+0×15b 0184e5f8 7c83b1dd ntdll!RtlpLowFragHeapAlloc+0xc6a 00099d30 00000000 ntdll!RtlpLowFragHeapFree+0xa7

Here we have almost all attributes of the wrong stack trace. At the first glance it looks like some heap corruption happened (runtime heap alloc and free functions are present) but if you give it second thought you would see that low fragmentation heap Free function shouldn’t call low fragmentation heap Alloc function and the latter shoudn’t query full path name. That doesn’t make any sense.

What we should do here? Look at raw stack and try to build the correct stack trace ourselves. In our case this is very easy. We need to traverse stack frames from BaseThreadStart+0×34 until we don’t find any function call or reach the top. When functions are called (no optimization, most compilers) EBP registers are linked together as explained on slide 13 here:

Practical Foundations of Debugging (6.1)

0:011> !teb TEB at 7ffd8000 ExceptionList: 0184ebdc StackBase: 01850000 StackLimit: 01841000 SubSystemTib: 00000000 FiberData: 00001e00 ArbitraryUserPointer: 00000000 Self: 7ffd8000 EnvironmentPointer: 00000000 ClientId: 0000061c . 00001b60 RpcHandle: 00000000 Tls Storage: 00000000 PEB Address: 7ffdf000 LastErrorValue: 0 LastStatusValue: c0000034 Count Owned Locks: 0 HardErrorMode: 0

0:011> dds 01841000 01850000 01841000 00000000 … … … 0184eef0 0184ef0c 0184eef4 7615dff2 localspl!SplDriverEvent+0×21 0184eef8 00bc3e08 0184eefc 00000003 0184ef00 00000001 0184ef04 00000000 0184ef08 0184efb0 0184ef0c 0184ef30 0184ef10 7615f9d0 localspl!PrinterDriverEvent+0×46 0184ef14 00bc3e08 0184ef18 00000003 0184ef1c 00000000 0184ef20 0184efb0 0184ef24 00b852a8 0184ef28 00c3ec58 0184ef2c 00bafcc0 0184ef30 0184f3f8 0184ef34 7614a9b4 localspl!SplAddPrinter+0×5f3 0184ef38 00c3ec58 0184ef3c 00000003 0184ef40 00000000 0184ef44 0184efb0 0184ef48 00c117f8 … … … 0184ff28 00000000 0184ff2c 00000000 0184ff30 0184ff84 0184ff34 77c75286 RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0×3a 0184ff38 0184ff4c 0184ff3c 77c75296 RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0×4a 0184ff40 7c82f2fc ntdll!RtlLeaveCriticalSection 0184ff44 000de378 0184ff48 00097df0 0184ff4c 4d2fa200 0184ff50 ffffffff 0184ff54 ca5b1700 0184ff58 ffffffff 0184ff5c 8082d821 0184ff60 0184fe38 0184ff64 00097df0 0184ff68 000000aa 0184ff6c 80020000 0184ff70 0184ff54 0184ff74 80020000 0184ff78 000b0c78 0184ff7c 00a50180 0184ff80 0184fe38 0184ff84 0184ff8c 0184ff88 77c5778f RPCRT4!RecvLotsaCallsWrapper+0xd 0184ff8c 0184ffac 0184ff90 77c5f7dd RPCRT4!BaseCachedThreadRoutine+0×9d 0184ff94 0009c410 0184ff98 00000000 0184ff9c 00000000 0184ffa0 00097df0 0184ffa4 00097df0 0184ffa8 00015f90 0184ffac 0184ffb8 0184ffb0 77c5de88 RPCRT4!ThreadStartRoutine+0×1b 0184ffb4 00088258 0184ffb8 0184ffec 0184ffbc 77e6608b kernel32!BaseThreadStart+0×34 0184ffc0 00097df0 0184ffc4 00000000 0184ffc8 00000000 0184ffcc 00097df0 0184ffd0 8ad84818 0184ffd4 0184ffc4 0184ffd8 8980a700 0184ffdc ffffffff 0184ffe0 77e6b7d0 kernel32!_except_handler3 0184ffe4 77e66098 kernel32!`string’+0×98 0184ffe8 00000000 0184ffec 00000000 0184fff0 00000000 77c5de6d RPCRT4!ThreadStartRoutine 0184fff8 00097df0 0184fffc 00000000 01850000 00000008

Next we need to use custom k command and specify base pointer. In our case the last found stack address that links EBP pointers is 0184eef0:

0:011> k L=0184eef0 ChildEBP RetAddr WARNING: Frame IP not in any known module. Following frames may be wrong. 0184eef0 7615dff2 0×184e5bf 0184ef0c 7615f9d0 localspl!SplDriverEvent+0×21 0184ef30 7614a9b4 localspl!PrinterDriverEvent+0×46 0184f3f8 761482de localspl!SplAddPrinter+0×5f3 0184f424 74067c8f localspl!LocalAddPrinterEx+0×2e 0184f874 74067b76 SPOOLSS!AddPrinterExW+0×151 0184f890 01007e29 SPOOLSS!AddPrinterW+0×17 0184f8ac 01006ec3 spoolsv!YAddPrinter+0×75 0184f8d0 77c70f3b spoolsv!RpcAddPrinter+0×37 0184f8f8 77ce23f7 RPCRT4!Invoke+0×30 0184fcf8 77ce26ed RPCRT4!NdrStubCall2+0×299 0184fd14 77c709be RPCRT4!NdrServerCall2+0×19 0184fd48 77c7093f RPCRT4!DispatchToStubInCNoAvrf+0×38 0184fd9c 77c70865 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0×117 0184fdc0 77c734b1 RPCRT4!RPC_INTERFACE::DispatchToStub+0xa3 0184fdfc 77c71bb3 RPCRT4!LRPC_SCALL::DealWithRequestMessage+0×42c 0184fe20 77c75458 RPCRT4!LRPC_ADDRESS::DealWithLRPCRequest+0×127 0184ff84 77c5778f RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0×430 0184ff8c 77c5f7dd RPCRT4!RecvLotsaCallsWrapper+0xd

Stack traces make more sense now but we don’t see BaseThreadStart+0×34. By default WinDbg displays only certain amount of function calls (stack frames) so we need to specify stack frame count, for example, 100:

0:011> k L=0184eef0 100 ChildEBP RetAddr WARNING: Frame IP not in any known module. Following frames may be wrong. 0184eef0 7615dff2 0×184e5bf 0184ef0c 7615f9d0 localspl!SplDriverEvent+0×21 0184ef30 7614a9b4 localspl!PrinterDriverEvent+0×46 0184f3f8 761482de localspl!SplAddPrinter+0×5f3 0184f424 74067c8f localspl!LocalAddPrinterEx+0×2e 0184f874 74067b76 SPOOLSS!AddPrinterExW+0×151 0184f890 01007e29 SPOOLSS!AddPrinterW+0×17 0184f8ac 01006ec3 spoolsv!YAddPrinter+0×75 0184f8d0 77c70f3b spoolsv!RpcAddPrinter+0×37 0184f8f8 77ce23f7 RPCRT4!Invoke+0×30 0184fcf8 77ce26ed RPCRT4!NdrStubCall2+0×299 0184fd14 77c709be RPCRT4!NdrServerCall2+0×19 0184fd48 77c7093f RPCRT4!DispatchToStubInCNoAvrf+0×38 0184fd9c 77c70865 RPCRT4!RPC_INTERFACE::DispatchToStubWorker+0×117 0184fdc0 77c734b1 RPCRT4!RPC_INTERFACE::DispatchToStub+0xa3 0184fdfc 77c71bb3 RPCRT4!LRPC_SCALL::DealWithRequestMessage+0×42c 0184fe20 77c75458 RPCRT4!LRPC_ADDRESS::DealWithLRPCRequest+0×127 0184ff84 77c5778f RPCRT4!LRPC_ADDRESS::ReceiveLotsaCalls+0×430 0184ff8c 77c5f7dd RPCRT4!RecvLotsaCallsWrapper+0xd 0184ffac 77c5de88 RPCRT4!BaseCachedThreadRoutine+0×9d 0184ffb8 77e6608b RPCRT4!ThreadStartRoutine+0×1b 0184ffec 00000000 kernel32!BaseThreadStart+0×34

Now stack trace looks much better.

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Crash Dump Analysis, Crash Dump Patterns, WinDbg Tips and Tricks | 23 Comments »

StressPrinters: Stressing Printer Autocreation

April 3rd, 2007

Printer drivers are great source of crash dumps especially in Citrix and Microsoft terminal services environments. Bad printer drivers crash or hang spooler service (spoolsv.exe) when multiple users connect to a server.

Most of bad drivers were designed and implemented for use in a single user environment without considering multithreading in mind. Some bad drivers display a dialog box every time the printer is created and because this is done on a server side users cannot dismiss it unless spooler service is configured to interact with the desktop and an administrator sees the dialog box. Some drivers are linked to a debug run-time library and every exception brings up a dialog effectively hanging the thread and sometimes the whole spooler service if there was heap corruption, for example.

Therefore before allowing terminal services users to use certain printers it is good to simulate multiple users trying to create particular printers to determine bad drivers and other printer components. Originally Citrix had very popular command line AddPrinter tool for this purpose and it has been replaced by StressPrinters tool where I designed and implemented GUI to set parameters, orchestration of multiple AddPrinter command line tools launched simultaneously with different parameters and overall log file management. You can even export your settings to a file and import it on another server. The tool also has 64-bit executables to test printer autocreation on x64 Windows.

Here are some screenshots:

The tool detects spooler crashes (if spoolsv.exe suddenly disappears from a process list) so you can check for crash dumps saved if you set up a default postmortem debugger (Dr. Watson or NTSD). If you see the progress bar hanging for a long time then you can dump the spooler service using MS userdump.exe to check for any stuck threads and resource contention.

You can register for free to read documentation and download this tool from Citrix support web site:

StressPrinters 1.2 for 32-bit and 64-bit platforms

- Dmitry Vostokov -

Posted in Announcements, Citrix, Tools | 4 Comments »

Crash Dump Analysis Patterns (Part 10)

March 19th, 2007

Sometimes the change of operating system version or installing an intrusive product reveals hidden bugs in software that was working perfectly before that.

What have happened after installing the new software? If you look at the process dump you would see many DLLs loaded at their specific virtual addresses. Here is the output from lm WinDbg command after attaching to iexplore.exe process running on my Windows XP SP2 workstation:

0:000> lm
start    end      module name
00400000 00419000 iexplore
01c80000 01d08000 shdoclc
01d10000 01fd5000 xpsp2res
022b0000 022cd000 xpsp3res
02680000 02946000 msi
031f0000 031fd000 LvHook
03520000 03578000 PortableDeviceApi
037e0000 037f7000 odbcint
0ffd0000 0fff8000 rsaenh
20000000 20012000 browselc
30000000 302ee000 Flash9b
325c0000 325d2000 msohev
4d4f0000 4d548000 WINHTTP
5ad70000 5ada8000 UxTheme
5b860000 5b8b4000 NETAPI32
5d090000 5d12a000 comctl32_5d090000
5e310000 5e31c000 pngfilt
63000000 63014000 SynTPFcs
662b0000 66308000 hnetcfg
66880000 6688c000 ImgUtil
6bdd0000 6be06000 dxtrans
6be10000 6be6a000 dxtmsft
6d430000 6d43a000 ddrawex
71a50000 71a8f000 mswsock
71a90000 71a98000 wshtcpip
71aa0000 71aa8000 WS2HELP
71ab0000 71ac7000 WS2_32
71ad0000 71ad9000 wsock32
71b20000 71b32000 MPR
71bf0000 71c03000 SAMLIB
71c10000 71c1e000 ntlanman
71c80000 71c87000 NETRAP
71c90000 71cd0000 NETUI1
71cd0000 71ce7000 NETUI0
71d40000 71d5c000 actxprxy
722b0000 722b5000 sensapi
72d10000 72d18000 msacm32
72d20000 72d29000 wdmaud
73300000 73367000 vbscript
73760000 737a9000 DDRAW
73bc0000 73bc6000 DCIMAN32
73dd0000 73ece000 MFC42
74320000 7435d000 ODBC32
746c0000 746e7000 msls31
746f0000 7471a000 msimtf
74720000 7476b000 MSCTF
754d0000 75550000 CRYPTUI
75970000 75a67000 MSGINA
75c50000 75cbe000 jscript
75cf0000 75d81000 mlang
75e90000 75f40000 SXS
75f60000 75f67000 drprov
75f70000 75f79000 davclnt
75f80000 7607d000 BROWSEUI
76200000 76271000 mshtmled
76360000 76370000 WINSTA
76390000 763ad000 IMM32
763b0000 763f9000 comdlg32
76600000 7661d000 CSCDLL
767f0000 76817000 schannel
769c0000 76a73000 USERENV
76b20000 76b31000 ATL
76b40000 76b6d000 WINMM
76bf0000 76bfb000 PSAPI
76c30000 76c5e000 WINTRUST
76c90000 76cb8000 IMAGEHLP
76d60000 76d79000 iphlpapi
76e80000 76e8e000 rtutils
76e90000 76ea2000 rasman
76eb0000 76edf000 TAPI32
76ee0000 76f1c000 RASAPI32
76f20000 76f47000 DNSAPI
76f60000 76f8c000 WLDAP32
76fc0000 76fc6000 rasadhlp
76fd0000 7704f000 CLBCATQ
77050000 77115000 COMRes
77120000 771ac000 OLEAUT32
771b0000 77256000 WININET
773d0000 774d3000 comctl32
774e0000 7761d000 ole32
77920000 77a13000 SETUPAPI
77a20000 77a74000 cscui
77a80000 77b14000 CRYPT32
77b20000 77b32000 MSASN1
77b40000 77b62000 appHelp
77bd0000 77bd7000 midimap
77be0000 77bf5000 MSACM32_77be0000
77c00000 77c08000 VERSION
77c10000 77c68000 msvcrt
77c70000 77c93000 msv1_0
77d40000 77dd0000 USER32
77dd0000 77e6b000 ADVAPI32
77e70000 77f01000 RPCRT4
77f10000 77f57000 GDI32
77f60000 77fd6000 SHLWAPI
77fe0000 77ff1000 Secur32
7c800000 7c8f4000 kernel32
7c900000 7c9b0000 ntdll
7c9c0000 7d1d5000 SHELL32
7dc30000 7df20000 mshtml
7e1e0000 7e280000 urlmon
7e290000 7e3ff000 SHDOCVW

Installing or upgrading software can change the distribution of loaded DLLs and their addresses. This also happens when you install some monitoring software which usually injects their DLLs into every process. As a result some DLLs might be relocated or even the new ones appear loaded. And this might influence 3rd-party program behavior therefore exposing its hidden bugs being dormant when executing the process in old environment. I call this pattern Changed Environment.

Let’s look at some hypothetical example. Suppose your program has the following code fragment

if (*p) { // do something useful }

Suppose the pointer p is invalid, dangling, its value has been overwritten and this happened because of some bug. Being invalid that pointer can point to a valid memory location nevertheless and the value it points to most likely is non-zero. Therefore the body of the “if” statement will be executed. Suppose it always happens when you run the program and every time you execute it the value of the pointer happens to be the same. Here is the picture illustrating the point:

The pointer value 0×40010024 due to some reason always points to the value 0×00BADBAD. Although in the correct program the pointer itself should have had a completely different value and pointed to 0×1, for example, we see that dereferencing its current invalid value doesn’t crash the process.

After installing the new software, NewComponent DLL is loaded at the address range previously occupied by ComponentC:

Now the address 0×40010024 happens to be completely invalid and we have access violation and the crash dump.

- Dmitry Vostokov @ DumpAnalysis.org -

Posted in Crash Dump Analysis, Crash Dump Patterns | 6 Comments »

July 2026
M	T	W	T	F	S	S
« Jun
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Pages

Recent Comments

Categories

Archives

ARM64

Automated Analysis

Blogroll

Debugging Channels

Forensics

Hardware

Linux

Mac OS X

Magazines and Newspapers

Malware Analysis

Medical Diagnostics

Narratology

Related Links

Reversing

Scripting Languages

Source Code

Tracing Tools

Meta