Archive for the ‘Assembly Language’ Category

Crash Dump Analysis Patterns (Part 36)

Wednesday, November 14th, 2007

The pattern I should have written as one of the first is called Local Buffer Overflow. It is observed on x86 platforms when a local variable and a function return address and/or saved frame pointer EBP are overwritten with some data. As a result, the instruction pointer EIP becomes Wild Pointer and we have a process crash in user mode or a bugcheck in kernel mode. Sometimes this pattern is diagnosed by looking at mismatched EBP and ESP values and in the case of ASCII or UNICODE buffer overflow EIP register may contain 4-char or 2-wchar_t value and ESP or EBP or both registers might point at some string fragment like in the example below:

0:000> r
eax=000fa101 ebx=0000c026 ecx=01010001 edx=bd43a010 esi=000003e0 edi=00000000
eip=0048004a esp=0012f158 ebp=00510044 iopl=0  nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=0038 gs=0000 efl=00000202
0048004a 0000 add     byte ptr [eax],al  ds:0023:000fa101=??

0:000> kL
ChildEBP RetAddr 
WARNING: Frame IP not in any known module. Following frames may be wrong.
0012f154 00420047 0x48004a
0012f158 00440077 0x420047
0012f15c 00420043 0x440077
0012f160 00510076 0x420043
0012f164 00420049 0x510076
0012f168 00540041 0x420049
0012f16c 00540041 0x540041
...
...
...

Good buffer overflow case studies with complete analysis including assembly language tutorial can be found in Buffer Overflow Attacks book.

Buy from Amazon 

- Dmitry Vostokov @ DumpAnalysis.org -

XEN from a system programmer’s perspective

Wednesday, October 3rd, 2007

I’m trying to educate myself about virtualization and XEN and found the book The Definitive Guide to the Xen Hypervisor written by David Chisnall that is about to be published:

Buy from Amazon

Table of Contents

I’ll write a review once I get it and read it.  

- Dmitry Vostokov @ DumpAnalysis.org -

BIOS Internals

Friday, August 10th, 2007

The life of OS starts with BIOS and if you are curious about BIOS technology, x86 computer architecture and security the following book that I recently discovered and bought will help you:

BIOS Disassembly Ninjutsu Uncovered

- Dmitry Vostokov @ DumpAnalysis.org -

Reconstructing Stack Trace Manually

Wednesday, July 25th, 2007

This is a small case study to complement Incorrect Stack Trace pattern and show how to reconstruct stack trace manually based on an example with complete source code.

I created a small working multithreaded program:

#include "stdafx.h"
#include <stdio.h>
#include <process.h>

typedef void (*REQ_JUMP)();
typedef void (*REQ_RETURN)();

const char str[] = "\0\0\0\0\0\0\0";

bool loop = true;

void return_func()
{
  puts("Return Func");
  loop = false;
  _endthread();
}

void jump_func()
{
  puts("Jump Func");
}

void internal_func_2(void *param_jump,void *param_return)
{
  REQ_JUMP f_jmp = (REQ_JUMP)param_jump;
  REQ_RETURN f_ret = (REQ_RETURN)param_return;

  puts("Internal Func 2");
  // Uncomment memcpy to crash the program
  // Overwrite f_jmp and f_ret with NULL
  // memcpy(&f_ret, str, sizeof(str));
  __asm
  {
     push f_ret;
     mov  eax, f_jmp
     mov  ebp, 0 // use ebp as a general purpose register
     jmp  eax
  }
}

void internal_func_1(void *param)
{
  puts("Internal Func 1");
  internal_func_2(param, &return_func);
}

void thread_request(void *param)
{
  puts("Request");
  internal_func_1(param);
}

int _tmain(int argc, _TCHAR* argv[])
{
  _beginthread(thread_request, 0, (void *)jump_func);
  while (loop);
  return 0;
}

For it I had to disable optimizations in Visual C++ compiler otherwise most of the code would have been eliminated because the program is very small and easy for code optimizer. If we run the program it displays the following output:

Request
Internal Func 1
Internal Func 2
Jump Func
Return Func

internal_func_2 gets two parameters: the function address to jump and the function address to call upon the return. The latter sets loop variable to false in order to break infinite main thread loop and calls _endthread. Why is that complexity in so small sample? I wanted to simulate FPO optimization in an inner function call and also gain control over a return address. This is why I set EBP to zero before jumping and pushed the custom return address which I can change any time. If I used the call instruction then the processor would have determined the return address as the next instruction address.

The code also copies two internal_func_2 parameters into local variables f_jmp and f_ret because the commented memcpy call is crafted to overwrite them with zeroes and do not touch the saved EBP, return address and function arguments. This is all to make stack trace incorrect but at the same time make manual stack reconstruction as easy as possible in this example.

Let’s suppose that memcpy call is a bug that overwrites local variables. Then we have a crash obviously because EAX is zero and jump to zero address will cause access violation. EBP is also 0 because we assigned 0 to it explicitly. Let’s pretend that we wanted to pass some constant via EBP and it is zero.

What we have now:

EBP is 0
EIP is 0
the return address is 0

As you might have expected already when you load a crash dump WinDbg is utterly confused because it has no clue on how to reconstruct the stack trace:

This dump file has an exception of interest stored in it.
The stored exception information can be accessed via .ecxr.
(bd0.ec8): Access violation - code c0000005 (first/second chance not available)
eax=00000000 ebx=00595620 ecx=00000002 edx=00000000 esi=00000000 edi=00000000
eip=00000000 esp=0069ff54 ebp=00000000 iopl=0 nv up ei pl nz ac po nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010212
00000000 ??              ???

0:001> kv
ChildEBP RetAddr  Args to Child
WARNING: Frame IP not in any known module. Following frames may be wrong.
0069ff50 00000000 00000000 00000000 0069ff70 0×0

Fortunately ESP is not zero so we can look at raw stack:

0:001> dds esp
0069ff54  00000000
0069ff58  00000000
0069ff5c  00000000
0069ff60  0069ff70
0069ff64  0040187f WrongIP!internal_func_1+0x1f
0069ff68  00401830 WrongIP!jump_func
0069ff6c  00401840 WrongIP!return_func
0069ff70  0069ff7c
0069ff74  0040189c WrongIP!thread_request+0xc
0069ff78  00401830 WrongIP!jump_func
0069ff7c  0069ffb4
0069ff80  78132848 msvcr80!_endthread+0x4b
0069ff84  00401830 WrongIP!jump_func
0069ff88  aa75565b
0069ff8c  00000000
0069ff90  00000000
0069ff94  00595620
0069ff98  c0000005
0069ff9c  0069ff88
0069ffa0  0069fb34
0069ffa4  0069ffdc
0069ffa8  78138cd9 msvcr80!_except_handler4
0069ffac  d207e277
0069ffb0  00000000
0069ffb4  0069ffec
0069ffb8  781328c8 msvcr80!_endthread+0xcb
0069ffbc  7d4dfe21 kernel32!BaseThreadStart+0x34
0069ffc0  00595620
0069ffc4  00000000
0069ffc8  00000000
0069ffcc  00595620
0069ffd0  c0000005

Here we can start searching for the following pairs:

EBP:         PreviousEBP
             Function return address



PreviousEBP: PrePreviousEBP
             Function return address


for example:

0:001> dds esp
0069ff54 00000000
0069ff58 00000000
0069ff5c 00000000
0069ff60 0069ff70
0069ff64 0040187f WrongIP!internal_func_1+0×1f
0069ff68 00401830 WrongIP!jump_func
0069ff6c 00401840 WrongIP!return_func
0069ff70 0069ff7c
0069ff74 0040189c WrongIP!thread_request+0xc
0069ff78 00401830 WrongIP!jump_func
0069ff7c 0069ffb4

This is based on the fact that a function call saves its return address and the standard function prolog saves the previous EBP value and sets ESP to point to it.

push ebp
mov ebp, esp

Therefore our stack looks like this:

0:001> dds esp
0069ff54 00000000
0069ff58 00000000
0069ff5c 00000000
0069ff60 0069ff70
0069ff64 0040187f WrongIP!internal_func_1+0×1f
0069ff68 00401830 WrongIP!jump_func
0069ff6c 00401840 WrongIP!return_func
0069ff70 0069ff7c
0069ff74 0040189c WrongIP!thread_request+0xc
0069ff78 00401830 WrongIP!jump_func
0069ff7c 0069ffb4
0069ff80 78132848 msvcr80!_endthread+0×4b
0069ff84 00401830 WrongIP!jump_func
0069ff88 aa75565b
0069ff8c 00000000
0069ff90 00000000
0069ff94 00595620
0069ff98 c0000005
0069ff9c 0069ff88
0069ffa0 0069fb34
0069ffa4 0069ffdc
0069ffa8 78138cd9 msvcr80!_except_handler4
0069ffac d207e277
0069ffb0 00000000
0069ffb4 0069ffec
0069ffb8 781328c8 msvcr80!_endthread+0xcb
0069ffbc 7d4dfe21 kernel32!BaseThreadStart+0×34
0069ffc0 00595620
0069ffc4 00000000
0069ffc8 00000000
0069ffcc 00595620
0069ffd0 c0000005

Also we double check return addresses to see if they are valid code indeed. The best way is to try to disassemble them backwards. This should show call instructions resulted in saved return addresses:

0:001> ub WrongIP!internal_func_1+0x1f
WrongIP!internal_func_1+0x1:
00401871 mov     ebp,esp
00401873 push    offset WrongIP!GS_ExceptionPointers+0x38 (00402124)
00401878 call    dword ptr [WrongIP!_imp__puts (004020ac)]
0040187e add     esp,4
00401881 push    offset WrongIP!return_func (00401850)
00401886 mov     eax,dword ptr [ebp+8]
00401889 push    eax
0040188a call    WrongIP!internal_func_2 (004017e0)

0:001> ub WrongIP!thread_request+0xc
WrongIP!internal_func_1+0x2d:
0040189d int     3
0040189e int     3
0040189f int     3
WrongIP!thread_request:
004018a0 push    ebp
004018a1 mov     ebp,esp
004018a3 mov     eax,dword ptr [ebp+8]
004018a6 push    eax
004018a7 call    WrongIP!internal_func_1 (00401870)

0:001> ub msvcr80!_endthread+0x4b
msvcr80!_endthread+0x2f:
7813282c pop     esi
7813282d push    0Ch
7813282f push    offset msvcr80!__rtc_tzz+0x64 (781b4b98)
78132834 call    msvcr80!_SEH_prolog4 (78138c80)
78132839 call    msvcr80!_getptd (78132e29)
7813283e and     dword ptr [ebp-4],0
78132842 push    dword ptr [eax+58h]
78132845 call    dword ptr [eax+54h]

0:001> ub msvcr80!_endthread+0xcb
msvcr80!_endthread+0xaf:
781328ac mov     edx,dword ptr [ecx+58h]
781328af mov     dword ptr [eax+58h],edx
781328b2 mov     edx,dword ptr [ecx+4]
781328b5 push    ecx
781328b6 mov     dword ptr [eax+4],edx
781328b9 call    msvcr80!_freefls (78132e41)
781328be call    msvcr80!_initp_misc_winxfltr (781493c1)
781328c3 call    msvcr80!_endthread+0×30 (7813282d)

0:001> ub BaseThreadStart+0x34
kernel32!BaseThreadStart+0x10:
7d4dfdfd mov     eax,dword ptr fs:[00000018h]
7d4dfe03 cmp     dword ptr [eax+10h],1E00h
7d4dfe0a jne     kernel32!BaseThreadStart+0x2e (7d4dfe1b)
7d4dfe0c cmp     byte ptr [kernel32!BaseRunningInServerProcess (7d560008)],0
7d4dfe13 jne     kernel32!BaseThreadStart+0x2e (7d4dfe1b)
7d4dfe15 call    dword ptr [kernel32!_imp__CsrNewThread (7d4d0310)]
7d4dfe1b push    dword ptr [ebp+0Ch]
7d4dfe1e call    dword ptr [ebp+8]

Now we can use extended version of k command and supply custom EBP, ESP and EIP values. We set EBP to the first found address of EBP:PreviousEBP pair and set EIP to 0:

0:001> k L=0069ff60 0069ff60 0
ChildEBP RetAddr
WARNING: Frame IP not in any known module. Following frames may be wrong.
0069ff5c 0069ff70 0x0
0069ff60 0040188f 0x69ff70
0069ff70 004018ac WrongIP!internal_func_1+0x1f
0069ff7c 78132848 WrongIP!thread_request+0xc
0069ffb4 781328c8 msvcr80!_endthread+0x4b
0069ffb8 7d4dfe21 msvcr80!_endthread+0xcb
0069ffec 00000000 kernel32!BaseThreadStart+0x34

The stack trace looks good because it also shows BaseThreadStart.

From the backwards disassembly of the return address WrongIP!internal_func_1+0×1f we see that internal_func_1 calls internal_func_2 so we can disassemble the latter function:

0:001> uf internal_func_2
Flow analysis was incomplete, some code may be missing
WrongIP!internal_func_2:
   28 004017e0 push    ebp
   28 004017e1 mov     ebp,esp

   28 004017e3 sub     esp,8
   29 004017e6 mov     eax,dword ptr [ebp+8]
   29 004017e9 mov     dword ptr [ebp-4],eax

   30 004017ec mov     ecx,dword ptr [ebp+0Ch]
   30 004017ef mov     dword ptr [ebp-8],ecx
   32 004017f2 push    offset WrongIP!GS_ExceptionPointers+0×28 (00402114)
   32 004017f7 call    dword ptr [WrongIP!_imp__puts (004020ac)]
   32 004017fd add     esp,4
   33 00401800 push    8
   33 00401802 push    offset WrongIP!GS_ExceptionPointers+0×8 (004020f4)
   33 00401807 lea     edx,[ebp-8]
   33 0040180a push    edx
   33 0040180b call    WrongIP!memcpy (00401010)
   33 00401810 add     esp,0Ch
   35 00401813 push    dword ptr [ebp-8]
   36 00401816 mov     eax,dword ptr [ebp-4]
   37 00401819 mov     ebp,0
   38 0040181e jmp     eax

We see that it takes some value from [ebp-8], puts it into EAX and then jumps to that address. The function uses standard prolog (in blue) and therefore EBP-4 is the local variable. From the code we see that it comes from [EBP+8] which is the first function parameter:

EBP+C: second parameter
EBP+8: first parameter
EBP+4: return address
EBP:   previous EBP
EBP-4: local variable
EBP-8: local variable

If we examine the first parameter we would see it is a valid function address that we were supposed to call:

0:001> kv L=0069ff60 0069ff60 0
ChildEBP RetAddr  Args to Child
WARNING: Frame IP not in any known module. Following frames may be wrong.
0069ff5c 0069ff70 0040188f 00401830 00401850 0x0
0069ff60 0040188f 00401830 00401850 0069ff7c 0x69ff70
0069ff70 004018ac 00401830 0069ffb4 78132848 WrongIP!internal_func_1+0×1f
0069ff7c 78132848 00401830 6d5ba283 00000000 WrongIP!thread_request+0xc
0069ffb4 781328c8 7d4dfe21 00595620 00000000 msvcr80!_endthread+0×4b
0069ffb8 7d4dfe21 00595620 00000000 00000000 msvcr80!_endthread+0xcb
0069ffec 00000000 7813286e 00595620 00000000 kernel32!BaseThreadStart+0×34

0:001> u 00401830
WrongIP!jump_func:
00401830 push    ebp
00401831 mov     ebp,esp
00401833 push    offset WrongIP!GS_ExceptionPointers+0x1c (00402108)
00401838 call    dword ptr [WrongIP!_imp__puts (004020ac)]
0040183e add     esp,4
00401841 pop     ebp
00401842 ret
00401843 int     3

However if we look at the code we would see that we call memcpy with ebp-8 address and the number of bytes to copy is 8. In pseudo-code it would look like:

memcpy(ebp-8, 004020f4, 8);

   33 00401800 push    8
   33 00401802 push    offset WrongIP!GS_ExceptionPointers+0x8 (004020f4)
   33 00401807 lea     edx,[ebp-8]
   33 0040180a push    edx
   33 0040180b call    WrongIP!memcpy (00401010)
   33 00401810 add     esp,0Ch

If we examine 004020f4 address we would see that it contains 8 zeroes:

0:001> db 004020f4 l8
004020f4  00 00 00 00 00 00 00 00
       

Therefore memcpy overwrites our local variables that contain a jump address with zeroes. This explains why we have jumped to 0 address and why EIP was zero.

Finally our reconstructed stack trace looks like this:

WrongIP!internal_func_2+offset ; here we jump
WrongIP!internal_func_1+0x1f
WrongIP!thread_request+0xc
msvcr80!_endthread+0x4b
msvcr80!_endthread+0xcb
kernel32!BaseThreadStart+0x34

This was based on the fact that ESP was valid. If we have a zero or invalid ESP we can look at the entire raw stack range from thread environment block (TEB). Use !teb command to get thread stack range. In my example this command doesn’t work due to the lack of proper MS symbols but it reports TEB address and we can dump it:

0:001> !teb
TEB at 7efda000
error InitTypeRead( TEB )...

0:001> dd 7efda000 l3
7efda000 0069ffa4 006a0000 0069e000

Usually the second double word is the stack limit and the third is the stack base address so we can dump the range and start reconstructing stack trace for our example from the bottom of the stack (BaseThreadStart) or look after exception handling calls (shown in red):

0:001> dds 0069e000 006a0000
0069e000  00000000
0069e004  00000000
...
...
...
0069fb24  7d535b43 kernel32!UnhandledExceptionFilter+0×851



0069fbb0  0069fc20
0069fbb4  7d6354c9 ntdll!RtlDispatchException+0×11f
0069fbb8  0069fc38
0069fbbc  0069fc88
0069fc1c  00000000
0069fc20  00000000
0069fc24  7d61dd26 ntdll!NtRaiseException+0×12
0069fc28  7d61ea51 ntdll!KiUserExceptionDispatcher+0×29
0069fc2c  0069fc38



0069ff38  00000000
0069ff3c  00000000
0069ff40  00000000
0069ff44  00000000
0069ff48  00000000
0069ff4c  00000000
0069ff50  00000000
0069ff54  00000000
0069ff58  00000000
0069ff5c  00000000
0069ff60  0069ff70
0069ff64  0040188f WrongIP!internal_func_1+0×1f
0069ff68  00401830 WrongIP!jump_func
0069ff6c  00401850 WrongIP!return_func
0069ff70  0069ff7c
0069ff74  004018ac WrongIP!thread_request+0xc
0069ff78  00401830 WrongIP!jump_func
0069ff7c  0069ffb4
0069ff80  78132848 msvcr80!_endthread+0×4b
0069ff84  00401830 WrongIP!jump_func
0069ff88  6d5ba283
0069ff8c  00000000
0069ff90  00000000
0069ff94  00595620
0069ff98  c0000005
0069ff9c  0069ff88
0069ffa0  0069fb34
0069ffa4  0069ffdc
0069ffa8  78138cd9 msvcr80!_except_handler4
0069ffac  152916af
0069ffb0  00000000
0069ffb4  0069ffec
0069ffb8  781328c8 msvcr80!_endthread+0xcb
0069ffbc  7d4dfe21 kernel32!BaseThreadStart+0×34
0069ffc0  00595620
0069ffc4  00000000


- Dmitry Vostokov @ DumpAnalysis.org -

GDB for WinDbg Users (Part 1)

Monday, June 25th, 2007

Recently started using GDB on FreeBSD and found AT&T Intel assembly language syntax uncomfortable. The same is when using GDB on Windows. Source and destination operands are reversed and negative offsets like -4 are represented in hexadecimal format like 0xfffffffc. It is ok for small assembly language fragments but very confusing when looking at several pages of code. Here is an example of AT&T syntax:

C:\MinGW\bin>gdb a.exe
GNU gdb 5.2.1
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i686-pc-mingw32"...(no debugging symbols found)...
(gdb) disas main
Dump of assembler code for function main:
0x4012f0 <main>:        push   %ebp
0x4012f1 <main+1>:      mov    %esp,%ebp
0x4012f3 <main+3>:      sub    $0x8,%esp
0x4012f6 <main+6>:      and    $0xfffffff0,%esp
0x4012f9 <main+9>:      mov    $0x0,%eax
0x4012fe <main+14>:     add    $0xf,%eax
0x401301 <main+17>:     add    $0xf,%eax
0x401304 <main+20>:     shr    $0x4,%eax
0x401307 <main+23>:     shl    $0x4,%eax
0x40130a <main+26>:     mov    %eax,0xfffffffc(%ebp)
0x40130d <main+29>:     mov    0xfffffffc(%ebp),%eax
0x401310 <main+32>:     call   0x401850 <_alloca>
0x401315 <main+37>:     call   0x4014f0 <__main>
0x40131a <main+42>:     leave
0x40131b <main+43>:     ret
0x40131c <main+44>:     nop
0x40131d <main+45>:     nop
0x40131e <main+46>:     nop
0x40131f <main+47>:     nop
End of assembler dump.

To my relief, I found that I can change AT&T flavour to Intel using the following command:

(gdb) set disassembly-flavor intel

The same function now looks more familiar:

(gdb) disas main
Dump of assembler code for function main:
0x4012f0 <main>:        push   ebp
0x4012f1 <main+1>:      mov    ebp,esp
0x4012f3 <main+3>:      sub    esp,0x8
0x4012f6 <main+6>:      and    esp,0xfffffff0
0x4012f9 <main+9>:      mov    eax,0x0
0x4012fe <main+14>:     add    eax,0xf
0x401301 <main+17>:     add    eax,0xf
0x401304 <main+20>:     shr    eax,0x4
0x401307 <main+23>:     shl    eax,0x4
0x40130a <main+26>:     mov    DWORD PTR [ebp-4],eax
0x40130d <main+29>:     mov    eax,DWORD PTR [ebp-4]
0x401310 <main+32>:     call   0x401850 <_alloca>
0x401315 <main+37>:     call   0x4014f0 <__main>
0x40131a <main+42>:     leave
0x40131b <main+43>:     ret
0x40131c <main+44>:     nop
0x40131d <main+45>:     nop
0x40131e <main+46>:     nop
0x40131f <main+47>:     nop
End of assembler dump.

- Dmitry Vostokov @ DumpAnalysis.org -

Detecting loops in code

Saturday, June 23rd, 2007

Sometimes when we look at a stack trace and disassembled code we see that a crash couldn’t have happened if the code path was linear. In such cases we need to see if there is any loop that changes some variables. This is greatly simplified if we have source code but in cases where we don’t have access to source code it is still possible to detect loops. We just need to find a direct (JMP) or conditional jump instruction (Jxxx, for example, JE) after the crash point branching to the beginning of the loop before the crash point as shown in the following pseudo code:

set the pointer value

label:

>>> crash when dereferencing the pointer

change the pointer value

jmp label

Let’s look at one example I found very interesting because it also shows __thiscall calling convention for C++ code generated by Visual С++ compiler. Before we look at the dump I quickly remind you about how C++ non-static class methods are called. Let’s first look at non-virtual method call.

class A
{
public:
        int foo() { return i; }
virtual int bar() { return i; }
private:
        int i;
};

Internally class members are accessed via implicit this pointer (passed via ECX):

int A::foo() { return this->i; }

Suppose we have an object instance of class A and we call its foo method:

A obj;
obj.foo();

The compiler has to generate code which calls foo function and the code inside the function has to know which object it is associated with. So internally the compiler passes implicit parameter - a pointer to that object. In pseudo code:

int foo_impl(A *this)
{
return this->i;
}

A obj;
foo_impl(&obj);

In x86 assembly language it should be similar to this code:

lea ecx, obj
call foo_impl

If you have obj declared as a local variable:

lea ecx, [ebp-N]
call foo_impl

If you have a pointer to an obj then the compiler usually generates mov instruction instead of lea instruction:

A *pobj;
pobj->foo();

mov ecx, [ebp-N]
call foo_impl

If you have other function parameters they are pushed on the stack from right to left. This is __thiscall calling convention. For virtual function call we have an indirect call through virtual function table. The pointer to it is the first object layout member and in the latter case where the pointer to obj is declared as the local variable we have the following x86 code:

A *pobj;
pobj->bar();

mov ecx, [ebp-N]
mov eax, [ecx]
call [eax]

Now let’s look at the crash point and stack trace:

0:021> r
eax=020864ee ebx=00000000 ecx=0000005c edx=7518005c esi=020864dc edi=00000000
eip=67dc5dda esp=075de820 ebp=075dea78 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010202
component!CDirectory::GetDirectory+0×8a:
67dc5dda 8b03 mov eax,dword ptr [ebx] ds:0023:00000000=????????

0:021> k
ChildEBP RetAddr
075dea78 004074f0 component!CDirectory::GetDirectory+0x8a
075deaac 0040e4fc component!CDirectory::FindFirstFileW+0xd0
075dffb8 77e64829 component!MonitorThread+0x13
075dffec 00000000 kernel32!BaseThreadStart+0x34

If we look at GetDirectory code we would see:

0:021> .asm no_code_bytes
Assembly options: no_code_bytes

0:021> uf component!CDirectory::GetDirectory
component!CDirectory::GetDirectory:
67dc5d50 push    ebp
67dc5d51 mov     ebp,esp
67dc5d53 push    0FFFFFFFFh
67dc5d55 push    offset component!CreateErrorInfo+0x553 (67ded93b)
67dc5d5a mov     eax,dword ptr fs:[00000000h]
67dc5d60 push    eax
67dc5d61 mov     dword ptr fs:[0],esp
67dc5d68 sub     esp,240h
67dc5d6e mov     eax,dword ptr [component!__security_cookie (67e0113c)]
67dc5d73 mov     dword ptr [ebp-10h],eax
67dc5d76 mov     eax,dword ptr [ebp+8]
67dc5d79 test    eax,eax
67dc5d7b push    ebx
67dc5d7c mov     ebx,ecx
67dc5d7e mov     dword ptr [ebp-238h],ebx
67dc5d84 je      component!CDirectory::GetDirectory+0×2a1 (67dc5ff1)

component!CDirectory::GetDirectory+0x3a:
67dc5d8a cmp     word ptr [eax],0
67dc5d8e je      component!CDirectory::GetDirectory+0x2a1 (67dc5ff1)

component!CDirectory::GetDirectory+0x44:
67dc5d94 push    esi
67dc5d95 push    eax
67dc5d96 call    dword ptr [component!_imp__wcsdup (67df050c)]
67dc5d9c add     esp,4
67dc5d9f mov     dword ptr [ebp-244h],eax
67dc5da5 mov     dword ptr [ebp-240h],eax
67dc5dab push    5Ch
67dc5dad lea     ecx,[ebp-244h]
67dc5db3 mov     dword ptr [ebp-4],0
67dc5dba call    component!CStrToken::Next (67dc4f80)
67dc5dbf mov     esi,eax
67dc5dc1 test    esi,esi
67dc5dc3 je      component!CDirectory::GetDirectory+0x28c (67dc5fdc)

component!CDirectory::GetDirectory+0x79:
67dc5dc9 push    edi
67dc5dca lea     ebx,[ebx]

component!CDirectory::GetDirectory+0x80:
67dc5dd0 cmp     word ptr [esi],0
67dc5dd4 je      component!CDirectory::GetDirectory+0x28b (67dc5fdb)

component!CDirectory::GetDirectory+0x8a:
>>> 67dc5dda mov     eax,dword ptr [ebx]
67dc5ddc mov ecx,ebx

If we trace EBX backwards (red) we would see that it comes from ECX (blue) so ECX could be considered as an implicit this pointer according to __thiscall calling convention. Therefore it looks like the caller passed NULL this pointer via ECX.

Let’s look at the caller. To see the code we can either disassemble FindFirstFileW or disassemble backwards at the GetDirectory return address. I’ll do the latter:

0:021> k
ChildEBP RetAddr
075dea78 004074f0 component!CDirectory::GetDirectory+0×8a
075deaac 0040e4fc component!CDirectory::FindFirstFileW+0xd0
075dffb8 77e64829 component!MonitorThread+0×13
075dffec 00000000 kernel32!BaseThreadStart+0×34

0:021> ub 004074f0
component!CDirectory::FindFirstFileW+0xbe:
004074de pop     ebp
004074df clc
004074e0 mov     ecx,dword ptr [esi+8E4h]
004074e6 mov     eax,dword ptr [ecx]
004074e8 push    0
004074ea push    0
004074ec push    edx
004074ed call    dword ptr [eax+10h]

We see that ECX is our this pointer. However the virtual table pointer is taken from the memory it references:

004074e6 mov eax,dword ptr [ecx]


004074ed call dword ptr [eax+10h]

Were ECX a NULL we would have had our crash at this point. However we have our crash in the called function. So it couldn’t be NULL. There is a contradiction here. The only plausible explanation is that in GetDirectory function there is a loop that changes EBX (shown in red in GetDirectory function code above). If we have a second look at the code we would see that EBX is saved in [ebp-238h] local variable before it is used:

0:021> uf component!CDirectory::GetDirectory
component!CDirectory::GetDirectory:
67dc5d50 push    ebp
67dc5d51 mov     ebp,esp
67dc5d53 push    0FFFFFFFFh
67dc5d55 push    offset component!CreateErrorInfo+0x553 (67ded93b)
67dc5d5a mov     eax,dword ptr fs:[00000000h]
67dc5d60 push    eax
67dc5d61 mov     dword ptr fs:[0],esp
67dc5d68 sub     esp,240h
67dc5d6e mov     eax,dword ptr [component!__security_cookie (67e0113c)]
67dc5d73 mov     dword ptr [ebp-10h],eax
67dc5d76 mov     eax,dword ptr [ebp+8]
67dc5d79 test    eax,eax
67dc5d7b push    ebx
67dc5d7c mov     ebx,ecx
67dc5d7e mov     dword ptr [ebp-238h],ebx
67dc5d84 je      component!CDirectory::GetDirectory+0×2a1 (67dc5ff1)

component!CDirectory::GetDirectory+0x3a:
67dc5d8a cmp     word ptr [eax],0
67dc5d8e je      component!CDirectory::GetDirectory+0x2a1 (67dc5ff1)

component!CDirectory::GetDirectory+0x44:
67dc5d94 push    esi
67dc5d95 push    eax
67dc5d96 call    dword ptr [component!_imp__wcsdup (67df050c)]
67dc5d9c add     esp,4
67dc5d9f mov     dword ptr [ebp-244h],eax
67dc5da5 mov     dword ptr [ebp-240h],eax
67dc5dab push    5Ch
67dc5dad lea     ecx,[ebp-244h]
67dc5db3 mov     dword ptr [ebp-4],0
67dc5dba call    component!CStrToken::Next (67dc4f80)
67dc5dbf mov     esi,eax
67dc5dc1 test    esi,esi
67dc5dc3 je      component!CDirectory::GetDirectory+0x28c (67dc5fdc)

component!CDirectory::GetDirectory+0x79:
67dc5dc9 push    edi
67dc5dca lea     ebx,[ebx]

component!CDirectory::GetDirectory+0x80:
67dc5dd0 cmp     word ptr [esi],0
67dc5dd4 je      component!CDirectory::GetDirectory+0x28b (67dc5fdb)

component!CDirectory::GetDirectory+0x8a:
>>> 67dc5dda mov     eax,dword ptr [ebx]
67dc5ddc mov ecx,ebx

If we look further past the crash point we would see that [ebp-238h] value is changed and then used again to change EBX:

component!CDirectory::GetDirectory+0x80:
67dc5dd0 cmp word ptr [esi],0
67dc5dd4 je component!CDirectory::GetDirectory+0×28b (67dc5fdb)

component!CDirectory::GetDirectory+0x8a:
>>> 67dc5dda mov eax,dword ptr [ebx]
67dc5ddc mov ecx,ebx



component!CDirectory::GetDirectory+0×11e:
67dc5e6e mov     eax,dword ptr [ebp-23Ch]
67dc5e74 mov     ecx,dword ptr [eax]
67dc5e76 mov     dword ptr [ebp-238h],ecx
67dc5e7c jmp     component!CDirectory::GetDirectory+0×20e (67dc5f5e)



component!CDirectory::GetDirectory+0×23e:
67dc5f8e cmp     esi,edi
67dc5f90 mov     ebx,dword ptr [ebp-238h]
67dc5f96 jne     component!CDirectory::GetDirectory+0×80 (67dc5dd0)

We see that after changing EBX the code jumps to 67dc5dd0 address and this address is just before our crash point. It looks like a loop. Therefore there is no contradiction. ECX as this pointer was passed as non-NULL and valid pointer. Before the loop started its value was passed to EBX. In the loop body EBX was changed and after some loop iterations the new value became NULL. It could be the case that there were no checks for NULL pointers in the loop code.

- Dmitry Vostokov @ DumpAnalysis.org -

Yet another look at Zw* and Nt* functions

Tuesday, April 10th, 2007

While reading the new book “Professional Rootkits” by Ric Vieler I encountered the following macro definition to get function index in system service table:

#define HOOK_INDEX(function2hook) *(PULONG)((PUCHAR)function2hook+1)

Couldn’t understand the code until looked at disassembly of a typical ntdll!Zw and nt!Zw function (x86 W2K3):

lkd> u ntdll!ZwCreateProcess
ntdll!NtCreateProcess:
7c821298 b831000000      mov     eax,31h
7c82129d ba0003fe7f      mov     edx,offset SharedUserData!SystemCallStub (7ffe0300)
7c8212a2 ff12            call    dword ptr [edx]
7c8212a4 c22000          ret     20h
7c8212a7 90              nop
ntdll!ZwCreateProcessEx:
7c8212a8 b832000000      mov     eax,32h
7c8212ad ba0003fe7f      mov     edx,offset SharedUserData!SystemCallStub (7ffe0300)
7c8212b2 ff12            call    dword ptr [edx]

lkd> u nt!ZwCreateProcess
nt!ZwCreateProcess:
8083c2a3 b831000000      mov     eax,31h
8083c2a8 8d542404        lea     edx,[esp+4]
8083c2ac 9c              pushfd
8083c2ad 6a08            push    8
8083c2af e8c688ffff      call    nt!KiSystemService (80834b7a)
8083c2b4 c22000          ret     20h
nt!ZwCreateProcessEx:
8083c2b7 b832000000      mov     eax,32h
8083c2bc 8d542404        lea     edx,[esp+4]

You can notice that user space ntdll!Nt and ntdll!Zw variants are the same. This is not the case in kernel space:

lkd> u nt!NtCreateProcess
nt!NtCreateProcess:
808f80ea 8bff            mov     edi,edi
808f80ec 55              push    ebp
808f80ed 8bec            mov     ebp,esp
808f80ef 33c0            xor     eax,eax
808f80f1 f6451c01        test    byte ptr [ebp+1Ch],1
808f80f5 0f8549d10600    jne     nt!NtCreateProcess+0xd (80965244)
808f80fb f6452001        test    byte ptr [ebp+20h],1
808f80ff 0f8545d10600    jne     nt!NtCreateProcess+0×14 (8096524a)

nt!Zw functions are dispatched through service table. nt!Nt functions are actual code. 

For completeness let’s look at AMD x64 W2K3. User space x64 call:

0:001> u ntdll!ZwCreateProcess
ntdll!NtCreateProcess:
00000000`78ef1ab0 4c8bd1          mov     r10,rcx
00000000`78ef1ab3 b882000000      mov     eax,82h
00000000`78ef1ab8 0f05            syscall
00000000`78ef1aba c3              ret
00000000`78ef1abb 666690          xchg    ax,ax
00000000`78ef1abe 6690            xchg    ax,ax
ntdll!NtCreateProfile:
00000000`78ef1ac0 4c8bd1          mov     r10,rcx
00000000`78ef1ac3 b883000000      mov     eax,83h

User space x86 call in x64 W2K3:

0:001> u ntdll!ZwCreateProcess
ntdll!ZwCreateProcess:
7d61d428 b882000000      mov     eax,82h
7d61d42d 33c9            xor     ecx,ecx
7d61d42f 8d542404        lea     edx,[esp+4]
7d61d433 64ff15c0000000  call    dword ptr fs:[0C0h]
7d61d43a c22000          ret     20h
7d61d43d 8d4900          lea     ecx,[ecx]
ntdll!ZwCreateProfile:
7d61d440 b883000000      mov     eax,83h
7d61d445 33c9            xor     ecx,ecx

Kernel space call in x64 W2K3:

kd> u nt!ZwCreateProcess nt!ZwCreateProcess+20
nt!ZwCreateProcess:
fffff800`0103dd70 488bc4          mov     rax,rsp
fffff800`0103dd73 fa              cli
fffff800`0103dd74 4883ec10        sub     rsp,10h
fffff800`0103dd78 50              push    rax
fffff800`0103dd79 9c              pushfq
fffff800`0103dd7a 6a10            push    10h
fffff800`0103dd7c 488d057d380000  lea     rax,[nt!KiServiceLinkage (fffff800`01041600)]
fffff800`0103dd83 50              push    rax
fffff800`0103dd84 b882000000      mov     eax,82h
fffff800`0103dd89 e972310000      jmp     nt!KiServiceInternal (fffff800`01040f00)
fffff800`0103dd8e 6690            xchg    ax,ax

kd> u nt!NtCreateProcess
nt!NtCreateProcess:
fffff800`01245832 53               push    rbx
fffff800`01245833 4883ec50         sub     rsp,50h
fffff800`01245837 4c8b9c2488000000 mov     r11,qword ptr [rsp+88h]
fffff800`0124583f b801000000       mov     eax,1
fffff800`01245844 488bd9           mov     rbx,rcx
fffff800`01245847 488b8c2490000000 mov     rcx,qword ptr [rsp+90h]
fffff800`0124584f 41f6c301         test    r11b,1
fffff800`01245853 41ba00000000     mov     r10d,0

Here is the same as in kernel x86: Zw functions are dispatched but Nt functions are actual code. If you want to remember which function variant is dispatched and which is actual code I propose the mnemonic “Z-dispatch”.

- Dmitry Vostokov -

lvalues, rvalues and pointers

Monday, March 19th, 2007

I’ve just published stripped down HTML version of my old Code Reading lectures explaining lvalues and rvalues terminology (used in C and C++ standard documents) and pointers. It also explains how to read complex pointer declarations, const pointers and relationship between pointers and arrays. Understanding pointers is a must in low-level debugging. Can be found here:

lvalues, rvalues and pointers

- Dmitry Vostokov -

Practical Foundations of Debugging (x64)

Monday, March 19th, 2007

I’ve just published HTML slides for the first two original lectures on debugging on x64 platform written last year. Basically they are an improved version of my old lectures on debugging on x86 platforms developed in 2004. Now I put more emphasis on using WinDbg as a debugging tool and as previously no assembly language background is assumed. New lectures use x64 assembly language throughout. I’m planning to adapt them to ILP 32-32-64 Windows x64 model (integer-long-pointer) and port more old lectures this year.

- Dmitry Vostokov -

Asmpedia.org update (2007, Week 6)

Tuesday, February 6th, 2007

Added EFLAGS/RFLAGS template:

http://www.asmpedia.org/index.php?title=EFLAGS/RFLAGS

AAA, MOV, NOP instructions have been updated to include EFLAGS (all other instructions will have it automatically):

Added 16-bit addressing ModRegRM table (for the sake of completeness):

http://www.asmpedia.org/index.php?title=ModRegRM_byte_%2816-bit_addressing%29

Added SIB byte translation table:

http://www.asmpedia.org/index.php?title=SIB_byte_%2832/64-bit_addressing%29

- Dmitry Vostokov -

Venerable NOP instruction and more

Monday, January 29th, 2007

Does NOP instruction have “parameters”? Yes, it does (depends on CPU model). As I promised earlier I continue to update Asmpedia. Today I added ModRegRM table useful for both disassembling and assembling and NOP description together with WinDbg output:

ModRegRM byte (32/64-bit addressing) 

Note: REX prefix information will be added later

NOP instruction

I’ll keep you up with Asmpedia updates on weekly bases.

- Dmitry Vostokov -

Asmpedia

Saturday, January 6th, 2007

As a part of my Master’s thesis I founded Wintel assembly language encyclopedia: www.asmpedia.org.

It is based on MediaWiki and I will start populating it from the end of January onwards. Information will be presented from dump analysis and reverse engineering perspective.

Currently I created some entries to test and collect comments, for example:

MOV instruction (x64 opcodes will be added later)

Instruction description will include:

  • definition and examples
  • x86 and x64 differences
  • C-style pseudo-code
  • annotated WinDbg disassembly
  • C/C++ compiler translation examples

Opcodes and mnemonics are cross-referenced, for example:

0xBB

I use Intel and AMD manuals and disassembly output from WinDbg as reference.

Finally I can fulfill my desire to learn all x86 instructions :-)

Further plans are to start with ARM assembly language as soon as I finish most of Wintel part because I do development for Windows Mobile and I’m interested in low level stuff there.

- Dmitry Vostokov -

Crash Dump Analysis Blog

Saturday, December 23rd, 2006

Welcome to the new blog location at dumpanalysis.org/blog/ 

Its feed address is

http://feeds.feedburner.com/CrashDumpAnalysis

The blog has been moved from its original location at

citrite.org/blogs/dmitryv/ 

in order to bring all crash dump analysis and debugging information to one place including www.dumpanalysis.org/forum and the forthcoming online encyclopedia about assembly languages:

www.asmpedia.org

Thank you and sorry for any inconvenience this might have caused.

Merry Christmas and Happy Debugging in New Year!

- Dmitry Vostokov -