Horrors of debugging legacy code

CARE: Crash Analysis Report Environment

DATA (Dump Analysis + Trace Analysis) Facebook group
Please join the community of memory (dump) and trace analysis engineers. This group promotes scientific methods and memory dump-based worldview.

Twitter @ DumpAnalysis
You can now follow portal and blog news at DumpAnalysis on Twitter

LinkedIn Group Dr. Watson Enthusiasts
All about Dr. Watson errors and more. Get news, excerpts and progress reports about the forthcoming book The Science of Dr. Watson: An Illustrated History of Debugging (ISBN 978-1906717070)

2010 (0x7DA) - The Year of Dump Analysis
2011 (0x7DB) - 2020 (0x7E4) The Debugging Decade

We all know that macro definitions in C and C++ are evil. They cause maintenance nightmares by introducing subtle bugs. I never took that seriously until last weekend I was debugging my old code written 10 years ago which uses macros written 15 years ago :-) 

My Windows Mobile 5.0 application was crashing when I was using POOM COM interfaces (Pocket Outlook Object Model). The crash never pointed to my code. It always happened after pimstore.dll and other MS modules were loaded and COM interfaces started to return errors. I first suspected that I was using POOM incorrectly and rewrote all code several times and in different ways. No luck. Then I tried PoomMaster sample from Windows Mobile 5.0 SDK and it worked well. So I rewrote my code in exactly the same way as in that sample. No luck. My last hope was that moving code from my DLL to EXE (as in sample SDK project) would eliminate crashes but it didn’t help too. Then I slowly started to realize that the problem might have been in my old code and I also noticed that one old piece of code had never been used before. So I started debugging by elimination (commenting out less and less code) until I found a macro. I had to stare at it for couple of minutes until I realized that one pair of brackets was missing and that caused allocating less memory and worse: the returned pointer to allocated memory was multiplied by 2! So the net result was the pointer pointing to other modules and subsequent string copy was effectively overwriting their memory and eventually causing crashes inside MS dlls.  

Here is that legacy macro:

#define ALLOC(t, p, s)
((p)=(t)GlobalLock(GlobalAlloc(GHND, (s))))

It allocates memory and returns a pointer. It should have been called like this (size parameter is highlighted in blue):

if (ALLOC(LPWSTR,lpm->lpszEvents,
(lstrlen(lpszMacro)+1)*sizeof(WCHAR)))
{
lstrcpy(lpm->lpszEvents, lpszMacro);
    lpm->nEvents=lstrlen(lpm->lpszEvents)+1;
}

What I found is the missing bracket before lstrlen and last enclosing bracket (size parameter is highlighted in red):

if (ALLOC(LPWSTR,lpm->lpszEvents,
lstrlen(lpszMacro)+1)*sizeof(WCHAR))
{
lstrcpy(lpm->lpszEvents, lpszMacro);
    lpm->nEvents=lstrlen(lpm->lpszEvents)+1;
}

The resulted code after macro expansion looks like this

if (lpm->lpszEvents=(LPWSTR)GlobalLock(GlobalAlloc(GHND,
lstrlen(lpszMacro)+1))*sizeof(WCHAR))

You see that the pointer to allocated memory is multiplied by two and string copy is performed to a random place in the address space of other loaded dlls corrupting their data and causing the process to crash later.

- Dmitry Vostokov -

           

Announcements

Coming Soon:

Debugging Notebook: Essential Concepts, WinDbg Commands and Tools

Crash Dump Analysis for System Administrators and Support Engineers

New Magazines:

Debugged! MZ/PE: MagaZine for/from Practicing Engineers


New Books:

Memory Dump Analysis Anthology, Volume 3

First Fault Software Problem Solving: A Guide for Engineers, Managers and Users

x64 Windows Debugging: Practical Foundations

Also available:

Windows Debugging: Practical Foundations

DLL List Landscape: The Art from Computer Memory Space

Dumps, Bugs and Debugging Forensics: The Adventures of Dr. Debugalov

WinDbg: A Reference Poster and Learning Cards

Memory Dump Analysis Anthology, Volume 2

Memory Dump Analysis Anthology, Volume 1

New Children's Book:

Baby Turing

Leave a Reply