Race conditions on a uniprocessor machine
Sponsored link: Memory Dump Analysis Services
Debugging Experts Magazine Online
It is a known fact that hidden race conditions in code are manifested more frequently on a multiprocessor machine than on a uniprocessor machine. I was trying to create an example to illustrate this and wrote the following code which was motivated by the similar kernel level code and the discussion on Russian Software Development Network forum:
volatile bool b;
void thread_true(void *)
{
while(true)
{
b = true;
}
}
void thread_false(void *)
{
while(true)
{
b = false;
}
}
int _tmain(int argc, _TCHAR* argv[])
{
_beginthread(thread_true, 0, NULL);
_beginthread(thread_false, 0, NULL);
while(true)
{
assert (b == false || b == true);
}
return 0;
}
The program has three threads. Two of them are trying to set the same boolean variable b to different values and the main thread checks that its value is either true or false. The assertion should fail in the following scenario: the first thread (thread_true) sets b variable to true value so the first comparison in assertion fails and we expect the second comparison to succeed but the main thread is preempted by the second thread (thread_false) that sets that value to false and therefore the second comparison fails too. We get an assertion dialog in debug build showing that boolean variable b is neither true nor false!
I compiled and ran that program and it wasn’t failing for hours on my uniprocessor laptop. On a multiprocessor machine it was failing in a couple of minutes. If we look at assertion assembly language code we would see that it is very short so statistically speaking the chances that our main thread is preempted in the middle of the assertion are very low. This is because on a uniprocessor machine two threads are running not in parallel but until their quantum is expired. So we should make the assertion code longer to exceed the quantum. To simulate this I added a call to SwitchToThread API. When the assertion code yields execution to another thread then perhaps that thread would be thread_false and as soon as it is preempted by main thread again we get the assertion failure:
volatile bool b;
bool SlowOp()
{
SwitchToThread();
return false;
}
void thread_true(void *)
{
while(true)
{
b = true;
}
}
void thread_false(void *)
{
while(true)
{
b = false;
}
}
int _tmain(int argc, _TCHAR* argv[])
{
_beginthread(thread_true, 0, NULL);
_beginthread(thread_false, 0, NULL);
while(true)
{
assert (b == false || SlowOp() || b == true);
}
return 0;
}
I compiled and ran the program again and I couldn’t see any failure for a long time. It looks like thread_false is always running before the main thread and when the main thread is running then due to short-circuit operator || evaluation rule we don’t have a chance to execute SlowOp(). Then I added a fourth thread called thread_true_2 to make the number of threads setting b variable to true value as twice as many as the number of threads setting b variable to false value (2 to 1) so we could have more chances to set b variable to true value before executing the assertion:
volatile bool b;
bool SlowOp()
{
SwitchToThread();
return false;
}
void thread_true(void *)
{
while(true)
{
b = true;
}
}
void thread_true_2(void *)
{
while(true)
{
b = true;
}
}
void thread_false(void *)
{
while(true)
{
b = false;
}
}
int _tmain(int argc, _TCHAR* argv[])
{
_beginthread(thread_true, 0, NULL);
_beginthread(thread_false, 0, NULL);
_beginthread(thread_true_2, 0, NULL);
while(true)
{
assert (b == false || SlowOp() || b == true);
}
return 0;
}
Now when I ran the new program I got the assertion failure in a couple of minutes! It is hard to make race conditions manifest themselves on a uniprocessor machine.
- Dmitry Vostokov -
_1125.png)
Museum of Debugging and Memory Dumps
7/7/2010 - 8/8/2010 Annual Competition: Tell Your Windows Debugging Story
Crash and Hang Analysis Audit Service
CARE: Crash Analysis Report Environment
Crash Dump and Software Trace Analysis Training and Seminars
Access OpenTask Titles on Safari Books Online
DATA (Dump Analysis + Trace Analysis) Facebook group
Please join the community of memory (dump) and trace analysis engineers. This group promotes scientific methods and memory dump-based worldview.
Twitter @ DumpAnalysis You can now follow portal and blog news at DumpAnalysis on Twitter
LinkedIn Group Dr. Watson Enthusiasts All about Dr. Watson errors and more. Get news, excerpts and progress reports about the forthcoming book The Science of Dr. Watson: An Illustrated History of Debugging (ISBN 978-1906717070)
2010 (0x7DA) - The Year of Dump Analysis 2011 (0x7DB) - 2020 (0x7E4) The Debugging Decade
International Memory Analysts and Debuggers Day: 07.07 and/or 08.08 starting from The Year of Dump Analysis, 2010, 7DA
AnnouncementsComing Soon:
Management Bits: An Anthology from Reductionist Manager
Debugging Notebook: Essential Concepts, WinDbg Commands and Tools
Crash Dump Analysis for System Administrators and Support Engineers
New Magazines:
Debugged! MZ/PE: MagaZine for/from Practicing Engineers
New Books:
Memory Dump Analysis Anthology: Color Supplement for Volumes 1-3
Memory Dump Analysis Anthology, Volume 3
First Fault Software Problem Solving: A Guide for Engineers, Managers and Users
x64 Windows Debugging: Practical Foundations
Also available:
Windows Debugging: Practical Foundations
DLL List Landscape: The Art from Computer Memory Space
Dumps, Bugs and Debugging Forensics: The Adventures of Dr. Debugalov
WinDbg: A Reference Poster and Learning Cards
Memory Dump Analysis Anthology, Volume 2
Memory Dump Analysis Anthology, Volume 1
New Children's Book:



September 2nd, 2009 at 4:34 pm
[…] Incorrect sharing of memory example (p. 171) - although context switches emulate multiprocessor systems single-processor machines experience the same error conditions less frequently: http://www.dumpanalysis.org/blog/index.php/2007/04/14/race-conditions-on-a-uniprocessor-machine/ […]