Race conditions on a uniprocessor machine

Sponsored link: Memory Dump Analysis Services

Debugging Experts Magazine Online

It is a known fact that hidden race conditions in code are manifested more frequently on a multiprocessor machine than on a uniprocessor machine. I was trying to create an example to illustrate this and wrote the following code which was motivated by the similar kernel level code and the discussion on Russian Software Development Network forum:

volatile bool b;
void thread_true(void *)
{
 while(true)
 {
  b = true;
 }
}
void thread_false(void *)
{
 while(true)
 {
  b = false;
 }
}
int _tmain(int argc, _TCHAR* argv[])
{
 _beginthread(thread_true, 0, NULL);
 _beginthread(thread_false, 0, NULL);
 while(true)
 {
  assert (b == false || b == true);
 }
 return 0;
}

The program has three threads. Two of them are trying to set the same boolean variable b to different values and the main thread checks that its value is either true or false. The assertion should fail in the following scenario: the first thread (thread_true) sets b variable to true value so the first comparison in assertion fails and we expect the second comparison to succeed but the main thread is preempted by the second thread (thread_false) that sets that value to false and therefore the second comparison fails too. We get an assertion dialog in debug build showing that boolean variable b is neither true nor false!

I compiled and ran that program and it wasn’t failing for hours on my uniprocessor laptop. On a multiprocessor machine it was failing in a couple of minutes. If we look at assertion assembly language code we would see that it is very short so statistically speaking the chances that our main thread is preempted in the middle of the assertion are very low. This is because on a uniprocessor machine two threads are running not in parallel but until their quantum is expired. So we should make the assertion code longer to exceed the quantum. To simulate this I added a call to SwitchToThread API. When the assertion code yields execution to another thread then perhaps that thread would be thread_false and as soon as it is preempted by main thread again we get the assertion failure:

volatile bool b;
bool SlowOp()
{
 SwitchToThread();
 return false;
}
void thread_true(void *)
{
 while(true)
 {
  b = true;
 }
}
void thread_false(void *)
{
 while(true)
 {
  b = false;
 }
}
int _tmain(int argc, _TCHAR* argv[])
{
 _beginthread(thread_true, 0, NULL);
 _beginthread(thread_false, 0, NULL);
 while(true)
 {
  assert (b == false || SlowOp() || b == true);
 }
 return 0;
}

I compiled and ran the program again and I couldn’t see any failure for a long time. It looks like thread_false is always running before the main thread and when the main thread is running then due to short-circuit operator || evaluation rule we don’t have a chance to execute SlowOp(). Then I added a fourth thread called thread_true_2 to make the number of threads setting b variable to true value as twice as many as the number of threads setting b variable to false value (2 to 1) so we could have more chances to set b variable to true value before executing the assertion:

volatile bool b;
bool SlowOp()
{
 SwitchToThread();
 return false;
}
void thread_true(void *)
{
 while(true)
 {
  b = true;
 }
}
void thread_true_2(void *)
{
 while(true)
 {
  b = true;
 }
}
void thread_false(void *)
{
 while(true)
 {
  b = false;
 }
}
int _tmain(int argc, _TCHAR* argv[])
{
 _beginthread(thread_true, 0, NULL);
 _beginthread(thread_false, 0, NULL);
 _beginthread(thread_true_2, 0, NULL);
 while(true)
 {
  assert (b == false || SlowOp() || b == true);
 }
 return 0;
}

Now when I ran the new program I got the assertion failure in a couple of minutes! It is hard to make race conditions manifest themselves on a uniprocessor machine.

- Dmitry Vostokov -

           

Museum of Debugging and Memory Dumps

7/7/2010 - 8/8/2010 Annual Competition: Tell Your Windows Debugging Story

Crash and Hang Analysis Audit Service

CARE: Crash Analysis Report Environment

Crash Dump and Software Trace Analysis Training and Seminars

Access OpenTask Titles on Safari Books Online

DATA (Dump Analysis + Trace Analysis) Facebook group
Please join the community of memory (dump) and trace analysis engineers. This group promotes scientific methods and memory dump-based worldview.

Twitter @ DumpAnalysis
You can now follow portal and blog news at DumpAnalysis on Twitter

LinkedIn Group Dr. Watson Enthusiasts
All about Dr. Watson errors and more. Get news, excerpts and progress reports about the forthcoming book The Science of Dr. Watson: An Illustrated History of Debugging (ISBN 978-1906717070)

2010 (0x7DA) - The Year of Dump Analysis
2011 (0x7DB) - 2020 (0x7E4) The Debugging Decade

International Memory Analysts and Debuggers Day:
07.07 and/or 08.08 starting from The Year of Dump Analysis, 2010, 7DA

Announcements

Coming Soon:

Management Bits: An Anthology from Reductionist Manager

Crash Dump Analysis: Practical Foundations (Windows Edition, Systematic Software Fault Analysis Series)

Debugging Notebook: Essential Concepts, WinDbg Commands and Tools

Crash Dump Analysis for System Administrators and Support Engineers

New Magazines:

Debugged! MZ/PE: MagaZine for/from Practicing Engineers


New Books:

Memory Dump Analysis Anthology: Color Supplement for Volumes 1-3

Memory Dump Analysis Anthology, Volume 3

First Fault Software Problem Solving: A Guide for Engineers, Managers and Users

x64 Windows Debugging: Practical Foundations

Also available:

Windows Debugging: Practical Foundations

DLL List Landscape: The Art from Computer Memory Space

Dumps, Bugs and Debugging Forensics: The Adventures of Dr. Debugalov

WinDbg: A Reference Poster and Learning Cards

Memory Dump Analysis Anthology, Volume 2

Memory Dump Analysis Anthology, Volume 1

New Children's Book:

Baby Turing

One Response to “Race conditions on a uniprocessor machine”

  1. Software Generalist » Blog Archive » Reading Notebook: 02-September-09 Says:

    […] Incorrect sharing of memory example (p. 171) - although context switches emulate multiprocessor systems single-processor machines experience the same error conditions less frequently: http://www.dumpanalysis.org/blog/index.php/2007/04/14/race-conditions-on-a-uniprocessor-machine/ […]

Leave a Reply