The Smallest Program

Can the smallest program that crashes be smaller than the smallest program that doesn’t? Depends on a platform and a compiler/linker set. I’ve chosen x64 and MASM64 for my experiments. The smallest working program I came up first was this:

; ml64 /Zi TheSmallestProgram64.asm /link
;   /entry:main /SUBSYSTEM:CONSOLE

_text SEGMENT
main PROC
ret
main ENDP
_text ENDS

END

It compiles and links to an executable with only one byte instruction in its main function:

0:000> u main
TheSmallestProgram64!main:
00000000`00401010 c3              ret
00000000`00401011 cc              int     3
00000000`00401012 cc              int     3
00000000`00401013 cc              int     3
00000000`00401014 cc              int     3
00000000`00401015 cc              int     3
00000000`00401016 cc              int     3
00000000`00401017 cc              int     3

Then I thought about removing ret instruction and supposed that if we compile and link and try to execute the program with 0 bytes we get straight to int 3 instruction and in my case (I have NTSD set as a default postmortem debugger) a dump will be saved. So I did that but I found that unfortunately compiler inserts ret instruction if the procedure body is empty. So I cheated them by putting nop instruction (which is also one byte) and got my dump!

; ml64 /Zi TheSmallestProgramWithBug64.asm /link
;    /entry:main /SUBSYSTEM:CONSOLE

_text SEGMENT
main PROC
nop
main ENDP
_text ENDS

END

Loading Dump File [new_2006-10-25_12-40-06-500_076C.dmp]
…

0:000> kL
TheSmallestProgramWithBug64!main+0×1
kernel32!BaseProcessStart+0×29

0:000> u main
TheSmallestProgramWithBug64!main:
00000000`00401010 90              nop
00000000`00401011 cc              int     3
00000000`00401012 cc              int     3
00000000`00401013 cc              int     3
00000000`00401014 cc              int     3
00000000`00401015 cc              int     3
00000000`00401016 cc              int     3
00000000`00401017 cc              int     3

So one answer to my question: The smallest working program and the smallest crashing program have the same size unless we use some binary editors :-)

Then I tried MS Visual C++ (this time 32-bit project) and came up with the following C or C++ program without any prolog and epilog code:

__declspec(naked) void Main ()
{
}

I changed entry point from standard main function to my own capitalized Main function and compiler/link options:

Compiler:

/Od /GL /D "WIN32" /D "NDEBUG" /D "_CONSOLE"
/D "_UNICODE" /D "UNICODE" /D "_AFXDLL"
/FD /MD /GS- /Fo"Release\"
/Fd"Releasevc80.pdb" /W3 /nologo /c
/Wp64 /Zi /TP /errorReport:prompt

Linker:

/OUT:"SmallestProgram.exe" /INCREMENTAL:NO
/NOLOGO /MANIFEST:NO /NODEFAULTLIB /DEBUG
/PDB:"SmallestProgram.pdb" /SUBSYSTEM:CONSOLE
/OPT:REF /OPT:ICF /LTCG /ENTRY:"Main"
/ERRORREPORT:PROMPT

The program crashes immediately because the body is empty:

Loading Dump File [new_2006-10-25_15-18-03-109_13B0.dmp]

0:000> u Main
SmallestProgram!Main:
00401000 cc              int     3
00401001 0000            add     byte ptr [eax],al
00401003 0000            add     byte ptr [eax],al
00401005 0000            add     byte ptr [eax],al
00401007 0000            add     byte ptr [eax],al
00401009 0000            add     byte ptr [eax],al
0040100b 0000            add     byte ptr [eax],al
0040100d 0000            add     byte ptr [eax],al

0:000> kL
ChildEBP RetAddr
002cfff0 00000000 SmallestProgram!Main

0:000> dds esp
002cffc4  7d4e992a kernel32!BaseProcessStart+0x28
002cffc8  00000000
002cffcc  00000000
002cffd0  7efdf000
002cffd4  80000003
002cffd8  002cffc8
002cffdc  002cfbbc
002cffe0  ffffffff
002cffe4  7d4d8998 kernel32!_except_handler3
002cffe8  7d4e9938 kernel32!`string'+0x28
002cffec  00000000
002cfff0  00000000
002cfff4  00000000
002cfff8  00401000 SmallestProgram!Main
002cfffc  00000000

So another answer to my question: The smallest crashing program can be less than the smallest working program and is actually 0 bytes :-)

- Dmitry Vostokov -

6 Responses to “The Smallest Program”

  1. hught Says:

    Cool. So here’s a challenge: write a program that is in multiple languages - that is, it will successfully compile under two or more different language compilers, and will run without causing an error. Ideally, when run, it should do something smart like print the name of the language it was compiled as.

    I have seen this done before - the same piece of code compiled under three different language compilers.

    There’s a pint offered for the best solution

  2. Dmitry Vostokov Says:

    Cool challenge! Reminds me Obfuscated C Code contests. I devised this algorithm in pseudo code (might require a supercomputer if we assess its complexity for a non trivial program that prints a language it is written in ):
    main()
    {
    input max_length;
    test(”", 0, max_length);
    }

    bool test(string prg, int level, int max)
    {
    if (level == max) return false;
    for (char in [allowed symbols for both languages])
    {
    prg += char;
    if (both compilers compile successfully) { print prg; return true; }

    if (test(prg, level+1, max) == true) return true;
    prg -= char;
    }
    return false;
    }
    I’ll try to implement it in C and let you know Although you have to wait for about 2*10,000,000,000 compilations in the worst case for simple languages (10 letters in alphabet and the length of the program is 10 symbols) because the complexity is O(N^M) where N is alphabet cardinality and M is the maximum program length to try

  3. Dmitry Vostokov Says:

    Here is the program written in Visual C++ and C# and compiles under both compilers. When compiled as C# by C# compiler it prints “Written in C#”. When compiled under Visual C++ compiler it prints nothing because there is not need to say that C++ is the best language ever

    #if !__cplusplus
    class Program
    {
    static void Main()
    {
    System.Console.WriteLine(”Written in C#”);
    }
    }
    #else
    void main() {}
    #endif

  4. hught Says:

    I saw it done many years ago with Fortran IV, AlgolW and BCPL. It was extremely devious and involved lines such as

    COMMENT = 3;

    which in Fortran is a comment (capital C in column 6) and is a variable assignment in the other two. There was generally much abuse of the comment facility to hide things and everything was in capitals. I think I’m showing my age

  5. Crash Dump Analysis » Blog Archive » Opcodism: The Art of Opcodes Says:

    […] The earliest opcodism binary was created on October 25th, 2006 that I now call Nothingness and Crash: The Smallest Program. […]

  6. CyberRax Says:

    Actually they are of the same size: 0 bytes. The problem is that they depend on what’s already in the memory. In your case it’s a bunch of zeoes which cause a crash. But this value depends on the OS. Vanilla DOS doesn’t really clear memory so if the segment where the program is loaded already contains a RET at 100h then it’ll work.
    The RET command actually also depends on the environment. A correct program would be 2 bytes long, containing “INT 20h” (EXIT in DOS 1, but also supported by later versios, altough using INT 21h with the “exit” value is prefferred). Your current implementation depends on the correct value being in the stack, which IIRC is not always the case.

Leave a Reply

You must be logged in to post a comment.