Understanding I/O Completion Ports

Many articles and books explain Windows I/O completion ports from high level design considerations arising when building high-performance server software. But it is hard to recall them later when someone asks to explain and not everyone writes that software. Looking at complete memory dumps has an advantage of a bottom-up or reverse engineering approach where we see internals of server software and can immediately grasp the implementation of certain architectural and design decisions.

Consider this thread stack trace we can find almost inside any service or network application process:

THREAD 86cf09c0  Cid 05cc.2030  Teb: 7ffd7000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable
    8a3bb970  QueueObject
    86cf0a38  NotificationTimer
Not impersonating
DeviceMap                 e15af5a8
Owning Process            8a3803d8       Image:         svchost.exe
Wait Start TickCount      2131621        Ticks: 1264 (0:00:00:19.750)
Context Switch Count      6            
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Win32 Start Address RPCRT4!ThreadStartRoutine (0×77c5de6d)
Start Address kernel32!BaseThreadStartThunk (0×77e6b5f3)
Stack Init ba276000 Current ba275c38 Base ba276000 Limit ba273000 Call 0
Priority 8 BasePriority 8 PriorityDecrement 0
ChildEBP RetAddr 
ba275c50 8083d3b1 nt!KiSwapContext+0×26
ba275c7c 8083dea2 nt!KiSwapThread+0×2e5
ba275cc4 8092b205 nt!KeRemoveQueue+0×417
ba275d48 80833a6f nt!NtRemoveIoCompletion+0xdc

ba275d48 7c82ed54 nt!KiFastCallEntry+0xfc
0093feac 7c821bf4 ntdll!KiFastSystemCallRet
0093feb0 77e66142 ntdll!NtRemoveIoCompletion+0xc
0093fedc 77c604c3 kernel32!GetQueuedCompletionStatus+0×29

0093ff18 77c60655 RPCRT4!COMMON_ProcessCalls+0xa1
0093ff84 77c5f9f1 RPCRT4!LOADABLE_TRANSPORT::ProcessIOEvents+0×117
0093ff8c 77c5f7dd RPCRT4!ProcessIOEventsWrapper+0xd
0093ffac 77c5de88 RPCRT4!BaseCachedThreadRoutine+0×9d
0093ffb8 77e6608b RPCRT4!ThreadStartRoutine+0×1b
0093ffec 00000000 kernel32!BaseThreadStart+0×34

We see that I/O completion port is implemented via kernel queue object so requests (work items, completion notifications, etc) are stored in that queue for further processing by threads. The number of active threads processing requests is bound to some maximum value that usually corresponds to the number of processors:

0: kd> dt _KQUEUE 8a3bb970
ntdll!_KQUEUE
   +0x000 Header           : _DISPATCHER_HEADER
   +0x010 EntryListHead    : _LIST_ENTRY [ 0x8a3bb980 - 0x8a3bb980 ]
   +0x018 CurrentCount     : 0
   +0×01c MaximumCount     : 2
   +0×020 ThreadListHead   : _LIST_ENTRY [ 0×86cf0ac8 - 0×89ff9520 ]

0: kd> !smt
SMT Summary:
------------
   KeActiveProcessors: **------------------------------ (00000003)
        KiIdleSummary: **------------------------------ (00000003)
No PRCB     Set Master SMT Set                                     IAID
 0 ffdff120 Master     **—————————— (00000003)  00
 1 f772f120 ffdff120   **—————————— (00000003)  01

Kernel work queues are also implemented via the same queue object as we might have guessed already:

THREAD 8a777660  Cid 0004.00d0  Teb: 00000000 Win32Thread: 00000000 WAIT: (Unknown) UserMode Non-Alertable
    808b707c  QueueObject
Not impersonating
DeviceMap                 e1000928
Owning Process            8a780818       Image:         System
Wait Start TickCount      2615           Ticks: 2130270 (0:09:14:45.468)
Context Switch Count      301            
UserTime                  00:00:00.000
KernelTime                00:00:00.000
Start Address nt!ExpWorkerThread (0×8082d92b)
Stack Init f71e0000 Current f71dfcec Base f71e0000 Limit f71dd000 Call 0
Priority 12 BasePriority 12 PriorityDecrement 0
Kernel stack not resident.
ChildEBP RetAddr 
f71dfd04 8083d3b1 nt!KiSwapContext+0×26
f71dfd30 8083dea2 nt!KiSwapThread+0×2e5
f71dfd78 8082d9c1 nt!KeRemoveQueue+0×417
f71dfdac 809208fc nt!ExpWorkerThread+0xc8
f71dfddc 8083fc9f nt!PspSystemThreadStartup+0×2e
00000000 00000000 nt!KiThreadStartup+0×16

0: kd> dt _KQUEUE 808b707c
ntdll!_KQUEUE
   +0x000 Header           : _DISPATCHER_HEADER
   +0x010 EntryListHead    : _LIST_ENTRY [ 0x808b708c - 0x808b708c ]
   +0x018 CurrentCount     : 0
   +0×01c MaximumCount     : 2
   +0×020 ThreadListHead   : _LIST_ENTRY [ 0×8a77a128 - 0×8a777768 ]

I’ve created the simple UML diagram showing high-level relationship between various objects seen from crash dumps. Note that Active Thread object can process items from more than one completion port if its wait was satisfied for one port and then for another but I have never seen this. Obviously Waiting thread can wait only for one completion port. 

- Dmitry Vostokov @ DumpAnalysis.org -

4 Responses to “Understanding I/O Completion Ports”

  1. Marc Sherman Says:

    Is “dt _KQUEUE 8a1e7150″ a typo? Shouldn’t the address be 8a3bb970 to match the QueueObject address highlighted in blue?

  2. Dmitry Vostokov Says:

    Corrected. Thanks!

  3. Software Generalist » Blog Archive » Reading Notebook: 20-September-10 Says:

    […] I/O Completion Ports (pp. 592 - 598) - my own architectural investigation from a complete memory dump perspective: http://www.dumpanalysis.org/blog/index.php/2007/11/27/understanding-io-completion-ports/ […]

  4. Crash Dump Analysis » Blog Archive » Reading Notebook: 20-September-10 Says:

    […] I/O Completion Ports (pp. 592 - 598) - my own architectural investigation from a complete memory dump perspective: http://www.dumpanalysis.org/blog/index.php/2007/11/27/understanding-io-completion-ports/ […]

Leave a Reply

You must be logged in to post a comment.