I'll try and be as concise as possible. Any assistance is much appreciated. The problem is a memory leak.
I have a 32bit dll written in c# that runs as a com component on a windows 2003 server. This component processes biztalk 2002  messages. The process runs under 200mb most of the time until it spikes up above a gig and will run out of memory around 2 gig unless it’s recycled. The memory is never released even when the process is idle for some time.
I used perfmon to verify that when the process is using a gig of memory, only 30-40 megs are on the managed heap so I know it’s not .net memory. !address –summary shows growth is the native heap. !heap command shows one heap (the first one) contains 95% of the memory used. Typically at this point I would use debugdiag to give me a summary of allocations. When I ran the process under debugdiag… no leak for a month. Detached debugdiag and the process blew up in a day. So I tried umdh.exe this time. I turned on user heap traces with “gflags –i dllhost +ust”, restarted the process and took an initial snapshot with umdh.exe. Again, the process did not leak for a week. I turned off user mode stack traces and restarted the process 4 hours ago and it is at 600mb already. I grabbed a dump of it when it was around 500mb.
So that’s my first issue. Why does it not leak with heap traces on? It’s possible that the timing is a coincidence but the odds of that seem low. I know that turning on heap traces and/or running under a debugger disables the LFH, turns on page heap, and tweaks some flags on the heap so technically there is a difference in heap behavior. I see that with traces on and off, the !heap –s command shows “L” in the fast heap column for the large problem heap and not LFH if that helps.
I decided to take a look at the 500mb dump I got without stack traces and poke around in the heap and see what was there. Heap 00090000 has 450+ mb of data.
!heap -stat -h 00090000 
 heap @ 00090000
group-by: TOTSIZE max-display: 20
    size     #blocks     total     ( %) (percent of total busy bytes)
    21120 116 - 23e98c0  (13.73)
    3c 769ef - 1bcd404  (10.63)
    2e 66d0f - 12798b2  (7.06)
    1c 667e8 - b35d60  (4.29)
    24 4f942 - b30d48  (4.28)
    2a 3a03c - 9849d8  (3.64)
    3a 27c6d - 9030b2  (3.45)
    7c80 116 - 873300  (3.23)
    6c5c 116 - 75abe8  (2.81)
…
I though those large allocations of 21120 were odd so I dumped those
!heap -flt s 21120
    _HEAP @ 90000
      HEAP_ENTRY Size Prev Flags    UserPtr UserSize - state
        1bbd0040 4225 0000  [01]   1bbd0048    21120 - (busy)
          ? <Unloaded_elp.dll>+1b6dc7b7
        1be12fc8 4225 4225  [01]   1be12fd0    21120 - (busy)
          ? <Unloaded_elp.dll>+1b6c21a7
        1bf24188 4225 4225  [01]   1bf24190    21120 - (busy)
          ? <Unloaded_elp.dll>+1b5f68ef
        1c011000 4225 4225  [01]   1c011008    21120 - (busy)
          ? <Unloaded_elp.dll>+1b72542f
        1c074320 4225 4225  [01]   1c074328    21120 - (busy)
          ? <Unloaded_elp.dll>+1b84dc37
        1c096fe0 4225 4225  [01]   1c096fe8    21120 - (busy)
          ? <Unloaded_elp.dll>+1bad1a57
        1c0daf98 4225 4225  [01]   1c0dafa0    21120 - (busy)
          ? <Unloaded_elp.dll>+1b961f7f
        1c0fc128 4225 4225  [01]   1c0fc130    21120 - (busy)
I did the same for the next largest allocation, 3c, and got similar results for 90% of the entries
1b7d61e8 0009 0009  [01]   1b7d61f0    0003c - (busy)
          ? <Unloaded_dll>+2d0b22
Lm output shows this for unloaded modules
Unloaded modules:
00320033 00960061   Unknown_Module_00320033
Missing image name, possible paged-out or corrupt data.
00680063 00cc00a8   Unknown_Module_00680063
0000f50f 0075f572   dll     
00000001 45d70a37   elp.dll 
71af0000 71b12000   ShimEng.dll
Any tips on next steps here? Spot checking those addresses with db command doesn’t show any strings or recognizable pattern. I cannot find info on this elp.dll anywhere on the internet and there is no such dll on the server. When I randomly break into the live process with the debugger and dump modules, elp.dll is always in the unloaded modules area. And what’s with the address for elp.dll, 00000001 ?
			
		