Running with Code

Like scissors, only more dangerous

  Home  |   Contact  |   Syndication    |   Login
  79 Posts | 0 Stories | 173 Comments | 2 Trackbacks

News



Archives

Post Categories

All Terralever

ASP.NET

Misc

In C#, Visual Basic .NET, C++/CLI, J# - the list goes on - we're freed from having to worry about our memory management.  Objects take care of themselves, and when the CLR garbage collector detects that an object is no longer in use, it frees the associated memory.  That doesn't mean that we should run around allocating and deallocating objects all willy-nilly; in fact, since we have less control over memory, we arguably have the opportunity to be more careful with the way we use high-frequency objects.

Memory Regions in .NET

In .NET, and generally in most programming, we can think of two places in which we can access memory: the stack and the heap.  We can think of Stack memory as temporary workspace, or scratch space; when we leave a function, all of our stack goes away.  Way, way down in the machine architecture, the stack also stores the return addresses of functions.  The stack also stores function parameters.  It's generally very orderly, inexpensive, but its volatile nature makes it a poor candidate for long-term storage.  In .NET, all types that derive from the ValueType class are stored on the stack unless they are boxed into an object reference; this includes types defined with the struct and enum keywords, as well as all of the primitive types except string (including int, double, and bool).

Heap memory is another matter.  The heap is a region of memory reserved for the use of the program and is intended to store objects that aren't quite so transient.  This might be something like a database connection, a file or buffer, or a window.

The Enemy: Fragmentation

Over time, objects are allocated and eventually released, and because there's not really any rhyme or reason, the heap becomes chaotic.  Allocations grab the first free block that's large enough (sometimes larger than necessary) and hold onto it until they go into the abyss.  This leads to fragmentation - all the free memory must be tracked somehow, and here's the real killer: contiguous blocks of free memory may not always be recognized as such.  Check this out: let's say we have a heap allocated at memory location 0x4000 that is 32 bytes wide:

An un-used heap

(Yes, my awesome artwork was done with none other than Excel!)

Suppose we allocate an 8-byte object and another 8-byte object, then a 16-byte object.  The first is in red, the second in orange, and the third in gray:

The heap after it was filled

Now I'll free the first and the third objects; we'll have 24 bytes of total free memory:

A fragmented heap

Either we need to keep track of every little piece of memory, which might be the fastest algorithm for releasing but slow for allocating (not to mention potentially VERY wasteful), or try to come up with another solution.  This type of memory fragmentation is referred to as external fragmentation.

The Garbage Collector and Compaction

The garbage collector has two components: a reference counter and a compaction engine.  The reference counter is responsible for determining when objects no longer have references to them; this frees programmers from having to explicitly destroy objects (as is the practice in C++ with the delete operator, or in C with the free function).  A lazy thread is then able to release and compact memory as needed, avoiding much of the overhead of external fragmentation and also allowing unused memory to be reclaimed.  The garbage collector in .NET is generational; it checks the newest objects first (what are called "gen-0"), and if the newest objects are still in use, they get moved to gen-1.  If the memory pressure requires, gen-1 objects are evaluated, and if they are still in use, they get moved to gen-2.  Gen-2 objects are considered long-lasting, and are only checked when memory pressure is severe.

Let's go back to our heap example; supposing I had an 8-byte, another 8-byte, and a 12-byte allocation, here's my heap graph:

A new heap

Object 1 (in red) has gone out of scope, but objects 2 and three are sticking around.  Using normal memory freeing rules, the largest object that could be allocated would still only be 8 bytes, because that would be the largest contiguous free space.  However, using .NET's compacting garbage collector, we could expect something along these lines:

A compacted heap graph

Here we can see we've dealt with the problem of external fragmentation by compacting the heap.  This convenience doesn't come without a cost, though; while the garbage collector performed the compaction, all of your application threads were suspended.  The GC can't guarantee object integrity if memory is getting abused during a garbage collection!

Preventing Compaction: Stop Killing off Objects!

Object pooling is a pattern to use that allows objects to be reused rather than allocated and deallocated, which helps to prevent heap fragmentation as well as costly GC compactions.  A pool can be thought of as an object factory; in essence, the most rudimentary pool could look like this:

   1: public class Pool<T> where T : new()
   2: {
   3:     private Stack<T> _items = new Stack<T>();
   4:     private object _sync = new object(); 
   5:  
   6:     public T Get()
   7:     {
   8:         lock (_sync)
   9:         {
  10:             if (_items.Count == 0)
  11:             {
  12:                 return new T();
  13:             }
  14:             else
  15:             {
  16:                 return _items.Pop();
  17:             }
  18:         }
  19:     }
  20:  
  21:     public void Free(T item)
  22:     {
  23:         lock (_sync)
  24:         {
  25:             _items.Push(item);
  26:         }
  27:     }
  28: }

Here, objects are created entirely on-demand and, when freed, are stored in a stack.  The reason we want to use a Stack is the performance characteristic of adding and removing objects; operations are always performed at the end of the list, which makes it highly efficient to add or remove items.  If possible, it may be prudent to pre-create a number of objects for use throughout the lifetime of your application. 

Here's an example: the project I've been discussing lately uses a pool of byte arrays to handle incoming network messages received and sent via a Socket.  When pooling is enabled, over the course of the application's lifetime, there were 17 Gen-0 collections, 5 Gen-1 collections, and 3 Gen-2 collections; a total of 270 byte[] instances were allocated, of which 44 were eligible for pooling and were pooled.  When pooling is disabled, there were 22 Gen-0 collections, 5 Gen-1 collections, and 3 Gen-2 collections; a total of 11,660 byte[] instances were allocated, of which approximately 10,900 were eligible for pooling.  That's a lot of memory!

Summary - When and Why

Object pooling is a powerful optimization technique, and if you're already using factory patterns it shouldn't be terribly foreign to you.  The .NET Framework includes the ThreadPool class as part of System.Threading.  Other objects you might consider pooling are database connections, any expensive links to unmanaged code, or anything that needs to be allocated frequently and can then be thrown away.  In my example, byte arrays are exceptionally good for this because they can be overwritten easily.

Further Reading

The "Speedy C#" Series:

posted on Thursday, August 7, 2008 9:38 PM

Feedback

# re: Speedy C#, Part 2: Optimizing Memory Allocations - Pooling and Reusing Objects 8/8/2008 1:37 PM Guy Ellis
Excellent explanation Rob - thanks - also really like your little code snippet - immediately makes a lot of sense.

# re: Speedy C#, Part 2: Optimizing Memory Allocations - Pooling and Reusing Objects 8/8/2008 7:18 PM Scott
Where did you get your information that the GC uses a reference counter and a lazy thread? When the GC runs, it actually freezes your application's main thread so it can walk the roots looking for reachable objects.

Reread Richter's article, particularly the section titled "The Garbage Collection Algorithm".

# re: Speedy C#, Part 2: Optimizing Memory Allocations - Pooling and Reusing Objects 8/8/2008 10:36 PM Rob
Hi Scott,

The GC counts outstanding references to objects during compaction and also determines whether the reference holders are also eligible for garbage collection (e.g., a object has a reference to a container and the container has a reference to the object, but both are eligible for collection). I suppose this isn't a "reference counter" in the strictest sense of the word, but since this article wasn't on GC internals, I thought it conveyed the sense of the process well enough.

As to the lazy thread, I meant "lazy" as in "not constantly working." It not only freezes the main thread but all threads in the process, as I said:

"This convenience doesn't come without a cost, though; while the garbage collector performed the compaction, all of your application threads were suspended."

Thanks!

# re: Speedy C#, Part 2: Optimizing Memory Allocations - Pooling and Reusing Objects 8/12/2008 5:06 AM ApacheChief
The code example makes tons of sense, really enjoy reading your stuff, keep it up :)

# re: Speedy C#, Part 2: Optimizing Memory Allocations - Pooling and Reusing Objects 2/28/2009 6:27 PM matt
nicely written. I hope it covinces a couple of guys I work with to follow the pattern. We're acquiring and processing ~200,000 images per day. I wish we were using byte arrays too (I'm still looking into making a Bitmap or one of the standard image objects work here).

thanks!

# re: Speedy C#, Part 2: Optimizing Memory Allocations - Pooling and Reusing Objects 5/28/2009 4:01 AM myrocode
thank you very much for your clear explanation...

i really enjoyed in reading your article. now things are getting brighter!

# re: Speedy C#, Part 2: Optimizing Memory Allocations - Pooling and Reusing Objects 1/30/2013 2:07 AM ArungPalakka
hey, I'm newbie... how to use this code to my app... sorry for my bad English



Post A Comment
Title:
Name:
Email:
Comment:
Verification: