Geeks With Blogs
Dynamic Concepts (in) Development Musings of TheCPUWizard

[Originally Published Apr 2004 - Updated October 2009]

Everyone "understands" that Microsoft's .NET and the CLR is a "garbage collector" based environment; but is it really.

First we must establish what is meant be "garbage" in this context. When an object is created there is (typically) one reference by which it can be accessed (the return value of "new"). While the program executes, there may be other references established to the same item; and established references may terminate. When an object can no longer be referenced, it is deemed to be "Garbage". [note: This is a bit of a simplification but will satisify out needs]

Next we must look at the definition of "collection", Websters dictionary offers the following:

collection: the act or process of collecting.
colllect: to bring together into one body or place.

Now lets look at what happens when a "GC.Collect" occurs.... (For simplicity we will look at generation 0, and ignore the impact of "pinned" objects). The object graph is "walked" starting at the rooted references, and any reachable item that is in Generation 0 is marked. When the walk is complete, the live objects are moved to the Gen1 heap, and the Gen0 heap reset back to the beginning. The result is that the memory occupied by all of the previous Gen0 residents is now available.

This reveals the fundamental problem with calling this process "garbage collection". Absolutely NOTHING is done with the garbage. Specifically there are no operations which involve moving the garbage so it is "brought together in one place".

To see what a "real" garbage collection is, consider an anology. In ones house, there are likely to be multiple wastebaskets; one in the kitchen, one in the bathroom, and other scattered throughout the residence. On trash day (or earlier if the Wife has anything to say about the matter), one goes through the residence and collects all of the garbage from multiple locations, places it in one bag, and brings it outside to the rubbish container. The amount of work is dependant on the number of original locations of garbage, and the amount of garbage in each location. The amount of "precious" (non-garbage) item in the house has absolutely no bearing on the process or the effort it will involve.

But when we look at the .NET situation, the exact opposite is true. It is the number of LIVE objects that impacts the performance as these are what must be scanned and moved. It does not matter if there is a single small "garbage" object on the heap, or if there are tens of thousands (of varying sizes). Once the live (precious) objects have been moved out of harms way, it is a single, constant time operation to reset the heap to be ready to get new objects.

This shows that .NET implements a Live Object Preservation pattern, and NOT a grabage collection pattern.

While this entire post may seem like a "symantic quibble", it has serious ramifications when dealing with .NET architecture/design and implementation. In other environments there is NO overhead (aside from the actual memory) to keeping references to heap based object which will be needed (or even just possibly needed) later. In many cases, the cost of allocating [always higher in a conventional heap than in a CLR heap]  and deleting (updating the freelist) far outwieghs the memory utilization issue, and so references are kept for an extended period of time.

When this approach is taken in a .NET application, these live objects represent a performance hit everytime (neglecting some optimizations) that the GC runs - simply because the GC deals with processing live objects. On the otherhand, allocating a (non-large) object in .NET is typically a simply pointer increment, and abandoning it (assuming no finalizer) is a 0 time issue.

Over the past few years, I have been involved with a number of projects where clients were complaining that ".NET was slow" and could not meet their perfomance demands. In the vast majority of cases, this was directly tracked to the implementation not having proper (for .NET) object lifetime management..

addendum: When one looks at environments such at C/C++, the conventional/standard implementation (pre C++0x) do not include "garbage collection". The heap is (typically, and simplified) implemented as a structure containing the "free blocks" of items that were previously deleted. This means that (a pointer to) memory that is not longer in use [i.e. garbage] IS actually MOVED. Each time there is a call to "delete" or "free(...)" there is a synchronous [i.e. it completes before delete/free() returns] collection of information about the garbage that occurs.

In .NET the large object heap [LOH] is used for items which exceed a threshold size [80,000 bytes]. This particular heap IS operated in a manner nearly identical to a C/C++, in that the "live" objects are NOT moved, and it is a set of references to the avilable memory (garbage) that is manipulated.

 

 

Posted on Thursday, September 24, 2009 7:46 PM Contrary Views | Back to top


Comments on this post: Sorry Johnny - There is NO Garbage Collection in .NET

# re: Sorry Johnny - There is NO Garbage Collection in .NET
Requesting Gravatar...
Hi David. You ARE a contrarian, aren't you? While you make a valid case, I don't think you're going to be able to dissuade the software community from using the term Garbage Collector in favor of Live Object Preserver.

Good article :) See you again at the 2010 MVP summit - assuming I'm still in the clan!
Left by Carl Daniel on Sep 24, 2009 8:16 PM

# re: Sorry Johnny - There is NO Garbage Collection in .NET
Requesting Gravatar...
Carl, Good to hear from you. I agree that there is not going to be any change in the usage of the term. My only hope is that developers will realize that it is the complexity of the live object graph that impacts performance rather than the the number of garbage objects.

Even as .Net approaches being a decade old, I still frequently encounter developers who increase the complexity of the object graph with the goal being to reduce the total number of allocations. Most of them are completely baffled as to why they performance DECEREASES as they continue to "optimize" the implementation.
Left by TheCPUWizard on Sep 25, 2009 4:28 AM

# great discount
Requesting Gravatar...
Lol, that is really funny! I will send to my friend - they will get a kick out of it!
Left by carnival cruises on Mar 25, 2010 1:40 AM

# enjoy be happy
Requesting Gravatar...
Hey cool stuff..really amazing ..liked how u presented it :)
Left by carnival cruise deals on Mar 25, 2010 1:42 AM

# re: Sorry Johnny - There is NO Garbage Collection in .NET
Requesting Gravatar...
I have found garbage collector performance issues are much more likely to be issues with bad code design. Overuse of finalisers, large complex object graphs, too many objects as roots etc etc are the problem. If most of the objects created during an applications life are collected in the ephemeral segment then the garbage collector runs extremely efficiently (even if objects are pinned). With .NET4 CLR we have background collection on Gen2 as well so its even faster.
Although theoretically managing memory directly can be faster, with all of the other benefits of a self-tuning garbage collector its clear why this has become the de-facto standard for memory management in standard application development
Left by Dean Chalk on Oct 14, 2010 7:18 AM

Your comment:
 (will show your gravatar)


Copyright © David V. Corbin | Powered by: GeeksWithBlogs.net