Geeks With Blogs

@neh123us
  • neh123us T4 sould be your outsourcing strategy https://t.co/1ZceCUKWKE T4 - The Insource Code Monkey about 482 days ago
  • neh123us Had a need for Dynamic Views in MVC today. Wanted to use a Partial view on two different views with different ViewModels about 530 days ago

News Google

Nick Harrison Blog<Nick>.Next()

I have been researching some memory issues and come across something that surprised me the other day. The generic List hides what can potentially be a substantial memory hog.

This is not a memory leak. It is just a fairly benign looking object with some simple looking code that may cause your application to use substantially more memory than you may initially think likely.

It is common to write code similar to this:

           
public static IList<string> GetNames(Type enumType)
{
    if (!enumType.IsEnum)
    {
        throw new ArgumentException("Type '" + enumType.Name + "' is not an enum.");
    }
    List<string> list = new List<string>();
    foreach (FieldInfo info in from field in enumType.GetFields()
        where field.IsLiteral
        select field)
    {
        list.Add(info.Name);
    }
    return list;
}

 

The details are not as important as the pattern, we are looping through a collection and adding items to the List.

Consider the implementation of the Add method. From Reflector we get:

public void Add(T item)

{

if (this._size == this._items.Length)

{

this.EnsureCapacity(this._size + 1);

}

this._items[this._size++] = item;

this._version++;

}

 

 

This looks innocent enough. The problem is in the EnsureCapacity method. This may be a little sublte:

 

 

private 									void 									EnsureCapacity(int min) 

 

{ 

 

 									if (this._items.Length < min) 

 

    { 

 

 									int 									num = (this._items.Length == 0) ? 4 : (this._items.Length * 2); 

 

 									if (num < min) 

 

        { 

 

            num = min; 

 

        } 

 

 									this.Capacity = num; 

 

    } 

}

 

If you follow a common pattern of instantiating your list with the default constructor, we will start with an empty list. The first time we call Add, we will hit the EnsureCapacity. The length will be 0 and num will get a value of 4. After adding a few more items, we will again hit EnsureCapacity and this time we will double the size of the list. Every time after this initial call to EnsureCapacity we will double the size of the list.

If the number of items in the list is close to a power of two but not over it, we will have good memory utilization, but we can get into trouble if we are over by just a little bit. If the size of the list is 255, we will have ideal memory usage, but if there are 257 items in the list, we will have allocated enough room for 512 items even though we only need half that.

Why would we do such a thing? Look at the implementation of the Capacity property:

public int Capacity

{

get

{

return this._items.Length;

}

set

{

if (value != this._items.Length)

{

if (value < this._size)

{

ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.value,

ExceptionResource.ArgumentOutOfRange_SmallCapacity);

}

if (value > 0)

{

T[] destinationArray = new T[value];

if (this._size > 0)

{

Array.Copy(this._items, 0, destinationArray, 0, this._size);

}

this._items = destinationArray;

}

else

{

this._items = List<T>._emptyArray;

}

}

}

}

 

We run the risk of using substantially more memory than we need to avoid the call to Array.Copy. This method is actually implemented as unmanaged code, so I don't know what it is actually going on there. My guess would be "unsafe" pointer manipulation.

 

My advice, provide a reasonable initial capacity when declaring lists (this will avoid the initial calls to EnsureCapacity). Err on the side of too big because if the List decides it needs to grow, if will double in size compared to you initial estimate. If it could be 500, go ahead and make it 600. As long as the actual values are at least half of what you set, you will probably be better off than if the list grew on its own by powers of 2 while incurring the Copy overhead at each step.

 

What are your thoughts?

 

 

 

 

Posted on Friday, May 13, 2011 3:56 PM | Back to top


Comments on this post: The Perils of List.Add()

comments powered by Disqus

Copyright © Nick Harrison | Powered by: GeeksWithBlogs.net | Join free