Posts
72
Comments
234
Trackbacks
162
August 2011 Entries
Memory Consumption: List<T> vs CompressedIndex

In my last post I did show how you can consume much less memory if you read values into a list which have often the same value. The IndexedList class uses internally a reference cache similar to the String.Intern method. But what if we do read value types such as int, enum, int64, … into a list which do repeat themself also very often? Can we store an enum which can have four values in less memory than an integer? Sure we can. Lets have a look at the following data

Row Values
0 1
1 2
2 3
3 1
4 2
….  

 

The enum in our sample data can take only 3 distinct values so we can use bit arrays to represent each value by its own bit array

Row BitArray 1 BitArray 2 BitArray 3
0 1 0 0
1 0 1 0
2 0 0 1
3 1 0 0
4 0 1 0
….      

 

This way we consume for each distinct value only one bit for each row. In the BCL there is a BitArray class implemented but I did find a Word Aligned BitArray implementation which claims to be much faster than the one of the BCL so I did use this one. If we try to compress a list of integers which has more than 32 distinct values in the data we do read then we would consume more memory as the original data. It is therefore a good idea to limit the number of bit arrays in use and switch back to a conventional list if more distinct values are found.

The best thing about this CompressedIndex is that although there may be many different values possible. Only for the actually occurring values we do allocate a bitvector. If you read e.g. a 500 MB data file which do contain the logs for a process with the id 5100 then we will allocate only one bit array for the value 5100. In the end we use only one bit for each row instead of 32 bits if we would have stored the process id as integer.

I do think you cannot get much better than this. But we have still plenty of room to optimize…

During my journey into the BCL I have found BitVector32 which was unkown to me until now. It does wrap an uint value which allows you to store up to 32 boolean flags in one BitVector32 instance. On the other hand it is no wonder why have never found this class. I did not care so deeply about memory until recently. The old C/C++ tricks are still the best and can be used in C# as well where necessary.

Here is the code for the CompressedIndex

    /// <summary>
    /// Create an index which uses for each distinct value a bitarray. More different values are stored than he has bits
    /// we switch to a plain list.
    /// </summary>
    /// <typeparam name="TKey">Storage type. Must be implementing GetHashCode and equals properly!</typeparam>
    class CompressedIndex<TKey> where TKey : struct
    {
        public CompressedIndex()
        {
            Type t = typeof(TKey);
            if (t.IsEnum) // Marshal can only get the size of primitive types. 
            {
                t = Enum.GetUnderlyingType(t);
            }

            _MaxBits = Marshal.SizeOf(t) * 8;
        }

        public void Add(int row, TKey key)
        {
            // we have more distinct values than the value has bits we need to swicht to a plain list
            if (_Cache != null && _Cache.Count > _MaxBits) 
            {
                ConvertCache();
            }

            if (_Cache != null)
            {
                // get bitarray for this value
                WAHBitArray bitArray;
                if (!_Cache.TryGetValue(key, out bitArray))
                {
                    bitArray = new WAHBitArray();
                    _Cache[key] = bitArray;
                }
                bitArray.Set(row, true);  // set at row index for this value the bitarray value to 1
                _MaxIndex++;
            }
            else
            {
                TKey value = key;
                _ListCache.Add(value);  // store for the same value the same reference to the list
            }
        }

        public TKey[] UniqueValues()
        {
            if (_Cache != null)
            {
                return _Cache.Keys.ToArray();
            }
            else
            {
                HashSet<TKey> valueTypeValues = new HashSet<TKey>();
                List<TKey> uniqueValues = new List<TKey>();
                foreach (var val in _ListCache)
                {
                    if( valueTypeValues.Add(val) )
                    {
                        uniqueValues.Add(val);
                    }
                }

                return uniqueValues.ToArray();
            }
        }

        public TKey this[int index]
        {
            get
            {
                if (_Cache != null)
                {
                    foreach (var kvp in _Cache) // get all bitarrays
                    {
                        if (kvp.Value.Get(index) == true) // The bitarray which has 1 marks the actual value
                        {
                            return kvp.Key;
                        }
                    }

                    return default(TKey);
                }
                else
                {
                    return _ListCache[index];
                }
            }
        }

        /// <summary>
        /// We have reached the limit. Switch back to List
        /// </summary>
        void ConvertCache()
        {
            var newCache = new List<TKey>(_MaxIndex + 1);
            for (int i = 0; i < _MaxIndex; i++)
            {
                newCache.Add(this[i]);
            }
            _ListCache = newCache;
            _Cache = null;
            _MaxIndex = 0;
        }

        /// <summary>
        /// Trim internal buffers to the exact count. After this
        /// operation it can be accessed read only by multiple threads.
        /// </summary>
        public void TrimExcess()
        {
            if (_ListCache != null)
            {
                _ListCache.TrimExcess();
            }

            if (_Cache != null)
            {
                foreach(var kvp in _Cache )
                {
                    // get last value which can cause the bitvector to resize itself
                    // to its final size. Only then the bitvector is thread safe for read only 
                    // access.
                    kvp.Value.Get(_MaxIndex); 
                    kvp.Value.TrimExcess();
                }
            }
        }

        Dictionary<TKey, WAHBitArray> _Cache = new Dictionary<TKey, WAHBitArray>();
        int _MaxIndex = 0;

        List<TKey> _ListCache;
        static int _MaxBits;  // threshold value 
    }

Posted On Friday, August 19, 2011 2:36 AM | Feedback (0)
Memory Consumption: List<T> vs IndexedList<T>

When you want to store millions of strings in a list you will notice in your memory profiler that the biggest memory consumers strings and lists are. If you e.g read many lines from a file with repetitive data inside it you can use a reference cache to consume the actual memory for the same string only once. The CLR offers String.Intern to do exactly that but the string reference cache is cross AppDomain static which is mostly never a good choice. Where could you use this? If you parse log files where many rows of data are often the same like type name, method name, … it can save you a lot of memory.

If you use a List<string> and intern all of your read strings you still store references in your list which will make the object graph more complicated and GC cycles slower. Besides this a reference does cost you 4/8 bytes in 32/64 bit processes. A generic read only index based list to which you simply add your data can make a huge difference. I could read a 200 MB log file in 17s by using a List<string> and it did drop to 10s after switching to IndexedList. Not bad for such a simple thing. But be warned it does only help in your specific case you you now that you are adding redundant data to it. IndexedList will consume twice as much memory compared to List<string> if all strings are different!

As added bonus you just have made searching inside your list a lot faster (at least for strings) by taking advantage of the fact that the same value is the same object. It is therefore enough to check for reference equality. String.Equals does this already so there is no need to use a custom comparer. But for custom objects you should check out if they do implement Equals properly.

As usual her comes the code:

    /// <summary>
    /// Memory conserving list which helps to spare memory if many identical values are added.
    /// Internally it uses integers to lookup its values from a hash table. The base structure is an integer list which 
    /// performs much better if millions of entries with only a few thousand distinct values are added. 
    /// </summary>
    /// <typeparam name="T">List Type</typeparam>
    class IndexedList<T> : IList<T> where T : class
    {
        Dictionary<T, T> _ReferenceCache = new Dictionary<T, T>();            // Use the same reference if the value is the same
        Dictionary<object, int> _Value2Index = new Dictionary<object, int>(); // use reference equality for dictionary
        Dictionary<int, T> _Index2Value = new Dictionary<int, T>();
        List<int> _List = new List<int>();                                    // The list holds at i the index which is the key to _Index2Value 

        public IndexedList()
        {
        }

        #region IList<T> Members

        public int IndexOf(T item)
        {
            throw new NotImplementedException();
        }

        public void Insert(int index, T item)
        {
            throw new NotImplementedException();
        }

        public void RemoveAt(int index)
        {
            throw new NotImplementedException();
        }

        public T this[int index]
        {
            get
            {
                return _Index2Value[_List[index]];
            }
            set
            {
                throw new NotImplementedException();
            }
        }

        #endregion

        #region ICollection<T> Members

        public void Add(T item)
        {
            T cachedReference = null;
            int idx = -1;
            if (!_ReferenceCache.TryGetValue(item, out cachedReference))
            {
                _ReferenceCache[item] = item;
                idx = _ReferenceCache.Count;
                _Index2Value[idx] = item;
                _Value2Index[item] = idx;
            }
            else
            {
                idx = _Value2Index[cachedReference];
            }
            _List.Add(idx);
        }

        public void Clear()
        {
            _Index2Value.Clear();
            _ReferenceCache.Clear();
            _Value2Index.Clear();
        }

        public bool Contains(T item)
        {
            return _ReferenceCache.ContainsKey(item);
        }

        public void CopyTo(T[] array, int arrayIndex)
        {
            throw new NotImplementedException();
        }

        public int Count
        {
            get { return _List.Count;  }
        }

        public bool IsReadOnly
        {
            get { return true; }
        }

        public bool Remove(T item)
        {
            throw new NotImplementedException();
        }

        #endregion

        #region IEnumerable<T> Members

        public IEnumerator<T> GetEnumerator()
        {
            throw new NotImplementedException();
        }

        #endregion

        #region IEnumerable Members

        System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
        {
            throw new NotImplementedException();
        }

        #endregion

        internal IEnumerable<T> UniqueValues()
        {
            foreach (var key in _ReferenceCache.Keys)
            {
                yield return key;
            }
        }

        internal void TrimExcess()
        {
            _List.TrimExcess();
        }
    }
Posted On Thursday, August 18, 2011 8:02 PM | Feedback (0)
How To Get Faster – Forwarding Empty Values

Last time we did have a look into the issues you get if you create many class instances. The more class instances you have the more complex is your object graph which means more work for the GC. If the GC has more work there is less CPU time for your application left which does make it slower. An easy way to fix this issue is to create a struct which is a value type which is copied by value and does not add complexity to your object graph. Common wisdom suggests that your structs should not get bigger much than the actual reference type size (16 bytes on 32 bit, 24 bytes for 64 bit). This quite big limitation renders structs useless as general purpose data access object containers. Really?

The key insight is that we do not need to store the actual data inside the value type. We must only be able to access it from our struct. This way we can employ the struct as forwarder which does not contain the actual data but uses indexes to retrieve the actual instance data from a DataProvider object.

    public class DataProviderSimple : IList<ForwardingEmptyValue>
    {
        internal List<double> Value1 = new List<double>();
        internal List<int> Value2 = new List<int>();

        #region IList<ForwardingEmptyValue> Members
        public ForwardingEmptyValue this[int index]
        {
            get
            {
                return new ForwardingEmptyValue(this, index); // return forwarding struct to save memory
            }
            set
            {
                throw new NotImplementedException();
            }
        }
    }
    public struct ForwardingEmptyValue
    {
        public ForwardingEmptyValue(DataProviderSimple provider, int row)
        {
            _Provider = provider;
            _Row = row;
        }

        public double V1
        {
            get { return _Provider.Value1[_Row]; }
        }

        public int V2
        {
            get { return _Provider.Value2[_Row]; }
        }

        DataProviderSimple _Provider;
        int _Row;
    }
 

Ok this looks better but we have not gained much since we still have a reference to our provider object. The best solution I came up with so far was to replace the reference to the data provider with a byte which is an index to a static list of data provider objects. We still need to know from which row to read our data from which will result in a 5 byte value type object which can give you an arbitrary amount of data.

Here is my current version of a forwarding struct:

    [StructLayout(LayoutKind.Sequential, Pack=1)]
    public struct ForwardingEmptyValueStruct
    {
        int _Row;
        byte _ProviderIdx;
        

        public ForwardingEmptyValueStruct(byte providerIdx, int row)
        {
            _ProviderIdx = providerIdx;
            _Row = row;
        }

        public double V1
        {
            get { return DataProvider._DataProviders[_ProviderIdx].Value1[_Row];  }
        }

        public int V2
        {
            get { return DataProvider._DataProviders[_ProviderIdx].Value2[_Row];  }
        }
    }

I had to tweak the layout of the struct a bit since .NET has the habit to fill any gaps between natural alignment with zeros which would have resulted in an 8 byte struct instead of 5 bytes. As you can see there are no more object references left which will come in handy if we have millions of instances hanging around. As a general rule are index based data structures more more GC friendly than reference (read pointer) based structures. The Microsoft guys do know this of course which explains why highly efficient data structures like Dictionary<T> stores its values internally in index based structs.

[StructLayout(LayoutKind.Sequential)]
private struct Entry
{
    public int hashCode;
    public int next;
    public TKey key;
    public TValue value;
}

The differences can be substantial if your application is in a high throughput scenarios where you read large amounts of data to calculate some intermediate value (statistics, filter, …) and discard the data immediately after it had been processed. To illustrate this fact I have written a DataProvider which uses a forwarding struct as its DAO object and another one which uses a forwarding class object.

***** UnitTests.DataProviderTests.Read_Via_Class
Did read 10000000 lines in 2,96s Managed Memory: 320.017.224


***** UnitTests.DataProviderTests.Read_Via_Struct
Did read 10000000 lines in 0,84s, Managed Memory: 170.004.248

There is over a factor 3 difference between these two if you are in a high GC scenario. The actual benefit may range from 10% up to factors.

The test I did was

        [Test]
        public void Read_Via_Struct()
        {
            var sw = Stopwatch.StartNew();
            long start = GC.GetTotalMemory(true);
            using (var prov = new DataProvider(NumericData))
            { 
                prov.ReadData();
                Console.WriteLine("Reading did take: {0:F2}", sw.Elapsed.TotalSeconds);
                sw = Stopwatch.StartNew();
                var array = CopyToArray(prov);
                DoGCHeavyThings();
                sw.Stop();
                Console.WriteLine("Did read {0} lines in {1:F2}s, Managed Memory: {2:N0}", prov.Count, sw.Elapsed.TotalSeconds, GC.GetTotalMemory(true)-start);
            }
        }

        T[] CopyToArray<T>(IList<T> data)
        {
            T[] ret = new T[data.Count];
            data.CopyTo(ret, 0);
            return ret;
        }

        private void DoGCHeavyThings()
        {
            List<List<string>> lists = new List<List<string>>();
            for (int i = 0; i < 200 * 1000; i++)
            {
                var lst =  new List<string>(new string [] { new string("Test".ToCharArray()) } );
                lists.Add(lst);
            }
        }

Besides the much better speed we need with the forwarding struct solution needed nearly half of the memory since for values type the 12 (20 on x64) byte object header for class instances does not exist. This may be a major factor if you read millions of rows from a large data file.

 

Below is the source code for a real world DataProvider which reads data from a file which has one double and int per row.

1,1 0
2,1 1
3,1 2
4,1 3
5,1 4

….

I have found it useful to implement IList so I can read the n-th object via the index operator in an intuitive manner. Besides this I can change the underlying data structures much easier if I use the .Count and [] as prime data access method. So far we have only dealt with value types like int or double inside our test files. If we do know more about the actual structure of the data we do read we can save much more memory than already. I have written a simple (okok but it is fast as hell) trace viewer which uses this method and I could beat an already optimized viewer in terms of reading speed by a factor 6 (thanks to the simplified object model and other tricks) and memory consumption by a factor 3. Next time we have a look at the data we read and how to reduce the memory consumption with bit vectors, reference caches and indexed lists.

Below is the implementation of a simple DataProvider which does use the defined ForwardingEmptyValueStruct as data access object. You can store the structs inside an array, sort them and do everything you like like with usual classes. The forwarding struct has reference type semantics which means if you mutate the data through the struct all others will see the change as well since the data is not part of the struct.

 

    /// <summary>
    /// Read a file which has a double and integer value in each line separated by a space.
    /// The read data can be accessed via the index operator. The returned values are forwarding
    /// structs to use as little memory as possible. After you are done you should dispose the dataprovider as usual.
    /// </summary>
    public class DataProvider : IList<ForwardingEmptyValueStruct>, IDisposable
    {
        internal List<double> Value1 = new List<double>();
        internal List<int> Value2 = new List<int>();

        static object _Lock = new object();
        internal static DataProvider[] _DataProviders = new DataProvider[0]; // static list of data providers
        byte _ThisIdx;                                                       // index to _Dataproviders array of current instance
        FileStream _dataSource;                                              // Opened file

        public DataProvider(string file)
        {
            _dataSource = new FileStream(file, FileMode.Open, FileAccess.Read, FileShare.ReadWrite);
            AddThis();
        }

        unsafe public void ReadData()
        {
            string line;
            double v1;
            int v2;
            using (StreamReader reader = new StreamReader(_dataSource))
            {
                while ((line = reader.ReadLine()) != null)
                {
                    int space = 0;
                    fixed (char* pStr = line)
                    {
                        for (int i = 0; i < line.Length; i++)
                        {
                            if (pStr[i] == ' ')
                            {
                                space = i;
                                break;
                            }
                            else if (pStr[i] == ',')
                            {
                                pStr[i] = '.';
                            }
                        }

                        if (space != line.Length)
                        {
                            v1 = double.Parse(new string(pStr,0, space));
                            v2 = IntParseFast(pStr+space, line.Length-(space+1));
                            Value1.Add(v1);
                            Value2.Add(v2);
                        }
                    }

         
                }
            }
            Value1.TrimExcess();
            Value2.TrimExcess();
        }


        unsafe static int IntParseFast(char *pStr, int len)
        {
            int result = 0;
            for (int i = 0; i < len; i++)
            {
                char letter = pStr[i];
                result = 10 * result + (letter - 48);
            }
            return result;
        }

        private void AddThis()
        {
            lock (_Lock)
            {
                var prov = new List<DataProvider>(_DataProviders);

                if (prov.Count != byte.MaxValue)
                {
                    _ThisIdx = (byte)prov.Count;
                    prov.Add(this);
                }
                else // more than 256 files open 
                {
                    int tmpIdx = prov.FindIndex(x => x == null);
                    if (tmpIdx == -1)
                    {
                        throw new NotSupportedException("The dataprovider does support only 256 simultanously open files. If you did not expect this error search for missing dispose calls for unused provider objects.");
                    }
                    _ThisIdx = (byte)tmpIdx;
                    prov[_ThisIdx] = this;

                }
                _DataProviders = prov.ToArray();
            }
        }

        #region IDisposable Members

        public void Dispose()
        {
            lock (_Lock)
            {
                _dataSource.Close();
                _DataProviders[_ThisIdx] = null;
                this.Value1 = null;
                this.Value2 = null;
                if (_DataProviders.All(x => x == null))  // no more files open, reset byte array
                {
                    _DataProviders = new DataProvider[0];
                }
            }
        }

        #endregion



        #region IList<ForwardingEmptyValueStruct> Members

        public int IndexOf(ForwardingEmptyValueStruct item)
        {
            throw new NotImplementedException();
        }

        public void Insert(int index, ForwardingEmptyValueStruct item)
        {
            throw new NotImplementedException();
        }

        public void RemoveAt(int index)
        {
            throw new NotImplementedException();
        }

        public ForwardingEmptyValueStruct this[int index]
        {
            get
            {
                return new ForwardingEmptyValueStruct(_ThisIdx, index); // return forwarding struct to save memory
            }
            set
            {
                throw new NotImplementedException();
            }
        }

        #endregion

        #region ICollection<ForwardingEmptyValueStruct> Members

        public void Add(ForwardingEmptyValueStruct item)
        {
            throw new NotImplementedException();
        }

        public void Clear()
        {
            Value1.Clear();
            Value2.Clear();
        }

        public bool Contains(ForwardingEmptyValueStruct item)
        {
            throw new NotImplementedException();
        }

        public void CopyTo(ForwardingEmptyValueStruct[] array, int arrayIndex)
        {
            for (int i = arrayIndex; i < this.Count; i++)
            {
                array[i] = this[i];
            }
        }

        public int Count
        {
            get { return Value1.Count; }
        }

        public bool IsReadOnly
        {
            get { return true; }
        }

        public bool Remove(ForwardingEmptyValueStruct item)
        {
            throw new NotImplementedException();
        }

        #endregion

        #region IEnumerable<ForwardingEmptyValueStruct> Members

        public IEnumerator<ForwardingEmptyValueStruct> GetEnumerator()
        {
            for (int i = 0; i < this.Count; i++)
            {
                yield return this[i];
            }
        }

        #endregion

        #region IEnumerable Members

        System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
        {
            throw new NotImplementedException();
        }

        #endregion
    }
Posted On Thursday, August 18, 2011 11:24 AM | Feedback (0)
Do You Know The Costs Of Garbage?

If you want to write scalable applications with a high throughput you need to take care of good data structure design to make your application scale. If you want to read for example 200 MB data from a file and process it you can finish it in 5 or 30s. To be on the 5s side it is educational to see how fast you actually can get. To measure the raw disc performance I do read the file into a pre allocated byte array.

[Test]
public void ReadFile_Store_In_ByteArray()
{
    var sw = Stopwatch.StartNew();
    int read = 0;
    int numRead = 0;
    using (FileStream rStream = new FileStream(TestData, FileMode.Open, FileAccess.Read, FileShare.ReadWrite, 4096, FileOptions.SequentialScan))
    {
        byte[] bytes = new byte[210 * 1024 * 1024];

        while ((numRead = rStream.Read(bytes, read, 4096)) > 0)
        {
            read += numRead;
        }
    }
    Console.WriteLine("Did read {0:N0} bytes in {1:F2}s", read, sw.Elapsed.TotalSeconds);
}

***** UnitTests.PushingLimits.ReadFile_Store_In_ByteArray
Did read 216.000.003 bytes in 0,60s

Not bad. The Windows cache does deliver 360 MB/s to my application. If I read the file  the first time I see the real disc performance which is around 160 MB which is still quite good for my Intel PostVille X25M. One common myth many believe in is that if you are reading from disc you can process the data much faster than you can read. Lets try it out. This time we do simply read the file line by line and store the lines in a List<string>.

 

[Test]
public void ReadFile_Store_InList()
{
    var sw = Stopwatch.StartNew();
    FileStream rStream = new FileStream(TestData, FileMode.Open, FileAccess.Read, FileShare.ReadWrite, 4096, FileOptions.SequentialScan);
    int lineNo = 0;
    List<string> lines = new List<string>(2*1000*1000);
    using (StreamReader reader = new StreamReader(rStream))
    {
        string line;
        while ((line = reader.ReadLine()) != null)
        {
            lineNo++;
            lines.Add(line);
        }
    }
    Console.WriteLine("Did read {0:N0} lines in {1:F2}s", lineNo, sw.Elapsed.TotalSeconds);
}

***** UnitTests.PushingLimits.ReadFile_Store_In_List
Did read 2.000.000 lines in 4,92s

Why the heck ware we over 8 times slower? I did not do anything with them! Ok to read form the files the strings we need a StreamReader which needs to convert the byte array into a char array and then into a string. The input file was an ASCII file which means that while we convert a 200 MB file into strings we consume 400 MB of memory because .NET strings are UTF-16. These things do certainly cost us some performance but not 8 times. Lets do the same but this time we do only count the lines of the file.

 

[Test]
public void ReadFile()
{
    var sw = Stopwatch.StartNew();
    FileStream rStream = new FileStream(TestData, FileMode.Open, FileAccess.Read, FileShare.ReadWrite, 4096, FileOptions.SequentialScan);
    int lineNo = 0;
    using (StreamReader reader = new StreamReader(rStream))
    {
        string line;
        while ((line = reader.ReadLine()) != null)
        {
            lineNo++;
        }
    }
    Console.WriteLine("Did read {0:N0} lines in {1:F2}s", lineNo, sw.Elapsed.TotalSeconds);
}

 

***** UnitTests.PushingLimits.ReadFile
    Did read 2.000.000 lines in 1,57s

Thats more like it. From 0,6s to 1,6s is still slower but here we do see what decoding the byte array into .NET strings actually cost. It is not as bad as it looks since we did not yet process (e.g. parse, convert,…) the lines yet which does also add a significant amount of time. But why are we getting over 3 times faster when we throw away our read data instead of keeping it? The Garbage Collector should not have any work to do if we keep all our data in a list. As it turns out this is the point.

The GC has a lot more to do if we store the strings in a list since we add 2 million string references to an array. The GC kicks in from time to time to clean up any garbage and needs to traverse the complete object graph which just becomes more and more complex if we add more managed references to our list.

If your managed application consumes over a few hundred MB of memory it will become much slower if the data is stored in the object graph which needs to be traversed by the GC. It is as simple as that. The more managed references you add the more work the GC has to do for each full GC. To prove this point lets look what happens if we switch from classes to structs which are value types and not under GC control.

First we define a class which has 10 long values stored

class ClassContainer
{
    public ClassContainer(long value)
    {
        v1 = value++;
        v2 = value++;
        v3 = value++;
        v4 = value++;
        v5 = value++;
        v6 = value++;
        v7 = value++;
        v8 = value++;
        v9 = value++;
        v10 = value++;
    }
    long v1,v2,v3,v4,v5,v6,v7,v8,v9,v10;
}

A long value consumes 8 bytes which attributes to 80 bytes of payload of our class.

[Test]
public void Create_Class_List()
{
    var start = GC.GetTotalMemory(true);
    var sw = Stopwatch.StartNew();
    var arr = new ClassContainer[N];
    for (int i = 0; i < arr.Length; i++)
    {
        arr[i] = new ClassContainer(i);
    }
    sw.Stop();
    Console.WriteLine("Did insert {0:N0} classes into list. {1:F2}s, GC Memory: {2:N0}",
                        arr.Length, sw.Elapsed.TotalSeconds, GC.GetTotalMemory(true)-start);
}

***** UnitTests.StructVsClass.Create_Class_List
Did insert 5.000.000 classes into list. 3,74s, GC Memory: 459.996.312

 

Now lets to the same thing for a struct with the same content.

[Test]
public void Create_Struct_List()
{
    var start = GC.GetTotalMemory(true);
    var sw = Stopwatch.StartNew();
    var arr = new StructContainer[N];
    for (int i = 0; i < arr.Length; i++)
    {
        arr[i] = new StructContainer(i);
    }
    sw.Stop();
    Console.WriteLine("Did insert {0:N0} structs into list. {1:F2}s, GC Memory: {2:N0}",
        arr.Length, sw.Elapsed.TotalSeconds, GC.GetTotalMemory(true)-start);

    Console.WriteLine("Class Memory = N*(10*8+Reference+9) = {0:N0}, Struct Memory = N*10*8 = {1:N0}",
        N * (10 * 8 + IntPtr.Size+9), N * 10 * 8);
}

***** UnitTests.StructVsClass.Create_Struct_List
Did insert 5.000.000 structs into list. 0,58s, GC Memory: 400.00.096
Class Memory = N*(10*8+Reference+9) = 465.000.000, Struct Memory = N*10*8 = 400.000.000

 

Although we do insert the same amount of data the performance does differ over a factor 6! This is the price you pay if the GC needs to do more work. Things will become worse if you are a 64 bit application with several GB of memory consumption. Did you notice that (especially managed) applications with a high memory consumption react slowly on user input? It could be that the application does not do much but the GC brings it down. If you want to write an application with a high data throughput and high memory consumption you will hit a wall if you do not take care to design your application more GC friendly.

To declare all classes as structs is not the solution since you are then copying the data by value with every function call you pass the data around which is for my 80 byte structure a quite big overhead. Luckily you can create data structures which do play well with the GC. I will give you some unique data structures in my next posts. It is possible to use structures as general purpose data container with very little overhead …

Posted On Saturday, August 13, 2011 12:03 AM | Feedback (1)