Search
Close this search box.

C#: The Curious ConcurrentDictionary

Update: I revised some of my initial thoughts after taking  bit of my own advice and thinking about the less-tangible benefits of a ConcurrentDictionary.  This leads me to believe that ay time you use a Dictionary in a read/write manner in a multi-threaded environment, you should use ConcurrentDictionary instead for the simplicity and safety.  It’s easier to read and more maintainable, and even if you are write-heavy, it’s not orders-of-magnitude slower so I think it’s worth it for the safety and maintainability gains (not to mention performance gains on read-heavy uses.

In my previous post (here) I did a comparison of the new ConcurrentQueue versus the old standard of a System.Collections.Generic Queue with simple locking.  The results were exactly what I would have hoped, that the ConcurrentQueue was faster with multi-threading for most all situations.  In addition, concurrent collections have the added benefit that you can enumerate them even if they’re being modified.

So I set out to see what the improvements would be for the ConcurrentDictionary, would it have the same performance benefits as the ConcurrentQueue did?  Well, after running some tests and multiple tweaks and tunes, I have good and bad news.

But first, let’s look at the tests.  Obviously there’s many things we can do with a dictionary.  One of the most notable uses, of course, in a multi-threaded environment is for a small, local in-memory cache.  So I set about to do a very simple simulation of a cache where I would create a test class that I’ll just call an Accessor.  This accessor will attempt to look up a key in the dictionary, and if the key exists, it stops (i.e. a cache “hit”).  However, if the lookup fails, it will then try to add the key and value to the dictionary(i.e. a cache “miss”). 

So here’s the Accessor that will run the tests:

1 : internal class Accessor 2 : {
  3 : public int Hits {
    get;
    set;
  }
  4 : public int Misses {
    get;
    set;
  }
  5 : public Func<int, string> GetDelegate {
    get;
    set;
  }
  6 : public Action<int, string> AddDelegate {
    get;
    set;
  }
  7 : public int Iterations {
    get;
    set;
  }
  8 : public int MaxRange {
    get;
    set;
  }
  9 : public int Seed {
    get;
    set;
  }
  10 : 11 : public void Access() 12 : {
    13 : var randomGenerator = new Random(Seed);
    14 : 15 : for (int i = 0; i < Iterations; i++) 16 : {
      17 :  // give a wide spread so will have some duplicates and some unique
            18 : var target = randomGenerator.Next(1, MaxRange);
      19 : 20 :  // attempt to grab the item from the cache
                 21 : var result = GetDelegate(target);
      22 : 23 :  // if the item doesn't exist, add it
                 24 : if (result == null) 25 : {
        26 : AddDelegate(target, target.ToString());
        27 : Misses++;
        28:
      }
      29 : else 30 : {
        31 : Hits++;
        32:
      }
      33:
    }
    34:
  }
  35:
}

Note that so I could test different implementations, I defined a GetDelegate and AddDelegate that will call the appropriate dictionary methods to add or retrieve items in the cache using various techniques.

So let’s examine the three techniques I decided to test:

  • Dictionary with mutex – Just your standard generic Dictionary with a simple lock construct on an internal object.
  • Dictionary with ReaderWriterLockSlim – Same Dictionary, but now using a lock designed to let multiple readers access simultaneously and then locked when a writer needs access.
  • ConcurrentDictionary – The new ConcurrentDictionary from System.Collections.Concurrent that is supposed to be optimized to allow multiple threads to access safely.

So the approach to each of these is also fairly straight-forward.  Let’s look at the GetDelegate and AddDelegate implementations for the Dictionary with mutex lock:

1 : var addDelegate = (key, val) =
        > 2 : {3 : lock(_mutex) 4 : {5 : _dictionary[key] = val;
6:
}
7:
}
;
8 : var getDelegate = (key) = > 9 : {10 : lock(_mutex) 11 : {12 : string val;
13 : return _dictionary.TryGetValue(key, out val) ? val : null;
14:
}
15:
}
;

Nothing new or fancy here, just your basic lock on a private object and then query/insert into the Dictionary.

Now, for the Dictionary with ReadWriteLockSlim it’s a little more complex:

1 : var addDelegate = (key, val) = > 2 : {
  3 : _readerWriterLock.EnterWriteLock();
  4 : _dictionary[key] = val;
  5 : _readerWriterLock.ExitWriteLock();
  6:
};
7 : var getDelegate = (key) = > 8 : {
  9 : string val;
  10 : _readerWriterLock.EnterReadLock();
  11 : if (!_dictionary.TryGetValue(key, out val)) 12 : {
    13 : val = null;
    14:
  }
  15 : _readerWriterLock.ExitReadLock();
  16 : return val;
  17:
};

And finally, the ConcurrentDictionary, which since it does all it’s own concurrency control, is remarkably elegant and simple:

1 : var addDelegate = (key, val) = > 2 : {
  3 : _concurrentDictionary[key] = val;
  4:
};
5 : var getDelegate = (key) = > 6 : {
  7 : string s;
  8 : return _concurrentDictionary.TryGetValue(key, out s) ? s : null;
  9:
};

Then, I set up a test harness that would simply ask the user for the number of concurrent Accessors to attempt to Access the cache (as specified in Accessor.Access() above) and then let them fly and see how long it took them all to complete.  Each of these tests was run with 10,000,000 cache accesses divided among the available Accessor instances.  All times are in milliseconds.

1: Dictionary with Mutex Locking

2: —————————————————

3: Accessors Mostly Misses Mostly Hits

4: 1 7916 3285

5: 10 8293 3481

6: 100 8799 3532

7: 1000 8815 3584

8: 

9: 

10: Dictionary with ReaderWriterLockSlim Locking

11: —————————————————

12: Accessors Mostly Misses Mostly Hits

13: 1 8445 3624

14: 10 11002 4119

15: 100 11076 3992

16: 1000 14794 4861

17: 

18: 

19: Concurrent Dictionary

20: —————————————————

21: Accessors Mostly Misses Mostly Hits

22: 1 17443 3726

23: 10 14181 1897

24: 100 15141 1994

25: 1000 17209 2128

The first test I did across the board is the Mostly Misses category.  The mostly misses (more adds because data requested was not in the dictionary) shows an interesting trend.  In both cases the Dictionary with the simple mutex lock is much faster, and the ConcurrentDictionary is the slowest solution.  But this got me thinking, and a little research seemed to confirm it, maybe the ConcurrentDictionary is more optimized to concurrent “gets” than “adds”.  So since the ratio of misses to hits were 2 to 1, I decided to reverse that and see the results.

So I tweaked the data so that the number of keys were much smaller than the number of iterations to give me about a 2 to 1 ration of hits to misses (twice as likely to already find the item in the cache than to need to add it).  And yes, indeed here we see that the ConcurrentDictionary is indeed faster than the standard Dictionary here.  I have a strong feeling that as the ration of hits-to-misses gets higher and higher these number gets even better as well.  This makes sense since the ConcurrentDictionary is read-optimized.

Also note that I tried the tests with capacity and concurrency hints on the ConcurrentDictionary but saw very little improvement, I think this is largely because on the 10,000,000 hit test it quickly ramped up to the correct capacity and concurrency and thus the impact was limited to the first few milliseconds of the run.

So the ConcurrentDictionary is very good at read-heavy uses, and a bit slower at write-heavy uses, but with the obvious additional level of safety and maintainability that the ConcurrentDictionary brings, I would recommend the following:

Use System.Collections.Generic.Dictionary when:

  • You need a single-threaded Dictionary (no locking needed).
  • You need a multi-threaded Dictionary that is loaded only once at creation and never modified (no locking needed).

And use System.Collections.Concurrent.ConcurrentDictionary when:

  • You need a multi-threaded Dictionary where both reads and writes are being performed.
  • You need to be able to iterate over the collection without locking it even if its being modified.

Both Dictionaries have their strong suits, but the more i think about it, even with the slighty slower speed on write-heavy multi-threaded use, it’s worth it for the safety and performance, and for read-heavy use, it is clearly faster.

Print | posted on Wednesday, June 09, 2010 5:45 PM | Filed Under [ My BlogC#Software.NET ]

This article is part of the GWB Archives. Original Author:  James Michael Hare

Related Posts