James Michael Hare

...hare-brained ideas from the realm of software development...
posts - 137 , comments - 1099 , trackbacks - 0

My Links

News

Welcome to my blog! I'm a Sr. Software Development Engineer in Seattle, WA. I've been doing C++/C#/Java development for over 18 years, but have definitely learned that there is always more to learn!

All thoughts and opinions expressed in my blog and my comments are my own and do not represent the thoughts of my employer.

Blogs I Read

MCC Logo MVP Logo

Follow BlkRabbitCoder on Twitter

Tag Cloud

Archives

Post Categories

C#: System.Collections.Concurrent.ConcurrentQueue vs. Queue

I love new toys, so of course when .NET 4.0 came out I felt like the proverbial kid in the candy store!  Now, some people get all excited about the IDE and it’s new features or about changes to WPF and Silver Light and yes, those are all very fine and grand.  But me, I get all excited about things that tend to affect my life on the backside of development.  That’s why when I heard there were going to be concurrent container implementations in the latest version of .NET I was salivating like Pavlov’s dog at the dinner bell.

They seem so simple, really, that one could easily overlook them.  Essentially they are implementations of containers (many that mirror the generic collections, others are new) that have either been optimized with very efficient, limited, or no locking but are still completely thread safe -- and I just had to see what kind of an improvement that would translate into.

Since part of my job as a solutions architect here where I work is to help design, develop, and maintain the systems that process tons of requests each second, the thought of extremely efficient thread-safe containers was extremely appealing.  Of course, they also rolled out a whole parallel development framework which I won’t get into in this post but will cover bits and pieces of as time goes by.

This time, I was mainly curious as to how well these new concurrent containers would perform compared to areas in our code where we manually synchronize them using lock or some other mechanism.  So I set about to run a processing test with a series of producers and consumers that would be either processing a traditional System.Collections.Generic.Queue or a System.Collection.Concurrent.ConcurrentQueue.

Now, I wanted to keep the code as common as possible to make sure that the only variance was the container, so I created a test Producer and a test Consumer.  The test Producer takes an Action<string> delegate which is responsible for taking a string and placing it on whichever queue we’re testing in a thread-safe manner:

   1: internal class Producer
   2: {
   3:     public int Iterations { get; set; }
   4:     public Action<string> ProduceDelegate { get; set; }
   5:     
   6:     public void Produce()
   7:     {
   8:         for (int i = 0; i < Iterations; i++)
   9:         {
  10:             ProduceDelegate(“Hello”);
  11:         }
  12:     }
  13: }

Then likewise, I created a consumer that took a Func<string> that would read from whichever queue we’re testing and return either the string if data exists or null if not.  Then, if the item doesn’t exist, it will do a 10 ms wait before testing again.  Once all the producers are done and join the main thread, a flag will be set in each of the consumers to tell them once the queue is empty they can shut down since no other data is coming:

   1: internal class Consumer
   2: {
   3:     public Func<string> ConsumeDelegate { get; set; }
   4:     public bool HaltWhenEmpty { get; set; }
   5:     
   6:     public void Consume()
   7:     {
   8:         bool processing = true;
   9:         
  10:         while (processing)
  11:         {
  12:             string result = ConsumeDelegate();
  13:             
  14:             if(result == null)
  15:             {
  16:                 if (HaltWhenEmpty)
  17:                 {
  18:                     processing = false;
  19:                 }
  20:                 else
  21:                 {
  22:                     Thread.Sleep(TimeSpan.FromMilliseconds(10));
  23:                 }
  24:             }
  25:             else
  26:             {
  27:                 DoWork();    // do something non-trivial so consumers lag behind a bit
  28:             }
  29:         }
  30:     }
  31: }    

Okay, now that we’ve done that, we can launch threads of varying numbers using lambdas for each different method of production/consumption.  First let's look at the lambdas for a typical System.Collections.Generics.Queue with locking:

   1: // lambda for putting to typical Queue with locking...
   2: var productionDelegate = s => 
   3:     {
   4:         lock (_mutex)
   5:         {
   6:             _mutexQueue.Enqueue(s);
   7:         }
   8:     };
   9:  
  10: // and lambda for typical getting from Queue with locking... 
  11: var consumptionDelegate = () =>
  12:     {
  13:         lock (_mutex)
  14:         {
  15:             if (_mutexQueue.Count > 0)
  16:             {
  17:                 return _mutexQueue.Dequeue();
  18:             }
  19:         }
  20:         return null;
  21:     };

Nothing new or interesting here.  Just typical locks on an internal object instance.  Now let's look at using a ConcurrentQueue from the System.Collections.Concurrent library:

   1: // lambda for putting to a ConcurrentQueue, notice it needs no locking!
   2: var productionDelegate = s =>
   3:     {
   4:         _concurrentQueue.Enqueue(s);
   5:     };
   6:  
   7: // lambda for getting from a ConcurrentQueue, once again, no locking required.
   8: var consumptionDelegate = () =>
   9:     {
  10:         string s;
  11:         return _concurrentQueue.TryDequeue(out s) ? s : null;
  12:     };

So I pass each of these lambdas and the number of producer and consumers threads to launch and take a look at the timing results.  Basically I’m timing from the time all threads start and begin producing/consuming to the time that all threads rejoin. 

I won't bore you with the test code, basically it just launches code that creates the producers and consumers and launches them in their own threads, then waits for them all to rejoin.  The following are the timings from the start of all threads to the Join() on all threads completing.  The producers create 10,000,000 items evenly between themselves and then when all producers are done they trigger the consumers to stop once the queue is empty.

These are the results in milliseconds from the ordinary Queue with locking:

   1: Consumers   Producers   1       2       3       Time (ms)
   2: ----------  ----------  ------  ------  ------  ---------
   3: 1           1           4284    5153    4226    4554.33
   4: 10          10          4044    3831    5010    4295.00
   5: 100         100         5497    5378    5612    5495.67
   6: 1000        1000        24234   25409   27160   25601.00

And the following are the results in milliseconds from the ConcurrentQueue with no locking necessary:

   1: Consumers   Producers   1       2       3       Time (ms)
   2: ----------  ----------  ------  ------  ------  ---------
   3: 1           1           3647    3643    3718    3669.33
   4: 10          10          2311    2136    2142    2196.33
   5: 100         100         2480    2416    2190    2362.00
   6: 1000        1000        7289    6897    7061    7082.33

Note that even though obviously 2000 threads is quite extreme, the concurrent queue actually scales really well, whereas the traditional queue with simple locking scales much more poorly.

I love the new concurrent collections, they look so much simpler without littering your code with the locking logic, and they perform much better.  All in all, a great new toy to add to your arsenal of multi-threaded processing!

Print | posted on Monday, June 7, 2010 8:37 PM | Filed Under [ My Blog C# Software .NET ]

Feedback

Gravatar

# re: C#: System.Collections.Concurrent.ConcurrentQueue vs. Queue

I've found that that using a queue.Count to determine if a queue has items causes all kinds of performance problems... Massive performance problems. It may even cause some kind of race condition.

In my tests, doing a queue.Dequeue and catching an InvalidOperationException is far more effecient.

In VB, something like this:

Try
While True
Dim data As HitCountObject = Nothing
SyncLock m_queue
data = m_queue.Dequeue()
End SyncLock
'If data is nothing, Dequeue would have thrown an InvalidOperationException
LogInformation(data)
End While
Catch ex As InvalidOperationException
'if an exception occurs because of no items, break out, sleep, End Try

Threading.Thread.Sleep(1000)

.. then retry

When I did a test using this code, a Queue outperforms a ConcurrentQueue (concurrent queue no locking). Using Queue.Count in place of what I have causes a concurrent queue to outperform the queue.
8/25/2011 4:13 PM | Tim Doke
Gravatar

# re: C#: System.Collections.Concurrent.ConcurrentQueue vs. Queue

All Count does is return a property, which is extremely performant and shouldn't make any difference. Also, if your Count property is NOT inside your lock, then yes you would get race conditions, keeping Count inside the lock eliminates the race conditions.

Now Exceptions, on the other hand, are much more expensive when hit. My gut reaction is there's something about your test skewing the results. Not only because my tests showed differently, but because Microsoft's published papers on the performance results of the various concurrent collections vs non-concurrent locked collections bears out the claim that ConcurrentQueue is faster in most multi-threaded situations.

To make your test valid, you'd have to make sure you have producers and consumers at the same time producing AND at non-constant rates. That is, you'd have to have several points where the queue was empty during a timed cycle to incur the cost of the exception being thrown, which is VERY expensive compared to locking, and Count. Remember exceptions have to pull a StackFrame which is a non-trivial task and allocate memory.

All of this leads me to believe that either your exception isn't getting hit as part of your tests (until possibly the end) and you're forward loading your queue or producing at a very constant rate faster than consumption.

Also remember that ConcurrentQueue is meant to scale better to active (and possibly multiple) producers and consumers. If there's no contention on the lock, then yes Queue will generally outperform ConcurrentQueue. But typically in multi-threaded applications this is not the case and the lock is hotly contested.

I'd be interested to see the full source of your test. Is it available to send?
8/25/2011 4:45 PM | James Michael Hare
Gravatar

# re: C#: System.Collections.Concurrent.ConcurrentQueue vs. Queue

I used testing tools included in visual studio 2010 (using .loadtest files). The macro created simply hit a generic handler page (.ashx) 10 times per test and was run 100 times. I would pass four things in the querystring that would then be inserted into a database. There was a simulated user count of 250.

The queue I was testing had a single consumer that started up in the application start event of the global.asax (when I went to production, I used a windows WCF service running a task to process the queue). When the .ashx page was hit, items would be added to the same queue for processing. It simply logged data to a database. The handler added items to the queue and the single processor would remove them as fast as possible.

I found using a queue for this scenario allowed for vastly more database inserts than a straight insert from the .ashx page allowed.

As per the exceptions comment, it would only error out when there were no items to insert. In this testing scenario, that would not happen. Also, checking the count each time is way slower than just dequeuing. Anyway, I try using count early on in my tests and had horrible performance.

Using the queue in the manner of my previous post, I was able to insert 104,000 records in the best test. The concurrent queue never inserted more than 10,000 records.

I was actually interested in what type of collection, allowing for multiple clients and a single processor, would allow for the most inserts. I was thinking of retesting the the concurrent bag collection. I was satisfied with the queue and was very disappointed with the concurrent queue so I never bothered testing other concurrent collection types.
8/25/2011 5:08 PM | Tim Doke
Gravatar

# re: C#: System.Collections.Concurrent.ConcurrentQueue vs. Queue

Yes, I can send the source, if you want. Just let me know where to send it. I would be curious if you think I could squeeze more performance out of it.
8/25/2011 5:12 PM | Tim Doke
Gravatar

# re: C#: System.Collections.Concurrent.ConcurrentQueue vs. Queue

I'll try my tests w/o count tomorrow, bur I can't see how that would make an order of magnitude difference offhand. Would be very interesting if so.
8/25/2011 5:54 PM | James Michael hare
Gravatar

# re: C#: System.Collections.Concurrent.ConcurrentQueue vs. Queue

@Tim: I'll try my test without Count tomorrow, but an order of magnitude difference w/o Count seems rather extreme. I'll just have to try to look at the code for both ConcurrentQueue and Queue and we'll see.

It COULD be something relating to the fact you have a single processor system and I'm dual core, but still I wouldn't think that would account for a whole order of magnitude.

Once I get to my main computer tomorrow and run the tests, it should become apparent if it's the scenario (WCF vs Console App) or a fluke with the load test tool or the processor architecture.

On the ConcurrentBag note, ConcurrentBag won't help you here. ConcurrentBag is optimized when a single process is both a consumer and producer of the data, which in your case wouldn't work. If the producer and consumer are on different threads, it performs "work stealing" which is less efficient.
8/25/2011 7:03 PM | James Michael Hare
Gravatar

# re: C#: System.Collections.Concurrent.ConcurrentQueue vs. Queue

@Tim:

Okay, i just recreated my tests and used just a single producer creating 10 million items and a consumer, each on a separate task.

I then start the tasks and have them consuming. I tried a queue with a lock() both with Count and without Count using try/catch. There was no appreciable difference in my test code between using or not using Count (always ended up at ~8600 ms +/- 10ms).

So I'd have to see your full code to see if there's a nuance I'm missing, otherwise I've tried recreating the scenario and my ConcurrentQueue is still outperforming the Queue for me.

8/26/2011 10:35 AM | James Michael Hare
Gravatar

# re: C#: System.Collections.Concurrent.ConcurrentQueue vs. Queue

#Tim: After looking over the code you sent, i really think the issue is the structure of the loop. That 10 second wait (10,000 ms) is really WAY to big. I'd eliminate it completely for ConcurrentQueue and instead put a timeout of 10ms on the TryDequeue() call. My guess is that COncurrentQueue is faster, and thus it drops from the loop quicker and hits that 10 second pause which kills the performance. Or the loop structure is incorrect when you use concurrent queue (since it doesn't throw).
8/29/2011 8:57 AM | James Michael Hare
Gravatar

# re: C#: System.Collections.Concurrent.ConcurrentQueue vs. Queue

>All Count does is return a property, which is extremely performant and shouldn't make any difference. Also, if your Count property is NOT inside your lock, then yes you would get race conditions, keeping Count inside the lock eliminates the race conditions.

Just want to point out, this is not true for all collection. Most concurrently collection explicitly include both Count and IsEmpty properties, because getting an actual count can be extremely expensive compared to checking if there are any elements at all.
2/3/2012 2:52 PM | Chu
Post A Comment
Title:
Name:
Email:
Comment:
Verification:
 

Powered by: