<feed xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns="http://www.w3.org/2005/Atom" xml:lang="en-GB">
    <title>Simon Cooper</title>
    <link rel="self" type="application/xml" href="http://geekswithblogs.net/simonc/Atom.aspx" />
    <subtitle type="html">Peering into the depths of .NET</subtitle>
    <id>http://geekswithblogs.net/simonc/Default.aspx</id>
    <author>
        <name>simonc</name>
        <uri>http://geekswithblogs.net/simonc/Default.aspx</uri>
    </author>
    <generator uri="http://subtextproject.com" version="Subtext Version 0.0.0.0">Subtext</generator>
    <updated>2012-05-06T17:39:02Z</updated>
    <entry>
        <title>Inside Red Gate - Ricky Leeks</title>
        <link rel="self" type="text/html" href="http://geekswithblogs.net/simonc/archive/2012/05/04/inside-red-gate-ricky-leeks.aspx" />
        <id>http://geekswithblogs.net/simonc/archive/2012/05/04/inside-red-gate-ricky-leeks.aspx</id>
        <published>2012-05-04T15:30:0001:00:00</published>
        <updated>2012-05-04T15:34:43Z</updated>
        <content type="html">&lt;p&gt;So, one of our profilers has a problem. Red Gate produces two .NET profilers - ANTS Performance Profiler (APP) and ANTS Memory Profiler (AMP). Both products help .NET developers solve problems they are virtually guaranteed to encounter at some point in their careers - slow code, and high memory usage, respectively.&lt;/p&gt;

&lt;p&gt;Everyone understands slow code - the symptoms are very obvious (an operation takes 2 hours when it should take 10 seconds), you know when you've solved it (the same operation now takes 15 seconds), and everyone understands how you can use a profiler like APP to help solve your particular problem. High memory usage is a much more subtle and misunderstood concept.&lt;/p&gt;

&lt;h4&gt;How can .NET have memory leaks?&lt;/h4&gt;
&lt;p&gt;The garbage collector, and how the CLR uses and frees memory, is one of the most misunderstood concepts in .NET. There's hundreds of blog posts out there covering various aspects of the GC and .NET memory, some of them helpful, some of them confusing, and some of them are just plain wrong. There's a lot of misconceptions out there. And, if you have got an application that uses far too much memory, it can be hard to wade through all the contradictory information available to even get an idea as to what's going on, let alone trying to solve it.&lt;/p&gt;

&lt;p&gt;That's where a memory profiler, like AMP, comes into play. Unfortunately, that's not the end of the issue. .NET memory management is a large, complicated, and misunderstood problem. Even armed with a profiler, you need to understand what .NET is doing with your objects, how it processes them, and how it frees them, to be able to use the profiler effectively to solve your particular problem.&lt;/p&gt;

&lt;p&gt;And that's what's wrong with AMP - even with all the thought, designs, UX sessions, and research we've put into AMP itself, some users simply don't have the knowledge required to be able to understand what AMP is telling them about how their application uses memory, and so they have problems understanding &amp;amp; solving their memory problem.&lt;/p&gt;

&lt;h4&gt;Ricky Leeks&lt;/h4&gt;

&lt;p&gt;This is where Ricky Leeks comes in. Created by one of the many...colourful...people in Red Gate, he headlines and promotes several tutorials, pages, and articles all with information on how .NET memory management actually works, with the goal to help educate developers on .NET memory management. And educating us all on how far you can push various vegetable-based puns. This, in turn, not only helps them understand and solve any memory issues they may be having, but helps them proactively code against such memory issues in their existing code.&lt;/p&gt;

&lt;p&gt;Ricky's latest outing is an interview on &lt;a href="http://www.dotnetrocks.com/default.aspx?showNum=760"&gt;.NET Rocks&lt;/a&gt;, providing information on the Top 5 .NET Memory Management Gotchas, along with information on a free ebook on .NET Memory Management. Don't worry, there's loads more vegetable-based jokes where those came from...&lt;/p&gt;
&lt;small&gt;Cross posted from &lt;a href="http://www.simple-talk.com/community/blogs/simonc"&gt;Simple Talk&lt;/a&gt;.&lt;/small&gt;&lt;img src="http://geekswithblogs.net/simonc/aggbug/149535.aspx" width="1" height="1" /&gt;</content>
        <wfw:comment>http://geekswithblogs.net/simonc/comments/149535.aspx</wfw:comment>
        <slash:comments>0</slash:comments>
        <wfw:commentRss>http://geekswithblogs.net/simonc/comments/commentRss/149535.aspx</wfw:commentRss>
        <trackback:ping>http://geekswithblogs.net/simonc/services/trackbacks/149535.aspx</trackback:ping>
    </entry>
    <entry>
        <title>Subterranean IL: The ThreadLocal type</title>
        <link rel="self" type="text/html" href="http://geekswithblogs.net/simonc/archive/2012/05/03/subterranean-il-the-threadlocal-type.aspx" />
        <id>http://geekswithblogs.net/simonc/archive/2012/05/03/subterranean-il-the-threadlocal-type.aspx</id>
        <published>2012-05-03T12:11:0001:00:00</published>
        <updated>2012-05-06T17:39:02Z</updated>
        <content type="html">&lt;p&gt;I came across &lt;code&gt;ThreadLocal&amp;lt;T&amp;gt;&lt;/code&gt; while I was researching &lt;code&gt;ConcurrentBag&lt;/code&gt;. To look at it, it doesn't really make much sense. What's all those extra &lt;code&gt;Cn&lt;/code&gt; classes doing in there? Why is there a &lt;code&gt;GenericHolder&amp;lt;T,U,V,W&amp;gt;&lt;/code&gt; class? What's going on? However, digging deeper, it's a rather ingenious solution to a tricky problem.&lt;/p&gt;

&lt;h4&gt;Thread statics&lt;/h4&gt;
&lt;p&gt;Declaring that a variable is thread static, that is, values assigned and read from the field is specific to the thread doing the reading, is quite easy in .NET:&lt;/p&gt;
&lt;pre&gt;[ThreadStatic]
private static string s_ThreadStaticField;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;ThreadStaticAttribute&lt;/code&gt; is not a &lt;a href="http://www.simple-talk.com/community/blogs/simonc/archive/2010/11/30/95936.aspx"&gt;pseudo-custom attribute&lt;/a&gt;; it is compiled as a normal attribute, but the CLR has in-built magic, activated by that attribute, to redirect accesses to the field based on the executing thread's identity.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;TheadStaticAttribute&lt;/code&gt; provides a simple solution when you want to use a single field as thread-static. What if you want to create an arbitary number of thread static variables at runtime? Thread-static fields can only be declared, and are fixed, at compile time. Prior to .NET 4, you only had one solution - &lt;a href="http://msdn.microsoft.com/en-us/library/6sby1byh.aspx"&gt;thread local data slots&lt;/a&gt;. This is a lesser-known function of &lt;code&gt;Thread&lt;/code&gt; that has existed since .NET 1.1:&lt;/p&gt;

&lt;pre&gt;LocalDataStoreSlot threadSlot = Thread.AllocateNamedDataSlot("slot1");

string value = "foo";
Thread.SetData(threadSlot, value);

string gettedValue = (string)Thread.GetData(threadSlot);&lt;/pre&gt;

&lt;p&gt;Each instance of &lt;code&gt;LocalStoreDataSlot&lt;/code&gt; mediates access to a single slot, and each slot acts like a separate thread-static field.&lt;/p&gt;

&lt;p&gt;As you can see, using thread data slots is quite cumbersome. You need to keep track of &lt;code&gt;LocalDataStoreSlot&lt;/code&gt; objects, it's not obvious how instances of &lt;code&gt;LocalDataStoreSlot&lt;/code&gt; correspond to individual thread-static variables, and it's not type safe. It's also relatively slow and complicated; the internal implementation consists of a whole series of classes hanging off a single thread-static field in &lt;code&gt;Thread&lt;/code&gt; itself, using various arrays, lists, and locks for synchronization. &lt;code&gt;ThreadLocal&amp;lt;T&amp;gt;&lt;/code&gt; is far simpler and easier to use.&lt;/p&gt;

&lt;h4&gt;ThreadLocal&lt;/h4&gt;
&lt;p&gt;&lt;code&gt;ThreadLocal&lt;/code&gt; provides an abstraction around thread-static fields that allows it to be used just like any other class; it can be used as a replacement for a thread-static field, it can be used in a &lt;code&gt;List&amp;lt;ThreadLocal&amp;lt;T&amp;gt;&amp;gt;&lt;/code&gt;, you can create as many as you need at runtime. So what does it do? It can't just have an instance-specific thread-static field, because thread-static fields have to be declared as &lt;code&gt;static&lt;/code&gt;, and so shared between all instances of the declaring type. There's something else going on here.&lt;/p&gt;

&lt;p&gt;The values stored in instances of &lt;code&gt;ThreadLocal&amp;lt;T&amp;gt;&lt;/code&gt; are stored in instantiations of the &lt;code&gt;GenericHolder&amp;lt;T,U,V,W&amp;gt;&lt;/code&gt; class, which contains a single &lt;code&gt;ThreadStatic&lt;/code&gt; field (&lt;code&gt;s_value&lt;/code&gt;) to store the actual value. This class is then instantiated with various combinations of the &lt;code&gt;Cn&lt;/code&gt; types for generic arguments.&lt;/p&gt;

&lt;p&gt;In .NET, each separate instantiation of a generic type has its own static state. For example, &lt;code&gt;GenericHolder&amp;lt;int,C0,C1,C2&amp;gt;&lt;/code&gt; has a completely separate &lt;code&gt;s_value&lt;/code&gt; field to &lt;code&gt;GenericHolder&amp;lt;int,C1,C14,C1&amp;gt;&lt;/code&gt;. This feature is (ab)used by &lt;code&gt;ThreadLocal&lt;/code&gt; to emulate instance thread-static fields.&lt;/p&gt;

&lt;p&gt;Every time an instance of &lt;code&gt;ThreadLocal&lt;/code&gt; is constructed, it is assigned a unique number from the static &lt;code&gt;s_currentTypeId&lt;/code&gt; field using &lt;code&gt;Interlocked.Increment&lt;/code&gt;, in the &lt;code&gt;FindNextTypeIndex&lt;/code&gt; method. The hexadecimal representation of that number then defines the specific &lt;code&gt;Cn&lt;/code&gt; types that instantiates the &lt;code&gt;GenericHolder&lt;/code&gt; class. That instantiation is therefore 'owned' by that instance of &lt;code&gt;ThreadLocal&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This gives each instance of &lt;code&gt;ThreadLocal&lt;/code&gt; its own &lt;code&gt;ThreadStatic&lt;/code&gt; field through a specific unique instantiation of the &lt;code&gt;GenericHolder&lt;/code&gt; class. Although &lt;code&gt;GenericHolder&lt;/code&gt; has four type variables, the first one is always instantiated to the type stored in the &lt;code&gt;ThreadLocal&amp;lt;T&amp;gt;&lt;/code&gt;. This gives three free type variables, each of which can be instantiated to one of 16 types (&lt;code&gt;C0&lt;/code&gt; to &lt;code&gt;C15&lt;/code&gt;). This puts an upper limit of 4096 (16&lt;sup&gt;3&lt;/sup&gt;) on the number of &lt;code&gt;ThreadLocal&amp;lt;T&amp;gt;&lt;/code&gt; instances that can be created for each value of T. That is, there can be a maximum of 4096 instances of &lt;code&gt;ThreadLocal&amp;lt;string&amp;gt;&lt;/code&gt;, and separately a maximum of 4096 instances of &lt;code&gt;ThreadLocal&amp;lt;object&amp;gt;&lt;/code&gt;, etc.&lt;/p&gt;

&lt;p&gt;However, there is an upper limit of 16384 enforced on the total number of &lt;code&gt;ThreadLocal&lt;/code&gt; instances in the AppDomain. This is to stop too much memory being used by thousands of instantiations of &lt;code&gt;GenericHolder&amp;lt;T,U,V,W&amp;gt;&lt;/code&gt;, as once a type is loaded into an AppDomain it cannot be unloaded, and will continue to sit there taking up memory until the AppDomain is unloaded. The total number of &lt;code&gt;ThreadLocal&lt;/code&gt; instances created is tracked by the &lt;code&gt;ThreadLocalGlobalCounter&lt;/code&gt; class.&lt;/p&gt;

&lt;p&gt;So what happens when either limit is reached? Firstly, to try and stop this limit being reached, it recycles &lt;code&gt;GenericHolder&lt;/code&gt; type indexes of &lt;code&gt;ThreadLocal&lt;/code&gt; instances that get disposed using the &lt;code&gt;s_availableIndices&lt;/code&gt; concurrent stack. This allows &lt;code&gt;GenericHolder&lt;/code&gt; instantiations of disposed &lt;code&gt;ThreadLocal&lt;/code&gt; instances to be re-used. But if there aren't any available instantiations, then &lt;code&gt;ThreadLocal&lt;/code&gt; falls back on a standard thread local slot using &lt;code&gt;TLSHolder&lt;/code&gt;. This makes it very important to dispose of your &lt;code&gt;ThreadLocal&lt;/code&gt; instances if you'll be using lots of them, so the type instantiations can be recycled.&lt;/p&gt;

&lt;p&gt;The previous way of creating arbitary thread-static variables, thread data slots, was slow, clunky, and hard to use. In comparison, &lt;code&gt;ThreadLocal&lt;/code&gt; can be used just like any other type, and each instance appears from the outside to be a non-static thread-static variable. It does this by using the CLR type system to assign each instance of &lt;code&gt;ThreadLocal&lt;/code&gt; its own instantiated type containing a thread-static field, and so delegating a lot of the bookkeeping that thread data slots had to do to the CLR type system itself! That's a very clever use of the CLR type system.&lt;/p&gt;
&lt;small&gt;Cross posted from &lt;a href="http://www.simple-talk.com/community/blogs/simonc"&gt;Simple Talk&lt;/a&gt;.&lt;/small&gt;&lt;img src="http://geekswithblogs.net/simonc/aggbug/149524.aspx" width="1" height="1" /&gt;</content>
        <wfw:comment>http://geekswithblogs.net/simonc/comments/149524.aspx</wfw:comment>
        <slash:comments>0</slash:comments>
        <wfw:commentRss>http://geekswithblogs.net/simonc/comments/commentRss/149524.aspx</wfw:commentRss>
        <trackback:ping>http://geekswithblogs.net/simonc/services/trackbacks/149524.aspx</trackback:ping>
    </entry>
    <entry>
        <title>Obfuscation is not a panacea</title>
        <link rel="self" type="text/html" href="http://geekswithblogs.net/simonc/archive/2012/04/16/obfuscation-is-not-a-panacea.aspx" />
        <id>http://geekswithblogs.net/simonc/archive/2012/04/16/obfuscation-is-not-a-panacea.aspx</id>
        <published>2012-04-16T10:52:0001:00:00</published>
        <updated>2012-04-16T10:52:00Z</updated>
        <content type="html">&lt;p&gt;So, you want to obfuscate your .NET application. My question to you is:&lt;/p&gt;
&lt;p&gt;Why?&lt;/p&gt;
&lt;p&gt;What are your aims when your obfuscate your application? To protect your IP &amp;amp; algorithms? Prevent crackers from breaking your licensing? Your boss says you need to? To give you a warm fuzzy feeling inside? Obfuscating code correctly can be tricky, it can break your app if applied incorrectly, it can cause problems down the line. Let me be clear - there are some very good reasons why you would want to obfuscate your .NET application. However, you shouldn't be obfuscating for the sake of obfuscating.&lt;/p&gt;
&lt;h4&gt;Security through Obfuscation?&lt;/h4&gt;
&lt;p&gt;Once your application has been installed on a user’s computer, you no longer control it. If they do not want to pay for your application, then nothing can stop them from cracking it, even if the time cost to them is much greater than the cost of actually paying for it. Some people will not pay for software, even if it takes them a month to crack a $30 app. And once it is cracked, there is nothing stopping them from putting the result up on the internet.&lt;/p&gt;
&lt;p&gt;There should be nothing suprising about this; there is no software protection available for general-purpose computers that cannot be cracked by a sufficiently determined attacker. Only by completely controlling the entire stack – software, hardware, and the internet connection, can you have even a chance to be uncrackable. And even then, someone somewhere will still have a go, and probably succeed. Even high-end cryptoprocessors have known vulnerabilities that can be exploited by someone with a scanning electron microscope and lots of free time.&lt;/p&gt;
&lt;p&gt;So, then, why use obfuscation? Well, the primary reason is to protect your IP. What obfuscation is very good at is hiding the overall structure of your program, so that it’s very hard to figure out what exactly the code is doing at any one time, what context it is running in, and how it fits in with the rest of the application; all of which you need to do to understand how the application operates. This is completely different to cracking an application, where you simply have to find a single toggle that determines whether the application is licensed or not, and flip it without the rest of the application noticing.&lt;/p&gt;
&lt;p&gt;However, again, there are limitations. An obfuscated application still has to run in the same way, and do the same thing, as the original unobfuscated application. This means that some of the protections applied to the obfuscated assembly have to be undone at runtime, else it would not run on the CLR and do the same thing. And, again, since we don’t control the environment the application is run on, there is nothing stopping a user from undoing those protections manually, and reversing some of the obfuscation.&lt;/p&gt;
&lt;p&gt;It’s a perpetual arms race, and it always will be. We have plenty of ideas lined about new protections, and the new protections added in SA 6.6 (method parent obfuscation and a new control flow obfuscation level) are specifically designed to be harder to reverse and reconstruct the original structure.&lt;/p&gt;
&lt;p&gt;So then, by all means, obfuscate your application if you want to protect the algorithms and what the application does. That’s what SmartAssembly is designed to do. But make sure you are clear what a .NET obfuscator can and cannot protect you against, and don’t expect your obfuscated application to be uncrackable. Someone, somewhere, will crack your application if they want to and they don’t have anything better to do with their time. The best we can do is dissuade the casual crackers and make it much more difficult for the serious ones.&lt;/p&gt;

&lt;small&gt;Cross posted from &lt;a href="http://www.simple-talk.com/community/blogs/simonc"&gt;Simple Talk&lt;/a&gt;.&lt;/small&gt;&lt;img src="http://geekswithblogs.net/simonc/aggbug/149333.aspx" width="1" height="1" /&gt;</content>
        <wfw:comment>http://geekswithblogs.net/simonc/comments/149333.aspx</wfw:comment>
        <slash:comments>0</slash:comments>
        <wfw:commentRss>http://geekswithblogs.net/simonc/comments/commentRss/149333.aspx</wfw:commentRss>
        <trackback:ping>http://geekswithblogs.net/simonc/services/trackbacks/149333.aspx</trackback:ping>
    </entry>
    <entry>
        <title>.NET vs Windows 8</title>
        <link rel="self" type="text/html" href="http://geekswithblogs.net/simonc/archive/2012/03/28/.net-vs-windows-8.aspx" />
        <id>http://geekswithblogs.net/simonc/archive/2012/03/28/.net-vs-windows-8.aspx</id>
        <published>2012-03-28T10:56:0001:00:00</published>
        <updated>2012-04-03T09:41:00Z</updated>
        <content type="html">&lt;p&gt;So, day 1 of DevWeek. Lots and lots of Windows 8 and WinRT, as you would expect. The keynote had some actual content in it, fleshed out some of the details of how your apps linked into the Metro infrastructure, and confirmed that there would indeed be an enterprise version of the app store available for Metro apps.)&lt;/p&gt;

&lt;p&gt;However, that's, not what I want to focus this post on. What I do want to focus on is this:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Windows 8 does not make .NET developers obsolete.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Phew!&lt;/p&gt;

&lt;h4&gt;.NET in the New Ecosystem&lt;/h4&gt;
&lt;p&gt;In all the hype around Windows 8 the past few months, a lot of developers have got the impression that .NET has been sidelined in Windows 8; C++ and COM is back in vogue, and HTML5 + JavaScript is the New Way of writing applications. You know .NET? It's yesterday's tech. Enter the 21st Century and write &lt;code&gt;&amp;lt;div&amp;gt;&lt;/code&gt;! However, after speaking to people at the conference, and after a couple of talks by Dave Wheeler on the innards of WinRT and how .NET interacts with it, my views on the coming operating system have changed somewhat.&lt;/p&gt;

&lt;p&gt;To summarize what I've picked up, in no particular order (none of this is official, just my sense of what's been said by various people):
&lt;/p&gt;&lt;ul&gt;
&lt;li&gt;Metro apps do not replace desktop apps. That is, Windows 8 fully supports .NET desktop applications written for every other previous version of Windows, and will continue to do so in the forseeable future. There are some apps that simply do not fit into Metro. They do not fit into the touch-based paradigm, and never will. Traditional desktop support is not going away anytime soon.&lt;/li&gt;
&lt;li&gt;The reason Silverlight has been hidden in all the Metro hype is that Metro is essentially based on Silverlight design principles. Silverlight developers will have a much easier time writing Metro apps than desktop developers, as they would already be used to all the principles of sandboxing and separation introduced with Silverlight. It's desktop developers who are going to have to adapt how they work.&lt;/li&gt;
&lt;li&gt;.NET + XAML is equal to HTML5 + JS in importance. Although the underlying WinRT system is built on C++ &amp;amp; COM, most application development will be done either using .NET or HTML5. Both systems have their own wrapper around the underlying WinRT infrastructure, hiding the implementation details.&lt;/li&gt;
&lt;li&gt;The CLR is unchanged; it's the same .NET 4.5 CLR, running IL in .NET assemblies, as on the traditional desktop. The thing that changes between desktop and Metro is the class libraries, which have more in common with the Silverlight libraries than the desktop libraries. In Metro, although all the types look and behave the same to callers, some of the core BCL types are now wrappers around their WinRT equivalents. These wrappers are then enhanced using standard .NET types and code to produce the Metro .NET class libraries.&lt;/li&gt;
&lt;li&gt;You can't simply port a desktop app into Metro. The underlying file IO, network, timing and database access is either completely different or simply missing. Similarly, although the UI is programmed using XAML, the behaviour of the Metro XAML is different to WPF or Silverlight XAML. Furthermore, the new design principles and touch-based interface for Metro applications demand a completely new UI. You will be able to re-use sections of your app encapsulating pure program logic, but everything else will need to be written from scratch. Microsoft has taken the opportunity to remove a whole raft of types and methods from the Metro framework that are obsolete (non-generic collections) or break the sandbox (synchronous APIs); if you use these, you will have to rewrite to use the alternatives, if they exist at all, to move your apps to Metro.&lt;/li&gt;
&lt;li&gt;If you want to write public WinRT components in .NET, there are some quite strict rules you have to adhere to. But the compilers know about these rules; you can write them in C# or VB, and the compilers will tell you when you do something that isn't allowed and deal with the translation to WinRT metadata rather than .NET assemblies.&lt;/li&gt;
&lt;li&gt;It is possible to write a class library that can be used in Metro and desktop applications. However, you need to be very careful not to use types that are available in one but not the other. One can imagine developers writing their own abstraction around file IO and UIs (MVVM anyone?) that can be implemented differently in Metro and desktop, but look the same within your shared library.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So, if you're a .NET developer, you have a lot less to worry about. .NET is a viable platform on Metro, and traditional desktop apps are not going away. You don't have to learn HTML5 and JavaScript if you don't want to. Hurray!&lt;/p&gt;
&lt;small&gt;Cross posted from &lt;a href="http://www.simple-talk.com/community/blogs/simonc"&gt;Simple Talk&lt;/a&gt;.&lt;/small&gt;&lt;img src="http://geekswithblogs.net/simonc/aggbug/149125.aspx" width="1" height="1" /&gt;</content>
        <wfw:comment>http://geekswithblogs.net/simonc/comments/149125.aspx</wfw:comment>
        <slash:comments>10</slash:comments>
        <wfw:commentRss>http://geekswithblogs.net/simonc/comments/commentRss/149125.aspx</wfw:commentRss>
        <trackback:ping>http://geekswithblogs.net/simonc/services/trackbacks/149125.aspx</trackback:ping>
    </entry>
    <entry>
        <title>Inside the Concurrent Collections: ConcurrentBag</title>
        <link rel="self" type="text/html" href="http://geekswithblogs.net/simonc/archive/2012/03/26/inside-the-concurrent-collections-concurrentbag.aspx" />
        <id>http://geekswithblogs.net/simonc/archive/2012/03/26/inside-the-concurrent-collections-concurrentbag.aspx</id>
        <published>2012-03-26T16:42:4901:00:00</published>
        <updated>2012-03-26T16:42:49Z</updated>
        <content type="html">&lt;p&gt;Unlike the other concurrent collections, &lt;code&gt;ConcurrentBag&lt;/code&gt; does not really have a non-concurrent analogy. As stated in the &lt;a href="http://msdn.microsoft.com/en-us/library/dd381779.aspx"&gt;MSDN documentation&lt;/a&gt;, ConcurrentBag is optimised for the situation where the same thread is both producing and consuming items from the collection. We'll see how this is the case as we take a closer look. Again, I recommend you have &lt;code&gt;ConcurrentBag&lt;/code&gt; open in a decompiler for reference.&lt;/p&gt;

&lt;h4&gt;Thread Statics&lt;/h4&gt;
&lt;p&gt;ConcurrentBag makes heavy use of thread statics - static variables marked with &lt;a href="http://msdn.microsoft.com/en-us/library/System.ThreadStaticAttribute.aspx"&gt;&lt;code&gt;ThreadStaticAttribute&lt;/code&gt;&lt;/a&gt;. This is a special attribute that instructs the CLR to scope any values assigned to or read from the variable to the executing thread, not globally within the AppDomain.&lt;/p&gt;

&lt;p&gt;This means that if two different threads assign two different values to the same thread static variable, one value will not overwrite the other, and each thread will see the value they assigned to the variable, separately to any other thread. This is a very useful function that allows for ConcurrentBag's concurrency properties.&lt;/p&gt;

&lt;p&gt;You can think of a thread static variable:
&lt;/p&gt;&lt;pre&gt;[ThreadStatic]
private static int m_Value;&lt;/pre&gt;
as doing the same as:
&lt;pre&gt;private static Dictionary&amp;lt;Thread, int&amp;gt; m_Values;&lt;/pre&gt;
where the executing thread's identity is used to automatically set and retrieve the corresponding value in the dictionary.&lt;p&gt;&lt;/p&gt;

&lt;p&gt;In .NET 4, this usage of &lt;code&gt;ThreadStaticAttribute&lt;/code&gt; is encapsulated in the &lt;a href="http://msdn.microsoft.com/en-us/library/dd642243.aspx"&gt;&lt;code&gt;ThreadLocal&lt;/code&gt;&lt;/a&gt; class.&lt;/p&gt;

&lt;h4&gt;Lists of lists&lt;/h4&gt;
&lt;p&gt;ConcurrentBag, at its core, operates as a linked list of linked lists:&lt;/p&gt;

&lt;img src="http://www.simple-talk.com/blogbits/simon.cooper/ConcurrentBag.png" /&gt;

&lt;p&gt;Each outer list node is an instance of &lt;code&gt;ThreadLocalList&lt;/code&gt;, and each inner list node is an instance of &lt;code&gt;Node&lt;/code&gt;. Each outer &lt;code&gt;ThreadLocalList&lt;/code&gt; is owned by a particular thread, accessible through the thread local &lt;code&gt;m_locals&lt;/code&gt; variable:
&lt;/p&gt;&lt;pre&gt;private ThreadLocal&amp;lt;ThreadLocalList&amp;lt;T&amp;gt;&amp;gt; m_locals&lt;/pre&gt;
It is important to note that, although the &lt;code&gt;m_locals&lt;/code&gt; variable is thread-local, that only applies to accesses through that variable. The objects referenced by the thread (each instance of the &lt;code&gt;ThreadLocalList&lt;/code&gt; object) are normal heap objects that are not specific to any thread. Thinking back to the Dictionary analogy above, if each value stored in the dictionary could be accessed by other means, then any thread could access the value belonging to other threads using that mechanism. Only reads and writes to the variable defined as thread-local are re-routed by the CLR according to the executing thread's identity.&lt;p&gt;&lt;/p&gt;

&lt;p&gt;So, although &lt;code&gt;m_locals&lt;/code&gt; is defined as thread-local, the &lt;code&gt;m_headList&lt;/code&gt;, &lt;code&gt;m_nextList&lt;/code&gt; and &lt;code&gt;m_tailList&lt;/code&gt; variables aren't. This means that any thread can access all the thread local lists in the collection by doing a linear search through the outer linked list defined by these variables.&lt;/p&gt;

&lt;h4&gt;Adding items&lt;/h4&gt;
&lt;p&gt;So, onto the collection operations. First, adding items. This one's pretty simple. If the current thread doesn't already own an instance of &lt;code&gt;ThreadLocalList&lt;/code&gt;, then one is created (or, if there are lists owned by threads that have stopped, it takes control of one of those). Then the item is added to the head of that thread's list.&lt;/p&gt;
&lt;p&gt;That's it. Don't worry, it'll get more complicated when we account for the other operations on the list!&lt;/p&gt;

&lt;h4&gt;Taking &amp;amp; Peeking items&lt;/h4&gt;
&lt;p&gt;This is where it gets tricky. If the current thread's list has items in it, then it peeks or removes the head item (&lt;em&gt;not&lt;/em&gt; the tail item) from the local list and returns that. However, if the local list is empty, it has to go and steal another item from another list, belonging to a different thread. It iterates through all the thread local lists in the collection using the &lt;code&gt;m_headList&lt;/code&gt; and &lt;code&gt;m_nextList&lt;/code&gt; variables until it finds one that has items in it, and it steals one item from that list. Up to this point, the two threads had been operating completely independently. To steal an item from another thread's list, the stealing thread has to do it in such a way as to not step on the owning thread's toes.&lt;/p&gt;

&lt;p&gt;Recall how adding and removing items both operate on the head of the thread's linked list? That gives us an easy way out - a thread trying to steal items from another thread can pop in round the back of another thread's list using the &lt;code&gt;m_tail&lt;/code&gt; variable, and steal an item from the back without the owning thread knowing anything about it. The owning thread can carry on completely independently, unaware that one of its items has been nicked.&lt;/p&gt;

&lt;p&gt;However, this only works when there are at least 3 items in the list, as that guarantees there will be at least one node between the owning thread performing operations on the list head and the thread stealing items from the tail - there's no chance of the two threads operating on the same node at the same time and causing a race condition.&lt;/p&gt;

&lt;p&gt;If there's less than three items in the list, then there does need to be some synchronization between the two threads. In this case, the lock on the &lt;code&gt;ThreadLocalList&lt;/code&gt; object is used to mediate access to a thread's list when there's the possibility of contention.&lt;/p&gt;

&lt;h4&gt;Thread synchronization&lt;/h4&gt;
&lt;p&gt;In &lt;code&gt;ConcurrentBag&lt;/code&gt;, this is done using several mechanisms:

&lt;/p&gt;&lt;ol&gt;
&lt;li&gt;Operations performed by the owner thread only take out the lock when there are less than three items in the collection. With three or greater items, there won't be any conflict with a stealing thread operating on the tail of the list.&lt;/li&gt;
&lt;li&gt;If a lock isn't taken out, the owning thread sets the list's &lt;code&gt;m_currentOp&lt;/code&gt; variable to a non-zero value for the duration of the operation. This indicates to all other threads that there is a non-locked operation currently occuring on that list.&lt;/li&gt;
&lt;li&gt;The stealing thread always takes out the lock, to prevent two threads trying to steal from the same list at the same time.&lt;/li&gt;
&lt;li&gt;After taking out the lock, the stealing thread spinwaits until &lt;code&gt;m_currentOp&lt;/code&gt; has been set to zero before actually performing the steal. This ensures there won't be a conflict with the owning thread when the number of items in the list is on the 2-3 item borderline. If any add or remove operations are started in the meantime, and the list is below 3 items, those operations try to take out the list's lock and are blocked until the stealing thread has finished.&lt;/li&gt;
&lt;/ol&gt;&lt;p&gt;&lt;/p&gt;

&lt;p&gt;This allows a thread to steal an item from another thread's list without corrupting it. What about synchronization in the collection as a whole?&lt;/p&gt;

&lt;h4&gt;Collection synchronization&lt;/h4&gt;
&lt;p&gt;Any thread that operates on the collection's global structure (accessing anything outside the thread local lists) has to take out the collection's global lock - &lt;code&gt;m_globalListsLock&lt;/code&gt;. This single lock is sufficient when adding a new thread local list, as the items inside each thread's list are unaffected. However, what about operations (such as &lt;code&gt;Count&lt;/code&gt; or &lt;code&gt;ToArray&lt;/code&gt;) that need to access every item in the collection?&lt;/p&gt;

&lt;p&gt;In order to ensure a consistent view, all operations on the collection are stopped while the count or &lt;code&gt;ToArray&lt;/code&gt; is performed. This is done by freezing the bag at the start, performing the global operation, and unfreezing at the end:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The global lock is taken out, to prevent structural alterations to the collection.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;m_needSync&lt;/code&gt; is set to true. This notifies all the threads that they need to take out their list's lock irregardless of what operation they're doing.&lt;/li&gt;
&lt;li&gt;All the list locks are taken out in order. This blocks all locking operations on the lists.&lt;/li&gt;
&lt;li&gt;The freezing thread waits for all current lockless operations to finish by spinwaiting on each &lt;code&gt;m_currentOp&lt;/code&gt; field.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The global operation can then be performed while the bag is frozen, but no other operations can take place at the same time, as all other threads are blocked on a list's lock. Then, once the global operation has finished, the locks are released, &lt;code&gt;m_needSync&lt;/code&gt; is unset, and normal concurrent operation resumes.&lt;/p&gt;

&lt;h4&gt;Concurrent principles&lt;/h4&gt;

&lt;p&gt;That's the essence of how &lt;code&gt;ConcurrentBag&lt;/code&gt; operates. Each thread operates independently on its own local list, except when they have to steal items from another list. When stealing, only the stealing thread is forced to take out the lock; the owning thread only has to when there is the possibility of contention. And a global lock controls accesses to the structure of the collection outside the thread lists. Operations affecting the entire collection take out all locks in the collection to freeze the contents at a single point in time.&lt;/p&gt;

&lt;p&gt;So, what principles can we extract here?&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Threads operate independently&lt;/strong&gt;
&lt;p&gt;Thread-static variables and &lt;code&gt;ThreadLocal&lt;/code&gt; makes this easy. Threads operate entirely concurrently on their own structures; only when they need to grab data from another thread is there any thread contention.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Minimised lock-taking&lt;/strong&gt;
&lt;p&gt;Even when two threads need to operate on the same data structures (one thread stealing from another), they do so in such a way such that the probability of actually blocking on a lock is minimised; the owning thread always operates on the head of the list, and the stealing thread always operates on the tail.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Management of lockless operations&lt;/strong&gt;
&lt;p&gt;Any operations that don't take out a lock still have a 'hook' to force them to lock when necessary. This allows all operations on the collection to be stopped temporarily while a global snapshot is taken. Hopefully, such operations will be short-lived and infrequent.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's all the concurrent collections covered. I hope you've found it as informative and interesting as I have. Next, I'll be taking a closer look at &lt;code&gt;ThreadLocal&lt;/code&gt;, which I came across while analyzing &lt;code&gt;ConcurrentBag&lt;/code&gt;. As you'll see, the operation of this class deserves a much closer look.&lt;/p&gt;

&lt;p&gt;&lt;small&gt;Cross posted from &lt;a href="http://www.simple-talk.com/community/blogs/simonc"&gt;Simple Talk&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;&lt;img src="http://geekswithblogs.net/simonc/aggbug/149106.aspx" width="1" height="1" /&gt;</content>
        <wfw:comment>http://geekswithblogs.net/simonc/comments/149106.aspx</wfw:comment>
        <slash:comments>2</slash:comments>
        <wfw:commentRss>http://geekswithblogs.net/simonc/comments/commentRss/149106.aspx</wfw:commentRss>
        <trackback:ping>http://geekswithblogs.net/simonc/services/trackbacks/149106.aspx</trackback:ping>
    </entry>
    <entry>
        <title>DevWeek 2012</title>
        <link rel="self" type="text/html" href="http://geekswithblogs.net/simonc/archive/2012/03/26/devweek-2012.aspx" />
        <id>http://geekswithblogs.net/simonc/archive/2012/03/26/devweek-2012.aspx</id>
        <published>2012-03-26T11:14:0001:00:00</published>
        <updated>2012-03-26T11:18:48Z</updated>
        <content type="html">For those interested, myself and a few other people from Red Gate will 
be going to DevWeek 2012 in London this week. I'll be mostly around the .NET and C# talks, but may drop into the architecture and agile talks as well. I'll be blogging interesting stuff I come across as well. If you want to meet up, do
 feel free to contact me via the blog or on twitter at &lt;a href="https://twitter.com/#%21/simonmcooper"&gt;@simonmcooper&lt;/a&gt;. 
See you there!&lt;br /&gt;&lt;br /&gt;&lt;small&gt;Cross posted from &lt;a href="http://www.simple-talk.com/community/blogs/simonc"&gt;Simple Talk&lt;/a&gt;.&lt;/small&gt;&lt;img src="http://geekswithblogs.net/simonc/aggbug/149102.aspx" width="1" height="1" /&gt;</content>
        <wfw:comment>http://geekswithblogs.net/simonc/comments/149102.aspx</wfw:comment>
        <slash:comments>0</slash:comments>
        <wfw:commentRss>http://geekswithblogs.net/simonc/comments/commentRss/149102.aspx</wfw:commentRss>
        <trackback:ping>http://geekswithblogs.net/simonc/services/trackbacks/149102.aspx</trackback:ping>
    </entry>
    <entry>
        <title>Why you shouldn't add methods to interfaces in APIs</title>
        <link rel="self" type="text/html" href="http://geekswithblogs.net/simonc/archive/2012/03/08/why-you-shouldnt-add-methods-to-interfaces-in-apis.aspx" />
        <id>http://geekswithblogs.net/simonc/archive/2012/03/08/why-you-shouldnt-add-methods-to-interfaces-in-apis.aspx</id>
        <published>2012-03-08T15:27:3600:00:00</published>
        <updated>2012-03-08T15:28:41Z</updated>
        <content type="html">&lt;p&gt;It is an oft-repeated maxim that you shouldn't add methods to a publically-released interface in an API. Recently, I was hit hard when this wasn't followed.&lt;/p&gt;

&lt;p&gt;As part of the work on ApplicationMetrics, I've been implementing auto-reporting of MVC action methods; whenever an action was called on a controller, ApplicationMetrics would automatically report it without the developer needing to add manual &lt;code&gt;ReportEvent&lt;/code&gt; calls. Fortunately, MVC provides easy hook when a controller is created, letting me log when it happens - the &lt;code&gt;IControllerFactory&lt;/code&gt; interface.&lt;/p&gt;

&lt;p&gt;Now, the dll we provide to instrument an MVC webapp has to be compiled against .NET 3.5 and MVC 1, as the lowest common denominator. This MVC 1 dll will still work when used in an MVC 2, 3 or 4 webapp because all MVC 2+ webapps have a &lt;a href="http://msdn.microsoft.com/en-us/library/eftw1fys.aspx"&gt;binding redirect&lt;/a&gt; redirecting all references to previous versions of System.Web.Mvc to the correct version, and &lt;a href="http://geekswithblogs.net/simonc/archive/2011/12/23/anatomy-of-a-.net-assembly---type-forwards.aspx"&gt;type forwards&lt;/a&gt; taking care of any moved types in the new assemblies. Or at least, it should.&lt;/p&gt;

&lt;h4&gt;IControllerFactory&lt;/h4&gt;

&lt;p&gt;In MVC 1 and 2, &lt;a href="http://msdn.microsoft.com/en-us/library/system.web.mvc.icontrollerfactory.aspx"&gt;&lt;code&gt;IControllerFactory&lt;/code&gt;&lt;/a&gt; was defined as follows:

&lt;/p&gt;&lt;pre&gt;public interface IControllerFactory
{
    IController CreateController(RequestContext requestContext, string controllerName);
    
    void ReleaseController(IController controller);
}&lt;/pre&gt;

So, to implement the logging controller factory, we simply wrap the existing controller factory:

&lt;pre&gt;internal sealed class LoggingControllerFactory : IControllerFactory
{
    private readonly IControllerFactory m_CurrentController;
    
    public LoggingControllerFactory(IControllerFactory currentController)
    {
        m_CurrentController = currentController;
    }
    
    public IController CreateController(
        RequestContext requestContext, string controllerName)
    {
        // log the controller being used
        FeatureSessionData.ReportEvent("Controller used:", controllerName);
        
        return m_CurrentController.CreateController(requestContext, controllerName);
    }
    
    public void ReleaseController(IController controller)
    {
        m_CurrentController.ReleaseController(controller);
    }
}&lt;/pre&gt;

&lt;p&gt;Easy. This works as expected in MVC 1 and 2. However, in MVC 3 this type was throwing a &lt;code&gt;TypeLoadException&lt;/code&gt;, saying a method wasn't implemented. It turns out that, in MVC 3, the definition of &lt;code&gt;IControllerFactory&lt;/code&gt; was changed to &lt;a href="http://msdn.microsoft.com/en-us/library/system.web.mvc.icontrollerfactory%28v=vs.98%29.aspx"&gt;this&lt;/a&gt;:&lt;/p&gt;

&lt;pre&gt;public interface IControllerFactory
{
    IController CreateController(RequestContext requestContext, string controllerName);
    
    SessionStateBehavior GetControllerSessionBehavior(
        RequestContext requestContext, string controllerName);
        
    void ReleaseController(IController controller);
}&lt;/pre&gt;

&lt;p&gt;There's a new method in the interface. So when our MVC 1 dll was redirected to reference System.Web.Mvc v3, &lt;code&gt;LoggingControllerFactory&lt;/code&gt; tried to implement version 3 of &lt;code&gt;IControllerFactory&lt;/code&gt;, was missing the &lt;code&gt;GetControllerSessionBehaviour&lt;/code&gt; method, and so couldn't be loaded by the CLR.&lt;/p&gt;

&lt;h4&gt;Implementing the new method&lt;/h4&gt;

&lt;p&gt;Fortunately, there was a workaround. Because interface methods are normally implemented implicitly in the CLR, if we simply declare a virtual method matching the signature of the new method in MVC 3, then it will be ignored in MVC 1 and 2 and implement the extra method in MVC 3:

&lt;/p&gt;&lt;pre&gt;internal sealed class LoggingControllerFactory : IControllerFactory
{
    ...
    
    public virtual SessionStateBehaviour GetControllerSessionBehaviour(
        RequestContext requestContext, string controllerName) {}
    
    ...
}&lt;/pre&gt;

&lt;p&gt;However, this also has problems - the &lt;a href="http://msdn.microsoft.com/en-us/library/system.web.sessionstate.sessionstatebehavior.aspx"&gt;&lt;code&gt;SessionStateBehaviour&lt;/code&gt;&lt;/a&gt; type only exists in .NET 4, and we're limited to .NET 3.5 by support for MVC 1 and 2.&lt;/p&gt;

&lt;p&gt;This means that the &lt;em&gt;only&lt;/em&gt; solutions to support all MVC versions are:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Construct the &lt;code&gt;LoggingControllerFactory&lt;/code&gt; type at runtime using reflection&lt;/li&gt;
&lt;li&gt;Produce entirely separate dlls for MVC 1&amp;amp;2 and MVC 3.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Ugh. And all because of that blasted extra method!&lt;/p&gt;

&lt;h4&gt;Another solution?&lt;/h4&gt;

&lt;p&gt;Fortunately, in this case, there is a third option - System.Web.Mvc also provides a &lt;a href="http://msdn.microsoft.com/en-us/library/system.web.mvc.defaultcontrollerfactory.aspx"&gt;&lt;code&gt;DefaultControllerFactory&lt;/code&gt;&lt;/a&gt; type that can provide the implementation of &lt;code&gt;GetControllerSessionBehaviour&lt;/code&gt; for us in MVC 3, while still allowing us to override &lt;code&gt;CreateController&lt;/code&gt; and &lt;code&gt;ReleaseController&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;However, this does mean that &lt;code&gt;LoggingControllerFactory&lt;/code&gt; won't be able to wrap any calls to &lt;code&gt;GetControllerSessionBehaviour&lt;/code&gt;. This is an acceptable bug, given the other options, as very few developers will be overriding &lt;code&gt;GetControllerSessionBehaviour&lt;/code&gt; in their own custom controller factory.&lt;/p&gt;

&lt;p&gt;So, if you're providing an interface as part of an API, then please please &lt;em&gt;please&lt;/em&gt; don't add methods to it. Especially if you don't provide a 'default' implementing type. Any code compiled against the previous version that can't be updated will have some very tough decisions to make to support both versions.&lt;/p&gt;

&lt;small&gt;Cross-posted from &lt;a href="http://www.simple-talk.com/community/blogs/simonc"&gt;Simple Talk&lt;/a&gt;.&lt;/small&gt;&lt;img src="http://geekswithblogs.net/simonc/aggbug/148936.aspx" width="1" height="1" /&gt;</content>
        <wfw:comment>http://geekswithblogs.net/simonc/comments/148936.aspx</wfw:comment>
        <slash:comments>2</slash:comments>
        <wfw:commentRss>http://geekswithblogs.net/simonc/comments/commentRss/148936.aspx</wfw:commentRss>
        <trackback:ping>http://geekswithblogs.net/simonc/services/trackbacks/148936.aspx</trackback:ping>
    </entry>
    <entry>
        <title>Oh no! My padding's invalid!</title>
        <link rel="self" type="text/html" href="http://geekswithblogs.net/simonc/archive/2012/02/28/oh-no-my-paddings-invalid.aspx" />
        <id>http://geekswithblogs.net/simonc/archive/2012/02/28/oh-no-my-paddings-invalid.aspx</id>
        <published>2012-02-28T12:33:0000:00:00</published>
        <updated>2012-02-28T14:00:34Z</updated>
        <content type="html">&lt;p&gt;Recently, I've been doing some work involving cryptography, and encountered the standard .NET CryptographicException: 'Padding is invalid and cannot be removed.' Searching on StackOverflow produces 57 questions concerning this exception; it's a very common problem encountered. So I decided to have a closer look.&lt;/p&gt;

&lt;p&gt;To test this, I created a simple project that decrypts and encrypts a byte array:&lt;/p&gt;

&lt;pre&gt;// create some random data
byte[] data = new byte[100];
new Random().NextBytes(data);

// use the Rijndael symmetric algorithm
RijndaelManaged rij = new RijndaelManaged();
byte[] encrypted;

// encrypt the data using a CryptoStream
using (var encryptor = rij.CreateEncryptor())
using (MemoryStream encryptedStream = new MemoryStream())
using (CryptoStream crypto = new CryptoStream(
    encryptedStream, encryptor, CryptoStreamMode.Write))
{
    crypto.Write(data, 0, data.Length);
    encrypted = encryptedStream.ToArray();
}

byte[] decrypted;

// and decrypt it again
using (var decryptor = rij.CreateDecryptor())
using (CryptoStream crypto = new CryptoStream(
    new MemoryStream(encrypted), decryptor, CryptoStreamMode.Read))
{
    byte[] decrypted = new byte[data.Length];
    crypto.Read(decrypted, 0, decrypted.Length);
}&lt;/pre&gt;

&lt;p&gt;Sure enough, I got exactly the same CryptographicException when trying to decrypt the data even in this simple example. Well, I'm obviously missing something, if I can't even get this single method right! What does the exception message actually mean? What am I missing?&lt;/p&gt;

&lt;p&gt;Well, after playing around a bit, I discovered the problem was fixed by changing the encryption step to this:
&lt;/p&gt;&lt;pre&gt;// encrypt the data using a CryptoStream
using (var encryptor = rij.CreateEncryptor())
using (MemoryStream encryptedStream = new MemoryStream())
{
    using (CryptoStream crypto = new CryptoStream(
        encryptedStream, encryptor, CryptoStreamMode.Write))
    {
        crypto.Write(data, 0, data.Length);
    }
    encrypted = encryptedStream.ToArray();
}&lt;/pre&gt;

&lt;p&gt;Aaaah, so that's what the problem was. The CryptoStream wasn't flushing all it's data to the MemoryStream before it was being read, and closing the stream causes it to flush everything to the backing stream. But why does this cause an error in padding?

&lt;/p&gt;&lt;h4&gt;Cryptographic padding&lt;/h4&gt;

&lt;p&gt;All symmetric encryption algorithms (of which Rijndael is one) operates on fixed block sizes. For Rijndael, the default block size is 16 bytes. This means the input needs to be a multiple of 16 bytes long. If it isn't, then the input is padded to 16 bytes using one of the &lt;a href="http://msdn.microsoft.com/en-us/library/system.security.cryptography.paddingmode.aspx"&gt;padding modes&lt;/a&gt;. This is only done to the final block of data to be encrypted.&lt;/p&gt;

&lt;p&gt;CryptoStream has a special method to flush this final block of data - &lt;a href="http://msdn.microsoft.com/en-us/library/system.security.cryptography.cryptostream.flushfinalblock.aspx"&gt;&lt;code&gt;FlushFinalBlock&lt;/code&gt;&lt;/a&gt;. Calling &lt;code&gt;Stream.Flush()&lt;/code&gt; does &lt;em&gt;not&lt;/em&gt; flush the final block, as you might expect. Only by closing the stream or explicitly calling &lt;code&gt;FlushFinalBlock&lt;/code&gt; is the final block, with any padding, encrypted and written to the backing stream. Without this call, the encrypted data is 16 bytes shorter than it should be.&lt;/p&gt;

&lt;p&gt;If this final block wasn't written, then the decryption gets to the final 16 bytes of the encrypted data and tries to decrypt it as the final block with padding. The end bytes don't match the padding scheme it's been told to use, therefore it throws an exception stating what is wrong - what the decryptor expects to be padding actually isn't, and so can't be removed from the stream.&lt;/p&gt;

&lt;p&gt;So, as well as closing the stream before reading the result, an alternative fix to my encryption code is the following:&lt;/p&gt;

&lt;pre&gt;// encrypt the data using a CryptoStream
using (var encryptor = rij.CreateEncryptor())
using (MemoryStream encryptedStream = new MemoryStream())
using (CryptoStream crypto = new CryptoStream(
    encryptedStream, encryptor, CryptoStreamMode.Write))
{
    crypto.Write(data, 0, data.Length);

    // explicitly flush the final block of data
    crypto.FlushFinalBlock();

    encrypted = encryptedStream.ToArray();
}&lt;/pre&gt;

&lt;h4&gt;Conclusion&lt;/h4&gt;
&lt;p&gt;So, if your padding is invalid, make sure that you close or call &lt;code&gt;FlushFinalBlock&lt;/code&gt; on any &lt;code&gt;CryptoStream&lt;/code&gt; performing encryption before you access the encrypted data. &lt;code&gt;Flush&lt;/code&gt; isn't enough. Only then will the final block be present in the encrypted data, allowing it to be decrypted successfully.&lt;/p&gt;
&lt;small&gt;Cross posted from &lt;a href="http://www.simple-talk.com/community/blogs/simonc"&gt;Simple Talk&lt;/a&gt;.&lt;/small&gt;&lt;img src="http://geekswithblogs.net/simonc/aggbug/148828.aspx" width="1" height="1" /&gt;</content>
        <wfw:comment>http://geekswithblogs.net/simonc/comments/148828.aspx</wfw:comment>
        <slash:comments>1</slash:comments>
        <wfw:commentRss>http://geekswithblogs.net/simonc/comments/commentRss/148828.aspx</wfw:commentRss>
        <trackback:ping>http://geekswithblogs.net/simonc/services/trackbacks/148828.aspx</trackback:ping>
    </entry>
    <entry>
        <title>Inside the Concurrent Collections: ConcurrentDictionary</title>
        <link rel="self" type="text/html" href="http://geekswithblogs.net/simonc/archive/2012/02/22/inside-the-concurrent-collections-concurrentdictionary.aspx" />
        <id>http://geekswithblogs.net/simonc/archive/2012/02/22/inside-the-concurrent-collections-concurrentdictionary.aspx</id>
        <published>2012-02-22T18:47:2900:00:00</published>
        <updated>2012-02-22T18:53:42Z</updated>
        <content type="html">&lt;p&gt;Using locks to implement a thread-safe collection is rather like using a sledgehammer - unsubtle, easy to understand, and tends to make any other tool redundant. Unlike the previous two collections I looked at, &lt;code&gt;ConcurrentStack&lt;/code&gt; and &lt;code&gt;ConcurrentQueue&lt;/code&gt;, &lt;code&gt;ConcurrentDictionary&lt;/code&gt; uses locks quite heavily. However, it is careful to wield locks only where necessary to ensure that concurrency is maximised.&lt;/p&gt;

&lt;p&gt;This will, by necessity, be a higher-level look than my other posts in this series, as there is quite a lot of code and logic in &lt;code&gt;ConcurrentDictionary&lt;/code&gt;. Therefore, I do recommend that you have &lt;code&gt;ConcurrentDictionary&lt;/code&gt; open in a decompiler to have a look at all the details that I skip over.&lt;/p&gt;

&lt;h4&gt;The problem with locks&lt;/h4&gt;

&lt;p&gt;There's several things to bear in mind when using locks, as encapsulated by the &lt;code&gt;lock&lt;/code&gt; keyword in C# and the &lt;code&gt;System.Threading.Monitor&lt;/code&gt; class in .NET (if you're unsure as to what &lt;code&gt;lock&lt;/code&gt; does in C#, I briefly covered it in my &lt;a href="http://geekswithblogs.net/simonc/archive/2012/01/05/inside-the-concurrent-collections.aspx"&gt;first post in the series&lt;/a&gt;):
&lt;/p&gt;&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Locks block threads&lt;/strong&gt;
&lt;br /&gt;The most obvious problem is that threads waiting on a lock can't do any work at all. No preparatory work, no 'optimistic' work like in &lt;code&gt;ConcurrentQueue&lt;/code&gt; and &lt;code&gt;ConcurrentStack&lt;/code&gt;, nothing. It sits there, waiting to be unblocked. This is bad if you're trying to maximise concurrency.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Locks are slow&lt;/strong&gt;
&lt;br /&gt;Whereas most of the methods on the &lt;code&gt;Interlocked&lt;/code&gt; class can be compiled down to a single CPU instruction, ensuring atomicity at the hardware level, taking out a lock requires some heavy lifting by the CLR and the operating system. There's quite a bit of work required to take out a lock, block other threads, and wake them up again. If locks are used heavily, this impacts performance.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Deadlocks&lt;/strong&gt;
&lt;br /&gt;When using locks there's always the possibility of a deadlock - two threads, each holding a lock, each trying to aquire the other's lock. Fortunately, this can be avoided with careful programming and structured lock-taking, as we'll see.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;So, it's important to minimise where locks are used to maximise the concurrency and performance of the collection.&lt;/p&gt;

&lt;h4&gt;Implementation&lt;/h4&gt;

&lt;p&gt;As you might expect, &lt;code&gt;ConcurrentDictionary&lt;/code&gt; is similar in basic implementation to the non-concurrent &lt;code&gt;Dictionary&lt;/code&gt;, which I studied in a &lt;a href="http://geekswithblogs.net/simonc/archive/2011/09/16/the-.net-dictionary.aspx"&gt;previous post&lt;/a&gt;. I'll be using some concepts introduced in that post, so I recommend you have a quick read of it.&lt;/p&gt;

&lt;p&gt;So, if you were implementing a thread-safe dictionary, what would you do? The naive implementation is to simply have a single lock around all methods accessing the dictionary. This would work, but doesn't allow much concurrency.&lt;/p&gt;

&lt;p&gt;Fortunately, the bucketing used by &lt;code&gt;Dictionary&lt;/code&gt; allows a simple but effective improvement to this - one lock per bucket. This allows different threads modifying different buckets to do so in parallel. Any thread making changes to the contents of a bucket takes the lock for that bucket, ensuring those changes are thread-safe. The method that maps each bucket to a lock is the &lt;code&gt;GetBucketAndLockNo&lt;/code&gt; method:&lt;/p&gt;

&lt;pre&gt;private void GetBucketAndLockNo(
        int hashcode, out int bucketNo, out int lockNo, int bucketCount) {
        
    // the bucket number is the hashcode (without the initial sign bit)
    // modulo the number of buckets
    bucketNo = (hashcode &amp;amp; 0x7fffffff) % bucketCount;
	  
    // and the lock number is the bucket number modulo the number of locks
    lockNo = bucketNo % m_locks.Length;
}&lt;/pre&gt;

&lt;p&gt;However, this does require some changes to how the buckets are implemented. The 'implicit' linked list within a single backing array used by the non-concurrent &lt;code&gt;Dictionary&lt;/code&gt; adds a dependency between separate buckets, as every bucket uses the same backing array. Instead, &lt;code&gt;ConcurrentDictionary&lt;/code&gt; uses a strict linked list on each bucket:&lt;/p&gt;

&lt;img src="http://www.simple-talk.com/blogbits/simon.cooper/ConcurrentDictionary1.png" /&gt;

&lt;p&gt;This ensures that each bucket is entirely separate from all other buckets; adding or removing an item from a bucket is independent to any changes to other buckets.&lt;/p&gt;

&lt;h4&gt;Modifying the dictionary&lt;/h4&gt;

&lt;p&gt;All the operations on the dictionary follow the same basic pattern:&lt;/p&gt;

&lt;pre&gt;  void AlterBucket(TKey key, ...) {
      int bucketNo, lockNo;
&lt;strong&gt;1:&lt;/strong&gt;    GetBucketAndLockNo(
          key.GetHashCode(), out bucketNo, out lockNo, m_buckets.Length);

&lt;strong&gt;2:&lt;/strong&gt;    lock (m_locks[lockNo]) {
&lt;strong&gt;3:&lt;/strong&gt;        Node headNode = m_buckets[bucketNo];
        
&lt;strong&gt;4:&lt;/strong&gt;        &lt;em&gt;Mutate the node linked list as appropriate&lt;/em&gt;
      }
  }&lt;/pre&gt;

&lt;p&gt;For example, when adding another entry to the dictionary, you would iterate through the linked list to check whether the key exists already, and add the new entry as the head node. When removing items, you would find the entry to remove (if it exists), and remove the node from the linked list. Adding, updating, and removing items all follow this pattern.&lt;/p&gt;

&lt;h4&gt;Performance issues&lt;/h4&gt;

&lt;p&gt;There is a problem we have to address at this point. If the number of buckets in the dictionary is fixed in the constructor, then the performance will degrade from O(1) to O(n) when a large number of items are added to the dictionary. As more and more items get added to the linked lists in each bucket, the lookup operations will spend most of their time traversing a linear linked list.&lt;/p&gt;

&lt;p&gt;To fix this, the buckets array has to be resized once the number of items in each bucket has gone over a certain limit. &lt;small&gt;(In &lt;code&gt;ConcurrentDictionary&lt;/code&gt; this limit is when the size of the largest bucket is greater than the number of buckets for each lock. This check is done at the end of the &lt;code&gt;TryAddInternal&lt;/code&gt; method.)&lt;/small&gt;&lt;/p&gt;

&lt;p&gt;Resizing the bucket array and re-hashing everything affects every bucket in the collection. Therefore, this operation needs to take out every lock in the collection. Taking out mutiple locks at once inevitably summons the spectre of the deadlock; two threads each hold a lock, and each trying to acquire the other lock.&lt;/p&gt;

&lt;p&gt;How can we eliminate this? Simple - ensure that threads never try to 'swap' locks in this fashion. When taking out multiple locks, always take them out in the same order, and always take out all the locks you need before starting to release them. In &lt;code&gt;ConcurrentDictionary&lt;/code&gt;, this is controlled by the &lt;code&gt;AcquireLocks&lt;/code&gt;, &lt;code&gt;AcquireAllLocks&lt;/code&gt; and &lt;code&gt;ReleaseLocks&lt;/code&gt; methods. Locks are always taken out and released in the order they are in the &lt;code&gt;m_locks&lt;/code&gt; array, and locks are all released right at the end of the method in a &lt;code&gt;finally&lt;/code&gt; block.&lt;/p&gt;

&lt;p&gt;At this point, it's worth pointing out that the locks array is &lt;em&gt;never&lt;/em&gt; re-assigned, even when the buckets array is increased in size. The number of locks is fixed in the constructor by the &lt;code&gt;concurrencyLevel&lt;/code&gt; parameter. This simplifies programming the locks; you don't have to check if the locks array has changed or been re-assigned before taking out a lock object. And you can be sure that when a thread takes out a lock, another thread isn't going to re-assign the lock array. This would create a new series of lock objects, thus allowing another thread to ignore the existing locks (and any threads controlling them), breaking thread-safety.&lt;/p&gt;

&lt;h4&gt;Consequences of growing the array&lt;/h4&gt;

&lt;p&gt;Just because we're using locks doesn't mean that race conditions aren't a problem. We can see this by looking at the &lt;code&gt;GrowTable&lt;/code&gt; method. The operation of this method can be boiled down to:&lt;/p&gt;

&lt;pre&gt;  private void GrowTable(Node[] buckets) {
      try {
&lt;strong&gt;1:&lt;/strong&gt;        &lt;em&gt;Acquire first lock in the locks array&lt;/em&gt;
          // this causes any other thread trying to take out
          // all the locks to block because the first lock in the array
          // is always the one taken out first
        
          // check if another thread has already resized the buckets array
          // while we were waiting to acquire the first lock   
&lt;strong&gt;2:&lt;/strong&gt;        if (buckets != m_buckets) return;
        
&lt;strong&gt;3:&lt;/strong&gt;        &lt;em&gt;Calculate the new size of the backing array&lt;/em&gt;

&lt;strong&gt;4:&lt;/strong&gt;        Node[] array = new array[size];

&lt;strong&gt;5:&lt;/strong&gt;        &lt;em&gt;Acquire all the remaining locks&lt;/em&gt;

&lt;strong&gt;6:&lt;/strong&gt;        &lt;em&gt;Re-hash the contents of the existing buckets into array&lt;/em&gt;

&lt;strong&gt;7:&lt;/strong&gt;        m_buckets = array;
      }
      finally {
&lt;strong&gt;8:&lt;/strong&gt;        &lt;em&gt;Release all locks&lt;/em&gt;
      }
  }&lt;/pre&gt;

&lt;p&gt;As you can see, there's already a check for a race condition at step 2, for the case when the &lt;code&gt;GrowTable&lt;/code&gt; method is called twice in quick succession on two separate threads. One will successfully resize the buckets array (blocking the second in the meantime), when the second thread is unblocked it'll see that the array has already been resized &amp;amp; exit without doing anything.&lt;/p&gt;

&lt;p&gt;There is another case we need to consider; looking back at the &lt;code&gt;AlterBucket&lt;/code&gt; method above, consider the following situation:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Thread 1 calls &lt;code&gt;AlterBucket&lt;/code&gt;; step 1 is executed to get the bucket and lock numbers.&lt;/li&gt;
&lt;li&gt;Thread 2 calls &lt;code&gt;GrowTable&lt;/code&gt; and executes steps 1-5; thread 1 is blocked when it tries to take out the lock in step 2.&lt;/li&gt;
&lt;li&gt;Thread 2 re-hashes everything, re-assigns the buckets array, and releases all the locks (steps 6-8).&lt;/li&gt;
&lt;li&gt;Thread 1 is unblocked and continues executing, but the calculated bucket and lock numbers are no longer valid.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Between calculating the correct bucket and lock number and taking out the lock, another thread has changed where everything is. Not exactly thread-safe. Well, a similar problem was solved in &lt;code&gt;ConcurrentStack&lt;/code&gt; and &lt;code&gt;ConcurrentQueue&lt;/code&gt; by storing a local copy of the state, doing the necessary calculations, then checking if that state is still valid. We can use a similar idea here:&lt;/p&gt;

&lt;pre&gt;void AlterBucket(TKey key, ...) {
    while (true) {
        Node[] buckets = m_buckets;
        
        int bucketNo, lockNo;
        GetBucketAndLockNo(
            key.GetHashCode(), out bucketNo, out lockNo, buckets.Length);

        lock (m_locks[lockNo]) {
            // if the state has changed, go back to the start
            if (buckets != m_buckets) continue;
            
            Node headNode = m_buckets[bucketNo];
            
            &lt;em&gt;Mutate the node linked list as appropriate&lt;/em&gt;
        }
        break;
    }
}&lt;/pre&gt;

&lt;h4&gt;TryGetValue and GetEnumerator&lt;/h4&gt;

&lt;p&gt;And so, finally, we get onto &lt;code&gt;TryGetValue&lt;/code&gt; and &lt;code&gt;GetEnumerator&lt;/code&gt;. I've left these to the end because, well, they don't actually use any locks.&lt;/p&gt;

&lt;p&gt;How can this be? Whenever you change a bucket, you need to take out the corresponding lock, yes? Indeed you do. However, it is important to note that &lt;code&gt;TryGetValue&lt;/code&gt; and &lt;code&gt;GetEnumerator&lt;/code&gt; &lt;em&gt;don't actually change anything&lt;/em&gt;. Just as immutable objects are, by definition, thread-safe, read-only operations don't need to take out a lock because they don't change anything. All lockless methods can happily iterate through the buckets and linked lists without worrying about locking anything.&lt;/p&gt;

&lt;p&gt;However, this does put restrictions on how the other methods operate. Because there could be another thread in the middle of reading the dictionary at any time (even if a lock is taken out), the dictionary has to be in a valid state at all times. Every change to state has to be made visible to other threads in a single atomic operation (all relevant variables are marked &lt;code&gt;volatile&lt;/code&gt; to help with this).&lt;/p&gt;

&lt;p&gt;This restriction ensures that whatever the reading threads are doing, they never read the dictionary in an invalid state (eg items that should be in the collection temporarily removed from the linked list, or reading a node that has had it's key &amp;amp; value removed before the node itself has been removed from the linked list).&lt;/p&gt;

&lt;p&gt;Fortunately, all the operations needed to change the dictionary can be done in that way. Bucket resizes are made visible when the new array is assigned back to the &lt;code&gt;m_buckets&lt;/code&gt; variable. Any additions or modifications to a node are done by creating a new node, then splicing it into the existing list using a single variable assignment. Node removals are simply done by re-assigning the node's &lt;code&gt;m_next&lt;/code&gt; pointer.&lt;/p&gt;

&lt;p&gt;Because the dictionary can be changed by another thread during execution of the lockless methods, the &lt;code&gt;GetEnumerator&lt;/code&gt; method is liable to return dirty reads - changes made to the dictionary after &lt;code&gt;GetEnumerator&lt;/code&gt; was called, but before the enumeration got to that point in the dictionary. It's worth listing at this point which methods are lockless, and which take out all the locks in the dictionary to ensure they get a consistent view of the dictionary:&lt;/p&gt;

&lt;strong&gt;Lockless:&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;TryGetValue&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;GetEnumerator&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;The indexer getter&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ContainsKey&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;strong&gt;Takes out every lock (lockfull?):&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;Count&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;IsEmpty&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Keys&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Values&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;CopyTo&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ToArray&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;Concurrent principles&lt;/h4&gt;
&lt;p&gt;That covers the overall implementation of &lt;code&gt;ConcurrentDictionary&lt;/code&gt;. I haven't even begun to scratch the surface of this sophisticated collection. That I leave to you. However, we've looked at enough to be able to extract some useful principles for concurrent programming:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;h4&gt;Partitioning&lt;/h4&gt;
&lt;p&gt;When using locks, the work is partitioned into independant chunks, each with its own lock. Each partition can then be modified concurrently to other partitions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;h4&gt;Ordered lock-taking&lt;/h4&gt;
&lt;p&gt;When a method does need to control the entire collection, locks are taken and released in a fixed order to prevent deadlocks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;h4&gt;Lockless reads&lt;/h4&gt;
&lt;p&gt;Read operations that don't care about dirty reads don't take out any lock; the rest of the collection is implemented so that any reading thread always has a consistent view of the collection.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That leads us to the final collection in this little series - &lt;code&gt;ConcurrentBag&lt;/code&gt;. Lacking a non-concurrent analogy, it is quite different to any other collection in the class libraries. Prepare your thinking hats!&lt;/p&gt;&lt;img src="http://geekswithblogs.net/simonc/aggbug/148789.aspx" width="1" height="1" /&gt;</content>
        <wfw:comment>http://geekswithblogs.net/simonc/comments/148789.aspx</wfw:comment>
        <slash:comments>0</slash:comments>
        <wfw:commentRss>http://geekswithblogs.net/simonc/comments/commentRss/148789.aspx</wfw:commentRss>
        <trackback:ping>http://geekswithblogs.net/simonc/services/trackbacks/148789.aspx</trackback:ping>
    </entry>
    <entry>
        <title>Inside the Concurrent Collections: ConcurrentQueue</title>
        <link rel="self" type="text/html" href="http://geekswithblogs.net/simonc/archive/2012/01/24/inside-the-concurrent-collections-concurrentqueue.aspx" />
        <id>http://geekswithblogs.net/simonc/archive/2012/01/24/inside-the-concurrent-collections-concurrentqueue.aspx</id>
        <published>2012-01-24T22:14:4600:00:00</published>
        <updated>2012-01-24T22:19:06Z</updated>
        <content type="html">&lt;p&gt;&lt;code&gt;ConcurrentQueue&lt;/code&gt; is, like &lt;code&gt;ConcurrentStack&lt;/code&gt;, a lockless collection, in that it is implemented without using any locks at all. However, the semantics required for a queue impose a quite different approach; unlike &lt;code&gt;ConcurrentStack&lt;/code&gt;, which has a single point of concurrent contention, a queue can be changed at both the head and tail. This means that at least two variables are involved, most likely more. The simple approach of atomically modifying a single variable won't work here.&lt;/p&gt;

&lt;h4&gt;What does &lt;code&gt;System.Collections.Generic.Queue&lt;/code&gt; do?&lt;/h4&gt;

&lt;p&gt;Well, let's have a look at the non-concurrent queue. This is implemented as a &lt;a href="http://en.wikipedia.org/wiki/Circular_buffer"&gt;circular buffer&lt;/a&gt;. However, this design is very hard to use in a lockless way; when a new item is enqueued and the backing array is full, the array is doubled in size to accommodate the new item.&lt;/p&gt;

&lt;p&gt;To resize the array, the entire contents of the queue has to be read sequentially without using locks and copied into a brand new array by a single thread, during which other threads can enqueue and dequeue items many times. This is very tricky to implement without being subject to race conditions and still ensure high performance.&lt;/p&gt;

&lt;p&gt;So, instead, ConcurrentQueue forgoes the circular buffer, and uses a linear array instead. More specifically, it needs to be a linear array with a (conceptually) infinite capacity - although there will only be a finite number of items stored in the queue, items can be enqueued and dequeued an unlimited number of times. Such an array can't be created as a single object.&lt;/p&gt;

&lt;p&gt;However, you don't need to keep the entire infinite array in memory, you only need the section of the array that is actually storing items. By splitting the array into finite segments you can create and discard segments as items get enqueued and dequeued to the queue. This is what &lt;code&gt;ConcurrentQueue&lt;/code&gt; does.&lt;/p&gt;

&lt;h4&gt;Queue Segments&lt;/h4&gt;

&lt;p&gt;To demonstrate, I'll be using a queue with a segment size of 4. To start off with, three items are enqueued. These are added to the leftmost available slot of the tail segment:&lt;/p&gt;

&lt;img src="http://www.simple-talk.com/blogbits/simon.cooper/ConcurrentQueue1.png" /&gt;

&lt;p&gt;Three more items are then enqueued. A new segment is created and assigned to the tail:&lt;/p&gt;

&lt;img src="http://www.simple-talk.com/blogbits/simon.cooper/ConcurrentQueue2.png" /&gt;

&lt;p&gt;And another three:&lt;/p&gt;

&lt;img src="http://www.simple-talk.com/blogbits/simon.cooper/ConcurrentQueue3.png" /&gt;

&lt;p&gt;Five items are then dequeued from the head. The head segment is now empty, so it is discarded:&lt;/p&gt;

&lt;img src="http://www.simple-talk.com/blogbits/simon.cooper/ConcurrentQueue4.png" /&gt;

&lt;p&gt;Using this method, the illusion of an infinite linear array can be created using a finite number of segments.&lt;/p&gt;

&lt;h4&gt;Achieving thread-safety&lt;/h4&gt;

&lt;p&gt;&lt;code&gt;ConcurrentQueue&lt;/code&gt; uses a segment size of 32, and along with the backing array each segment stores the index items are to be added (&lt;code&gt;m_high&lt;/code&gt;) and where they are to be removed (&lt;code&gt;m_low&lt;/code&gt;):

&lt;/p&gt;&lt;pre&gt;private class Segment {
    volatile T[] m_array;
    volatile int m_high;
    volatile int m_low;
}&lt;/pre&gt;

&lt;p&gt;Thread-safety, in this case, depends on the operation of both enqueuing and dequeuing. So we'll look at the operations in isolation, then put them both together.&lt;/p&gt;

&lt;h4&gt;Enqueuing&lt;/h4&gt;

&lt;p&gt;To enqueue an item, you need to:
&lt;/p&gt;&lt;ol&gt;
&lt;li&gt;Increment the &lt;code&gt;m_high&lt;/code&gt; variable. This is now the index the item should be inserted at.&lt;/li&gt;
&lt;li&gt;Assign the item to the specified slot in &lt;code&gt;m_array&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

The key to this is step 1. If step 1 can be done atomically, returning the new incremented value, that gives each thread trying to enqueue an item at the same time separate slots. Step 2 can then be performed concurrently by each thread without having to worry about conflicts. &lt;code&gt;Interlocked.Increment&lt;/code&gt; performs this increment atomically:&lt;p&gt;&lt;/p&gt;

&lt;pre&gt;public void Enqueue(T item) {
    int insertIndex = Interlocked.Increment(ref m_high);
    m_array[insertIndex] = item;
}&lt;/pre&gt;

&lt;p&gt;That's it! Simple, really...&lt;/p&gt;

&lt;h4&gt;Dequeuing&lt;/h4&gt;
&lt;p&gt;Dequeuing acts in a similar way to enqueuing:

&lt;/p&gt;&lt;ol&gt;
&lt;li&gt;Increment the &lt;code&gt;m_low&lt;/code&gt; variable. The previous index is the index of the item to be dequeued.&lt;/li&gt;
&lt;li&gt;Return the item stored at the specified slot in &lt;code&gt;m_array&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

And, similarly, if each thread trying to dequeue an item can be atomically 'assigned' a slot by performing step 1 using &lt;code&gt;Interlocked.Increment&lt;/code&gt;, then step 2 can be performed concurrently by several threads:

&lt;pre&gt;public T Dequeue() {
    // Increment returns the incremented value of the variable
    int index = Interlocked.Increment(ref m_low);
    return m_array[index-1];
}&lt;/pre&gt;

&lt;p&gt;But hang on, if we're dequeuing items, we need to check whether the queue is empty first. So checks need to be added to determine when a queue is empty. This depends on the segment state that represents an empty segment. In the case of &lt;code&gt;ConcurrentQueue&lt;/code&gt;, this state is when &lt;code&gt;m_low &amp;gt; m_high&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;small&gt;&lt;em&gt;The exact reasons for this specific state representing an empty segment are a gory implementation detail; if you're wondering the reasons for this, then start by working out exactly what &lt;code&gt;m_high&lt;/code&gt; and &lt;code&gt;m_low&lt;/code&gt; represent.&lt;/em&gt;&lt;/small&gt;&lt;/p&gt;

&lt;pre&gt;public bool TryDequeue(out T item) {
    if (m_low &amp;gt; m_high) {
        item = default(T);
        return false;
    }
    
    int index = Interlocked.Increment(ref m_low);
    item = array[index-1];
    return true;
}&lt;/pre&gt;

&lt;p&gt;All good? Well, no. We've actually introduced a subtle race condition. If we've got a queue with one item in it, and two threads trying to dequeue that one item, then if both threads perform the &lt;code&gt;if (m_low &amp;gt; m_high)&lt;/code&gt; check before &lt;code&gt;m_low&lt;/code&gt; has been incremented, one will dequeue the item and one will successfully dequeue on an empty queue.&lt;/p&gt;

&lt;p&gt;In &lt;code&gt;ConcurrentStack&lt;/code&gt;, a similar race condition was solved by using &lt;code&gt;Interlocked.CompareExchange&lt;/code&gt;; taking a snapshot, doing some work, then atomically making that work visible to other threads only if the state is still valid. By judicious use of local variables and loops, we can do the same thing here:

&lt;/p&gt;&lt;pre&gt;public bool TryDequeue(out T item) {
    int index;
    do {
        index = m_low;
        
        // check if the queue is empty
        if (index &amp;gt; m_high) {
            item = default(T);
            return false;
        }
        
        // I think m_low has the same value as index. If it is, replace it with index+1.
        // Return to me what m_low actually was.
        if (Interlocked.CompareExchange(ref m_low, index+1, index) == index) {
            // success - index is now a valid index to dequeue, specific to this thread
            item = m_array[index]
            return true;
        }
        
        // m_low has been changed in the meantime. Go back to the start and try again.
    }
    while (true);
}&lt;/pre&gt;

&lt;p&gt;In this case, &lt;code&gt;CompareExchange&lt;/code&gt; is acting as a conditional Increment.

&lt;/p&gt;&lt;h4&gt;Combining the two&lt;/h4&gt;

&lt;p&gt;This is where it gets complicated. The most obvious issue is another race condition, between enqueuing and dequeuing. To reiterate, the steps performed are:&lt;/p&gt;

&lt;strong&gt;Enqueuing&lt;/strong&gt;:
&lt;ol&gt;
&lt;li&gt;Increment the &lt;code&gt;m_high&lt;/code&gt; variable. This is now the index the item should be inserted at.&lt;/li&gt;
&lt;li&gt;Assign the item to the specified slot in &lt;code&gt;m_array&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;strong&gt;Dequeuing&lt;/strong&gt;:
&lt;ol&gt;
&lt;li&gt;Check if the queue is empty (&lt;code&gt;m_low &amp;gt; m_high&lt;/code&gt;). If it is, return false.&lt;/li&gt;
&lt;li&gt;Increment &lt;code&gt;m_low&lt;/code&gt; (if &lt;code&gt;m_low&lt;/code&gt; hasn't changed) and return the item.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Start with an empty queue. Thread 1 executes step 1 of enqueuing; &lt;code&gt;m_high&lt;/code&gt; is incremented. Thread 2 then tries to dequeue an item; step 1 of dequeuing thinks the queue has items because &lt;code&gt;m_low &amp;lt;= m_high&lt;/code&gt;. Step 2 of dequeuing then executes, and the item dequeued is the null value, because thread 1 hasn't yet assigned the enqueued item to the backing array.&lt;/p&gt;

&lt;p&gt;Hmm, how to solve this? There's more than one variable involved here; the race condition involves the relationship between &lt;em&gt;two&lt;/em&gt; variables, &lt;code&gt;m_high&lt;/code&gt; and &lt;code&gt;m_low&lt;/code&gt;, not a single variable being changed behind the scenes, which was the problem solved previously using &lt;code&gt;CompareExchange&lt;/code&gt; and loops.&lt;/p&gt;

&lt;p&gt;Well, we can't swap round the order of operations for enqueuing, because step 1 is what atomically assigns a slot when several threads are enqueuing items at once.&lt;/p&gt;

&lt;p&gt;What we need is some sort of notification that enqueuing has finished assigned the item to the backing array. Then the dequeuing thread can wait for that notification before dequeuing the item from the queue:&lt;/p&gt;

&lt;strong&gt;Enqueuing&lt;/strong&gt;
&lt;ol&gt;
&lt;li&gt;Increment the &lt;code&gt;m_high&lt;/code&gt; variable. This is now the index the item should be inserted at.&lt;/li&gt;
&lt;li&gt;Assign the item to the specified slot in &lt;code&gt;m_array&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Set state indicating the item at that index has been enqueued.&lt;/li&gt;
&lt;/ol&gt;

&lt;strong&gt;Dequeuing&lt;/strong&gt;:
&lt;ol&gt;
&lt;li&gt;Check if the queue is empty (&lt;code&gt;m_high &amp;gt;= m_low&lt;/code&gt;). If it is, return false.&lt;/li&gt;
&lt;li&gt;Increment &lt;code&gt;m_low&lt;/code&gt; (if &lt;code&gt;m_low&lt;/code&gt; hasn't changed).&lt;/li&gt;
&lt;li&gt;Wait for the state to be set at the index to dequeue.&lt;/li&gt;
&lt;li&gt;Return the item.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is the role played by the &lt;code&gt;m_state&lt;/code&gt; array in &lt;code&gt;ConcurrentQueue.Segment&lt;/code&gt;. Each segment has an &lt;code&gt;int[] m_state&lt;/code&gt; alongside the &lt;code&gt;T[] m_array&lt;/code&gt;; a 1 in a particular slot indicates the item has been set, and a 0 (the default value) indicates it isn't set. This turns the Enqueue method into this:

&lt;/p&gt;&lt;pre&gt;public void Enqueue(T item) {
    int insertIndex = Interlocked.Increment(ref m_high);
    m_array[insertIndex] = item;
    m_state[insertIndex] = 1;
}&lt;/pre&gt;

and TryDequeue into this:

&lt;pre&gt;public bool TryDequeue(out T item) {
    int index;
    do {
        index = m_low;
        
        if (index &amp;gt; m_high) {
            item = default(T);
            return false;
        }
        
        if (Interlocked.CompareExchange(ref m_low, index+1, index) == index) {
            // busywait until m_state[index] is set
            while (m_state[index] == 0) {}
            
            item = m_array[index]
            return true;
        }
    }
    while (true);
}&lt;/pre&gt;

&lt;p&gt;By marking both &lt;code&gt;m_array&lt;/code&gt; and &lt;code&gt;m_state&lt;/code&gt; as volatile, this stops the JIT reordering accesses to those arrays, ensuring that the invariant implicit in this code (&lt;code&gt;m_state[index] == 1&lt;/code&gt; if and only if &lt;code&gt;m_array[index]&lt;/code&gt; has an item in it) doesn't get broken by the JIT or processors reordering instructions.&lt;/p&gt;

&lt;p&gt;Tweaking the control flow a bit, and sprinkling calls to &lt;code&gt;SpinWait.SpinOnce()&lt;/code&gt; around as the abstraction of a busywait, and you've got the code of the enqueue and dequeue methods of &lt;code&gt;ConcurrentQueue&lt;/code&gt;.

&lt;/p&gt;&lt;h4&gt;Final notes&lt;/h4&gt;

&lt;p&gt;There's a couple of things I need to briefly mention. Firstly, I haven't yet covered growing the queue. Creating new segments when the current segment gets full is largely handled by enqueuing. When an item is added to the last slot in a segment, the Enqueue method creates a new segment and assigns it as the next segment in the queue using the &lt;code&gt;m_next&lt;/code&gt; variable. When the dequeue method reaches the end of the current segment, it waits for the &lt;code&gt;m_next&lt;/code&gt; variable to be set before looking inside that segment to see if there's an item to dequeue.&lt;/p&gt;

&lt;p&gt;Secondly, the eagle-eyed among you might have noticed that dequeued items don't get removed from the backing array; they are still referenced from the array when items are dequeued. I can't see any specific threading reason for this, other than giving one less location for possible threading issues. However, more importantly, this guarantees that the invariant I mentioned above is never be broken, which can be important if you're reasoning about the collection.&lt;/p&gt;

&lt;p&gt;Keeping references around longer than they need isn't so big an issue; the items referenced in the array are dereferenced all together when the containing segment is dereferenced. However it is something to bear in mind if you have large objects referenced in a &lt;code&gt;ConcurrentQueue&lt;/code&gt; for a long time; they might not be garbage collected when they otherwise could be.&lt;/p&gt;

&lt;h4&gt;Concurrent principles&lt;/h4&gt;

&lt;p&gt;So what principles can we extract from this examination of &lt;code&gt;ConcurrentQueue&lt;/code&gt;?&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;h4&gt;Separating threads&lt;/h4&gt;
&lt;p&gt;&lt;code&gt;Increment&lt;/code&gt; and &lt;code&gt;CompareExchange&lt;/code&gt; are used to atomically assign each thread performing an action a separate slot. Each thread can then do work on that slot concurrently without conflicting with each other.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;h4&gt;Invariants&lt;/h4&gt;
&lt;p&gt;An invariant is used to eliminate a race condition between enqueuing and dequeuing, in this case, &lt;code&gt;m_state[index] == 1&lt;/code&gt; iff &lt;code&gt;m_array[index]&lt;/code&gt; has an item in it. This ensures enqueuing and dequeuing on the same slot works as it should. Note that this invariant is maintained even at the cost of keeping things in memory longer than they should.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;h4&gt;&lt;code&gt;volatile&lt;/code&gt; used as a memory barrier&lt;/h4&gt;
&lt;p&gt;Furthermore, &lt;code&gt;volatile&lt;/code&gt; is used to add memory barriers ensuring the JIT doesn't reorder &lt;code&gt;m_state&lt;/code&gt; and &lt;code&gt;m_array&lt;/code&gt; array accesses and inadvertantly break the invariant.&lt;/p&gt;
&lt;/li&gt;&lt;/ol&gt;

&lt;p&gt;Well, that's the core of the &lt;code&gt;ConcurrentQueue&lt;/code&gt; class. I haven't even begun to cover the supporting methods, and glossed over most of the gory details. These I leave for you to explore and figure out. It's time to move on! We finally get some locks involved, and peer under the covers at &lt;code&gt;ConcurrentDictionary&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;small&gt;Cross posted from &lt;a href="http://www.simple-talk.com/community/blogs/simonc"&gt;Simple Talk&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;&lt;img src="http://geekswithblogs.net/simonc/aggbug/148471.aspx" width="1" height="1" /&gt;</content>
        <wfw:comment>http://geekswithblogs.net/simonc/comments/148471.aspx</wfw:comment>
        <slash:comments>0</slash:comments>
        <wfw:commentRss>http://geekswithblogs.net/simonc/comments/commentRss/148471.aspx</wfw:commentRss>
        <trackback:ping>http://geekswithblogs.net/simonc/services/trackbacks/148471.aspx</trackback:ping>
    </entry>
</feed>
