Alois Kraus

blog

  Home  |   Contact  |   Syndication    |   Login
  111 Posts | 8 Stories | 296 Comments | 162 Trackbacks

News



Article Categories

Archives

Post Categories

Image Galleries

Programming

Singletons are the easiest programming pattern to learn which seems to be the reason why it is often used in places where it is inappropriate. They are very straightforward to implement and understand. Just create a static field and return the static instance in a public get property. A Logger Singleton is a classic example of this pattern:

        public class Logger
       {

        static LogWriter writer = new LogWriter();

        public static LogWriter Writer

        {

            get    { return writer; }

        }

    }

If you wrote such code before and thought that this is perfectly working in all cases I must disappoint you. There are subtle things going on when static fields are initialized by the CLR. This implementation is thread safe because static fields are initialized only once by the CLR before any method can be called or a static field can be accessed from the outside. More details can be found inside the ECMA-334 C# Language Specification and ECMA-335 Common Language Infrastructure (CLI). Now comes the tricky part. When do you think will be LogWriter instance be instantiated? To answer this question let's have a look at two nearly identical Singletons.


Two Singleton Implementations

    /// <summary>

    /// .class public auto ansi beforefieldinit EagerInitSingleton

    ///    extends object

    /// </summary>

    public class EagerInitSingleton

    {

        private EagerInitSingleton()

        {

            Console.WriteLine("EarlyInitSingleton was instantiated");

        }

 

        static EagerInitSingleton myInstance = new EagerInitSingleton

        public static EagerInitSingletonInstance

        {

            get { return myInstance; }

        }

    };

 

    /// <summary>

    /// .class public auto ansi LazySingleton

    ///    extends object

    /// </summary>

    public class LazySingleton

    {

        static LazySingleton() { } // Explicit Type initializer removes the beforefieldinit Flag

        private LazySingleton()

        {

            Console.WriteLine("Singleton was instantiated");

        }

 

        static LazySingleton myInstance = new LazySingleton

        public static LazySingletonInstance

        {

            get { return myInstance; }

        }

    };

 


This two simple Singleton classes have as only difference that the LazySingleton class does contain an empty type initializer (static ctor). A type initializer is as the name implies called before any constructor to initialize the type such as instantiating the static member variables. If an exception is thrown inside the Type Initializer you get a TypeInitializationException which is quite bad because the initializer is run only once for each type and then never again by the runtime. You can "fix" this by calling RuntimeHelpers.RunClassConstructor for yourself but this is a bad hack. Now let's use our Singletons.

 

    public static void UseSingletons(bool bShouldLog)

    {

            Console.WriteLine("Function Start");

            if (bShouldLog)

            {

                EagerInitSingleton inst1 = EagerInitSingletonInstance;

                LazySingleton inst2 = LazySingleton.Instance;

            }

            Console.WriteLine("Function End");

    }

 

The UseSingletons method can be called with false where no Singleton will be touched. Can you guess the outcome?

UseSingletons(false)

EagerInitSingleton was instantiated
Function Start
Function End

UseSingletons(true)

EagerInitSingleton was instantiated
Function Start
LazySingleton was instantiated
Function End


The EagerInitSingleton deserves it's name because it is initialized before our function is entered regardless if it is used inside it or not! This behavior is controlled with the beforefieldinit flag in IL code at the class declaration which you can check with Reflector in IL mode or by simply looking into the comments of the Singleton class declarations where I did put the output of Reflector. C# does not allow us to directly control the initialization behavior but if we declare an type initializer inside our Singleton class the C# compiler does not emit the beforefieldinit Flag. For performance reasons the C# compiler does emit the beforefieldinit flag to speed up the JITer which has now more freedom to initialize the types when he thinks it is appropriate. You will not notice the difference until you loose performance by this side effect.

Static Type Initialization Order

It is not only a tricky thing when a static field initialized but also in what order the members of a type are initialized. The ECMA Spec (ECMA-334 chapter 17.4.5) states that fields are initialized in the order they are declared but it is strongly discouraged to rely on such a thing. Static variables are initialized in two stages.
  1. Initialize static field to its default value (0 or null depending on its type).
  2. Initialize static field with its actual value.
The spec clearly states that it is possible to observe variables in its default but not yet initialized state. This does mean that it is possible to see a null reference of an statically initialized object under special circumstances! This part of the spec becomes relevant in our life as programmers when we have circular dependencies which can happen in your code. The following example declares two class types A and B which have a static member of the other type inside it. To make it clear when the type initializer is run a console message is printed.

Two classes with static members

    class A

    {

        static B b = new B();

 

        static A()

        {

            Console.WriteLine("A Type Initializer Thread {0}", Thread.CurrentThread.ManagedThreadId);

        }

 

        public A()

        {

            Console.WriteLine("A Ctor from Thread {0}, Static Field: {1}",

                    Thread.CurrentThread.ManagedThreadId, (b == null) ? "null" : "Initialized");

            Thread.Sleep(1000);

        }

 

        public static void AFunction()

        {

            Console.WriteLine("A Static function Thread {0}, Static Field: {1}"

                     Thread.CurrentThread.ManagedThreadId, (b == null) ? "null" : "Initialized");

        }

    };

 

    class B

    {

        static A a = new A();

 

        static B()

        {

            Console.WriteLine("B Type Initializer Thread {0}", Thread.CurrentThread.ManagedThreadId);

        }

 

        public B()

        {

            Console.WriteLine("B Ctor  Thread {0}, Static Field: {1}",

                    Thread.CurrentThread.ManagedThreadId, (a == null) ? "null" : "Initialized");

            Thread.Sleep(1000);

        }

 

        public static void BFunction()

        {

            Console.WriteLine("B Static function Thread {0}, Static Field: {1}"

                    Thread.CurrentThread.ManagedThreadId, (a == null) ? "null" : "Initialized");

        }

    };



Such a circular dependency does happen in many programs, although it will be not so direct like in this example. What will happen if we call one of the static functions of A or B?

        public static void StaticInit()

        {

            Console.WriteLine("Function Start");

            A.AFunction();

            Console.WriteLine("Function End");

        }

        Function Start
        A Ctor from Thread 1, Static Field: null
        B Type Initializer Thread 1
        B Ctor  Thread 1, Static Field: Initialized
        A Type Initializer Thread 1
        A Static function Thread 1, Static Field: Initialized
        Function End

Here is A::b == null  inside the ctor of  class A although we would expect a non null value. I did tell you that type initializers are run before any ctor right? It should be impossible that  we encounter a null reference of a statically initialized member but this assumption does no longer hold true in case of circular dependencies. To avoid endless recursion other rules apply here. If we set a breakpoint with our debugger  inside the ctor of A we see what is going on:

            StaticInit.A..ctor()
            StaticInit.B..cctor()
            [PrestubMethodFrame: 0012e6b8] StaticInit.B..ctor()
            StaticInit.A..cctor()
            [PrestubMethodFrame: 0012f424] StaticInit.A.AFunction()

Before the method StaticInit.A.AFunction can be called the following things happen.

  1. Invoke A's type initializer (cctor)
  2. Instantiate member variable A::b = new B(); which causes the CLR to prepare a call to B's ctor
  3. Invoke B's type initializer before the ctor of B can be run
  4. Invoke B::a = new A(); which calls the ctor of A

The last call to the ctor of A should cause a call to the type initializer of A before the ctor can run but since we are already during the initialization of A we would end up inside a recursion. To prevent such bad things at this point the CLI gives up and let us get away with static fields which are set to its default values but are not yet initialized. This behavior is explained in greater detail in ECMA-335  (chapter 9.5.3.3: Races and Deadlocks). Similar mechanisms exist to prevent deadlocks if we initialize different types from within different tthreads. The Spec outlines the Deadlock/Race resolver algorithm for type initialization as follows:

    Two separate threads might start attempting to access static variables of separate
    types (A and B) and then each would have to wait for the other to complete initialization. 

  • 1. At class load time (hence prior to initialization time) store zero or null into all static fields of the type. 
  • 2. If the type is initialized you are done. 
  • 2.1. If the type is not yet initialized, try to take an initialization lock. 
  • 2.2. If successful, record this thread as responsible for initializing the type and proceed to step 2.3. 8
  • 2.2.1. If not, see whether this thread or any thread waiting for this thread to complete already holds the lock. 
  • 2.2.2. If so, return since blocking would create a deadlock. This thread will now see an incompletely initialized state for the type, but no deadlock will arise.
  • 2.2.3 If not, block until the type is initialized then return.
  • 2.3 Initialize the parent type and then all interfaces implemented by this type.
  • 2.4 Execute the type initialization code for this type.
  • 2.5 Mark the type as initialized, release the initialization lock, awaken any threads waiting for this type to be.initialized, and return.

Luckily you need rarely know this stuff in such great detail but does not hurt your brain much either ;-). What have we learned from this? Variable initialization is a quite complex process but the CLI does all the heavy lifting behind the scenes. To initialize a type with statics inside it in a thread safe manner special care has to be taken during your class design. 



Deterministic And Thread Safe Initialization

The oddities of static variable initialization can be circumvented doing the variable initialization by ourself where you have full control. In this case the  Double-Check Lock pattern is perfectly suited for this task but unfortunately a big confusion exists how a correct multi threaded implementation does look like. There were changes in the memory model of the CLR between 1.0 and 2.0 which are not very well documented. This makes it not easier to determine if one or the other implementation does work correct with every CLR version because you need to know the underlying memory model of  your language. Java for example has a memory model where this pattern does not work. Things might have changed with Java 1.5 but it was hard enough to get the correct information about the behavior of the Double-Check Lock pattern for C#. To be able to show you that my implementation of this pattern is correct we need to know how volatile does affect code generation.

What the volatile keyword in C# really does

There are many myth around what the volatile keyword in C# is really doing. The only way to know what really happens is to look at disassembled code. I used a small code snippet that does a double check of an variables reference. Here we can study how the volatile keyword does affect code generation. The following code was used:

        Char [] cArr = new Char[20];

        public void Func()

        {

            object secondRef = cArr;

            if (secondRef == cArr)

            {

                if( secondRef == cArr )

                    return;

            }

        }

It does simply check if the reference is the same as before. This code can is quite good optimized by the JIT compiler by storing the first object reference inside a register without the need to check the memory location of the cArr object a second or third time:

        public void Func()
        {
            object secondRef = cArr;
00000000  mov         edx,dword ptr [ecx+4]
00000003  mov         eax,edx
            if (secondRef == cArr)
00000005  cmp         eax,edx
00000007  jne         00000009
00000009  ret     

        }

When we decorate the array declaration with the volatile keyword the JITer is no longer allowed to store any previous addresses of the array inside a register. Each compare call will need to get the arrays address from memory.

        public void Func()
        {
            object secondRef = cArr;
00000000  mov         eax,dword ptr [ecx+4]
            if (secondRef == cArr)
00000003  cmp         eax,dword ptr [ecx+4]
00000006  jne         0000000B
            {
                if( secondRef == cArr )
00000008  cmp         dword ptr [ecx+4],eax
0000000b  ret   

         }

Did you notice the differences of the generated code? If you omit the volatile keyword the second check is simply removed and a register compare is performed with the previously determined address of our array. cmp eax,edx becomes with volatile cmp eax,dword ptr [ecx+4] which does in effect work like a memory barrier. In other words the volatile keyword forces the JITer to access the memory location of the volatile variable each time it is requested. No register caching will ever happen.

Does The "lock" Keyword Affect Memory Barriers?

We can check this quite easy by placing a lock(this) around the code of our test function and declare the array as a non volatile field. The disassembled code does reflect only the behavior of the JITer for x86 32-Bit code. On other CPU platforms will look different.

        public void Func()
        {
            lock (this)
00000000  push        ebp 
00000001  mov         ebp,esp
00000003  push        edi 
00000004  push        esi 
00000005  push        ebx 
00000006  sub         esp,18h
00000009  xor         eax,eax
0000000b  mov         dword ptr [ebp-24h],eax
0000000e  xor         eax,eax
00000010  mov         dword ptr [ebp-18h],eax
00000013  mov         dword ptr [ebp-24h],ecx
00000016  call        79173A71
            {
                object secondRef = cArr;
0000001b  mov         eax,dword ptr [ebp-24h]
0000001e  mov         edx,dword ptr [eax+4]
00000021  mov         ecx,edx
                if (secondRef == cArr)
00000023  cmp         ecx,edx
00000025  jne         00000040
                {
                    if (secondRef == cArr)
00000027  cmp         ecx,edx
00000029  jne         00000040
0000002b  mov         dword ptr [ebp-1Ch],0
00000032  mov         dword ptr [ebp-18h],0FCh
00000039  push        0D00140h
0000003e  jmp         00000055
00000040  mov         dword ptr [ebp-1Ch],0
00000047  mov         dword ptr [ebp-18h],0FCh
0000004e  push        0D00149h
00000053  jmp         00000055
                        return;
00000055  mov         ecx,dword ptr [ebp-24h]
00000058  call        79173CEB
0000005d  pop         eax 
0000005e  jmp         eax 
00000060  lea         esp,[ebp-0Ch]
                }
            }
        }

I have marked the compare operations in question with yellow again. As it turns out the lock keyword does NOT prevent any JIT optimizations inside the lock statement with regards to variable access. But it does create a read memory barrier after entering the the lock statement which does cause a reload of all variable references from memory. Any register cached object addresses are not reused after the lock is entered. This is true for the generated code on a Intel P4 but on other CPU's the picture can look quite different. When we deal with multiple CPU's, branch prediction, speculative execution, ...  things get very complicated and we will have to consult the CPU vendors manual how the JIT generated assembler code does behave. The Itanium CPU for example has load and store operations with acquire semantics (ld.acq (read barrier) , st.rel (write barrier)) to ensure that our load and store operations are not reorganized by the CPU in a way we do not want.

Is "volatile" Needed For Double Checked-Locking?

        volatile Char [] cArr = new Char[20];

 

        public void Func()

        {

            object secondRef = cArr;

            if (secondRef == cArr)

            {

                lock (this)

                {

                    if (secondRef == cArr)

                        cArr = new Char[0x10];

                }

            }

        }

I have compared the generated assembler code of both versions and found to my surprise no differences in the generated assembler code at my Intel P4. You cannot draw from this result the conclusion that there will be no differences between these two variants at other CPU platforms. Joe Duffy has an extensive article about the Double-Check Lock pattern online where he talks about the IA64 memory model. The answer to this question is: You need the volatile keyword if you want to ensure that the Double-Check Lock pattern will work on all platforms.

The Correct Double Checked-Lock Pattern Implementation

To put a long story short if you stick with the following Double-Check Lock pattern implementation on reference types you are on the safe side. You need at least
  • A volatile instance field
  • A lock after the first null check


Double-Checked Lock Singleton

    class DoubleLockSingleton

    {

        private DoubleLockSingleton() { }

 

        static volatile DoubleLockSingleton instance;

        static object myLock = new object();

 

        public DoubleLockSingleton Instance

        {

            get

            {

                if (instance == null)

                {

                    lock (myLock)

                    {

                        if (instance == null)

                        {

                            instance = new DoubleLockSingleton();

                        }

                    }

                }

                return instance;

            }

        }

    }


Second Variant with an explicit volatile read

    public class StructDoubleLockChecking

    {

        private static object myLock = new object();

        private static DateTime instance;  // we have a struct instance which we cannot check against null

        private static int initialized;

        public DateTime Instance

        {

            get

            {

                if (Thread.VolatileRead(ref initialized) == 0)

                {

                    lock (myLock)

                    {

                        if (initialized == 0)

                        {

                            instance = DateTime.Now;

                            initialized = 1;

                        }

                    }

                }

                return instance;

            }

        }

    }

Since some CPU's can reorder your reads and writes even above your lock statement we need to do a volatile read on our initialized variable correctly. This can be achieved by using the volatile modifier or the more explicit Thread.VolatileRead functions. I hope you have enjoyed reading this article while I enjoyed debugging into the CLI ;-).

posted on Sunday, September 10, 2006 8:30 PM

Feedback

# re: Lazy Vs Eager Init Singletons / Double-Check Lock Pattern 9/11/2006 8:44 PM Marc Brooks
You need neither volatile or locking to do this fully lazy in a cross-platform way. All you really need is a nested class that does the lazy instantiation like this:

public sealed class Singleton
{
Singleton()
{
}

public static Singleton Instance
{
get
{
return Nested.instance;
}
}

class Nested
{
// Explicit static constructor to tell C# compiler
// not to mark type as beforefieldinit
static Nested()
{
}

internal static readonly Singleton instance = new Singleton();
}
}

Courtesy of Yoda himself :)

http://www.yoda.arachsys.com/csharp/singleton.html )

# re: Lazy Vs Eager Init Singletons / Double-Check Lock Pattern 9/11/2006 8:47 PM Keith Rull
Very nice explanation! two-thumbs up!

# re: Lazy Vs Eager Init Singletons / Double-Check Lock Pattern 9/12/2006 9:32 AM Alois Kraus
Hi Marc,

I have carefully read Yodas thought's but I disagree what he claim's as best solution. My main objection against this solution is that you shield any errors during initialization. If you throw an exception inside the ctor of Singleton (at least it should do something useful) you get an TypeInitializationException with your original exception as inner exception. This is rather bad because now people will be confused what caused this odd exception which occurs only once and never again. All following calls will cause a NullReferenceException with no sign what went wrong before. If the first user of your Singleton did catch the TypeInitializationException you make it much harder to debug.

Yours,
Alois Kraus


# re: Lazy Vs Eager Init Singletons / Double-Check Lock Pattern 2/9/2007 9:51 AM Cosmin
Thread.VolatileRead(ref ...) is not supported in .NET 2.0 framework

# re: Lazy Vs Eager Init Singletons / Double-Check Lock Pattern 5/28/2007 4:16 PM Vlad
Why not, Cosmin, it is supported:

http://msdn2.microsoft.com/en-us/library/bah54t54(VS.80).aspx




# re: Lazy Vs Eager Init Singletons / Double-Check Lock Pattern 1/8/2009 4:34 AM Stacy Vicknair
What I don't understand is in your double checked lock. If your double checked lock just uses an arbitrary object when locking, why is it necessary for the singleton's unique instance to still be volatile?

# re: Lazy Vs Eager Init Singletons / Double-Check Lock Pattern 1/8/2009 5:11 AM Alois Kraus
If you do not use volatile it could be that the next thread still sees a null reference although it has been initialized. That could lead to double initializations.

Yours,
Alois Kraus


# re: Lazy Vs Eager Init Singletons / Double-Check Lock Pattern 5/25/2009 11:52 AM Antoniu S
Stacy

1) first thread checks if singleton var is null; it is null
2) second thread instantiates class in the meantime
3) first thread acquires lock
4) when first thread tries to compare null to singleton var
it does that by using a cahed register that had value null
5) reinitializes singleton var again


# re: Lazy Vs Eager Init Singletons / Double-Check Lock Pattern 5/25/2009 12:45 PM Antoniu S
public DateTime Instance

{

get

{...}
}

I guess it needs to be declared static this property

Post A Comment
Title:
Name:
Email:
Comment:
Verification: