Alois Kraus

blog

  Home  |   Contact  |   Syndication    |   Login
  113 Posts | 8 Stories | 297 Comments | 162 Trackbacks

News



Article Categories

Archives

Post Categories

Image Galleries

Programming

Recently I have found an interesting issue with Thread.Interrupt during application shutdown. Some application was crashing once a week and we had not really a clue what was the issue. Since it happened not very often it was left as is until we have got some memory dumps during the crash. A memory dump usually means WindDbg which I really like to use (I know I am one of the very few fans of it).  After a quick analysis I did find that the main thread already had exited and the thread with the crash was stuck in a Monitor.Wait. Strange Indeed. Running the application a few thousand times under the debugger would potentially not have shown me what the reason was so I decided to do what I call constructive debugging. With constructive I mean that I do not try to analyze the running application and try to repro it there. Instead I try to create a synthetic sample application which tries to reproduce the issue. I am done when I get the same crash and callstack as in the production application. This approach is usually quite successful.

I did create a simple Console application project and try to simulate the exact circumstances when the crash did happen from the information I have via memory dump and source code reading. The thread that was  crashing was actually MS code from an old version of the Microsoft Caching Application Block. From reading the code I could conclude that the main thread did call the Dispose method on the CacheManger class which did call Thread.Interrupt on the cache scavenger thread which was just waiting for work to do.

My first version of the repro looked like this

 

    static void Main(string[] args)
    {
        Thread t = new Thread(ThreadFunc)
        {
            IsBackground = true,
            Name = "Test Thread"
        };
        t.Start();

        Console.WriteLine("Interrupt Thread");
        t.Interrupt();
    }

    static void ThreadFunc()
    {
        while (true)
        {
            object value = Dequeue(); // block until unblocked or awaken via ThreadInterruptedException
        }
    }

    static object WaitObject = new object();

    static object Dequeue()
    {
        object lret = "got value";
        try
        {
            lock (WaitObject)
            {
            }
        }
        catch (ThreadInterruptedException)
        {
            Console.WriteLine("Got ThreadInterruptException");
            lret = null;
        }
        return lret;
    }

I do start a background thread and call Thread.Interrupt on it and then directly let the application terminate. The thread in the meantime does plenty of Monitor.Enter/Leave calls to simulate work on it. This first version did not crash. So I need to dig deeper. From the memory dump I did know that the finalizer thread was doing just some critical finalizers which were closing file handles. Ok lets add some long running finalizers to the sample.

class FinalizableObject : CriticalFinalizerObject
{
    ~FinalizableObject()
    {
        Console.WriteLine("Hi we are waiting to finalize now and block the finalizer thread for 5s.");
        Thread.Sleep(5000);
    }
}

class Program
{
    static void Main(string[] args)
    {
        FinalizableObject fin = new FinalizableObject();
        Thread t = new Thread(ThreadFunc)
        {
            IsBackground = true,
            Name = "Test Thread"
        };
        t.Start();

        Console.WriteLine("Interrupt Thread");
        t.Interrupt();
        GC.KeepAlive(fin); // prevent finalizing it too early
        // After leaving main the other thread is woken up via Thread.Abort
        // while we are finalizing. This causes a stackoverflow in the CLR ThreadAbortException handling at this time.
    }

With this changed Main method and a blocking critical finalizer I did get my crash just like the real application. The funny thing is that this is actually a CLR bug. When the main method is left the CLR does suspend all threads except the finalizer thread and declares all objects as garbage. After the normal finalizers were called the critical finalizers are executed to e.g. free OS handles (usually). Remember that I did call Thread.Interrupt as one of the last methods in the Main method. The Interrupt method is actually asynchronous and does wake a thread up and throws a ThreadInterruptedException only once unlike Thread.Abort which does rethrow the exception when an exception handling clause is left.

It seems that the CLR does not expect that a frozen thread does wake up again while the critical finalizers are executed. While trying to raise a ThreadInterrupedException the CLR goes down with an stack overflow. Ups not so nice. Why has this nobody noticed for years is my next question. As it turned out this error does only happen on the CLR for .NET 4.0 (x86 and x64). It does not show up in earlier or later versions of the CLR.

I have reported this issue on connect here but so far it was not confirmed as a CLR bug. But I would be surprised if my console application was to blame for a stack overflow in my test thread in a Monitor.Wait call.

What is the moral of this story? Thread.Abort is evil but Thread.Interrupt is too. It is so evil that even the CLR of .NET 4.0 contains a race condition during the CLR shutdown. When the CLR gurus can get it wrong the chances are high that you get it wrong too when you use this constructs. If you do not believe me see what Patrick Smacchia does blog about Thread.Abort and List.Sort. Not only the CLR creators can get it wrong. The BCL writers do sometimes have a hard time with correct exception handling as well. If you do tell me that you use Thread.Abort frequently and never had problems with it I do suspect that you do not have looked deep enough into your application to find such sporadic errors.

posted on Monday, October 22, 2012 10:04 AM

Feedback

# re: Thread.Interrupt Is Evil 10/23/2012 7:50 PM Mark
Does this still crash if:
GC.KeepAlive(t);
is inserted after:
t.Interrupt();
?

# re: Thread.Interrupt Is Evil 10/23/2012 9:13 PM Alex
Sometimes Thread.Abort and its sibling are the only practical solutions to a problem, which I assume is why they're still in the Framework.

It seems to be relatively 'safe' doing an Abort if the resulting unwind of the stack discards all state being modified by that thread, and you've got no cross-thread interactions like locking.

# re: Thread.Interrupt Is Evil 10/24/2012 1:33 AM Rodrigo
You guys must find a Girlfriend.
Badly

# re: Thread.Interrupt Is Evil 10/24/2012 3:46 AM Chris
Interrupting threads should only be a last resort. But in large, complex systems, it's not unreasonable that this is absolutely necessary at some point. If it isn't necessary, they should remove the method.

You could argue that the reason nobody found this race condition, is that it isn't used enough in the first place. Which means that the vast majority of people agree with you: Thread.Interrupt is evil.

# re: Thread.Interrupt Is Evil 10/24/2012 4:50 AM Dan Sutton
Can't you use Thread.Join?

# re: Thread.Interrupt Is Evil 10/24/2012 9:16 AM Alois Kraus
@Mark: GC.KeepAlive is not necessary since the finalizers are not processed until the next GC does run which is in this little app surely not the case. I have added it just for completeness.

@Alex: Thread.Abort is a solution but still a bad one. The CLR can abort a thread only cooperatively when it does wait but not when the call reaches into the OS like waiting for file IO to return. When a kernel driver interaction is blocking you cannot terminate the whole process with TerminateProcess. That said there are still more dangerous things like TerminateThread of the Windows API which is also strongly discouraged.

@Rodrigo: Done ;-). Married, two kids.

@Dan: When you want to shut down a thread it is bad practice to wait for it to terminate. It can happen that it will never terminate and you could block the shutdown of the rest. Besides teh shutdown scenario it might be that the component is created and destroyed very often. Waiting on the thread to terminate would introduce unnecessary wait times. The common sense approach is to signal the thread somehow that it should exit. Usually you do this with two events. One is the work event the other is the exit event. With WaitHandle.WaitAny you can wait for more than one event.






# Can't Reproduce this Behaviour in VS2012 10/26/2012 1:52 AM AndrewJacksonZA
I can't reproduce this behaviour in VS2012 on a .NET 4.5 project in Win7 x64. Pity. (Well, I suppose it's a *good* thing it doesn't happen, but I would've liked to be able to reproduce it :-)

# re: Thread.Interrupt Is Evil 10/26/2012 10:43 PM Alois Kraus
I have tried this on .NET 3.5, 4.0 and 4.5 both x64 and x86. I was only able to let it crash with 4.0 (x64 and x86). It happened nearly every time under the debugger and 8/20 times when I did let it run on console.


# re: Thread.Interrupt Is Evil 10/27/2012 3:21 PM ChadF
WARNING - BAD PUNS AHEAD!

@Rodrigo:

They try.. but the computer keeps Interrupting common activities, the relationship eventually Aborts, and the Threads that binds them stop working.

=)


Post A Comment
Title:
Name:
Email:
Comment:
Verification: