How To Break The CLR For A Cool Feature

Did you ever want a feature but the system got in your way? In my case I wanted to to get the last thrown exception outside of any catch block. Be warned that this is a deep dive into the guts of the Common Language Runtime. Language architects and Api designers should stop reading from here on. Yes I did break the rules of encapsulation and many other things. I am perfectly fine with a reputation to use Apis in not intended ways but otherwise it would have been impossible create a tracing library which does automatically trace the thrown exceptions without the need for any catch handler.

Still reading? Ok. Recently I did a lot debugging with Sos and Psscor2 (waaay better than Sos) to track down weird problems. One of the coolest features is that you can display all managed threads along with the last thrown exception for each thread. I did wonder how the debugger knows what the last thrown exception in a thread was. Luckily we can look at the SOS source code because the SSCLI reference implementation does contain nearly the same code as the official .NET Framework 2.0. After poking around for some time I hit gold in sscli20\clr\src\vm\threads.h(2612):

private :

    // Stores the most recently thrown exception. We need to have a handle in
    // case a GC occurs before we catch so we don't lose the object. Having a
    // static allows others to catch outside of COM+ w/o leaking a handler and
    // allows rethrow outside of COM+ too. Differs from m_pThrowable in that it
    // doesn't stack on nested exceptions.
    OBJECTHANDLE m_LastThrownObjectHandle;  // Unsafe to use directly.  Use
                                            // accessors instead.

That does mean that the CLR itself does store the last thrown exception inside the unmanaged implementation of each managed thread object. Why on earth could that give me a cool feature? Lets have a look at this C# code

try
                    {
                        ThrowsException();
                    }
                    finally
                    {
                        Thread.Sleep(1000);
                    }

It is sometimes important to know if the finally block executes due to an thrown exception or all went well and we are executing the happy path. Three is a simple method to find out if you are in an exception unwind scenario or not which I have published here. That alone gives you the ability to write e.g. maintainable unit tests. You can when you exit the test with an exception print the contents of the tests out to Console only when you need to look at it. That does greatly minimize the output of your unit tests and you can examine the faulting values right away without the need for a debugger. Some tests of my ApiChange tool use this method like here:

 [Test]
        public void GetVirtualEvents()
        {
            EventQuery query = new EventQuery("virtual * *");
            var events = query.GetMatchingEvents(myClassWithManyEvents);
            try
            {
                Assert.AreEqual(1, events.Count, "Event count");
            }
            finally
            {
                if (ExceptionHelper.InException)
                {
                    foreach (var ev in events)
                    {
                        Console.WriteLine("{0}", ev.Print());
                    }
                }
            }
        }

This test will only print all the events that were found when something was wrong. The ApiChange tool does also contain a small Tracing library which allows you to trace entering and leaving of methods with very little overhead. The Api for the Tracer looks like this:

 public class TracerTests
    {
        // should be static since the construction of this type is costly (some string splits and array magic)
        static TypeHashes myType = new TypeHashes(typeof(TracerTests));
 
        
        [Test]
        public void Trace_Instance_Messages_To_DebugOutput()
        {
            TracerConfig.Reset("debugoutput");
 
            using (Tracer t = new Tracer(myType, "Test_Method"))
            {
                t.Info("Info message");
                t.Warning("Warning message");
                t.Error("Error message");
            }
        }

But back to our killer feature. What I really wanted was automatic tracing of exceptions when the trace object was disposed. This should be done automatically without the need to catch every exception upon method leave in a catch handler which would lead to very awkward code.

 [Test]
        [ExpectedException(typeof(NotImplementedException))]
        public void Demo_Show_Leaving_Trace_With_Exception()
        {
            TracerConfig.Reset("console");
 
            SomeMethod();
        }
 
        void SomeMethod()
        {
            using (Tracer t = new Tracer(myType, "SomeMethod"))
            {
                SomeOtherMethod();
            }
        }
 
        private void SomeOtherMethod()
        {
            using (Tracer t = new Tracer(myType, "SomeOtherMethod"))
            {
                FaultyMethod();
            }
        }
 
        private void FaultyMethod()
        {
            throw new NotImplementedException("Hi this a fault");
        }

And the output is:

***** ApiChange.IntegrationTests.Diagnostics.TracingTests.Demo_Show_Leaving_Trace_With_Exception
18:57:46.665  03064/05180 <{{         > ApiChange.IntegrationTests.Diagnostics.TracingTests.SomeMethod 
18:57:46.668  03064/05180 <{{         > ApiChange.IntegrationTests.Diagnostics.TracingTests.SomeOtherMethod 
18:57:46.670  03064/05180 <         }}< ApiChange.IntegrationTests.Diagnostics.TracingTests.SomeOtherMethod Exception thrown: System.NotImplementedException: Hi this a fault
   at ApiChange.IntegrationTests.Diagnostics.TracingTests.FaultyMethod()
   at ApiChange.IntegrationTests.Diagnostics.TracingTests.SomeOtherMethod()
   at ApiChange.IntegrationTests.Diagnostics.TracingTests.SomeMethod()
   at ApiChange.IntegrationTests.Diagnostics.TracingTests.Demo_Show_Leaving_Trace_With_Exception()
18:57:46.670  03064/05180 <         }}< ApiChange.IntegrationTests.Diagnostics.TracingTests.SomeOtherMethod Duration 2ms
18:57:46.689  03064/05180 <         }}< ApiChange.IntegrationTests.Diagnostics.TracingTests.SomeMethod Duration 24ms

Did you notice that for the user code nothing needed to be done to notify tracing about the exception? You simply wrap your methods with using statements to ensure that upon method enter and leave the corresponding enter and leave trace method is printed to the configured output device. The exception is only printed out once to prevent cluttering your output with the same message over and over again. In the case you want to try this out by yourself you only need to download the latest source of ApiChange and reference ApiChange.Api.dll from your project. Then litter your code with tracer instances and set the environment variable _Trace to one of the output devices. If you are not sure you only need to remember _Trace and set it to some random value and the help is traced to the console and debugoutput. The currently released binaries do not contain this exception printing feature.

Here is the help:

Tracing can be enabled via the environment variable _Trace

Format: <Output Device>;<Type Filter> <Message Filter>; <Type Filter> <Message Filter>; ...
    <Output Device>  Can be: Default, File, Debugoutput or Console.

Output Device	Description
Default	Write traces to configured trace listeners with the source name _Trace read from App.config
File [FileName]	Write traces to given file name. The AppDomain name and process id are prepended to the file name to make it unique. If none given a trace file where the executable is located is created.
DebugOutput	Write to windows kernel via OutputDebugString which can be best viewed with Dbgview from SysInternals
Console	Write to stdout

<[!]TypeFilter>  It is basically the full qualified type name (case insensitive).
                     If the TypeFilter begins with a ! character it is treated as exclusion filter.
                     Example: ApiChange.Infrastructure.AggregateException or * or ApiChange.* partial name matches like Api* are NOT supported

    <Message Filter> Enable a specific trace level and/or severity filter
                     Several filters can be combined with the + sign.
                     Allowed trace Levels are: Level*, Level1, Level2, ... Level5, LevelDispose. Shortcuts are l*, l1, ... l5, ldispose.
                     Severity Filters are:  All, InOut, Info, I, Information, Warning, Warn, W, Error, E, Exception, Ex 
                     Example: Level1+InOut+Info
                     When no severity and or trace level is specified all levels/severities are enabled.

Example:

Enable full tracing to debugoutput for all types except the ones which reside in the ApiChange.Infrastructure namespace
_Trace=debugoutput; ApiChange.* all;!ApiChange.Infrastructure.* all

Trace all exceptions in the first method where it is first encountered
_Trace=file c:\temp\exceptions.txt; * Exception

The configuration can be as easy as _Trace=console and everything is traced to console or you can be more restrictive and you can trace only a specific class. That is kind of cool. But how did you get your hands on the Exception object? This did involve quite some reverse engineering with Windbg to find out. The trickier part was how I could do this in a safe way. First I needed to get my hands on the unmanaged Thread object which is easy if you examine the managed thread object. There is a little field called DONT_USE_InternalThread.

Since I am a bad guy the first thing is to use this do not use field to get the address of the unmanaged Thread object. From there we know that it does contain an OBJECTHANDLE to the last thrown exception object. The next thing is to calculate the offset where the m_LastThrownObjectHandle resides. These offsets are likely to change and it is therefore a bad practice to do such things. After doing some tests I found the offsets for ppEx in the unmanaged thread object:

	.NET 2	.NET 4
32 Bit	0x180	0x188
64 Bit	0x240	0x250

The tricky part is that this pointer does not hold the address of the managed exception object but a pointer pointer. To find this out did take some time. But it should become worse. Now I have got the address which !pe in Windbg would display me as well. How can I convert this back to a managed object reference? I did manage to use Marshal.UnsafeAddrOfPinnedArrayElement to get the address of a managed object inside an array and write to this address with Marshal.WriteIntPtr. But this is a recipe for disaster since the GC is free to move objects around and I am writing potentially to a memory location which is after a GC in the middle of my method part of a completely different managed object. I am corrupting the managed heap with this method! I did not manage to get it right so I asked at the CLR forum for help and got some hints how I could solve this problem. The solution I finally got working employs some dynamic IL code generation to cast an IntPtr to a managed object back to a managed reference. This prevents the problem that I write to some arbitrary location into the managed heap. One race condition does still exist though: When I try to dereference the pointer to a managed object a GC might move the object to another address. Then I am pointing to some random memory which does cause an ExecutionEngineException. I will check how I can get around this one but I am pretty certain that you will never see such an error because the methods employed here have a very small window where something could get wrong.

Here is the class to get the last thrown exception from the current thread:

class LastException
    {
        static FieldInfo myThreadPointerFieldInfo;
        static int myThreadOffset = -1;
 
        PtrConverter<Exception> myConverter = new PtrConverter<Exception>();
 
        public LastException()
        {
            if (myThreadPointerFieldInfo == null)
            {
                // Dont read the don´t or you will get scared. 
                // I prefer to read this more like: If you dont know the rules you will
                // get biten by the next .NET Framework update
                myThreadPointerFieldInfo = typeof(Thread).GetField("DONT_USE_InternalThread",
                   BindingFlags.Instance | BindingFlags.NonPublic);
            }
 
            if (myThreadOffset == -1)
            {
                ClrContext context = ClrContext.None;
                context |= (IntPtr.Size == 8) ? ClrContext.Is64Bit : ClrContext.Is32Bit;
                context |= (Environment.Version.Major == 2) ? ClrContext.IsNet2 : ClrContext.None;
                context |= (Environment.Version.Major == 4) ? ClrContext.IsNet4 : ClrContext.None;
 
                switch(context)
                {
                    case ClrContext.Is32Bit|ClrContext.IsNet2:
                        myThreadOffset = 0x180;
                        break;
                    case ClrContext.Is32Bit|ClrContext.IsNet4:
                        myThreadOffset = 0x188;
                        break;
                    case ClrContext.Is64Bit|ClrContext.IsNet2:
                        myThreadOffset = 0x240;
                        break;
                    case ClrContext.Is64Bit|ClrContext.IsNet4:
                        myThreadOffset = 0x250;
                        break;
                }
            }
        }
 
        public Exception GetLastException()
        {
            Exception lret = null;
            if (myThreadPointerFieldInfo != null)
            {
                IntPtr pInternalThread = (IntPtr)myThreadPointerFieldInfo.GetValue(Thread.CurrentThread);
                IntPtr ppEx = Marshal.ReadIntPtr(pInternalThread, myThreadOffset);
                if (ppEx != IntPtr.Zero)
                {
                    IntPtr pEx = Marshal.ReadIntPtr(ppEx);
                    if( pEx != IntPtr.Zero )
                    {
                        lret = myConverter.ConvertFromIntPtr(pEx);
                    }
                }
            }
            return lret;
        }
    }
 
    /// <summary>
    /// There are currently 4 combinations possible but I want to 
    /// stay extensible since some .NET Framework patch might need
    /// additional treatement.
    /// </summary>
    [Flags]
    enum ClrContext
    {
        None,
        Is32Bit = 1,
        Is64Bit = 2,
        IsNet2 = 4,
        IsNet4 = 8
    }

It does dereference the pointer to the managed exception and converts it back to a managed reference. The code to cast a pointer back to a managed reference is a bit trickier but I wanted this feature as safe as possible:

 /// <summary>
    /// This class can convert any pointer to a managed object into a true object reference back.
    /// </summary>
    /// <typeparam name="T"></typeparam>
    public class PtrConverter<T>
    {
        delegate U Void2ObjectConverter<U>(IntPtr pManagedObject);
        static Void2ObjectConverter<T> myConverter;
 
        // The type initializer is run every time the converter is instantiated with a different 
        // generic argument. 
        static PtrConverter()
        {
            GenerateDynamicMethod();
        }
 
        static void GenerateDynamicMethod()
        {
            if (myConverter == null)
            {
                DynamicMethod method = new DynamicMethod("ConvertPtrToObjReference", typeof(T),
                               new Type[] { typeof(IntPtr) }, StaticModule.UnsafeModule);
                var gen = method.GetILGenerator();
                // Load first argument 
                gen.Emit(OpCodes.Ldarg_0);
                // return it directly. The Clr will take care of the cast!
                // this construct is unverifiable so we need to plug this into an assembly with 
                // IL Verification disabled
                gen.Emit(OpCodes.Ret);
                myConverter = (Void2ObjectConverter<T>)method.CreateDelegate(typeof(Void2ObjectConverter<T>));
            }
        }
 
        /// <summary>
        /// Convert a pointer to a managed object back to the original object reference
        /// </summary>
        /// <param name="pObj">Pointer to managed object</param>
        /// <returns>Object reference</returns>
        /// <exception cref="ExecutionEngineException">When the pointer does not point to valid CLR object. This can happen when the GC decides to move object references to new memory locations. 
        /// Beware this possibility exists all the time (although the probability should be very low)!</exception>
        public T ConvertFromIntPtr(IntPtr pObj)
        {
            return myConverter(pObj);
        }
    }
 
    class StaticModule
    {
        const string ModuleAssemblyName = "AloisDynamicCaster";
 
        static Module myUnsafeModule;
        public static Module UnsafeModule
        {
            get
            {
                if (myUnsafeModule == null)
                {
                    AssemblyName assemblyName = new AssemblyName(ModuleAssemblyName);
                    AssemblyBuilder aBuilder = AppDomain.CurrentDomain.DefineDynamicAssembly(assemblyName,
                                                                                 AssemblyBuilderAccess.Run);
                    ModuleBuilder mBuilder = aBuilder.DefineDynamicModule(ModuleAssemblyName);
                    // set SkipVerification=true on our assembly to prevent VerificationExceptions which warn
                    // about unsafe things but we want to do unsafe things after all.
                    Type secAttrib = typeof(SecurityPermissionAttribute);
                    var secCtor = secAttrib.GetConstructor(new Type[] { typeof(SecurityAction) });
                    CustomAttributeBuilder attribBuilder = new CustomAttributeBuilder(secCtor,
                        new object[] { SecurityAction.Assert },
                        new PropertyInfo[] { secAttrib.GetProperty("SkipVerification", BindingFlags.Instance | BindingFlags.Public) },
                        new object[] { true });
 
                    aBuilder.SetCustomAttribute(attribBuilder);
                    TypeBuilder tb = mBuilder.DefineType("MyDynamicType", TypeAttributes.Public);
                    myUnsafeModule = tb.CreateType().Module;
                }
 
                return myUnsafeModule;
            }
        }
    }

Just to give you an idea how fast the trace library is here are the perf numbers from the unit tests. The actual number of traced lines is a factor 5 bigger since I did measure only how much a typical trace compound statement would cost:

void GenerateTraces()
        {
            using (Tracer t = new Tracer(myType, "GenerateTraces"))
            {
                t.Info("Infoxx message: {0}", "Some other info string");
                t.Warning("Warning message: {0}", "Some other warning string");
                t.Error("Error message {0}", "Some other error string");
            }
        }


Output Device	GenerateTraces/s
Off	4 562 044
Null - not matching	2 380 953
Null	110 571
File	18 957
DebugOutput	17 331
DebugOutput (DbgView)	156

The generated output file was 330 MB in 2.53s for 50000 calls of GenerateTraces which is quite good. The most important perf aspect is the trace filter performance (Null – matching) which means that you can enable tracing with nearly no overhead for only some classes. The filter code does employ a cool trick. The name TypeHashes may point you to the right direction when you guess how this ultra fast filter performance could be achieved. You can use this facility for your own good to enable alternate execution paths via configuration.

 [Test]
        public void Demo_Alternate_Flow()
        {
            using(Tracer t = new Tracer(myType, "Demo_Alternate_Flow"))
            {
                t.InfoExecute(()=>
                    {
                        Console.WriteLine("Execute code only when info level is enabled");
                    });
            }
        }

Conditional execution of delegates comes in handy if you want to pipe massive amounts of formatted data to the configured output device but you need to do some custom formatting which should only happen when the trace level is actually enabled.

This is much much faster than any other logging library. Log4Net, NLog, Enterprise Library are magnitudes slower because these libraries do not differentiate between logging and tracing. If you want to follow your multi threaded application flow you must have a fast tracing library like the one that is built into the ApiChange tool.

I still have many ideas what you can do with these classes. E.g you could write a Windbg script that does give you all object instances of a specific type you do not have access to and use the gained addresses to cast them back to managed references to get access to them. Another cool thing would be to extend tracing so you can take a minidump when you encounter a known sporadic exception from some class. That is easy now since we can check the exception type inside the Dispose method of the tracer. Have fun using these things.

Sunday, July 04, 2010 6:11 AM

This article is part of the GWB Archives. Original Author: Alois kraus

Replatforming Guide: Pros, Cons, and Impact

Deciding to replatform is no small feat; it’s like setting sails for new horizons with your digital presence. Weighing the

Cypress vs Selenium: Why Cypress is Better!

Navigating the competitive landscape of web testing tools, Cypress emerges as a noteworthy contender, outshining Selenium with its cutting-edge advantages.

How To Break The CLR For A Cool Feature

Related Posts

Replatforming Guide: Pros, Cons, and Impact

Cypress vs Selenium: Why Cypress is Better!