Alois Kraus

blog

  Home  |   Contact  |   Syndication    |   Login
  133 Posts | 8 Stories | 368 Comments | 162 Trackbacks

News



Article Categories

Archives

Post Categories

Programming

Threading can sometimes lead to impossible bugs where you have a hard time to find out how you got into this state. Lets suppose you have a nice WPF application which you dutifully test via the UIAutomation provider which can alter some state in your application.

    /// <summary>
    /// Interaction logic for TestWindow.xaml
    /// </summary>
    public partial class TestWindow : Window
    {
        int Data;

        public TestWindow()
        {
            InitializeComponent();
        }

        public void AlterData(int newValue)
        {
            Debug.Print("Executing AlterData {0}", Environment.StackTrace);
            Data = newValue;
        }

        protected override AutomationPeer OnCreateAutomationPeer()
        {
            var lret = new OwnPeer(this);
            return lret;
        }

        class OwnPeer : WindowAutomationPeer
        {
            public OwnPeer(TestWindow owner)
                : base(owner)
            { }

            protected override HostedWindowWrapper GetHostRawElementProviderCore()
            {
                Debug.Print("GetHostRawElementProviderCore called");
                ((TestWindow)base.Owner).AlterData(500);
                return base.GetHostRawElementProviderCore();
            }
        }
}

When you look at the call stacks all is normal:

Executing AlterData    at System.Environment.GetStackTrace(Exception e, Boolean needFileInfo)
   at System.Environment.get_StackTrace()
   at ThreadingIsHard.TestWindow.AlterData(Int32 newValue) in d:\Source\ThreadingIsHard\ThreadingIsHard\TestWindow.xaml.cs:line 52
   at ThreadingIsHard.TestWindow.OwnPeer.GetHostRawElementProviderCore() in d:\Source\ThreadingIsHard\ThreadingIsHard\TestWindow.xaml.cs:line 69
   at MS.Internal.Automation.ElementProxy.InContextGetHostRawElementProvider(Object unused)
   at MS.Internal.Automation.ElementUtil.<>c__DisplayClass1.<Invoke>b__0(Object unused)
   at System.Windows.Threading.ExceptionWrapper.InternalRealCall(Delegate callback, Object args, Int32 numArgs)
   at MS.Internal.Threading.ExceptionFilterHelper.TryCatchWhen(Object source, Delegate method, Object args, Int32 numArgs, Delegate catchHandler)
   at System.Windows.Threading.DispatcherOperation.InvokeImpl()
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Windows.Threading.DispatcherOperation.Invoke()
   at System.Windows.Threading.Dispatcher.ProcessQueue()
   at System.Windows.Threading.Dispatcher.WndProcHook(IntPtr hwnd, Int32 msg, IntPtr wParam, IntPtr lParam, Boolean& handled)
   at MS.Win32.HwndWrapper.WndProc(IntPtr hwnd, Int32 msg, IntPtr wParam, IntPtr lParam, Boolean& handled)
   at MS.Win32.HwndSubclass.DispatcherCallbackOperation(Object o)
   at System.Windows.Threading.ExceptionWrapper.InternalRealCall(Delegate callback, Object args, Int32 numArgs)
   at MS.Internal.Threading.ExceptionFilterHelper.TryCatchWhen(Object source, Delegate method, Object args, Int32 numArgs, Delegate catchHandler)
   at System.Windows.Threading.Dispatcher.LegacyInvokeImpl(DispatcherPriority priority, TimeSpan timeout, Delegate method, Object args, Int32 numArgs)
   at MS.Win32.HwndSubclass.SubclassWndProc(IntPtr hwnd, Int32 msg, IntPtr wParam, IntPtr lParam)
   at MS.Win32.UnsafeNativeMethods.DispatchMessage(MSG& msg)
   at MS.Win32.UnsafeNativeMethods.DispatchMessage(MSG& msg)
   at System.Windows.Threading.Dispatcher.PushFrameImpl(DispatcherFrame frame)
   at System.Windows.Application.RunInternal(Window window)
   at System.Windows.Application.Run()
   at ThreadingIsHard.App.Main() in d:\Source\UIAutomationSample\UIAutomationTest\obj\Debug\App.g.cs:line 0

So far so expected. Now lets make our well behaving application at little more unexpected and take a lock for some time. To make it simple I deadlock the main UI thread by trying to enter a lock which is owned by another thread like this

        private void Button_Click(object sender, RoutedEventArgs e)
        {
            Deadlock();
        }

        public void Deadlock()
        {
            Barrier b = new Barrier(2);
            Thread t = new Thread(() =>
            {
                lock (this)
                {
                    b.SignalAndWait(); // let main UI thread run into lock
                    Thread.Sleep(Timeout.Infinite); // never leave the lock
                }
            });
            t.Start();

            b.SignalAndWait();
            lock (this) // deadlock which never releases lock
            {
            }
        }

If you have never seen Barrier it is a class introduced with .NET 4.0 which is very handy to synchronize threads to a specific time point. Only when both threads have signaled then both threads are released at the same time. In our case this ensures a nice deadlock. Here is the quiz:

Can something bad happen while the main UI thread is deadlocked?

As it turns out it can. Lets have a look at this impossible call stack:

Executing AlterData    at System.Environment.GetStackTrace(Exception e, Boolean needFileInfo)
   at System.Environment.get_StackTrace()
   at ThreadingIsHard.TestWindow.AlterData(Int32 newValue) in d:\Source\ThreadingIsHard\ThreadingIsHard\TestWindow.xaml.cs:line 25
   at ThreadingIsHard.TestWindow.OwnPeer.GetHostRawElementProviderCore() in d:\Source\ThreadingIsHard\ThreadingIsHard\TestWindow.xaml.cs:line 44
   at MS.Internal.Automation.ElementProxy.InContextGetHostRawElementProvider(Object unused)
   at MS.Internal.Automation.ElementUtil.<>c__DisplayClass1.<Invoke>b__0(Object unused)
   at System.Windows.Threading.ExceptionWrapper.InternalRealCall(Delegate callback, Object args, Int32 numArgs)
   at MS.Internal.Threading.ExceptionFilterHelper.TryCatchWhen(Object source, Delegate method, Object args, Int32 numArgs, Delegate catchHandler)
   at System.Windows.Threading.Dispatcher.LegacyInvokeImpl(DispatcherPriority priority, TimeSpan timeout, Delegate method, Object args, Int32 numArgs)
   at MS.Internal.Automation.ElementUtil.Invoke(AutomationPeer peer, DispatcherOperationCallback work, Object arg)
   at MS.Internal.Automation.ElementProxy.get_HostRawElementProvider()
   at MS.Internal.Automation.UiaCoreProviderApi.RawUiaReturnRawElementProvider(IntPtr hwnd, IntPtr wParam, IntPtr lParam, IRawElementProviderSimple el)
   at MS.Internal.Automation.UiaCoreProviderApi.RawUiaReturnRawElementProvider(IntPtr hwnd, IntPtr wParam, IntPtr lParam, IRawElementProviderSimple el)
   at System.Windows.Automation.Provider.AutomationInteropProvider.ReturnRawElementProvider(IntPtr hwnd, IntPtr wParam, IntPtr lParam, IRawElementProviderSimple el)
   at System.Windows.Interop.HwndTarget.CriticalHandleWMGetobject(IntPtr wparam, IntPtr lparam, Visual root, IntPtr handle)
   at System.Windows.Interop.HwndTarget.HandleMessage(WindowMessage msg, IntPtr wparam, IntPtr lparam)
   at System.Windows.Interop.HwndSource.HwndTargetFilterMessage(IntPtr hwnd, Int32 msg, IntPtr wParam, IntPtr lParam, Boolean& handled)
   at MS.Win32.HwndWrapper.WndProc(IntPtr hwnd, Int32 msg, IntPtr wParam, IntPtr lParam, Boolean& handled)
   at MS.Win32.HwndSubclass.DispatcherCallbackOperation(Object o)
   at System.Windows.Threading.ExceptionWrapper.InternalRealCall(Delegate callback, Object args, Int32 numArgs)
   at MS.Internal.Threading.ExceptionFilterHelper.TryCatchWhen(Object source, Delegate method, Object args, Int32 numArgs, Delegate catchHandler)
   at System.Windows.Threading.Dispatcher.LegacyInvokeImpl(DispatcherPriority priority, TimeSpan timeout, Delegate method, Object args, Int32 numArgs)
   at MS.Win32.HwndSubclass.SubclassWndProc(IntPtr hwnd, Int32 msg, IntPtr wParam, IntPtr lParam)
   at System.Threading.SynchronizationContext.WaitHelper(IntPtr[] waitHandles, Boolean waitAll, Int32 millisecondsTimeout)
   at System.Windows.Threading.DispatcherSynchronizationContext.Wait(IntPtr[] waitHandles, Boolean waitAll, Int32 millisecondsTimeout)
   at System.Threading.Monitor.Enter(Object obj)
   at ThreadingIsHard.TestWindow.Deadlock() in d:\Source\ThreadingIsHard\ThreadingIsHard\TestWindow.xaml.cs:line 69
   at ThreadingIsHard.TestWindow.Button_Click(Object sender, RoutedEventArgs e) in d:\Source\ThreadingIsHard\ThreadingIsHard\TestWindow.xaml.cs:line 52
   at System.Windows.EventRoute.InvokeHandlersImpl(Object source, RoutedEventArgs args, Boolean reRaised)
   at System.Windows.UIElement.RaiseEventImpl(DependencyObject sender, RoutedEventArgs args)
   at System.Windows.Controls.Button.OnClick()
   at System.Windows.Controls.Primitives.ButtonBase.OnMouseLeftButtonUp(MouseButtonEventArgs e)
   at System.Windows.RoutedEventArgs.InvokeHandler(Delegate handler, Object target)
   at System.Windows.EventRoute.InvokeHandlersImpl(Object source, RoutedEventArgs args, Boolean reRaised)
   at System.Windows.UIElement.ReRaiseEventAs(DependencyObject sender, RoutedEventArgs args, RoutedEvent newEvent)
   at System.Windows.RoutedEventArgs.InvokeHandler(Delegate handler, Object target)
   at System.Windows.EventRoute.InvokeHandlersImpl(Object source, RoutedEventArgs args, Boolean reRaised)
   at System.Windows.UIElement.RaiseEventImpl(DependencyObject sender, RoutedEventArgs args)
   at System.Windows.UIElement.RaiseTrustedEvent(RoutedEventArgs args)
   at System.Windows.Input.InputManager.ProcessStagingArea()
   at System.Windows.Input.InputProviderSite.ReportInput(InputReport inputReport)
   at System.Windows.Interop.HwndMouseInputProvider.ReportInput(IntPtr hwnd, InputMode mode, Int32 timestamp, RawMouseActions actions, Int32 x, Int32 y, Int32 wheel)
   at System.Windows.Interop.HwndMouseInputProvider.FilterMessage(IntPtr hwnd, WindowMessage msg, IntPtr wParam, IntPtr lParam, Boolean& handled)
   at System.Windows.Interop.HwndSource.InputFilterMessage(IntPtr hwnd, Int32 msg, IntPtr wParam, IntPtr lParam, Boolean& handled)
   at MS.Win32.HwndWrapper.WndProc(IntPtr hwnd, Int32 msg, IntPtr wParam, IntPtr lParam, Boolean& handled)
   at MS.Win32.HwndSubclass.DispatcherCallbackOperation(Object o)
   at System.Windows.Threading.ExceptionWrapper.InternalRealCall(Delegate callback, Object args, Int32 numArgs)
   at MS.Internal.Threading.ExceptionFilterHelper.TryCatchWhen(Object source, Delegate method, Object args, Int32 numArgs, Delegate catchHandler)
   at System.Windows.Threading.Dispatcher.LegacyInvokeImpl(DispatcherPriority priority, TimeSpan timeout, Delegate method, Object args, Int32 numArgs)
   at MS.Win32.HwndSubclass.SubclassWndProc(IntPtr hwnd, Int32 msg, IntPtr wParam, IntPtr lParam)
   at MS.Win32.UnsafeNativeMethods.DispatchMessage(MSG& msg)
   at MS.Win32.UnsafeNativeMethods.DispatchMessage(MSG& msg)
   at System.Windows.Threading.Dispatcher.PushFrameImpl(DispatcherFrame frame)
   at System.Windows.Application.RunInternal(Window window)
   at System.Windows.Application.Run()
   at ThreadingIsHard.App.Main() in d:\Source\UIAutomationSample\UIAutomationTest\obj\Debug\App.g.cs:line 0

Hm the lock is still held but how can AlterData be called while we are trying to enter the lock which was clearly not released? We do see it even in the call stack (green line)? To understand this you need to take a deep look back in history about STA threads and window message pumping. Monitor.Enter is not a replacement for WaitForSingleObject on a STA thread. Instead it calls into CoWaitForMultipleHandles which ultimately calls MsgWaitForMultipleObjectsEx. Here is the (simplified) call stack:

USER32!MsgWaitForMultipleObjectsEx
combase!CCliModalLoop::SetPointerInputModeAsAppropriateForQueueAttachmentConditions
combase!CCliModalLoop::BlockFn
combase!ClientCallRetryContext::StartTimer
combase!ClassicSTAThreadWaitForHandles
combase!CoWaitForMultipleHandles
System.Windows.Threading.DispatcherSynchronizationContext.Wait(IntPtr[], Boolean, Int32)
System.Threading.Monitor.Enter(System.Object) 
ThreadingIsHard.TestWindow.Deadlock() [d:\Source\ThreadingIsHard\ThreadingIsHard\TestWindow.xaml.cs @ 69] 
ThreadingIsHard.TestWindow.Button_Click(System.Object, System.Windows.RoutedEventArgs) [d:\Source\ThreadingIsHard\ThreadingIsHard\TestWindow.xaml.cs @ 52] 
System.Windows.EventRoute.InvokeHandlersImpl(System.Object, System.Windows.RoutedEventArgs, Boolean)

MsgWaitForMultipleObjectsEx is just like WaitForMultipleObjects with the important exception that the wait can also return when specific window messages are put into the window message queue. The official docs are silent about which window messages are pumped. I have tried

  • Keyboard press
  • Mouse move
  • Mouse button
  • WM_PAINT
  • BeginInvoke
  • Invoke
  • Dispatcher.Invoke

NONE of these window messages are pumped while holding a lock. Chris Brumme says about this in his blog

On lower operating systems, the CLR uses MsgWaitForMultipleHandles / PeekMessage / MsgWaitForMultipleHandlesEx and whatever else is available on that version of the operating system to try to mirror the behavior of CoWaitForMultipleHandles. The net effect is that we will always pump COM calls waiting to get into your STA. And any SendMessages to any windows will be serviced. But most PostMessages will be delayed until you have finished blocking.

Although the blog post is over 10 years old the information is still relevant. What I have seen so far all Window messages will be blocked except for COM calls as it is the case for UIAutomation calls which use COM as transport vehicle to marshal your UI automation calls across processes. If you have a custom UI Automation provider you should check twice if you are calling into state altering methods of your UI objects which can be reentrant called while your application was trying get access to a lock. While you were waiting for the lock your window could already be disposed! Bad things can and will happen. UIAutomation is not only used for integration testing but also by screen readers and tools for people who have problems reading a computer screen. It is therefore possible that these issues not only show up during tests but also on customer machines. If you get mysterious crashes at places where you would not have expected them check the call stack  for AutomationInteropProvider calls. If there are some you most likely have hit a problem due to unexpected "continuations" while taking a lock. Unfortunately there is not really a good solution known to me except to pay vigilant attention to your UI Automation provider objects which must ensure not to call any methods which could alter the state of objects. If you have pending Begin/Invoke calls pending in your message queue these messages will still not get pumped. Only the direct COM calls and their side effects can tunnel through blocking calls. Perhaps I am telling you things you already knew but this one was quite surprising. Perhaps it would be the best to stop pumping COM messages at all so your calls nicely line up with the other window messages as it should be. But I guess there is still too much legacy code out there which would be deadlocked if no message pumping occurs anymore. The current Windows 10 Build still pumps messages as always. Perhaps we could get a flag to switch over to the "new" behavior to prevent such nasty problems.

posted on Tuesday, October 7, 2014 10:40 AM