I have been working on a problem for a couple of weeks now. I even wrote about it in Windows CE: Finding an Out of Control Thread where I declared that I had found the thread that had the problem. It turns out that I was wrong, and I am willing to admit it.
In Windows CE: Finding an Out of Control Thread I showed how I found the thread that was using 100% of the CPU time by finding the thread priority of the thread. I didn’t discuss it, but I did consider priority inversion and ruled that out. It turns out that I misinterpreted the data when ruling out priority inversion, and to top it off I didn’t consider that the priority inversion could be occurring in the customer’s application.
Priority Inversion is something that occurs when two threads are synchronized with one of the synchronization objects. When a high priority thread is waiting to get a synchronization object and a low priority thread has the synchronization object, the priority of the low priority thread will be increased up to the priority of the high priority thread so that it can release the synchronization object.
In Windows CE: Finding an Out of Control Thread I used ShowRunningProcesses() from Windows CE: Using ToolHelpAPI to get more information about running processes to output the process and thread information when I detected that one of the threads was out of control. The output looks like this:
Process                     PID Base Priority # Threads Base Addr Access Key
NK.EXE                  7FFF002             3         2 C2000000          1
                Usage
Thread
Owner
Base
Delta
Flags
Access
Current
                Count
ID
 
Priority
Priority
 
Key
Thread ID
                1
7FFCFDA
7FFF002
1
0
2
1
7FFF002
                1
7FFF266
7FFF002
255
0
5
FFFFFFFF
7FFF002
filesys.exe            87FE8B52             3         4   4000000          2
                Usage
Thread
Owner
Base
Delta
Flags
Access
Current
                Count
ID
 
Priority
Priority
 
Key
Thread ID
                1
E7F09EE2
87FE8B52
251
0
4
3
87FE8B52
                1
47F7F836
87FE8B52
251
0
2
3
87FE8B52
                1
27F93956
87FE8B52
251
0
2
3
87FE8B52
                1
87FE8B2E
87FE8B52
251
0
2
3
87FE8B52
device.exe              7F7A616             3        78   6000000          4
                Usage
Thread
Owner
Base
Delta
Flags
Access
Current
                Count
ID
 
Priority
Priority
 
Key
Thread ID
                1
8783EF6A
7F7A616
100
0
2
5
7F7A616
                1
8783E906
7F7A616
91
0
2
5
7F7A616
                1
2784AF02
7F7A616
150
0
2
5
7F7A616
                1
7854642
7F7A616
101
0
2
5
7F7A616
                1
87859FB2
7F7A616
251
0
4
5
7F7A616
                1
87859882
7F7A616
10
0
0
5
7F7A616
gwes.exe                7841DAA             3         7   8000000          8
                Usage
Thread
Owner
Base
Delta
Flags
Access
Current
                Count
ID
 
Priority
Priority
 
Key
Thread ID
                1
675B683A
7841DAA
249
0
2
9
7841DAA
                1
75C810A
7841DAA
249
0
2
9
7841DAA
                1
675CDD72
7841DAA
251
0
2
9
7841DAA
App1.exe          871DE152             3         4   E000000         40
                Usage
Thread
Owner
Base
Delta
Flags
Access
Current
                Count
ID
 
Priority
Priority
 
Key
Thread ID
                1
26FD1FBE
871DE152
251
0
5
41
871DE152
                1
87206DF6
871DE152
248
0
1
41
871DE152
                1
E7206D8A
871DE152
248
0
2
41
871DE152
                1
870A254A
871DE152
249
0
2
49
7841DAA
CSWrapper.exe          A5BE8D46             3        28 1C000000       2000
                Usage
Thread
Owner
Base
Delta
Flags
Access
Current
                Count
ID
 
Priority
Priority
 
Key
Thread ID
                1
E43A1FAE
A5BE8D46
251
0
5
2001
A5BE8D46
                1
243C3C82
A5BE8D46
251
0
1
2001
A5BE8D46
                1
C447DDC2
A5BE8D46
251
160
1
2007
7F7A616
                1
8447DD12
A5BE8D46
251
0
1
2001
A5BE8D46
 LowPriorityThread Priority 251 setting 250
LowPriorityThread Priority 250 setting 249
LowPriorityThread Priority 249 setting 248
… Lines of output cut from here
LowPriorityThread Priority 94 setting 93
LowPriorityThread Priority 93 setting 92
LowPriorityThread Priority 92 setting 91
Based on this output, I can tell that the out of control thread is running at priority 91. I know this because as soon as my low priority thread's priority is set to priority 91 it starts getting its share of quantum and starts running again. That is at priority 92 and lower, it does not get any quantum because the out of control thread doesn’t yield to any of the lower priority threads.
Priority 91 shows up clearly on the blue line above as a thread running device.exe. I should note here, that I removed a lot of lines from this output that didn’t add value to this article, so the green highlighted line wasn’t so obvious buried in the output.   I will get back to the green highlighted line later.
I know exactly which thread is running at priority 91, so I thought I was well on my way to solving this. So I did some work to narrow down the cause of the problem. From that work, I could see exactly what was going on in the thread that I was looking at when the problem occurred.
When the problem occurred, the thread was stuck in EnterCriticalSection(). Not only was it stuck in EnterCriticalSection(), but it was using 100% of the CPU at the time. I was stunned because that just didn’t make sense and it probably meant that I was going to need to contact Microsoft for help. Imagine that call, they would think that I was crazy because this would mean that they had a serious problem if I was right.  
I knew that it would make sense if while the thread was in EnterCriticalSecion() another thread was having its priority raised due to priority inversion, but I wasn’t seeing any threads with raised priorities. Being a little wise, I decided to review the output again. That is when I noticed the thread with a value other than zero in the Delta Priority column. To be honest, I was so sure that the out of control thread was in a driver, that I hadn’t looked for one in a user application.
Delta Priority is a value subtracted from the Base Priority to during priority inversion to set the current priority of a thread. In the case of the green highlighted line, the base priority is 251 and the delta priority is 160, so the current priority is 91. That means that there are two threads running at priority 91 when the problem occurs and that I have more work to do to narrow this problem down.
I could stop here, but how could a user application keep the driver thread from getting the critical section. Critical sections can only be used within a single process, there is no way for multiple processes to use a critical section, or is there? The answer is sort of.
When the user application makes a call into a driver, like WriteFile(), the driver’s implementation of WriteFile() runs as the application’s thread. So the driver in device.exe is able to use the critical section in the context of the user application.
 
Copyright © 2009 – Bruce Eitman
All Rights Reserved