I have been working on a problem for a couple of weeks now. I even wrote about it in
Windows CE: Finding an Out of Control Thread where I declared that I had found the thread that had the problem. It turns out that I was wrong, and I am willing to admit it.
In
Windows CE: Finding an Out of Control Thread I showed how I found the thread that was using 100% of the CPU time by finding the thread priority of the thread. I didn’t discuss it, but I did consider priority inversion and ruled that out. It turns out that I misinterpreted the data when ruling out priority inversion, and to top it off I didn’t consider that the priority inversion could be occurring in the customer’s application.
Priority Inversion is something that occurs when two threads are synchronized with one of the synchronization objects. When a high priority thread is waiting to get a synchronization object and a low priority thread has the synchronization object, the priority of the low priority thread will be increased up to the priority of the high priority thread so that it can release the synchronization object.
Process PID Base Priority # Threads Base Addr Access Key
NK.EXE 7FFF002 3 2 C2000000 1
|
Usage
|
Thread
|
Owner
|
Base
|
Delta
|
Flags
|
Access
|
Current
|
|
Count
|
ID
|
|
Priority
|
Priority
|
|
Key
|
Thread ID
|
|
1
|
7FFCFDA
|
7FFF002
|
1
|
0
|
2
|
1
|
7FFF002
|
|
1
|
7FFF266
|
7FFF002
|
255
|
0
|
5
|
FFFFFFFF
|
7FFF002
|
filesys.exe 87FE8B52 3 4 4000000 2
|
Usage
|
Thread
|
Owner
|
Base
|
Delta
|
Flags
|
Access
|
Current
|
|
Count
|
ID
|
|
Priority
|
Priority
|
|
Key
|
Thread ID
|
|
1
|
E7F09EE2
|
87FE8B52
|
251
|
0
|
4
|
3
|
87FE8B52
|
|
1
|
47F7F836
|
87FE8B52
|
251
|
0
|
2
|
3
|
87FE8B52
|
|
1
|
27F93956
|
87FE8B52
|
251
|
0
|
2
|
3
|
87FE8B52
|
|
1
|
87FE8B2E
|
87FE8B52
|
251
|
0
|
2
|
3
|
87FE8B52
|
device.exe 7F7A616 3 78 6000000 4
|
Usage
|
Thread
|
Owner
|
Base
|
Delta
|
Flags
|
Access
|
Current
|
|
Count
|
ID
|
|
Priority
|
Priority
|
|
Key
|
Thread ID
|
|
1
|
8783EF6A
|
7F7A616
|
100
|
0
|
2
|
5
|
7F7A616
|
|
1
|
8783E906
|
7F7A616
|
91
|
0
|
2
|
5
|
7F7A616
|
|
1
|
2784AF02
|
7F7A616
|
150
|
0
|
2
|
5
|
7F7A616
|
|
1
|
7854642
|
7F7A616
|
101
|
0
|
2
|
5
|
7F7A616
|
|
1
|
87859FB2
|
7F7A616
|
251
|
0
|
4
|
5
|
7F7A616
|
|
1
|
87859882
|
7F7A616
|
10
|
0
|
0
|
5
|
7F7A616
|
gwes.exe 7841DAA 3 7 8000000 8
|
Usage
|
Thread
|
Owner
|
Base
|
Delta
|
Flags
|
Access
|
Current
|
|
Count
|
ID
|
|
Priority
|
Priority
|
|
Key
|
Thread ID
|
|
1
|
675B683A
|
7841DAA
|
249
|
0
|
2
|
9
|
7841DAA
|
|
1
|
75C810A
|
7841DAA
|
249
|
0
|
2
|
9
|
7841DAA
|
|
1
|
675CDD72
|
7841DAA
|
251
|
0
|
2
|
9
|
7841DAA
|
App1.exe 871DE152 3 4 E000000 40
|
Usage
|
Thread
|
Owner
|
Base
|
Delta
|
Flags
|
Access
|
Current
|
|
Count
|
ID
|
|
Priority
|
Priority
|
|
Key
|
Thread ID
|
|
1
|
26FD1FBE
|
871DE152
|
251
|
0
|
5
|
41
|
871DE152
|
|
1
|
87206DF6
|
871DE152
|
248
|
0
|
1
|
41
|
871DE152
|
|
1
|
E7206D8A
|
871DE152
|
248
|
0
|
2
|
41
|
871DE152
|
|
1
|
870A254A
|
871DE152
|
249
|
0
|
2
|
49
|
7841DAA
|
CSWrapper.exe A5BE8D46 3 28 1C000000 2000
|
Usage
|
Thread
|
Owner
|
Base
|
Delta
|
Flags
|
Access
|
Current
|
|
Count
|
ID
|
|
Priority
|
Priority
|
|
Key
|
Thread ID
|
|
1
|
E43A1FAE
|
A5BE8D46
|
251
|
0
|
5
|
2001
|
A5BE8D46
|
|
1
|
243C3C82
|
A5BE8D46
|
251
|
0
|
1
|
2001
|
A5BE8D46
|
|
1
|
C447DDC2
|
A5BE8D46
|
251
|
160
|
1
|
2007
|
7F7A616
|
|
1
|
8447DD12
|
A5BE8D46
|
251
|
0
|
1
|
2001
|
A5BE8D46
|
LowPriorityThread Priority 251 setting 250
LowPriorityThread Priority 250 setting 249
LowPriorityThread Priority 249 setting 248
… Lines of output cut from here
LowPriorityThread Priority 94 setting 93
LowPriorityThread Priority 93 setting 92
LowPriorityThread Priority 92 setting 91
Based on this output, I can tell that the out of control thread is running at priority 91. I know this because as soon as my low priority thread's priority is set to priority 91 it starts getting its share of quantum and starts running again. That is at priority 92 and lower, it does not get any quantum because the out of control thread doesn’t yield to any of the lower priority threads.
Priority 91 shows up clearly on the blue line above as a thread running device.exe. I should note here, that I removed a lot of lines from this output that didn’t add value to this article, so the green highlighted line wasn’t so obvious buried in the output. I will get back to the green highlighted line later.
I know exactly which thread is running at priority 91, so I thought I was well on my way to solving this. So I did some work to narrow down the cause of the problem. From that work, I could see exactly what was going on in the thread that I was looking at when the problem occurred.
When the problem occurred, the thread was stuck in EnterCriticalSection(). Not only was it stuck in EnterCriticalSection(), but it was using 100% of the CPU at the time. I was stunned because that just didn’t make sense and it probably meant that I was going to need to contact Microsoft for help. Imagine that call, they would think that I was crazy because this would mean that they had a serious problem if I was right.
I knew that it would make sense if while the thread was in EnterCriticalSecion() another thread was having its priority raised due to priority inversion, but I wasn’t seeing any threads with raised priorities. Being a little wise, I decided to review the output again. That is when I noticed the thread with a value other than zero in the Delta Priority column. To be honest, I was so sure that the out of control thread was in a driver, that I hadn’t looked for one in a user application.
Delta Priority is a value subtracted from the Base Priority to during priority inversion to set the current priority of a thread. In the case of the green highlighted line, the base priority is 251 and the delta priority is 160, so the current priority is 91. That means that there are two threads running at priority 91 when the problem occurs and that I have more work to do to narrow this problem down.
I could stop here, but how could a user application keep the driver thread from getting the critical section. Critical sections can only be used within a single process, there is no way for multiple processes to use a critical section, or is there? The answer is sort of.
When the user application makes a call into a driver, like WriteFile(), the driver’s implementation of WriteFile() runs as the application’s thread. So the driver in device.exe is able to use the critical section in the context of the user application.
Copyright © 2009 – Bruce Eitman
All Rights Reserved