I have written about problems with timer interrupt handlers before in Windows CE: Why does my system halt for 20 minutes? But that is really aimed at the OEM responsible for writing the interrupt service routine for the timer interrupt.
Today, I have been fielding questions from an end user who thinks that he might have this problem. He needs to convince the OEM that the OEM has the problem and needs to fix it. Of course the first thing to do is determine if he really does have the problem.
Let’s start with this. Ask the OEM where they obtained the interrupt hander for the timer and OEMIdle. If the answer is Intel, then ask how they fixed the time interrupt problems that seem to always exist with Intel’s timer handling for ARM processors. If they don’t have a good answer for that, then they probably have a problem.
But even with that, you will probably need to prove it to them. One reason that you will need to prove it is that this problem might not show up for many of your OEM’s customers, but it will show up for some. For the few that will experience it, only a few will be savvy enough to realize how serious the problem is and report it to the OEM. Some will just think that it happens once in a while so they reset the system and keep using it. But you are more concerned because it shows up more often for you and you found it early in your development cycle so you want it fixed.
In Windows CE: Why does my system halt for 20 minutes? I suggested a simple test to determine if the problem that you are seeing is really caused by the time interrupt. The test will run a thread to log the current time in hours, minutes and seconds and the return value from GetTickCount(). Here is an example thread that you can use:
DWORD WINAPI TestForTimeInterruptProblem(LPVOID p)
{
                DWORD Tick=0;
                DWORD LastTick=GetTickCount();
                DWORD TickDelta=0;
                SYSTEMTIME stNow;
 
                GetSystemTime(&stLast);
 
                while( 1 )
                {
                                Tick = GetTickCount();
                                TickDelta = Tick - LastTick;
 
                                GetSystemTime(&stNow);
                                RETAILMSG(1, (TEXT(" %2.2d/%2.2d/%4.4d %2.2d:%2.2d:%2.2d.%3.3d "),
                                                                stNow.wMonth,
                                                                stNow.wDay,
                                                                stNow.wYear,
                                                                stNow.wHour, 
                                                                stNow.wMinute,
                                                                stNow.wSecond,
                                                                stNow.wMilliseconds));
                                RETAILMSG(1, (TEXT("TC %d Delta %d"), Tick, TickDelta));
                                LastTick = Tick;
                                Sleep( 300 );
                }
}
This will output the current time and the current tick count. The time will update every three times this runs on most systems because the milliseconds on many systems will be zero. The tick count should increment by approximately 300 every time the loops runs because of the Sleep(300).
Let this run while you test the system. If the problem occurs, wait until the system starts running okay again and then check the debug output. If you have a timer interrupt problem, there will be a 20 minute jump in time and the tick count will still have incremented by approximately 300.
Note: If this thread runs at a low priority and the system is busy running threads at a higher priority, then the tick count will change by more than 300 because this thread will be starved for CPU time.
Copyright © 2009 – Bruce Eitman
All Rights Reserved