Geeks With Blogs

News
Charles Young
UPDATE:  22/09/2010.  I have posted an update on this issue.  My root problem turned out to overheating, although I still advocate a number of optimisations for VirtualBox.
 
I have a Dell laptop with 8GB RAM and a dual core processor running Windows 7. I use my machine quite intensively. Quite often I run virtual images under Oracle's VirtualBox. I'm currently using an image that contains an installation of the BizTalk Server 2010 beta.
 
For months, I found that every time I fired up images under VirtualBox, I would run into problems. I might be fine for a while, but then, at some point, problems would arise. The machine would become totally unresponsive for a few minutes. Task manager would report 100% CPU usage. After a while, things would revert to normal. A few minutes later, I would have to endure another period of unresponsiveness.   This would continue for an hour or two, and then everything would revert to normal.
 
The same problem arose on my machine every Wednesday lunchtime, regardless of the use of VirtualBox. My machine is a company laptop. It has Forefront installed and scheduled to run a full scan at this time. I can't change the schedule because of policies applied to the machine, although our systems people assure me they have been careful to ensure that Forefront excludes very large files such as virtual images from the scan.
 
I've lived with this issue for a long time. Normally, I would hope to be able to spot a rogue Windows process that is hogging all the processor cycles. However, in this case, there was never any indication of any process causing problems. The combined count of all CPU usage of all processes, including processes belonging to other users, was always substantially below 100%. This was true for processes on the host as well as those running under VirtualBox. Another strange thing was that, over many weeks, I kept on seeing the emergence of a pattern (not counting the Wednesday lunchtime issue, which had an obvious link to Forefront).   I might have three days when the problem occurred at the same time each day. Then the pattern would always disappear and some different pattern would later emerge.
 
Of course, I used various tools to try to track down the problem. With no rogue processes, my best guess was that the problem must be occurring at a lower level, perhaps with some badly behaving driver. I found no hint of any problems with drivers, though. I've seen rather similar behaviour in the past due to interrupt conflicts, but again, no sign of any issues. I spent quite some time with SysInternals Process Explorer trying to track down the problem, but to no avail.
 
Then, a few days ago, I got my first solid clue as to what was happening.   It was Process Explorer that helped. My attention was drawn to an instance of the Windows service host that I could see was using up a few % of CPU cycles. I opened it up and had a look.   One of the nice things about Process Explorer is that it provides a graph of CPU usage at the process level.   The graph I saw grabbed my full attention.   There, before my eyes, was a lovely trace showing clearly that, at just the same time my machine had gone into 100% CPU usage, this process had suddenly started using a few % of CPU cycles. At the moment the CPU usage dropped back down, so did the graph.
 
Process Explorer allows you to see all the services that are running in an instance of the service host.   I set things up and waited for the problem to re-emerge. Sure enough, after a minute or two, CPU usage rocketed sky high. I had previously discovered that if I paused VirtualBox, the CPU usage would drop back to about 80%. The machine was still very sluggish, but could be used.   So, I paused VirtualBox, waiting an eternity for the mouse click event to be processed, and then got to work.   As quickly as I could, I worked through the list of services. The host was running exactly ten Windows services. I stopped the Desktop Windows Management Session manager [UxSms] - bang went my Aero interface - I stopped the Distributed Link Tracking Client [TrkWks] service - no change - I killed the Human Interface Device Access [hidserv] - etc., etc. At last, on the seventh service, I stopped SuperFetch and, after taking ages to close, everything burst into life.
 
I have been running VirtualBox constantly since then, over several days.   I have yet to see any reoccurrence of the issue. Last Wednesday, for the first time in a very long time, Forefront completed a full scan without issues.   Wonderful.
 
Is SuperFetch at fault? I can't say.   Is it just a bad installation of Windows 7.   Maybe. Perhaps VirtualBox is the true culprit. That's possible.   I have no idea.   All I know is that my productivity is now significantly higher after switching SuperFetch off.   To switch it off, I simply opened the 'Services' administration management console and disabled the SuperFetch Windows service. 
 
I discussed my experience on the BizTalk Gurus newsgroup and two other people responded that they had been having similar issues.   They have both switched SuperFetch off. One, Randal van Splunteren, got back to me to say that his VirtualBox problems were significantly reduced by switching SuperFetch off. However, as he pointed out, there was still a fairly high on-going CPU usage when VirtualBox is running (45-50% on my box). As I understand it, VirtualBox always soaks up CPU cycles, even when the image is idle, due to timing interrupts which I presume has something to do with synchronising the virtual image to the actual hardware. This is to be expected, but the CPU usage did seem too high for comfort.   Randal investigated further and came up with a further improvement. Both he and I had configured our images to use two logical CPUs. You can control this on the Processor tab under Settings/System for a specific image. Reducing this to 1 significantly reduced the CPU usage.
 
Randal reports that this only really helped once he had reverted back to an older version of VirtualBox. Under Oracle's ownership, we are currently at version 3.2.8.   Randal recommends ditching this version in favour of version 3.1.8 which belongs to the Sun era. I experimented on my machine with both versions.   My experience was that setting the number of logical processors to 1 made a significant different on both versions, but the effect was a little greater under the older version. I get about 20% CPU usage under 3.2.8, but perhaps only 15% under 3.1.8.   Randal has decided to use the older version. I've decided to stick with the current version, at least for now.
 
So, a combination of disabling SuperFetch and configuring a single logical processor has made all the difference. If you have also been having problems running VirtualBox, then there may also be merit in reverting to an older version.
 
Update: 9th Sept 2010
 
Things went well for a week or two. Then, out of the blue, the behaviour started re-asserting itself. At first, I observed short bursts of 100% CPU usage every few minutes. Over a few days, things seemed to get progressively worse. In the end, I was seeing long (several minutes) bouts of lock-up interspersed with much shorter periods of usability. If I closed my image and restarted VirtualBox, the problem would generally go away for a couple of hours, but it would always come back, seemingly worse than before!
 
I took advice and tried several new approaches to discovering what was going on. However, nothing shed any further light on my problems.   Eventually, in frustration, I followed Randal's advice, un-installed Oracle VirtualBox and rolled back to version 3.1.8 of Sun VirtualBox.
 
The problems completely went away and have not come back. Of course, this has only got me back to the position I was in after switching SuperFetch off, and there is no certainty that VirtualBox will remain usable.   If the problem re-asserts itself, I'll issue another update!
 
Update 2: 17th Sept 2010
 
Well, dear reader, the problem did come back!  However, I have now worked out what is happening.  Being a software guy, it took a long time for me to consider it might be a hardware problem.   In fact, I can now reveal that the issue is due to overheating.  The combination of the Dell Vostro 1720 notebook and Windows 7 can't cope with any significant and sustained CPU-bound load.  Using the RightMark CPU Clock utility, I can see that, whenever the CPU usage rises to, say, 40-50% for a few minutes, the core temperature slowly rises to above 80o.  After a while, the power manager decides to throttle back the CPU from 2.6GHz to about 300MHz!   My machine locks up, and reports 100% CPU usage.   The core temperature slowly falls to below 80o and, after a bit, the CPU returns to full speed.  The cycle then repeats endlessly.
 
All my previous efforts certainly helped to lessen the problem by reducing the load on the machine. However, ultimately there are always going to be times when the CPU usage goes up to 50% for a while. So, nothing I did really solved the underlying problem. I will leave the post in place, though, in case my journey helps anyone else.
 
One of our systems guys tells me that this kind of overheating problem has become fairly common in the Vista/Windows7 era. He says there is little to be done about it, short of buying a cooling pad. So, I have ordered one today. If you haven't come across these gizmos, they are simply a little platform on which you can place your notebook. They have a dinky little fan built in to force greater airflow through the machine. The pad will arrive in a few days time, and I will report back on how well it worked later. In the meantime, propping my machine up on the edges of a couple of books to help increase air flow is helping somewhat.
Posted on Monday, August 23, 2010 9:20 AM | Back to top


Comments on this post: VirtualBox and 100% CPU usage - the SuperFetch effect

# re: VirtualBox and 100% CPU usage - the SuperFetch effect
Requesting Gravatar...
Thanks for a really thorough write-up on your ventures superfetch and overheating venture.

I'm having an issue with a Windows 7 guest running on Ubuntu 11.04 host. I'm currently running VB 4.0.12 r72916. My host CPU usage is stuck at 75% (Core 2 Duo ThinkPad & 4gb RAM) with the guest Windows 7 machine VM process showing %CPU of 102 in /usr/bin/top.

Will surely give your suggestion of turning superfetch off when I get to starting that VM in a few days time.

Hopefully I'll remember to come back here and post an update!

Cheers,
ak.
Left by Anthony K on Aug 17, 2011 7:50 AM

# re: VirtualBox and 100% CPU usage - the SuperFetch effect
Requesting Gravatar...
Thanks! I been through a very similar process of trying to figure out what is causing the 100% CPU, and no apparent process to blame. I've re-installed USB drivers, re-installed VirtualBox, un-installed misc software, stopped a lot of services, looked at system logs and what-not. While this has improved system performance, the key trick;

Change power manager policy from Passive to Active. Control Panel\All Control Panel Items\Power Options\Edit Plan Settings, Change Advance Settings, Processor power management, System cooling policy, Set to Active.

So now the fan speed will go up, also on battery, before CPU speed is lowered.
Left by MST xx on Nov 19, 2013 8:23 AM

# re: VirtualBox and 100% CPU usage - the SuperFetch effect
Requesting Gravatar...
I found that the version (6.1.7600.16385) of Superfetch provided with Windows 7 (clean install) was hogging my CPU and creating constant hard drive activity when I was logged in. So I had to disable it...

...Until I (finally) discovered this update:

http://support.microsoft.com/kb/2555428

The description "...Windows 7 startup process is slow when you create many restore points" doesn't sound right.

But it updated my Superfetch to version 6.1.7600.16819

(this update also provides an updated 6.1.7601.XXXXX version of Superfetch for Windows 7 SP1)

Bang! And the hard drive is behaving now with the Superfetch service enabled.
Left by Alex T on Jan 02, 2014 6:03 PM

Your comment:
 (will show your gravatar)


Copyright © Charles Young | Powered by: GeeksWithBlogs.net