I have installed Windows Vista Ultimate since some month now and I still wonder what system service I have missed to get a well behaved system. To troubleshoot heavy disk access I use Process Monitor from SysInternals. It is an invaluable tool to find out who did access what on the system. It can monitor every module load, process start, thread creation, registry or disk access and network activity you can imagine. If you are developing unmanaged code you can even view the call stacks who did access this file or registry key not only from which application but it captures also (when the symbols are loaded) the full call stack. To make the call stack work you need to download the symbols which is quite easy and I recommend to do it. Below is a typical screen shot from Process Monitor when the system is doing seemingly nothing:
As you can see there is still quite a lot activity going on behind the scenes. Process Monitor makes the Windows black box a white box and you can look directly at the system while it is working. If you ask with what black art this information is obtained: Process Monitor installs a filter driver (at run time only) which allows it to redirect nearly all IO related activities to the filter driver. To check if the driver is runnung you can use the fltmc command:
Filtername Number of Instances Height Frame
----------------------------- -------------------- ---- -----
PROCMON11 0 385001 0
luafv 1 135000 0
FileInfo 8 45000 0
If you wan to find out who did who did open the files simply go to Tools - Stack Trace
and you will see even the call stack who did try to access a specific file or registry key. I did try this with a small C++ application to get a full call stack but the latest version (2.02) seems have problems to pick up the correct events. I did not see my file IO at all (Windows Vista 32-bit). Ok that means that I can finish my post. But wait. My initial goal was to tell you a little known secret: XPerf is your secret weapon when you want to see what the Kernel really does.
Some introduction links are here:
Starting with Windows 2000 the ETW (Event Tracing for Windows) was introduced but it was a little known feature which did not get much attention. The main reason is that it was quite complicated to implement and use. Xperf has been used within MS since years but it was so useful they have published it. The tool can be used to capture traces on Server 2003 and XP systems but you must analyze them on a Vista system to decode them.
Its main uses are (where I have used it so far):
- Find out the difference why the first start of your application takes so much longer than the second start
- Who is accessing the file x zillion times?
- Are my disk access patterns inefficient?
To get started simply install the tool and start a trace session with some default event providers (xperf -providers lists all of them).
Now you can execute your use case, start your application, ... Then you can stop the trace session and save the traces to disk
- xperf -d c:\savedTraces.etl
To view the saved traces simply type
- xperf c:\savedTraces.etl or double click on the file in explorer if you have a file association with .etl files.
You will see some output like this:
As you can see there is CPU usage, Disk IO, Disk IO per process, .... Today application was scraping over my hard disk again and I never figured out what it was. Process Monitor shows you very much details but sometimes the amount of data can be overwhelming and I was not able to spot the issue. But here we see a nice 100% spike in the Disk Utilization by Process. Lets select the region of interest and click on Detail Graph (this is a context menu):
This graph shows you the time on the horizontal axis and the logical cluster number of you hard disk on the vertical axis. To make it even more interesting it shows you only true disk accesses. If all data is in the file system cache this graph will contain only a few dots which tells you that the hard disk had nothing to do with your slow system. From this graph you can conclude
- Big vertical changes means long disk seeks thus slow performance. An optimal disk access pattern (linear read) would be a straight line.
- Big horizontal gaps means that the disk or the application was busy. You just did wait for something to complete (random access) quite long.
Additionally the Summary table shows you also a wealth of information
In my case I did find that FireFox has its own database (urlclassifer3.sqlite) which does update its index from time to time. The database itself contains a list of URLs which are considered harmful. Firefox will warn you when you try to navigate to potential fraud urls. You can disable this thing under
Options - Security and uncheck
Tell me if the site I´m visiting is a suspected attack site.
Tell me if the site I´m visiting is a suspected forgery.
Shut down Firefox and delete the database. Voila no more hard disk scraping in the background and one reason less to blame Windows Vista. Finally this case can be closed. With this tool I managed also to find out what my little cpptest.exe application was doing:
When the _NT_SYMBOL_PATH is set to c:\windows\symbols;SRV*C:\Windows\Symbols*http://msdl.microsoft.com/download/symbols and you add the cpptest pdb directory you will get this output. From this stack trace we can conclude that cpptest.exe does call from its main method some long lasting functions such as WriteFile which are responsible for the disk IO we where seeing in the System process. One important thing about file IO with Windows Vista is that the OS seems to be veeeery lazy with file writes. If I write e.g. a 100 MB file it is has 0 bytes until I close the file! To force the Vista to write the contents of a file to disk you need to call FlushFileBuffers or use unbuffered IO.