Alois Kraus

blog

  Home  |   Contact  |   Syndication    |   Login
  114 Posts | 8 Stories | 297 Comments | 162 Trackbacks

News



Article Categories

Archives

Post Categories

Image Galleries

Programming

Sunday, December 2, 2012 #

SpeedTrace is a relatively unknown profiler made a company called Ipcas. A single professional license does cost 449€+VAT. For the test I did use SpeedTrace 4.5 which is currently Beta. Although it is cheaper than dotTrace it has by far the most options to influence how profiling does work.

First you need to create a tracing project which does configure tracing for one process type. You can start the application directly from the profiler or (much more interesting) it does attach to a specific process when it is started. For this you need to check “Trace the specified …” radio button and enter the process name in the “Process Name of the Trace” edit box. You can even selectively enable tracing for processes with a specific command line. Then you need to activate the trace project by pressing the Activate Project button and you are ready to start VS as usual. If you want to profile the next 10 VS instances that you start you can set the Number of Processes counter to e.g. 10. This is immensely helpful if you are trying to profile only the next 5 started processes.

image

As you can see there are many more tabs which do allow to influence tracing in a much more sophisticated way. SpeedTrace is

the only a profiler which does not rely entirely on the profiling Api of .NET. Instead it does modify the IL code (instrumentation on the fly) to write tracing information to disc which can later be analyzed. This approach is not only very fast but it does give you unprecedented analysis capabilities. Once the traces are collected they do show up in your workspace where you can open the trace viewer.

image

I do skip the other windows because this view is the most useful one. You can sort the methods not only by Wall Clock time but also by CPU consumption and wait time which none of the other products support in their views at the same time. If you want to optimize for CPU consumption sort by CPU time. If you want to find out where most time is spent you need Clock Total time and Clock Waiting. There you can directly see if the method did take long because it did wait on something or it did really execute stuff that did take so long.

Once you have found a method you want to drill deeper you can double click on a method to get to the Caller/Callee view which is similar to the JetBrains Method Grid view. But this time you do see much more. In the middle is the clicked method. Above are the methods that call you and below are the methods that you do directly call. Normally you would then start digging deeper to find the end of the chain where the slow method worth optimizing is located. But there is a shortcut.

You can press the magic image  button to calculate the aggregation of all called methods. This is displayed in the lower left window where you can see each method call and how long it did take. There you can also sort to see if this call stack does only contain methods (e.g. WCF connect calls which you cannot make faster) not worth optimizing. YourKit has a similar feature where it is called Callees List.

image

In the Functions tab you have in the context menu also many other useful analysis options

image

One really outstanding feature is the View Call History Drilldown. When you select this one you get not a sum of all method invocations but a list with the duration of each method call. This is not surprising since SpeedTrace does use tracing to get its timings. There you can get many useful graphs how this method did behave over time. Did it become slower at some point in time or was only the first call slow? The diagrams and the list will tell you that.

image image

image

That is all fine but what should I do when one method call was slow? I want to see from where it was coming from. No problem select the method in the list hit F10 and you get the call stack. This is a life saver if you e.g. search for serialization problems. Today Serializers are used everywhere. You want to find out from where the 5s XmlSerializer.Deserialize call did come from? Hit F10 and you get the call stack which did invoke the 5s Deserialize call. The CPU timeline tab is also useful to find out where long pauses or excessive CPU consumption did happen. Click in the graph to get the Thread Stacks window where you can get a quick overview what all threads were doing at this time. This does look like the Stack Traces feature in YourKit. Only this time you get the last called method first which helps to quickly see what all threads were executing at this moment. YourKit does generate a rather long list which can be hard to go through when you have many threads.

image

The thread list in the middle does not give you call stacks or anything like that but you see which methods were found most often executing code by the profiler which is a good indication for methods consuming most CPU time.

This does sound too good to be true? I have not told you the best part yet. The best thing about this profiler is the staff behind it. When I do see a crash or some other odd behavior I send a mail to Ipcas and I do get usually the next day a mail that the problem has been fixed and a download link to the new version. The guys at Ipcas are even so helpful to log in to your machine via a Citrix Client to help you to get started profiling your actual application you want to profile. After a 2h telco I was converted from a hater to a believer of this tool. The fast response time might also have something to do with the fact that they are actively working on 4.5 to get out of the door. But still the support is by far the best I have encountered so far.

The only downside is that you should instrument your assemblies including the .NET Framework to get most accurate numbers. You can profile without doing it but then you will see very high JIT times in your process which can severely affect the correctness of the measured timings. If you do not care about exact numbers you can also enable in the main UI in the Data Trace tab logging of method arguments of primitive types. If you need to know what files at which times were opened by your application you can find it out without a debugger.

Since SpeedTrace does read huge trace files in its reader you should perhaps use a 64 bit machine to be able to analyze bigger traces as well. The memory consumption of the trace reader is too high for my taste. But they did promise for the next version to come up with something much improved.


JustTrace is made by Telerik which is mainly known for its collection of UI controls. The current version (2012.3.1127.0) does include a performance and memory profiler which does cost 614€ and is currently with a special offer for 306€ on sale. It does include one year of free upgrades. The uneven € numbers are calculated from the 799€ and 399$ discount price. The UI is already in Metro style and simple to use. Multi process,

attach, Update (attach is supported when you select as Profilee type “Running Application” in sampling mode) method recording filter are not supported.

image image

It looks like JustTrace is like Ants a Just My Code profiler by default.

At least I have not (yet) found an option to see all methods. Update: There is a button there as well. I did not press it since it did look disabled to me. I guess I just too old to spot the difference between disabled and not enabled. You need to press the Minor Methods button to see all methods. Now I do see much more but unfortunately I still cannot sort by total time which is strange for a profiler. In the All Threads window there is a Total Time column but I do not get it why I do not have it in the All Methods view. For stuff where you do not have the pdbs or you want to dig deeper into the BCL code you will not get far. After getting the profile data you get in the All Methods grid a plain list with hit count and own time. The method list for all methods is also suspiciously short which is a clear sign that you will not get far during the analysis of foreign code. It looks like this is intentional because I cannot configure the displayed columns where I get at least the total time as well.

image

But at least there is also a memory profiler included. For this I have to choose in the first window for Profiling Type “Memory Profiler” to check the memory consumption of VS.  There are some interesting number to see but I do really miss from YourKit the thread stack window. How am I supposed to get a clue when much memory is allocated and the CPU consumption is high in which places I should look?

clip_image001

The Snapshot summary gives a rough overview which is ok for a first impression.

image

Next is Assemblies? This gives you a list of all loaded assemblies. Not terribly useful.

 

image

The By Type view gives you exactly what it is supposed to do. You have to keep in mind that this list is filtered by the types you did check in the Assemblies list.

image

The By Type instance list does only show types from assemblies which do not originate from Microsoft. By default mscorlib and System are not checked. That is the reason why for the first time my By Type window looked like

image

The idea behind this feature is to show only your instances because you are ultimately responsible for the overall memory consumption. I am not sure if I do like this feature because by default it does hide too much. I do want to see at least how many strings and arrays are allocated. A simple namespace filter would also do it in my opinion. Now you can examine all string instances and look who in the object graph does keep a reference on them.

image image

That is nice but YourKit has the big plus that you can also look into the string contents.  I am also not sure how in the graph cycles are visualized and what will happen if you have thousands of objects referencing you. That's pretty much it about JustTrace. It can help the average developer to pinpoint performance and memory issues by just looking at his own code and instances. Showing them more will not help them because the sheer amount of information will overwhelm them. And you need to have a pretty good understanding how the GC and the CLR does work. When you have a performance issue at a customer machine it is sometimes very helpful to be able a bring a profiler onto the machine (no pdbs, …) and to get a full snapshot of all processes which are in the problematic use case involved. For these more advanced use cased JustTrace is certainly the wrong tool.

Next: SpeedTrace


 

The YourKit (v7.0.5) profiler is interesting in terms of price (79€ single place license, 409€ + 1 year support and upgrades) and feature set. You do get a performance and memory profiler in one package for which you normally need also to pay extra from the other vendors. As an interesting side note the profiler UI is written in Java because they do also sell Java profilers with the same feature set. To get all methods of a VS startup you need first to configure it to include System* in the profiled methods and you need to configure * to measure wall clock time. By default it does record only CPU times which allows you to optimize CPU hungry operations. But you will never see a Thread.Sleep(10000) in the profiler blocking the UI in this mode.

image image

It can profile as all others processes started from within the profiler but it can also profile the next or all started processes.

image image

As usual it can profile in sampling and tracing mode. But since it is a memory profiler as well it does by default also record all object allocations > 1MB. With allocation recording enabled VS2012 did crash but without allocation recording there were no problems. The CPU tab contains the time line of the application and when you click in the graph you the call stacks of all threads at this time. This is really a nice feature. When you select a time region you the CPU Usage estimation for this time window.

image

I have seen many applications consuming 100% CPU only because they did create garbage like crazy. For this is the Garbage Collection tab interesting in conjunction with a time range.

image

This view is like the CPU table only that the CPU graph (green) is missing. All relevant information except for GCs/s is already visible in the CPU tab. Very handy to pinpoint excessive GC or CPU bound issues.

The Threads tab does show the thread names and their lifetime. This is useful to see thread interactions or which thread is hottest in terms of CPU consumption.

image

On the CPU tab the call tree does exist in a merged and thread specific view. When you click on a method you get below a list of all called methods. There you can sort for methods with a high own time which are worth optimizing.

image

In the Method List you can select which scope you want to see. Back Traces are the methods which did call you.

image

Callees ist the list of methods called directly or indirectly by your method as a flat list. This is not a call stack but still very useful to see which methods were slow so you can see the “root” cause quite quickly without the need to click trough long call stacks.

image

The last view Merged Calles is a call stacked view of the previous view. This does help a lot to understand did call each method at run time. You would get the same view with a debugger for one call invocation but here you get the full statistics (invocation count) as well.

image

Since YourKit is also a memory profiler you can directly see which objects you have on your managed heap and which objects do hold most of your precious memory. You can in in the Object Explorer view also examine the contents of your objects (strings or whatsoever) to get a better understanding which objects where potentially allocating this stuff.

 

image

YourKit is a very easy to use combined memory and performance profiler in one product. The unbeatable single license price makes it very attractive to straightly buy it. Although it is a Java UI it is very responsive and the memory consumption is considerably lower compared to dotTrace and ANTS profiler. What I do really like is to start the YourKit ui and then start the processes I want to profile as usual. There is no need to alter your own application code to be able to inject a profiler into your new started processes. For performance and memory profiling you can simply select the process you want to investigate from the list of started processes.

image

That's the way I like to use profilers. Just get out of the way and let the application run without any special preparations.

 

Next: Telerik JustTrace