Geeks With Blogs
MightyZot

While there are many programmers moving their infrastructures to platforms like .NET and away from VB6 and COM, there are still applications out there that are written in VB6 and possibly a plethora still using COM.  And, it is likely that we’ll be seeing and supporting VB6 and COM-based applications for the foreseeable future.  As a result, maybe you shouldn’t file away your VB6 and COM troubleshooting skills just yet.  Even if you have a good strategy for preserving binary compatibility from version to version of your software, you may still struggle periodically with “ActiveX can’t create object” errors and “Type Mismatch” problems.  This article describes how I found the source of a binary compatibility issue in one of our VB6 applications.

Background

Within the context of COM, binary compatibility refers to the act of preserving the GUIDs, CLSIDs, ProgIDs, and DispIDs, between versions of your software or COM components.  VB6 attempts to help you preserve binary compatibility by letting you specify the locations of previously compiled programs or components.  During the build, VB6 uses the type library information stored in those previously compiled programs or components to set the COM attributes for your newly compiled programs and components.  When you add new objects, properties, and methods to your projects, that were not present in the code during compiles dated prior to creation of your compatibility files, VB6 gives them new COM attributes.

If you do not update your compatibility files, after adding new properties and methods and recompiling your code, VB gives your objects, methods, and properties brand new COM attributes during every recompile.  What this means is that any objects not protected by your compatibility files are given new GUIDs, making them incompatible with prior builds.  Naturally, accidentally deploying files where compatibility is not maintained causes problems.  VB6 helps you out with this situation by throwing errors having absolutely no useful information in them.

Descriptive Error Messages (NOT!)

image

To the left, you can see an error message that was presented to me by one of the QA technicians where I work.  It resulted from the application of a hotfix.  To start troubleshooting the problem, I had him install the Sysinternals Suite from Microsoft’s www.sysinternals.com website.  The Sysinternals Suite includes a bunch of utilities that are handy for troubleshooting problems like this one.  My thought was that I would use Process Monitor to see the registry or file accesses going on just prior to the error.  As the VB runtime tries to load your COM objects, it hits the registry to get information about the binaries that implement them.  So, you’ll typically see a bunch of registry reads looking for GUIDs that correspond to your CLSIDs, or class IDs, as VB instantiates your objects.  You’ll also see file system events as your binaries are loaded.

Much to my chagrin, Process Monitor didn’t show either in this case.  The last registry access before the error showed that the application was trying to find the configuration options for a feature.  I have to admit that this was rather embarrassing, because there were three people standing over my shoulder watching me.  I wanted to show them how quickly and impressively that I could find the issue using the “tools of the trade.”  There is no doubt that it was spectacular, but it was spectacular of the failure kind and I don’t do failure very well.  So, I had the QA technician install Debugging Tools for Windows, so that I could take a look at the problem in WinDbg.  WinDbg is a low-level debugger provided by Microsoft.

After installing WinDbg, I attached to the program’s process before executing the feature that caused the type mismatch error.  WinDbg breaks into debugging mode automatically after attaching to a process.  I used the “g”, or “go”, command to continue the application and chose the feature causing the exception.  With the exception dialog displayed, I broke into the debugger to poke around in the application’s stack traces.  Interestingly, the first thing that I noticed was that there were multiple threads.  VB6 is largely single-threaded, or so they tell me, but I think a common misconception is that this means that there will only be one thread in a VB6 program’s process – and that is flat out not the case!  So, I examined the stacks of all of the threads and found one that was waiting on a dialog box.  “~” displays the threads of a process in WinDbg and “~2s” changes the debugger context to the thread with the 3rd index (this list is zero-based.)  To make a long story short, I repeatedly stepped through the code and created breakpoints until I found the chain of events just prior to the exception.  You can see that chain of events in the stack trace in the screen shot below.

clip_image002[8]

As you can see in the stack shown in WinDbg, the exception happens following a call to MSVBVM60!EVENT_SINK_Invoke.  Prior to that invoke call there is a call to something called MSVBVM60!BASIC_CLASS_QueryInterface.  This is exactly what I would expect to see.  The call to QueryInterface is getting a pointer to an interface and Invoke is invoking a method.  Since the error occurs after the Invoke, I can tell that the interface was probably retrieved successfully and the program is invoking a method on that interface when it throws the error.

Notice the function just before all of that, “Query+0x1e758e.”  Because this is a production application, it was not compiled with symbols.  That is why the offset of the function is shown and not the name of the function.  How do you get the name of the function?

There are a couple of ways that you can get the name of a function based upon its offset.  You can recompile the application with symbols, or you can utilize map files.  Map files are text files listing the segments, offsets, and function names contained within a program or library.  Apparently, the only way that you can generate map files in VB6 is to create a replacement for link.exe, which is located in the Microsoft Visual Studio\VB98 folder, that adds the “/map” option to the arguments and then passes the arguments to the original linker.  I’ve created a utility that does this.  If you need it, follow me on Twitter, send me a direct message, and I’ll get it to you.

Using a map file that I generated for the application, I could see that Query+0x1e758e is in the menu handler for the feature that generates the error.  That makes sense, but I wanted to make sure that the execution pointer (eip on x86) was close to the menu handler and not in a child routine.  Looking at the code for the menu handler, there are several COM objects created before the error.  One of the lines was instantiating an object and then calling a method to create another object, which looks suspiciously similar to what the code is doing in the stack trace.  I decided to look at the parameters for MSVBVM60!EVENT_SINK_Invoke, so I listed out another stack trace showing the parameters on the stack.  You can see an expanded stack trace in the screen shot below.

clip_image002[10]

0x43b51c looks like an interesting parameter.  To view the contents of that memory address, I used the “dd” command in WinDbg.  Jackpot!  When I listed out the contents of that memory location, I found something that looks like a GUID, “04d9bdff 4f9a1186 3d4c6daa 12a4d675!”

Using OleView, which is included with Visual Studio, I looked at the CLSIDs for the suspect object – the one returned from the function that I think is executing.  The GUIDs didn’t match.  What?

The QA technician was able to find a version of the library, containing the target of the invocation, that roughly matched the timeframe corresponding to the hotfix.  Examining that DLL in OleView revealed the returned type to have a GUID of – wait for it - “04D9BDFF-1186-4F9A-AA6D-4C3D75D6A412!”  (Accounting for how the GUIDs are stored in their structures, and the way parameters are stored on the stack, I had found a match.)

So, the picture becomes very clear at this point and I was able to explain to the QA technician, and other programmers working on the problem, what had happened.  We had distributed a hotfix and neglected to update compatibility.  The hotfix was being applied to a machine that had a newer version of the library containing the type causing the problem, so the GUIDs didn’t match.

Problem Solved…fix compatibility for the hotfix, or distribute the appropriate version of the child library.

Conclusion

You’re not forced to put up with the terse error messages thrown by VB6 when you experience ActiveX instantiation problems, whether those are “ActiveX can’t create object” errors or “Type Mismatch” problems.  As well, there are many more tools at your disposal to use for debugging than the Visual Studio IDEs.  Although I didn’t use the SysInternals tools to solve this particular problem, they are very handy for quickly diagnosing similar problems.  With a little bit of patience, the problems that you can’t figure out using tools like Process Monitor, you may be able to figure out using something a little more powerful, such as WinDbg.  Following are some great sources that can help you sharpen your Windows debugging skills:

Advanced Windows Debugging, by Mario Hewardt and Daniel Pravat
Advanced .NET Debugging, by Mario Hewardt
http://www.solsem.com/, David Solomon’s Website

Posted on Friday, June 25, 2010 5:06 PM Windows API , Troubleshooting | Back to top


Comments on this post: Finding Binary Compatibility Issues

No comments posted yet.
Your comment:
 (will show your gravatar)


Copyright © MightyZot | Powered by: GeeksWithBlogs.net