Wow... that's a lot of capitals and acronyms to squeeze into a single title (well, ok... one acronym, and two initialisms, if you want to split hairs). Anyhoo... after roundly dismissing the WPF WebBrowser control in an earlier article, I now find it's (ahem) not quite so bad after all.
I needed to create a WYSIWYG HTML editor in WPF for a project I'm currently working on. There are quite a few commercial WYSIWYG editors based on the Windows Forms WebBrowser control, and also quite a few examples of free ones on CodeProject etc, but I didn't spot any out there using the WPF WebBrowser control at the time.
We didn't really want another interoperability layer between our WPF app, and the ActiveX control which forms the backbone of the WPF WebBrowser, so we wanted to avoid the Windows Forms solutions- although it is quite possible to embed one of these onto a WPF Window using a WindowsFormsHost. In the end, we decided to roll our own, though it should be pointed out that it owes a debt of gratitude to the existing windows forms implementations such as the one linked to above.
The main issue to overcome when developing the control is the necessity of working with the Microsoft HTML Object Library COM component (MSHTML for short). If you want to execute any functionality against the document exposed by the WebBrowser control, you will have to start getting familiar with this component (for anyone who has not used it before- in Visual Studio, click add reference, navigate to the COM tab, scroll down till you see Microsoft HTML Object Library. Once added, you need to import the mshtml namespace in your class).
The document exposed by the WebBrowser control is initially null, until you browse to a valid URI, or load some content into the control using the NaviagteToString or NavigateToStream methods. Once loaded, the document is of type HTMLDocumentClass, which you will find in the mshtml namespace- once you have casted the document to this type, you can access its properties and methods.
It turns out there is a wealth of useful functionality available here which isn't exposed via the WebBrowser control itself- functionality for formatting html in the WebBrowser, and (crucially) for accessing the designMode property of the document, which effectively changes it from a read-only view of the HTML to an editable one which accepts user input (which is 90% of the work done already, in a single line of code!). Of course, the Windows Forms WebBrowser control exposed this property itself, and I'm sure that the WPF one will also do so in its next release, but for the time being, the only way to get at it is via the native HTMLDocument.
I also made the decision during development of the control to avoid any explicit non-.NET dependencies. In particular, this meant that I couldn't add a direct reference to the Microsoft HTML Object Library. Why did I do this? Well, for one thing, we work in a continuous integration environment, but our continuous integration build box has an older version of the MSHTML library, so every time I checked in the code with an MSHTML reference, it was breaking the build.
I solved this by late binding to the MSHTML library, however not the way you might think...
As a c# developer, my first instinct was to rewrite the code without the MSHTML reference using reflection to perform the late binding. However, I quickly found it was turning out to be a pretty ugly coding experience, with quite a lot of nested GetMethod and InvokeMember calls. I then remembered that VB.NET is able to do late binding at the language level: if you switch Option Strict off you essentially turn VB.NET almost into a scripting language, which is exactly what I wanted here. So I created a VB.NET wrapper for the HTMLDocument which exposes the functionality I need, which works very well indeed, and now consists of code several orders of magnitude more readable than the C# equivalents (although it will be possible to achieve the same thing in c# 4.0 when it comes out, using the new dynamic features).
All well and good so far... I now had the HTML document doing what I wanted, and I even nicked a nifty colour picker from the good folks at Microsoft, in order to enable font colour and highlight colour selection.
I exposed a few methods for loading HTML content: either from a specific URI, or from text content, or a stream. Then added a routed event which reports when the current HTML is edited.
The next task was to replicate a few pieces of functionality people generally expect to be available in HTML editors, but which isn’t provided via the MSHTML object model. The first is the ability to create ordered and unordered lists, the second is the inserting of hyperlinks, and the third is the ability to align the selected text. Each of these tasks, I accomplished with the help of the indispensible HTML Agility Pack HTML Parser. I used this to parse the HTML and manipulate the DOM of the loaded document directly.
Then, just when I thought I had it all cracked, I integrated it into the app, and found that the pesky ActiveX control was not quite behaving as expected. Every time the user edited the HTML document, then attempted to load a new one, a dialogue box was being generated prompting the user that the current content had changed, and asking them if they wanted to save this updated HTML content to disk, or discard the changes.
Given that our end users have most likely never even heard of HTML, let alone have any wish to save some to their disks, this was not good. Try as I might, I simply couldn't find any good way to suppress these dialogue boxes. Whatever properties I set on the document, or on the control, up it popped each time, like that hugely irritating paperclip in older versions of office.
Eventually, I found an article which had gone some way towards solving this exact problem, which provides a suitably tortuous hack which works by hooking the WndProc and listening out for the message which is raised when new windows are created. As the dialogue box is always opened with the same text in the title bar, each time a new window is created, its title bar text is examined, if it matches the dialogue window's text, the message is swallowed before the window is shown. Excellent!
However, one problem remained: because the window that is created by the control is an Yes/No/Cancel dialogue- if you just kill the window, it returns a dialogue result of 'Cancel' which results in nothing happening; remember the window is asking the user to confirm whether they want to A- save the updated html content to disk, B- discard the changes and continue, or C- cancel the loading of the new document. So, the final task was to find a way to programmatically click the 'No' button, so the modifications are discarded without the user ever seeing the window. This, I accomplished by way of the EnumChildWindows windows hook to recursively search through the elements on the dialogue window, until it finds the 'No' button, then the SendMessage win32 function to simulate a button click. This, finally, cured the problem of the recurring popup.
All in all, quite a bit of work to re-implement a lot of functionality which came for free on the Windows Forms HTML editor, but a good learning exercise along the way. The source code can be downloaded from the link at the top of the article.
Update (9/4/2009): I have uploaded version 1.1 of the editor, which fixes a couple of bugs around the alignment and formatting.
Update (20/06/2009): I have uploaded version 1.2 today with a couple more stability fixes.