UML Comics and more from Martin L. Shoemaker (The UML Guy),
Offering UML Instruction and Consulting for your projects and teams.
Coding for the Microsoft Managed Speech API.
So in Part 4, I said that recognizing the music key would be tricky. But why? Didn't I spend most of Part 3 explaining how cleverly I used M-SAPI so that users only had to say partial names to be recognized? Well, yes; but I've long said that programming has a Conservation of Complexity law: the less complex for the users, the more complex for the programmers. (Be glad: that's the short version. My long discussion on Conservation of Complexity would take up the rest of this post.) The reason why ......
In Part 3, we built a Grammar for Dee Jay to recognize. Update to Part 3 Driving around last night, it occurred to me that I can let the user specify what sort of media is expected. For example, I could say "Dee Jay, play song Has Been" to pay the song, or "Dee Jay, play album Has Been" to play the album. This specifier should be optional, so the user only has to use it when the user knows there's a potential conflict. Besides making my Dee Jay experience a little more convenient, this also gives ......
In Part 2, we dug a little bit into MPM (Media Player Magic) to build a JukeBoxPhraseMap, mapping phrases from the Media Player to songs, albums, and collections. Now we need to turn those phrases into M-SAPI commands. In concept, we want a Choices object, which represents a choice between two or more alternate phrases. We could turn the whole map into one giant Choices, and we will; but that Choices would be pretty unusable. No user is going to remember and correctly speak some of the song titles ......
In Part 1, we saw how the process of building a grammar is similar to the Decorator or Composite patterns, building a larger structure out of smaller pieces. In Part 2, we'll build and recognize a grammar to see how to define and identify parts of a command. In some ways, I wish I had chosen a different example for my first speech application. I think Dee Jay is a really cool app, and I use it every day on my drive to work; but the Media Player rogramming is complex enough to be worthy of a few blog ......
To understand the code behind Dee Jay, we first need to understand the basics of the M-SAPI speech recognition system. That means we need to understand three concepts: SpeechRecognitionEngine. This is the class that will listen for commands and phrases and fire events when it recognizes something. We're not ready to understand this class yet, even though it's a very simple class. Before we can look at the SpeechRecognitionEngine, though, we need to look at Grammar. Grammar. This class describes a ......
I wrote Dee Jay as an example for a proposed talk for the Ann Arbor Day of .NET, and as a way to learn more about the Managed Speech API in Microsoft Windows Vista. Dee Jay works with M-SAPI and Windows Media Player to give you a totally voice-controlled way to play your music. You simply say a command like "Dee Jay, play some Dire Straits", and it searches your song catalog for songs by Dire Straits, picks one, and plays it. Or you can name a specific title, or even a genre. If there are multiple ......
...and they're listening to him. Jason built a C# implementation of a Z-machine, the engine that powered classic old text adventures. Now James Ashley has added a Managed SAPI user interface, allowing you to talk to the game and have it respond. Jason knows I'm very excited by M-SAPI, so he sent me a link. Now I'm sharing it with what few readers I have; and I'll be keeping an eye on James's blog. And yes, Jason, I am very excited about M-SAPI. Witness my next post ......