Speech-recognition software now faster, more accurate than ever
Two speech-recognitions applications can be used for controlling your computer as well as for dictating documents. And both can be trained to better recognize your voice.
Special to The Seattle Times
Dragon NaturallySpeaking 11.5 Premium
Processor: 1 GHz Intel Pentium or equivalent AMD processor or 1.66 GHz Intel Atom. 1.8 GHz Intel dual core or equivalent AMD recommended
Cache: Minimum 512 KB, 2 MB recommended
Hard disk space: 2.5 GB
Operating system: Windows 7, Vista, Windows XP (32-bit only), or Windows Server 2003 and 2008
RAM: Minimum 1 GB for Windows XP and Windows Vista, 2 GB for Windows 7 and Windows Server 2003/2008
Sound: Creative Labs Sound Blaster 16 or equivalent sound card supporting 16-bit recording
DVD drive: Required for installation
Peripherals: Nuance-approved noise-canceling headset microphone (included in purchase)
I had no idea when I started this review just how much I would need it. Several weeks ago, on a rainy Saturday morning, my motorcycle got a little too intimate with a van. Unfortunately, I was on it. My broken wrist is going to be just a little more expensive to fix than the motorcycle.
So, yes, I need the money I make from this review. But more to the point, I need the speech-recognition software to write it.
I started out with the intent of comparing Nuance's oddly named Dragon NaturallySpeaking with the free Speech Recognition utility that comes in Windows.
Both applications can be used for controlling your computer as well as for dictating documents. And both can be trained to better recognize your voice.
But it didn't take me long to drop the idea of comparing the programs. Microsoft's Speech Recognition might be OK for the simplest tasks, but if you care at all about accurate speech recognition — especially in dictating documents — you want to shell out for Dragon NaturallySpeaking.
As a journalist who does a lot of interviewing, I've been using Dragon Naturally-
Speaking for about 20 years. With the new version 11.5, however, I can report that product has gotten fast and accurate enough to count on it regularly for transcriptions and dictations.
After only about 10 minutes of training the software to my voice, I dictated a full page of text — reading from a newspaper story — Dragon made only two errors. By comparison, Microsoft Speech Recognition produced a dozen errors, many of them so serious that I could not determine the correct text without referring to the original.
While Dragon's speech recognition is surprisingly accurate from the get-go, you can improve its performance in a variety of ways.
For starters, you will be prompted to create a user profile. In doing so, you will select the closest "speech model." You can, for example, choose "Australian accented English." There even are variants of the speech models that take into account whether you have a single core or a multicore computer and whether you're using a Bluetooth device. Each model is optimized to produce better results under those conditions.
And if you find that the program has difficulty with a certain word or phrase, you can train it specifically for that word or phrase. Just go to the audio menu and select "Improve recognition of word or phrase." You'll be prompted to type in the word or phrase and then to speak it.
Hearing your speech
Another option on the audio menu gives you access to a dozen or so readings you can perform at your leisure to further improve Dragon's accuracy. The more the program hears of your speaking the better it is at transcribing. Dragon also offers a variety of ways for you to cause the program to remember corrections you make so that future recognition is more accurate.
You can also improve accuracy by selecting recognition modes. The most flexible is normal mode. In this mode Dragon allows you to issue control commands and perform dictation. It tries to detect which you're doing by analyzing the context and what you say between pauses.
I found the program does a pretty good job of this, but there inevitably are some confusions. You can eliminate these confusions by choosing another, narrower mode, such as dictation, command, numbers or spell.
Finally, I found that I got noticeably better results using a headset microphone rather than a desktop microphone. Microphone quality is critical to recognition accuracy, so you may want a better microphone. Background noise can also significantly affect performance.
Nuance allows you to control almost everything about Dragon's configuration. And we're not just talking about where the toolbar is anchored. You can specify whether voice commands should be used to perform a variety of tasks, including managing email, searching the desktop, navigating the Web and so forth.
You can even have Dragon recognize digital files from your voice recorder, and you can use the smartphone as a wireless microphone.
Other capabilities include a text-to-speech technology that reads on-screen text in human-sounding synthesized speech. Dragon Voice Shortcuts let you create email, schedule appointments and search your desktop using voice commands. And you can also create commands for inserting boilerplate text or graphics into documents.
The Windows version of the program is designed to work inside of virtually any Windows application, and I found it did so without a hitch.
Not surprisingly, Dragon is more effective when it is tailored for a particular user audience. Versions are available for individual users, corporate users, small business.
Tailored to professions
The product has also been tailored for those in various professions, including health care, law, education and even automotive. There are versions that include headsets, Bluetooth headsets and other accessories.
I tested the standard Windows Premium version, which has a list price of $199.99. (By the way, Dragon accurately transcribed my saying the price in normal fashion.)
The long and the short of it is that, after decades of wishing that the program were better, I think the program is finally there.
Now if only I could get it to accurately recognize voices for which it hasn't been trained. Then I'd be able to feed into it recordings of my interviews. I'm sure the National Security Agency has something capable of doing that, but for now at least Dragon NaturallySpeaking 11.5 is the best on consumer markets.
Patrick Marshall writes the weekly Q&A column
in Personal Technology.