Customer Advisory: Voice Recognition Software
I leaned back in the overstuffed chair with the natty green upholstery.
The chair swiveled slightly to the right so that I could get a better
view out the window at the snowy eastern Connecticut scenery as it
passed by. The old rail car creaked and rumbled as it slowly made its
way down the tracks. I could see the steam that was billowing out of the
engine two cars in front of me. Although it was winter and the windows
were closed, I could smell that the firemen had recently stoked the
engine with fresh, black coal. Just as I reached for my glass so that I
could have another sip of what was left of my drink, I suddenly heard a
faint chirping sound. As I listened more carefully I realized that this
was not the noise of some melodious caged bird, unless this bird had
been taught one of the works of Mozart. I swiveled back in my chair to
see that the person next to me had pulled out her cell phone and was
just beginning a conversation with someone in California.
No, this is not the start of some cheesy science fiction short story
(although I have read many that began in a similar way), this was an
actual event that occurred a few months ago. I had gone with some
friends to a festival in eastern Connecticut and when we got there we
discovered that a steam train excursion was one of the features of the
festival. I couldn't help enjoying the incongruity between very old and
very new technology. While riding on this antique through the pastoral
countryside I struck up a conversation with one of my friends about some
of the features in her new cell phone that she had just used to hold a
conversation with a brother on the west coast. From this starting point
we talked about the latest features of the newest wireless devices. Soon
we were off to various other geeky tangents.
Eventually (though I'm not sure by what route) we found ourselves
discussing voice recognition software. I must admit the concept was
appealing to me. After all when ever I have tried to write something
I've always found myself having to say the text aloud, then type what I
have just said, and finally say it one more time just to see if it
really made as much sense as I thought it should. Wouldn't it be
wonderful to be able to cut out one or more of those steps?
A few weeks later with renewed enthusiasm and little extra money burning
a hole in my purse, I ventured forth to the software store ready to find
the program that would surely change my life forever.
On the recommendation of some friends who have perhaps even less
knowledge than I do about computers, I looked most carefully at IBM's
ViaVoice. There were two versions, the standard and the advanced. They
were about $40 and $60 respectively. Both versions came with a headset
microphone. I surmised that the cost of the headset was about $20 so
the software couldn't be all that expensive. I looked carefully at both
boxes - the Advanced Edition was clearly more expensive but I couldn't
figure out exactly why. There was no substantial difference in what was
said on either box. The boxes were definitely different colors and the
"advanced" was $20 more, and it was "advanced" after all. It was
clearly the better choice.
So I got my new toy home and installed it. The first thing you have to
do with it is train it to understand the way you speak. You have to read
stories to it so that it begins to learn how you pronounce certain
words. There was something strangely fun about reading to a computer and
knowing that it was trying to learn something from what you were
reading. Then, of course, there is the "Star Trek" factor - just
speaking a command will often make it happen (like saying "print" will
actually send the document I'm working on to the printer).
That isn't to say, however, that there still aren't some flaws in this
program. I am using the program right now to type this article, and when
I typed the last line of the previous paragraph I inadvertently said the
word that is spelled p-r-i-n-t and immediately Microsoft Word decided I
was asking for hard copy. So you can see that when a word comes up in
text that may also be a command you can get yourself in a little
trouble. You're supposed to pause before uttering commands, and the
program should recognize them as such. But who doesn't talk like William
Shatner at least once in awhile? It's the pauses that will get you into
trouble. If, for example, you want to describe something out your
window but think for a moment between "your" and "window" suddenly
you're staring at a pull down menu you didn't want (as I am right now).
And then of course, the program is still learning and until it is better
educated it will continue making various mistakes. Furthermore, if I get
tired and my vocal inflexion begins to change, the program has a harder
time understanding me and it begins to seems like it's getting a little
stupid, or I am beginning to say some very bizarre things.
Like now for example, I'm not speaking the same way I did when I started
writing. So the program is having a harder time keeping up with trying
to understand what I want it to type. To make this problem a bit
clearer, I will no longer correct what the program types into Word from
this point on. I will just keep going and hope that it makes some kind
of sense somehow.
Tallinn all, despite its flaws were I find it to be one route to the
relatively enjoyable program to use. Once a while of course the program
makes an educated guess a cat that it thinks you want to say. Sometimes
it's right and sometimes it's wildly wrong. When it is wrong, you could
use a special utility to teach and what the right word is to go with the
sound made. I would describe that but doing so would actually make it
happen and I really don't want to do that now. Another inevitable
problem is that the microphone sometimes picks things up like when I
scratch my nose and pull the microphone and it thinks that the noise is
some sort of word such as this: Pope.
And when you see something like
the Platte appear just because you scratch sure those care a slate, you
really have to wonder just what the hell thises. But as I said we come
up with, I am generally satisfied with this program and it is much
smarter than it used to pay. When perhaps the most annoying experience I
ever had with it though was once when I was just beginning to use it and
the phone rang. Where there is a command to simply tell all the
microphone to consult ought. But instead of turning itself off, it typed
in the words instead read and I tried to get to the phone will saying
many different silly things and less of them ended up in my document.
What that made me feel very foolish. And there is nothing worse than
being embarrassed by software. But what I set? It takes time to get used
to essentially a noise free environment and certainly aboon to to the
Ayatollah SFO. And is there anyone pokeweeds meandered? I think not!