speech recognition... is it really the future?

Today when I was watching tv, there was a commercial talking about the “kitchen of the future” and it said voice recognition would play a vital role. That got me thinking. I am constantly seeing advanced voice recognition software associated with the word “future.” We find it integrated into new cars, telephones, computer writing software, etc. Yes, there are certain professions and uses that benefit from v.r., but as a standard? Personally, I would never use v.r. related stuff. I don’t want to have a future of talking to ovens, closets, and elevators. Heck, I don’t even like talking on the phone. I will always prefer typing and pressing buttons instead of talking.

I was wondering your opinions on this. Are vocal commands to appliances and such really the future? Voice recognition software is rapidly improving, and it may very well interpret sounds as good as humans someday, but do you actually like it?

It will become more prevalent, thats for sure. My car (Acura TL) has voice recognition. For some things (particularly phone calls, though phones have had voice recognition for a while) it’s very useful. For others (turning on the defroster) it ends up being nothing more then a party trick. It’s easier for me to push 1 button than it is to push a button and say the command and wait for it to kick in. Perhaps if it was more seamless and easy to use people would be fonder of it, but buttons aren’t going away anytime soon - and there are plenty of times I’d rather push a button descretely rather than talk to something.

we see it happening already…look at Microsoft Sync.

I’m not too familiar with it myself, but from what I know the biggest problem seems to be confusion of words. My phone has it, but its such a pain the ass to use. You say ‘Call home’, you get ‘Dialing Laurie’

It’d be cool to see it actually happen and be dependable. I’d consider speech and blinking the 2 fastest ways to perform an action.

They’re wrong about technology in the kitchen.
Have they been in a Williams Sonoma or watched the Food Network lately?

Assuming the technology was 100% reliable, I think adoption will be constrained by culture, attitudes and behaviors that reject it as a mode of man-machine interaction.

Kitchens are not cars.

I dunno - telling my kitchen to “Make me a sandwich” might be pretty sweet.

It’s fairly accurate in my car but it’s a little confusing because if you issue a command for a screen you’re not on (like telling it to play a CD while on the Nav screen instead of the audio screen) it’ll get confused and do something stupid like turn the air conditioning off.

make me a sandwich IF you had the robotic chef in there as well (arms, knives etc)
you would have to state
condiments(ammount and type)
or had pre loaded “sanwich” into its memory.

I agree the kitchen is very cultural, a vestage of our hunter gather times, things will change slowly there (remember the microawve took 25 years to become a standard item)

With limited vocabularies, speech recognition has improved, but there are still other issues with it which will always be a problem. One major one is the problem with ambient noise and voices. Another as mentioned before is privacy and feeling like an idiot. (for people talking to themselves: Bluetooth or Crazy?)

For unlimited vocabularies the progress in accuracy will probably be very slow because we often slur our words, and depend on top-down processing to figure out what people are saying, which involves using your brain to impose pattern matching on incomming auditory information (ex. “Jeet yet? No, jew” trans: did you eat yet? No did you?)

I think that this commercial you saw was an example of “technology in search of an application” rather than the other way around. If however, there were some situations where there was a quiet private environment where other input mechanisms were unavailable (ex disabled users) or already in use, perhaps it would make more sense.

With regard to cultural norms, this can change faster than people may think. (Remember the Walkman?)


Speech recognition is a dead end. The computer can not differentiate between what we want and we are talking about. Look at computers. There is no barrier in software design to a speech recognition, yet it is barely used.

I don’t think it is cultural acceptance. We have been trained through 40-50 years of Buck Rogers, Star Trek, Star Wars, Babylon 5 and whatever other sci-fi story you wish to add to believe that one day we will work more efficiently by speaking to computers. So much so, that people keep presenting concepts like this. I remember someone posted a youtube video from the '50’s of a “kitchen of the future” that was voice controlled too. There is a deep desire to make something this easy, but there are big problems. Props to the designer/engineer/inventor who figures out how to solve those problems!

good post.

i remember my first encounter with speech recognition on the first imac. it worked (sorta), but i had to pretty much yell the commands in broken english and felt like i was some sorta savant or had tourettes. ("open folder. OPEN FOLDER. "x@$%^&* FOLDER!!!).

even now on my new iphone, speech recognition is one of those technologies, i believe that is better in theory than in practice as many here have mentioned.

i’d lump it in with other “great” future technologies that have been promised for at least the last 50 years such as videophones, sliding doors (a la star trek), and hover cars. possible, perhaps, but practical and gonna go mainstream anytime soon…think not.

all that being said, i still think that good speech recognition is possible and really just depends on a more realistic, human approach. if apple were to seriously tackle the issue, im sure the result could work. the tech is there (saw a great TED video recently on the topic), but i think its more an issue of UI and design at this point than technology or processing power.


The whole purpose of speech control is that you don’t need to face a screen. The way I picture it, your fridge tells you you’re running out of milk and while you are walking around the house opening your mail or doing the dishes you tell it to order some more.

I know this includes some other technology as well but my point is that I don’t think speech control will -replace- input apparatuses like the mouse and keyboard, but rather complement them.

Thinking back about my encounter with speech mouse software:
“Left, left, left, faster, faster, faster, STOP!!, right, right, slower, right, slower, STOP!, click!, right, slower, right, stop, doubleclick!”