When it comes to designing intuitive, compelling user interfaces, Apple is hands-down the best. Starting with the Mac but most evident with each new generation of “i” products — iMac, iPod and iPhone — the company has demonstrated time and again what so many other device makers and mobile operators have failed to understand: It’s the UI, stupid! So when Apple features Voice Control in commercials for the newest iPhone 3GS, the mobile industry should sit up and take notice.

While the marriage of speech technologies and mobile is under way and irreversible, the transition won’t be a smooth one. First, many undoubtedly remember past speech applications that didn’t work very well. That perception will need to be overcome; implementing speech with simple applications, as Apple has done with Voice Control, is a good way to start. Secondly, some applications are more compatible with speech than others. Selecting and listening to music, for instance, is a natural application; the number of songs and artists is limited, which improves accuracy of speech recognition, and users typically listen to music in a closed environment or with a headset — hopefully with a built-in microphone — which reduces ambient noise and makes it easier for voice commands to be understood.

Much as RIM has carved out a loyal following by developing solutions optimized for email, there is a significant opportunity for operators and OEMs to incorporate speech into mobile devices and applications in a comprehensive way. Apple is leading the way, and others will likely follow suit.

One April, 11 and a half years ago, Hollywood actor Richard Dreyfuss presented a new type of software that was going to 'revolutionise business'. He had been paid to host the launch of Dragon's NaturallySpeaking application, which could faultlessly translate spoken words into text. If this worked, we could chuck away our keyboards. Productivity would multiply. Dragon would become the new Microsoft and a new era of IT would dawn.

And work it did too -- in the demonstration. But not everything about the event was quite so well stage-managed. New York was suffering its worst ever blizzard and few made it through the snow. One year later, founders Janet and Jim Baker hadn't found the mass market they may have anticipated. That year, a Belgian firm called Lernout & Hauspie introduced Voice-Express, another desktop speech software product that could potentially free us all from the tyranny of crouching over a keyboard, ruining our posture and giving ourselves RSI. In a demo, it even outperformed the world's fastest typist.

So why aren't we using this software on every computer in the land? Why aren't we talking to computers, telling them what we want to do? How come Windows and Mac OS remained the user interfaces of choice, when voice commands would be so much more efficient and user friendly? Especially as speech dictation has become part of so many phone calls to buy tickets, report meter readings and query bills?

[click heading for more]

Nik Sargent

Nik Sargent

GetDesign | Nik Sargent | design interaction blog

Nik Sargent

Is iPhone’s Voice Control the Sound of Things to Come?

Time to reappraise speech recognition systems?

Nik Sargent