Google finding its voice


Google's Mike Cohen won't be satisfied until anyone who wants to talk to their computer can do so without laughing at the hideous translation or sighing in frustration.
Cohen, a leading figure in speech technology circles, heads up Google's efforts to advance the science of speech technology while applying it to as many products as possible. "Google's mission is to organize the world's information, and it turns out a lot of the world's information is spoken," Cohen said, in a recent interview with CNET about the search giant's speech ambitions.
Google is attempting to produce voice-recognition technology that fits in with its view that the computing universe is shifting toward mobile devices and browser-based applications. That is, easy-to-use software that does the heavy lifting at the data center in order to run over the Internet on a mobile device with limited hardware.
Computer speech recognition seems like it has been five to 10 years away for decades. Indeed, the electronics and computer industries have been chasing the goal of voice-directed computers for nearly 100 years, when a simple wooden toy dog released in 1911 called Radio Rex first captivated children and adults by responding (at least some of the time) when his owners called for "Rex!" by shooting out of a doghouse. (Cohen owns one of the few remaining gadgets.)
Huge advances have obviously been made since the 1920s, yet few of us use our computers like HAL in "2001: A Space Odyssey" or KITT, the computerized car in "Knight Rider." Cohen, however, believes the industry is about to silence the jokes about amusingly garbled voice mails as speech recognition models grow more sophisticated, engineers pack mobile computing devices with more sophisticated hardware, and users start to realize that performance has made great strides.
"The goal is complete ubiquity of spoken input and output," Cohen said. "Wherever it makes sense, we want it to be available with very high performance."



YouTube Launches Auto-Captioning for Videos

Mike Cohen, part of Google’s Speech Technology team (as a note, he is also deaf), spoke via sign language to talk about his team’s work on video. This press conference is about YouTube and accessibility to the disabled, specifically the deaf. It’s also about YouTube’s new auto-captioning technology, which is rolling out to everybody today.

ChaCha Beats Google and Yahoo in Mobile Voice Search Tests


Mobile analyst firmMSearchGroovehas just published the results of a series of tests which show that the mobile search serviceChaChabeat out two other voice-enabled search applications on the iPhone when it comes to search query accuracy.[Update, Ed:a commenter points out that the report was actually sponsored by ChaCha]To test this, the researchers usedGoogle's own mobile applicationandVlingo for iPhone, an app that lets you search both Google or Yahoo. Oddly, they ignoredYahoo's mobile app, which also has voice search built in.

The results of their study aren't entirely shocking: if you want to be understood, ask a human, not a computer.

The Mobile Search Tests

ChaCha's mobile search servicecan be accessed both by SMS and by calling a toll-free 1-800 number. Since these tests focused on voice search, the phone-in method was used. When using ChaCha, the service identified the queries accurately in 94.4% of the cases and delivered accurate search results 88.9% of the time. Vlingo, which the researchers used to test Yahoo search, only interpreted queries correctly in 72.2% of the cases and delivered accurate results 27.8% of the time. Google, surprisingly, fared worst of all. Their mobile application only understood spoken queries in 16.7% of tests and delivered accurate results 22.2% of the time.

To test the applications, the researchers conducted two rounds of tests using both keyword search and natural language queries where they asked questions using sentences. The queries represented a cross-section of typical mobile searches in categories like navigation, directions, local search, general information, social search, and long-tail search.

It's not all that surprising to find thatChaChaoutperformed the other voice-enabled applications - after all, they have real, live humans on the other end of the line to interpret the spoken questions. What is surprising, though, is how wide the gap is in between the human-powered search and the speech recognition apps,especiallywhen contrasting ChaCha with Google.



Now search is just a call away

Google allows users to phone a toll-free number and make a query.

Sharad Nanda from Pune is a kebab (meat dish) fan. On a trip to Delhi, he wanted to savour some kebabs. He could have easily phoned his friends for the exact location of eateries. Instead, he took out his cellphone, dialed a toll-free number and asked for kebab joints in a specified location. In less than a minute, he was given a choice of three places. He has much to thank the ‘voice search’, which internet search giant Google recently introduced in Hyderabad, Delhi, Mumbai and Bangalore.

Google Voice: A push to rewire your phone service

Google Voice, the new version of the GrandCentral technology Google acquired in July 2007, has the potential to make the search giant a middleman in an important part of people's lives, telephone communications. With the service, people can pick a new phone number from Google Voice; when others call it, Google can ring all the actual phones a person uses and handle voice mail.

The old version could let people centralize telephone services, screen their calls, and listen to voice mail over the Web. But the new version offers several significant new features, though. Google now uses its speech-to-text technology to transcribe voice mail, making it possible to search for particular words. Gmail's contacts now is used to instruct Google Voice how to treat various callers. And Google Voice now can send and receive SMS text messages and set up conference calls.

Lost in translation: The iPhone's accents problem

[nik's note: this is funny]

SEASONAL scene somewhere in Scottish theatre land:
"Whit did ye get her fir Christmas?"

"Ah firgoat."

"Ye firgoat? Aw, did she gie ye hell?"


"Well ye said ye firgoat"

"Naw, ah fir goat."

And so the pantomime joke continues, for as long as the colourfully clad dames can draw it out. Eventually the smaller, fatter ugly sister will understand that her taller, scrawnier stage sibling has given a present of a fur coat. But the confusion inevitably won't end there. "Whit fir?" The answer: "Fir tae keep her warm." Obviously. 

The Google application for iPhone, which was developed in the US, is supposed to allow users to search for information by recognising the words they say. Unfortunately, there have been some serious transatlantic translation glitches.
[click heading for more]

Google Voice Search Turns 'Fish' To 'Sex'

Users of the latest release of Google Mobile App for the iPhone have complained that the voice-recognition program doesn't understand some British accents.

Users posting on the Google Mobile Blog have complained that Google's app doesn't understand some British accents, a claim that isn't entirely surprising given the use of English subtitles on certain television shows imported from the United Kingdom to the United States.

"Awesome job, Google," wrote someone posting under the name Kevin. "Only problem is every time I say the word 'fish' it registers as 'sex.' " 
[click heading for more]

Google audio search graduates to lab project

Google has elevated the profile of its attempt to make videos searchable through speech recognition technology, a move that portends a potentially more financially successful YouTube division.
The speech recognition technology was used in an online application that let people search political speeches launched in July, and now the Gaudi (Google Audio Indexing) project has an official interface at Google Labs. [click heading for more]