Now search is just a call away

Google allows users to phone a toll-free number and make a query.

Sharad Nanda from Pune is a kebab (meat dish) fan. On a trip to Delhi, he wanted to savour some kebabs. He could have easily phoned his friends for the exact location of eateries. Instead, he took out his cellphone, dialed a toll-free number and asked for kebab joints in a specified location. In less than a minute, he was given a choice of three places. He has much to thank the ‘voice search’, which internet search giant Google recently introduced in Hyderabad, Delhi, Mumbai and Bangalore.

Nu Echo Announces General Availability of NuGram IDE Basic Edition

NuGram IDE is a fully-integrated, Eclipse-based, grammar development environment. Used to create, debug, optimize and maintain static and dynamic grammars, it is one of the two components of the NuGram Platform, first presented at SpeechTEK 2008 in New York City as the first complete solution for speech grammar development and deployment.

Following a 6-month beta program involving several hundred users worldwide, Nu Echo is now making NuGram IDE Basic Edition generally available free of charge to the developer community. Designed to support the rigorous grammar development processes required in order to efficiently produce solid grammars, NuGram IDE Basic Edition:

 

  • Enables developers to author grammars using one single concise and legible format, regardless of the target speech recognition engine.
  • Offers a grammar editor with several advanced features, including syntax coloring, content-assist - code completion, quick fixes, code templates, etc. - and sophisticated refactoring tools.
  • Provides powerful grammar analysis, visualization, and debugging tools.
  • Provides tools to test grammar coverage and semantic interpretation correctness.

Talking while driving still not safe?

A study by Harvard and the University of Warwick researched the safety of driving while talking and found that people engaged in conversation drove much more poorly than other conditions.

"The worst results came from the subjects tasked with listening to a list of words and then speaking new words that began with the same letters as each word on the list. Those "drivers" had a 480 millisecond delay, which at 60 miles per hour would mean 42.3 additional feet traveled before applying the brakes." 

This task is similar to using an in-vehicle system for command and control purposes.  The driver is speaking to the system and then waiting for it's response and possibly speaking again. A mitigating factor is that typically, speech offers the ability for shortcuts to activate functionality more quickly, reducing the time that drivers are interacting with the system.

In-vehicle systems are not totally hands-free however; they usually are "push to talk" like a Nextel phone or walkie talkies. The driver is end pointing their speech, making the system's job easier.

In-vehicle speech recognition is worth watching, and it may be safer than alternatives, but it still hasn't been shown to actually be safe.

Nuance prepares to exit Israel

In late 2007, Nuance laid off most of it's Israeli staff, leaving behind 30 to 40 employees. Nuance's R&D lab in Israel was made of people it had acquired through ART and Phonetic Systems in 2004.

ART provided mobile speech recognition (an area in which Nuance has made other acquisitions) and handwriting recognition (which has seeminly allowed to go fallow). Phonetic Systems had developed their own technology to provide Directory Assistance information. The closure of these operations is in line with Paul Ricci's dedication to running a relatively lean organization and his concerns about how current economic conditions are effecting the markets in which Nuance participates. He has noted that Enterprise Speech Rec partner revenue has been much lower than anticipated. In addition, the growth in mobile has slowed.

Update: Nuance acquires parts of IBM's speech technology

Nuance recently announced that it will be purchasing some of IBM's speech technology. While the addition of IBM's source code will enable Nuance to make improvements to its embedded and network-based speech recognition technology, the acquisition and ensuing relationship has prompted questions over Nuance's technology and IBM's motives.

Nuance recently announced that it has acquired parts of IBM's speech recognition technology; namely, the source code from IBM's research and development team, which will enhance its speech capabilities in the areas of network-based and embedded text-to-speech (TTS), and advanced speech recognition (ASR). Nuance intends to combine the source code with its own over the next two years to improve the performance of its speech recognition engine.

Despite initial speculation that IBM will no longer compete in this market, the company will continue to develop its speech capabilities independently in these areas. It has sold Nuance a past release of its code for its embedded ViaVoice software and its WebSphere Voice Server middleware. The key motive for IBM in making this transaction is to gain some return on investment for its speech recognition technology, which is not unusual as it regularly sells patent licenses to other vendors.

The purchase of IBM's technology reinforces Nuance's aims to develop leading speech technology. However, it has also led to speculation that IBM's technology was in fact superior to Nuance's; if true, Nuance's decision to acquire this technology was a prudent one.

 

Study Finds Consumers Increasingly Relying on Voice-Enabled Navigation and In-Car Systems

Nuance Communications today announced findings that show consumers will take advantage of automobile voice recognition capabilities if they’re built in. In fact, eight out of nine respondents who own speech-enabled in-car systems and navigation devices regularly use the voice recognition capabilities. The Automotive Voice Interface User Survey conducted by Maix Research and Consulting also revealed a high degree of satisfaction among 73 percent of users that will lead them to recommend the technology to friends and family, as well as plan to repurchase automobiles with speech-enabled functions in the future.

Conversive Announces New Speech Solution

Through its work in Web chat, Conversive has learned that many questions can be ably handled with one answer.  Unlike, it says, with previous ‘natural conversation’ speech recognition technologies, its speech solution readily recognizes the wide range of ways in which customers ask a single question.      Conversive’s speech solution works well in both standalone applications and in inbound call streams to divert calls to self-service. One of its key appeals is its ability to resolve a customer’s issues while they are on hold.  Instead of listening to music or a solemn voice indicating how many minutes remain before a contact center agent can take the call, a customer hears a voice asking: “How may I help you?”  The Conversive Speech Recognition solution turns the customer’s stated response into text.  Then, the Conversive Automated Conversation Engine matches the text string to the most appropriate response.  The answer is provided to the customer with a recorded voice.  

The power of (synthetic) speech

A single spoken word takes three breaths and a great deal of willpower for Geoffrey Roberts: like thousands of people, he has cerebral palsy, which also makes his speech nearly unintelligible to those who do not know him. His limited speech is a source of great frustration for his otherwise nimble mind.

That is why a technology being developed at Barnsley Hospital and Sheffield University is nothing short of revolutionary for people like him. The Voice Input Voice Output Communication Aid (Vivoca) uses speech recognition technology to translate severely distorted words into clear sentences. It can also communicate entire sentences having heard only one or two key words.

The aid consists of a handheld computer and a wireless Bluetooth headset. Users will also be able to choose from a range of male and female recorded voices and regional dialects. Voices in the bank already include the Barnsley poet Ian McMillan and the Yorkshire BBC newsreader Christa Ackroyd. People who are slowly losing their speech - through Parkinson's or motor neurone disease, for instance - can record their voice before it has completely deteriorated.

Air France extends its use of Nuance speech recognition for telephone information services

Air France originally began using Nuance technology in October 2006 to provide English and French speaking customers with access to dial-up information services. The application has since been expanded to support Spanish, Italian, German and Portuguese as well.

The airline has now further expanded the service to offer customers the ability to book, purchase and change tickets or obtain refunds from an Air France agent; access real-time flight information; obtain information about Air France coach services; check on required vaccinations; and monitor luggage in the event of an incident.

Google Voice: A push to rewire your phone service

Google Voice, the new version of the GrandCentral technology Google acquired in July 2007, has the potential to make the search giant a middleman in an important part of people's lives, telephone communications. With the service, people can pick a new phone number from Google Voice; when others call it, Google can ring all the actual phones a person uses and handle voice mail.

The old version could let people centralize telephone services, screen their calls, and listen to voice mail over the Web. But the new version offers several significant new features, though. Google now uses its speech-to-text technology to transcribe voice mail, making it possible to search for particular words. Gmail's contacts now is used to instruct Google Voice how to treat various callers. And Google Voice now can send and receive SMS text messages and set up conference calls.