Lyrix and Loquendo Partner for Improved Speech Recognition

In a bid to further improve the functionality of their Mobiso service, Lyrix, a provider of speech-enabled collaboration solutions for large and distributed enterprises, has partnered with speech technologies provider Loquendo.   Lyrix’s speech-enhanced directories are used for Speech Attendants and Speech-assisted Mobile Address Books. The company’s PeopleFind platform also helps businesses with increasingly on the go workers to more efficiently communicate while also reducing costs.

Report: iPhone 3.0 may include voice recognition, synthesis features

rumors suggest that the iPhone may soon be more receptive to my pleas: Ars Technica reports thatuncovered software frameworks in the iPhone 3.0 beta might represent speech recognition and synthesis systems.

Of course, this isn’t exactly out of left field. Both the latest iPod nano and iPod shuffle include speech synthesis capabilities, dubbed Spoken Menus in the 4G nano and VoiceOver in the iPod shuffle, that allow users to navigate the devices without having to look at the screen. In both cases it helps people who are visually impaired use the devices, but in the shuffle it’s also a necessity, since the device has no screen and a potentially confusing control scheme.

The iPhone and iPod touch are extremely difficult for visually impaired users to interact with, as they have little in the way of tactile cues or feedback. Voice recognition in particular has been a heavily requested feature, as it could improve not just accessibility, but everyday tasks such as dialing a number without having to look at the phone’s screen—handy for when you’re driving, for example.

Cordic Picks Loquendo for Speech Technologies

Fleet management solutions provider Cordic offers a highly innovative solution for taxi companies that automates taxi bookings. To voice-enable this offering, the company has chosen speech technologies provider Loquendo for their ASR and TTS technologies.   As per the partnership with Loquendo (News - Alert), Cordic’s cPAQ taxi dispatch system now includes IVR capabilities so that customers can book jobs, check on the progress of their bookings, and don’t have to speak to an operator or staff member to get information.

AppTek Bolsters Media Monitoring with Hybrid Machine Translation

In addition to providing coverage of dialects for automated speech recognition, MediaSphere provides AppTek’s hybrid machine translation system for customers.

MediaSphere is a software solution that offers multilingual transcripts of various television and radio stations for many domestic and international news bureaus. To adjust to changes in dialect and language in real-time, MediaSphere makes use of AppTek’s speaker adaptive speech recognition engine. Offering a unified and scalable solution, the updated media monitoring software seamlessly integrates AppTek’s HMT system with its ASR engine.

East Kent cuts turn around times with SRC

East Kent Hospitals University NHS Foundation Trust has been able to reduce cancer diagnosis times by deploying digital dictation with speech recognition from SRC.

The new technology is allowing pathologists to dictate results in real-time. As a result, histology reporting turnaround times have been cut from a week to often the same day.

Paul Williams, head of BMS cellular pathology at East Kent said: “It has been invaluable in helping us to improve reporting turnaround times and eradicate typing backlogs in spite of increasing workloads. Another, unexpected benefit has been freeing up secretarial staff to train in laboratory duties.”

Now search is just a call away

Google allows users to phone a toll-free number and make a query.

Sharad Nanda from Pune is a kebab (meat dish) fan. On a trip to Delhi, he wanted to savour some kebabs. He could have easily phoned his friends for the exact location of eateries. Instead, he took out his cellphone, dialed a toll-free number and asked for kebab joints in a specified location. In less than a minute, he was given a choice of three places. He has much to thank the ‘voice search’, which internet search giant Google recently introduced in Hyderabad, Delhi, Mumbai and Bangalore.

Nu Echo Announces General Availability of NuGram IDE Basic Edition

NuGram IDE is a fully-integrated, Eclipse-based, grammar development environment. Used to create, debug, optimize and maintain static and dynamic grammars, it is one of the two components of the NuGram Platform, first presented at SpeechTEK 2008 in New York City as the first complete solution for speech grammar development and deployment.

Following a 6-month beta program involving several hundred users worldwide, Nu Echo is now making NuGram IDE Basic Edition generally available free of charge to the developer community. Designed to support the rigorous grammar development processes required in order to efficiently produce solid grammars, NuGram IDE Basic Edition:

 

  • Enables developers to author grammars using one single concise and legible format, regardless of the target speech recognition engine.
  • Offers a grammar editor with several advanced features, including syntax coloring, content-assist - code completion, quick fixes, code templates, etc. - and sophisticated refactoring tools.
  • Provides powerful grammar analysis, visualization, and debugging tools.
  • Provides tools to test grammar coverage and semantic interpretation correctness.

Talking while driving still not safe?

A study by Harvard and the University of Warwick researched the safety of driving while talking and found that people engaged in conversation drove much more poorly than other conditions.

"The worst results came from the subjects tasked with listening to a list of words and then speaking new words that began with the same letters as each word on the list. Those "drivers" had a 480 millisecond delay, which at 60 miles per hour would mean 42.3 additional feet traveled before applying the brakes." 

This task is similar to using an in-vehicle system for command and control purposes.  The driver is speaking to the system and then waiting for it's response and possibly speaking again. A mitigating factor is that typically, speech offers the ability for shortcuts to activate functionality more quickly, reducing the time that drivers are interacting with the system.

In-vehicle systems are not totally hands-free however; they usually are "push to talk" like a Nextel phone or walkie talkies. The driver is end pointing their speech, making the system's job easier.

In-vehicle speech recognition is worth watching, and it may be safer than alternatives, but it still hasn't been shown to actually be safe.