April 29, 2009

Google's New Android SDK Adds Speech, Widgets

April 29, 2009/ Nik Sargent

The new version of Android offers a number of new features, including home-screen widgets, speech recognition, live folders, and soft keyboards, as well as new programming tools.

HTC has also produced new system images to allow Android 1.5 upgrades for Android Dev Phone 1, according to an Android blog post.

April 26, 2009

Lyrix and Loquendo Partner for Improved Speech Recognition

April 26, 2009/ Nik Sargent

In a bid to further improve the functionality of their Mobiso service, Lyrix, a provider of speech-enabled collaboration solutions for large and distributed enterprises, has partnered with speech technologies provider Loquendo. Lyrix’s speech-enhanced directories are used for Speech Attendants and Speech-assisted Mobile Address Books. The company’s PeopleFind platform also helps businesses with increasingly on the go workers to more efficiently communicate while also reducing costs.

April 22, 2009

Report: iPhone 3.0 may include voice recognition, synthesis features

April 22, 2009/ Nik Sargent

rumors suggest that the iPhone may soon be more receptive to my pleas: Ars Technica reports thatuncovered software frameworks in the iPhone 3.0 beta might represent speech recognition and synthesis systems.

Of course, this isn’t exactly out of left field. Both the latest iPod nano and iPod shuffle include speech synthesis capabilities, dubbed Spoken Menus in the 4G nano and VoiceOver in the iPod shuffle, that allow users to navigate the devices without having to look at the screen. In both cases it helps people who are visually impaired use the devices, but in the shuffle it’s also a necessity, since the device has no screen and a potentially confusing control scheme.

The iPhone and iPod touch are extremely difficult for visually impaired users to interact with, as they have little in the way of tactile cues or feedback. Voice recognition in particular has been a heavily requested feature, as it could improve not just accessibility, but everyday tasks such as dialing a number without having to look at the phone’s screen—handy for when you’re driving, for example.

April 22, 2009

Cordic Picks Loquendo for Speech Technologies

April 22, 2009/ Nik Sargent

Fleet management solutions provider Cordic offers a highly innovative solution for taxi companies that automates taxi bookings. To voice-enable this offering, the company has chosen speech technologies provider Loquendo for their ASR and TTS technologies. As per the partnership with Loquendo (News - Alert), Cordic’s cPAQ taxi dispatch system now includes IVR capabilities so that customers can book jobs, check on the progress of their bookings, and don’t have to speak to an operator or staff member to get information.

April 20, 2009

AppTek Bolsters Media Monitoring with Hybrid Machine Translation

April 20, 2009/ Nik Sargent

In addition to providing coverage of dialects for automated speech recognition, MediaSphere provides AppTek’s hybrid machine translation system for customers.

MediaSphere is a software solution that offers multilingual transcripts of various television and radio stations for many domestic and international news bureaus. To adjust to changes in dialect and language in real-time, MediaSphere makes use of AppTek’s speaker adaptive speech recognition engine. Offering a unified and scalable solution, the updated media monitoring software seamlessly integrates AppTek’s HMT system with its ASR engine.

April 19, 2009

East Kent cuts turn around times with SRC

April 19, 2009/ Nik Sargent

East Kent Hospitals University NHS Foundation Trust has been able to reduce cancer diagnosis times by deploying digital dictation with speech recognition from SRC.

The new technology is allowing pathologists to dictate results in real-time. As a result, histology reporting turnaround times have been cut from a week to often the same day.

Paul Williams, head of BMS cellular pathology at East Kent said: “It has been invaluable in helping us to improve reporting turnaround times and eradicate typing backlogs in spite of increasing workloads. Another, unexpected benefit has been freeing up secretarial staff to train in laboratory duties.”

April 09, 2009

Loquendo Adds Speech Recognition in Colombian Spanish

April 09, 2009/ Nik Sargent

Speech technologies provider Loquendo has announced it increased the number of languages available for its ASR technology to 22 with the latest addition being Colombian Spanish.

April 09, 2009

Now search is just a call away

April 09, 2009/ Nik Sargent

Google allows users to phone a toll-free number and make a query.

Sharad Nanda from Pune is a kebab (meat dish) fan. On a trip to Delhi, he wanted to savour some kebabs. He could have easily phoned his friends for the exact location of eateries. Instead, he took out his cellphone, dialed a toll-free number and asked for kebab joints in a specified location. In less than a minute, he was given a choice of three places. He has much to thank the ‘voice search’, which internet search giant Google recently introduced in Hyderabad, Delhi, Mumbai and Bangalore.

April 06, 2009

Nu Echo Announces General Availability of NuGram IDE Basic Edition

April 06, 2009/ Nik Sargent

NuGram IDE is a fully-integrated, Eclipse-based, grammar development environment. Used to create, debug, optimize and maintain static and dynamic grammars, it is one of the two components of the NuGram Platform, first presented at SpeechTEK 2008 in New York City as the first complete solution for speech grammar development and deployment.

Following a 6-month beta program involving several hundred users worldwide, Nu Echo is now making NuGram IDE Basic Edition generally available free of charge to the developer community. Designed to support the rigorous grammar development processes required in order to efficiently produce solid grammars, NuGram IDE Basic Edition:

Enables developers to author grammars using one single concise and legible format, regardless of the target speech recognition engine.
Offers a grammar editor with several advanced features, including syntax coloring, content-assist - code completion, quick fixes, code templates, etc. - and sophisticated refactoring tools.
Provides powerful grammar analysis, visualization, and debugging tools.
Provides tools to test grammar coverage and semantic interpretation correctness.

March 31, 2009

Talking while driving still not safe?

March 31, 2009/ Nik Sargent

A study by Harvard and the University of Warwick researched the safety of driving while talking and found that people engaged in conversation drove much more poorly than other conditions.

"The worst results came from the subjects tasked with listening to a list of words and then speaking new words that began with the same letters as each word on the list. Those "drivers" had a 480 millisecond delay, which at 60 miles per hour would mean 42.3 additional feet traveled before applying the brakes."

This task is similar to using an in-vehicle system for command and control purposes. The driver is speaking to the system and then waiting for it's response and possibly speaking again. A mitigating factor is that typically, speech offers the ability for shortcuts to activate functionality more quickly, reducing the time that drivers are interacting with the system.

In-vehicle systems are not totally hands-free however; they usually are "push to talk" like a Nextel phone or walkie talkies. The driver is end pointing their speech, making the system's job easier.

In-vehicle speech recognition is worth watching, and it may be safer than alternatives, but it still hasn't been shown to actually be safe.

GetDesign | Nik Sargent | design interaction blog

Nik Sargent