www.getdesign.in - My periodic blog exploring the world of business, experience design and interaction, with a smattering of gadgetry and social media. A world where business, people and technology meet.

Let's Fix Things: For over two decades I've been consulting in Communications Design: Everything from business strategy and processes, through to technology, interaction and customer experience. The thoughts here are my own, not necessarily that of my employer.

I have a penchant for spotting patterns and fixing broken user and customer experiences. Even my Bumblebee project hasn't escaped - I've been using Six Sigma techniques to study and predict their behaviour patterns. ☺

July 03, 2008

Rest those tired hands. Voice-recognition software finally works.

July 03, 2008/ Nik Sargent

For decades, computer scientists have dreamed of computers that respond to human voice. But until recently speech-recognition systems could be a nightmare. New users had to recite long scripts to train the software to the peculiarities of their voices, and the software's translations could still be as mistake-prone as a first-year foreign-language student. But lately the technology has improved dramatically. Last summer Nuance Corp., the industry's big player, released a new version of Dragon that's winning raves. This year Microsoft included a voice-recognition feature in its new Vista operating system and dropped a reported $800 million to acquire a speech-software start-up called Tellme. Nuance and other companies—including Google—are working on systems that allow voice to replace the frenzied pecking on BlackBerrys and other mobile devices. "The technology has kind of snuck up on everyone," says Bill Meisel, publisher of Speech Strategy News.
PC-based voice recognition is different from the "call center" systems you encounter when calling banks or airlines. Telephone systems recognize only simple vocabularies and are designed to work with any voice. In contrast, PC-based systems adapt to a single user's speech, gaining accuracy over time. Nuance cites several reasons the software has improved lately. As more Dragon users began to have broadband connections, the company started remotely collecting data on the particular words and phrases that Dragon screwed up, allowing researchers to tweak their black-box algorithms to better target trouble spots. [click heading for more]

July 01, 2008

OnMobile acquires Telisma

July 01, 2008/ Nik Sargent

OnMobile, India's Telecom Value Added Services (VAS) provider today announced that it has acquired 100% of the leading European Speech Recognition company, telisma.
The addition of telisma’s standards compliant speech recognition products & expertise will enable OnMobile to accelerate its penetration into fast growing emerging markets by developing new speech recognition language models. This technology enables quick and easy access to mobile applications and content and also strengthens OnMobile’s mobile applications product suite. [click heading for more]

June 29, 2008

Healthcare driving speech recognition technology growth

June 29, 2008/ Nik Sargent

The automation of healthcare processes is the main driving force behind the growth of speech recognition technology, according to a report released by Datamonitor this week.
Healthcare currently represents 85% of the market for PC- and server-based speech recognition technologies.
“Patient information is gradually becoming digitised in order to address issues with delivering records and test results faster,” said Aphrodite Brinsmead, analyst at Datamonitor and author of the report. Speech recognition is also being used for medical transcription, easing pressure on transcriptionists and allowing healthcare providers to save on staffing costs. Medical transcription is estimated to be a multi-billion dollar market and speech recognition vendors are taking advantage of this. [click heading for more]

June 27, 2008

RadiSys Announces New Speech Capabilities for Convedia Media Server Family

June 27, 2008/ Nik Sargent

RadiSys® Corporation (NASDAQ: RSYS) today announced that its market-leading Convedia® media server family now supports automatic speech recognition (ASR), or converting human speech to computer data, and text-to-speech (TTS) capabilities in multiple languages for IP contact center application developers and service providers. The company’s integration of a standards-based Media Resource Control Protocol (MRCP) with leading speech servers results in better resource utilization and economics, less equipment to procure and manage in large deployments, and improved scalability. [click heading for more]

June 26, 2008

Speech-Recognition Company "Travelling Wave" Goes After Small, Crowded Niche

June 26, 2008/ Nik Sargent

Every once in awhile you find a company that has the odds stacked against them. TravellingWave, which occupies a three-room office just outside of downtown Seattle, has just five employees, including the founders, and is the highly competitive space of trying to figure out how to use voice recognition, which has historically been plagued by inaccuracies, as a way to input information into small devices. To boot, the company's list of competitors includes such giants such as Microsoft ( NSDQ: MSFT) and IBM, but also public companies like Nuance Communications. Founder and CEO Ashwin Rao said it perfectly: "We are in a small niche that's a crowded market. It's dumb unless we have a solid differentiator."
As a company with one year under its belt and a couple of unofficial years, Rao gave me a sneak peak of an announcement it plans to release today that may just be the thing that can put the company on the map. To date, it's big differentiator has been combining voice-recognition with some texting. For instance, when sending an SMS, a user would first speak the word "hello," and then hit the "h" key, which would bump up the software's accuracy. If a person came a long a word that proved more difficult, they would keep typing letters of the word until it was recognized. With one letter, the accuracy increases to 90 percent, with two letters, it's 95 percent, and with three it becomes 99 percent, they claim. In addition, they say people input three times faster with four times less key presses. [click heading for more]

June 26, 2008

BroadSoft Unveils RESTful APIs for Carrier-Grade Voice Application Mashups

June 26, 2008/ Nik Sargent

[nik's note: is this the beginning of the end for voicexml ALREADY?? :-) ]

BroadSoft, Inc. today announced the availability of its Xtended Services Interface (Xsi), new RESTful application programming interfaces (APIs) that will allow Web developers to integrate BroadSoft's carrier-grade voice applications with unified communications solutions and Web-based business and consumer applications, such as Salesforce.com and Facebook.
The Xsi is the latest component to be announced as part of the BroadSoft® Xtended Program, BroadSoft's initiative for the creation of mashups that integrate BroadSoft's BroadWorks® VoIP platform with other applications that are already being used by millions. The RESTful-based Xsi allows subscriber and call resources to be accessed and used via HTTP and simple XML. This approach requires less client-side software to be written than other approaches and is becoming the overwhelming choice for developers to create Web applications. [click heading for more]

June 26, 2008

Why We Won't Have Fully Conversational Robots

June 26, 2008/ Nik Sargent

[nik's note: do you agree with the hypothesis in this article? Can we never replicate human capability at speech? ]

John Seabrook wrote a recent feature in The New Yorker about interactive-voice-response systems (I.V.R.) commonly used with customer service and tech support telephone hotlines. Seabrook spent time at B.B.N. Technologies watching these systems transcribe callers' words and analyzing the tone of voice for emotions present. While breaking down the history of automated telephone services and voice recognition innovations, he attempts to tackle the larger question of whether or not we can create a fully conversational, quasi-conscious robot, akin to 2001: A Space Odyssey's Hal 9000. Judging from the number of experts interviewed for the piece, the answer is a resounding no. [click heading for more]

June 26, 2008

Vlingo's Speech Recognition Features Come to BlackBerry Devices

June 26, 2008/ Nik Sargent

One of the world’s most popular mobile devices is getting an upgrade today as it integrates a broad suite of speech recognition functions developed by a Cambridge, Massachusetts-based company.
Starting today, Research In Motion’s BlackBerry devices are integrating with a voice-powered interface from vlingo, a technology that company officials say unlocks access to mobile phone wireless data services. [click heading for more]

June 26, 2008

Translation systems Speak up

June 26, 2008/ Nik Sargent

WARS often boost technological development. In Iraq the armed forces have faced a shortage of translators, both from within their own ranks and from bilingual locals whose lives can be put in peril if they are found to be working for the foreigners. This has created a demand for machines that can translate between Arabic and English. Although some experimental devices have proved unreliable, they are now improving.
A number of two-way translating devices have been under development as part of the Spoken Language Communication and Translation System for Tactical Use (TRANSTAC) programme run by the Defence Advanced Research Projects Agency, known as DARPA. There are three main participants: IBM, BBN Technologies and SRI International.
SRI said recently that it had sold 150 machines to the American government for use in Iraq. IBM has provided troops with 1,000 of its devices which run MASTOR, its multilingual automatic speech translator. Both systems can translate tens of thousands of words between Iraqi Arabic and American English, even when people are speaking outside the laboratory. [click heading for more]

June 22, 2008

GOOG-411 arrives in Canada

June 22, 2008/ Nik Sargent

Search engine giant Google has launched their GOOG-411 service in Canada.
The service was till now available only in the US market.
GOOG-411 is Google’s voice-recognition local search phone service which enables anyone with a telephone to dial in and ask for nearby destinations. [click heading for more]

GetDesign | Nik Sargent | design interaction blog

Nik Sargent