How Things Work: Speech Recognition

“Any sufficiently advanced technology is indistinguishable from magic,” said Arthur C. Clarke, renowned science fiction author of the 20th century — and even today, we see the truth of this statement everywhere before our eyes.
“Open Sesame!” chanted Aladdin to open the treasure cavern of Ali Baba and the 40 Thieves. Today, we deal with much the same in our daily lives — from phones that dial numbers at the sound of a name, to automated voice-menus in various phone services, voice recognition is a little bit of magic that technology has introduced into our lives.
Voice recognition, however, is a far more tricky business than it sounds. If even real-life, human students have trouble understanding professors’ thick accents, what hope do machines have in this regard? [click heading for more]

What is VoiceXML?

VoiceXML (VXML) is the W3C's standard XML format for specifying interactive voice dialogues between a human and a computer. It is fully analogous to HTML, and brings the same advantages of web application development and deployment to voice applications that HTML brings to visual applications. Just as HTML documents are interpreted by a visual web browser, VoiceXML documents are interpreted by a voice browser. A common architecture is to deploy banks of voice browsers attached to the public switched telephone network (PSTN) so that users can simply pick up a phone to interact with voice applications. [click heading for more]

Lightning-speed demonstration of building speech recognition application with service creation tool


This quick demonstration shows the main concepts of the Vicorp xMP Director service creation tool by building a very simple call flow, which includes an agent transfer and database look-up. No coding is required, and prompts can be configured at a later date (by the VUI designer) in the xMP Studio tool.


Do we need natural language?

So, do we need natural language? If speech recognition is a tool--like a keyboard--and if we can build useful applications based on the recognition of a few words, why do we need sophisticated natural language understanding? The reason why we need natural language is that it is not always possible to get away with keywords. [click heading for more]

Solving your Outlook contact woes

outlook This post isn't strictly on topic for the blog, but it's my blog so I can break my own rules. Anyway - I figured this was too useful not to share!

For the last month or two I've been plagued by my Outlook contacts folder containing incomplete records, duplicates, missing email addresses, even as many as four versions of the same contact. This has been a real headache, and with the number of contacts reaching many many hundreds, fixing it by hand has been almost impossible.

The situation arose due to my change of employer. I exported my old contacts from outlook and re-imported them into my new instance - but even this process didn't work properly - some fields got lost, others mangled. It really was disappointing. On top of that, I hadn't realised that contacts I'd added from my old employer's Global Address list simply did not resolve to a set of complete details (most notably email addresses were missing).

To compound this, I then had the bright idea of trying to synchronise contacts across all my devices and domains - mobile phone, PDA, personal PC (using Outlook express) with PLAXO as the 'glue'. Sure, I got all my contacts in one place, but it created a mess.

For over a month I've been occasionally hand-fixing the odd record here and there, and looking on the web to find something to make the job easy. Amazingly Microsoft has no such tool to help - amazingly there is even no "find duplicates" function in Outlook, which alone would help helpful.

Then I found "duplicate killer" by 4team.

This wonderful bit of software is more powerful than its name suggests. As well as being able to identify and "kill" duplicates, perhaps more usefully it is able to merge records. You can do this fully automatically (identifying duplicates any way you choose, e.g. by names, by email etc.) or you can work through the gathered list of duplicates step-by-step, manually changing the fixes that the tool suggests. I have to say, so far I have found I am only having to change about 5% of the suggestions - but being the cautious type I am sticking with the manual preview.

The tool can also do similar tricks on your inbox and calendar - perhaps not something you would use quite as often, although non-the-less still useful (I do sometimes end up with duplicate appointments where both I and someone else create an appointment for the same meeting).

I thoroughly recommend "duplicate killer" if your Outlook contact database needs a good spring clean.

Voice enabling XML, Part 1: Develop a voice-enabled RSS reader

RSS is a hot topic these days, as it provides an easy way to stream data online. This article, the first of a four-part series on developing VoiceXML applications, shows you how to develop a voice-enabled RSS reader. The input to the application is RSS data, and the output is VoiceXML that can be read and spoken by your favorite compatible voice application.