Voice Recognition Software Helps Florida Caseworkers Work Faster


A solution is emerging to enable reporting efficiency. Since July 2008 the department has been deploying voice recognition technology, designed to let fieldworkers dictate their notes while in the field. Software converts the dictation into typed copy, letting investigators spend more time on the road. Once back in the office, the fieldworker plugs the dictation device into a PC and gets a printed report. [click heading for more]

Time to reappraise speech recognition systems?

One April, 11 and a half years ago, Hollywood actor Richard Dreyfuss presented a new type of software that was going to 'revolutionise business'. He had been paid to host the launch of Dragon's NaturallySpeaking application, which could faultlessly translate spoken words into text. If this worked, we could chuck away our keyboards. Productivity would multiply. Dragon would become the new Microsoft and a new era of IT would dawn.


And work it did too -- in the demonstration. But not everything about the event was quite so well stage-managed. New York was suffering its worst ever blizzard and few made it through the snow. One year later, founders Janet and Jim Baker hadn't found the mass market they may have anticipated. That year, a Belgian firm called Lernout & Hauspie introduced Voice-Express, another desktop speech software product that could potentially free us all from the tyranny of crouching over a keyboard, ruining our posture and giving ourselves RSI. In a demo, it even outperformed the world's fastest typist.

So why aren't we using this software on every computer in the land? Why aren't we talking to computers, telling them what we want to do? How come Windows and Mac OS remained the user interfaces of choice, when voice commands would be so much more efficient and user friendly? Especially as speech dictation has become part of so many phone calls to buy tickets, report meter readings and query bills? 
[click heading for more]

Loquendo Speech Engines Featured in Interact's VIP V6 Platform

In a bid to make it even easier for customers to create advanced speech-enabled applications, Interact Incorporated has announced a partnership with Loquendo (News - Alert) to include Loquendo's speech engines on the company's new Voice and Information Processing (VIP) V6 platform.

 
Users have Loquendo TTS and Loquendo ASR technologies right at the fingertips thanks to the Loquendo MRCP Server so they can create high quality speech solutions and save on costs.
[click heading for more]

We can give you the wrong answer much faster

PROBABLY the CTO of every large technology company has to be a futurist. But it's a rare CTO who speaks at the Singularity Summit to consider the prospects for an artificial general intelligence surpassing humans. But Intel's CTO, Justin Rattner, laid out the future of Moore's Law to a packed auditorium for whom computational speed is a near-religious experience.

Yet Rattner says afterwards that raw speed won't be enough. "I once asked our speech recognition team if there was any direct relationship between machine computing speed and recognition accuracy and after a long pause, they said – because they knew I was not going to be happy with the answer – no." He asked why: "Our recognition performance is limited by our algorithmic understanding, not by our instruction speed. We can give you the wrong answer much faster, but we can't give you the right answer much faster."
Speech recognition is, of course, just one of many tasks even a very young human can do routinely and simultaneously.
"It's clearly a case where, until we have the right algorithms, no amount of performance improvement is going to give us the recognition performance a young child can deliver. That's why I try to separate out these notions a little. I have little doubt that when we figure it out we'll require lots of computing power, so there's no sense in abandoning it." 
[click heading for more]

North American Directory Assistance Is "Best of the Best."

Over the past eighteen months, user-paid directory assistance (DA) providers performed at levels that are almost as high as statistically possible. This makes the United States' DA service "the best of the best."

There are three components that drive the accuracy of DA: the automated front-end systems, the operators and the databases. According to the Fall, 2008 National Directory Assistance Performance IndexSM, an independent analysis published semi-annually by The Paisley Group, Ltd. (PGL), automated systems are performing at 98.7% accuracy, operators at 99.0% accuracy and databases at 95.7% accuracy. This results in 94.3% of all calls being handled accurately. The margin of error is +/- 2.6%. 
[click heading for more]

Nexidia Speech Recognition Technology to Assist with Enhanced Video Search


Nexidia, provider of rich media search and speech analytics software, announced today they have been chosen by Thought Equity Motion  to augment Thought Equity’s online content tagging capabilities, enabling an enhanced, more effective video search experience to Thought Equity Motion clients.
With more content under management than anyone in the world, making Thought Equity Motion’s content accessible quickly and accurately is imperative. Thought Equity Motion already sets the industry standard in text and context driven metadata; adding phonetic-based search capabilities adds significant value to customers looking for spoken-word content. 
[click heading for more]

Who Says The Government Can't Read Your Mind?

[nik's note: not the first time I have seen this story and I have to say, the headlines are more enticing than the actual reality, but still - interesting developments... ]

A new study published in Science has found a neural footprint for speech recognition, and they can determine not only what was said, but who said it.
The study used a combination of functional magnetic resonance imaging (fMRI) and a data-mining algorithm to take a look at the neural response to speech. Seven study subjects listened to three different speech sounds (the vowels /a/, /i/ and /u/) spoken by three different people. The neural patterns contained both bits of information. The 'neural fingerprint' reflected the sound being made and yet there were specific 'speaker's fingerprints' which were maintained between the different sounds. 
[click heading for more]

Voice dictation tools helpful, but still have kinks

[nik's note: some opinion - not mine]


“If you simply ran a stopwatch and compared how long it took to dictate and correct a document versus simply typing it, voice recognition doesn’t seem much faster and, indeed, is sometimes slower,” wrote Kennerly of the Philadelphia, Pa.-based, Beasley Firm LLC in an e-mail. “The critical difference is fatigue. After I type a document, I usually feel tired and unwilling to move on to my next task. Voice recognition software dramatically reduces that fatigue.

“The difference is often greatest at the end of the day. Instead of leaving your office with pain in your hands, wrists, and forearms, you leave feeling productive and ready to go back the next day.” 
[click heading for more]

Future Phones to Read Your Voice, Gestures

Buttons are on their way out.

Five years from now, it is likely that the mobile phone you will be holding will be a smooth, sleek brick — a piece of metal and plastic with a few grooves in it and little more.
Like the iPhone, it will be mostly display; unlike the iPhone, it will respond to voice commands and gestures as well as touch.
"So much of how we understand technology is visually driven," says Rachel Hinman, a strategist with Adaptive Path, a user-experience and design-consulting firm. "Mobile interface design has to mimic the touch, sight, gesture and auditory feeds that we use to interact with our environment."
That means speaking to your phone rather than typing, pointing with your finger instead of clicking on buttons, and gesturing instead of touching. You could listen to music, access the internet, use the camera and shop for gadgets by just telling your phone what you want to do, by waving your fingers at it, or by aiming its camera at an object you're interested in buying. 
[click heading for more]