[nik's note: not about speech technology per se, but an interesting insight into human language acquisition]
People comprehend their native language with great speed and accuracy, and without visible effort. Indeed, our ability to perform linguistic computations is remarkable, especially when compared with other cognitive domains in which our computational abilities may be rather modest. My work deals with one aspect of language processing, namely, the identification of sounds, which is needed for subsequent word recognition. Sound recognition is a complex task, because the same sounds may be spoken differently depending on the speaker’s sex, age, pitch of the voice or mood. In addition, people may whisper or shout, be in a quiet room or a noisy street. All of these, and many other factors, lead to huge variation in individual acoustic instances of the same sound. It is precisely this acoustic variation that for decades has caused problems for computational linguists and speech engineers building automatic speech recognition systems. Humans, however, even five-year-olds, can successfully recognise sounds and words and under-stand what other people say almost instantly.
So what allows humans to be so efficient at sound recognition and how does that impact on our ability to learn a new language? [click heading for more]