Edinburgh hosted the Listening Talker workshop during 2-3 May. Junichi Yamagashi gave an invited talk, HMM-based speech synthesis adapted to listeners' and talkers' conditions.
Abstract It is known that the intelligibility of state-of-the-art hidden Markov model (HMM) generated synthetic speech can be comparable to natural speech in clean environments. However, the situation is quite different if the listener's and/or talker's condition differ. If the environment of the listener is noisy, most often natural speech is still more intelligible than synthetic speech. If the condition of the talker is disordered due to vocal disabilities such as neurological degenerative diseases, the talker's speech may be unintelligible even in clean environments.
In this talk, we introduce our recent approaches to these problems. To improve the intelligibility of synthetic speech in noise, we have proposed two promising approches based on statistical modelling and signal processing. In the former statistical modelling approach, we use speech waveforms and articulatory movements recorded in parallel by electromagnetic articulography and try to create hyper-articulated speech from normal speech by manipulating articulatory movements predicted from HMM . The latter signal processing approach is a new cepstral analysis and transformation method  based on an objective intelligibility measure for speech in noise, the Glimpse Proportion measure . This new method aims to modify the spectral envelope of speech in order to increase the intelligibility of speech in noise by modifying the clean speech. Finally we mention other work, in which we create natural and intelligible synthetic voices even from disordered unintelligible speech of individuals suffering from motor neurone disease .