WWW pages of 3rd European Master School on Language and Speech

Evaluating Prosody Prediction in Synthesis with respect to Modern Greek Prenuclear Accents

Elisabeth Chorianopoulou
(University of Edinburgh)

Improving the naturalness of synthetic speech is the focus of much recent research. Most modern state of the art Text-to-Speech systems aim at producing a neutral prosody, which ensures intelligibility. Naturalness on the other hand appears to be much harder to achieve.

Previous work on intonational phonology by Arvaniti and Ladd (1999) has revealed that the Greek prenuclear accent consists of two independently aligned tonal targets. A low (L) tone is followed by a high (H) one and both of them are anchored to specific points in the segmental string. Thus, the duration and the slope of the pitch movement completely depend on the segmental composition of the accented word.

This project attempts to examine whether changes in tonal alignment affect the perception of synthetic speech as natural in native Greek speakers. If so, how are their judgements distributed, and how could we define a range within which the output of the synthesizer would sound natural to most listeners? Is it possible that determining such a range will contribute to the design of more natural-sounding systems?