The Centre for Speech Technology Research, The university of Edinburgh

30 Oct 2001

Madoka Tsuchiya, Prof. D. Robert Ladd & Dr. Mits Ota


Evidence for effects of prosodic structure on mora duration in Japanese

The conventional wisdom, based on a paper by Port, Dalby and O'Dell (JASA, 1986), is that the mora can be treated as a basic unit of utterance timing in Japanese: the duration of an utterance is a linear function of the number of moras it contains. However, even Port et al. acknowledge that the linear function may mask subtler effects. The experiments reported here investigated whether the supposed organisation of Japanese prosody in terms of bimoraic feet (e.g. Poser 1990, Ito 1990) is reflected in mora duration.

We measured mora durations in 4-mora surnames (e.g. Yamamoto) embedded in carrier sentences. (Methodological details will be reported in the talk.) There was clear evidence that foot-final moras are longer than foot-initial moras, and that foot-final moras in word-final feet are longer than those in word-initial feet. However, there was also clear evidence of an interaction whereby moras in absolute word-initial position are longer than others. Closer investigation suggested that this word-initial effect is due to lengthening of the word-initial consonant (cf. the work on word-initial strengthening by Keating et al. (1998, Labphon 6)). It can therefore be separated out in a model from the foot-based effects, which seem to be genuine.

A possible confound in the 4-mora names such as Yamamoto is that foot structure is isomorphic with morpheme structure (yama+moto). Three control studies explored whether the effect discovered in the main experiment were better explained in terms of foot structure or morpheme structure. The findings of the control studies can be summarised as follows: foot structure has a more robust and consistent effect than morpheme structure, although we cannot entirely rule out a possible effect of the latter.

It should be noted that all the effects we are dealing with here, though quite consistent, are extremely small: the extra duration exhibited by foot-final moras is on average no more than 10 ms in a total duration of 130-160ms. This raises a lot of questions about perceptibility, naturalness of synthetic speech, etc., which we will probably not discuss in the talk.

[back to PWorkshop Archives]

<owner-pworkshop@ling.ed.ac.uk>