The Centre for Speech Technology Research, The university of Edinburgh

PWorkshop Archives: Spring Term 2004

20 Jan 2004

Alice Turk, Satsuki Nakai and Sari Kunnari (University of Oulu)

Stages of prosodic and segmental processing in speech perception

Current human speech recognition models such as TRACE and Merge successfully explain word recognition in connected speech through activation of, and competition between, lexical candidates when obvious prosodic cues to word boundaries are absent. Yet we know little about how and when prosodic cues to syntactic boundaries are processed, despite evidence from speech perception studies that such prosodic information /is/ harnessed by the listener in parsing speech.

We propose to utilize a Garner task to test the relative timing of prosodic (phrase boundary and prominence) vs. segmental phonemic processing. If the dependency relationship between the two processes is unidirectional (one process is dependent on the other, but the latter is independent of the former), this would suggest that the dependent process takes place after the other, independent process. For instance, the view that prosodic and detailed segmental processing occurs in a bottom-up fashion, as suggested by Pierrehumbert's Fast Phonological Processor, predicts segmental processing to be dependent on prosodic processing, but not /vice versa./

We compare the dependency relationship of the two processes in a non-quantity language (English) vs. two quantity languages (Finnish and Japanese). Finnish and Japanese, unlike English, have phonemically long vs. short contrasts and thus use duration as a primary source of information about the identity of these segments. We investigate whether differing degrees to which acoustic cues are shared by prosodic and segmental information lead to different patterns of prosodic-segmental processing interaction.



27 Jan 2004

Bob Ladd

ToBI or not ToBI: phonetic transcription and the categories of intonation

The ToBI transcription system was intended as a tool for labelling prosodic features of speech databases in standard English. The "tonal" (To) notation in ToBI is based on Pierrehumbert's analysis of English intonation and claims to express PHONOLOGICAL categories, whereas the source of the "break index" (BI) notation was a preliminary IMPRESSIONISTIC PHONETIC transcription comparable to the IPA alphabet. The finished ToBI system modifies the original break index notation to give it a phonological role for which it was not intended; yet at the same time, the tonal notation makes concessions to users who want to transcribe certain impressionistically observable "sub-phonemic" details of alignment and pitch level. Overall, therefore, the theoretical underpinnings of the system are at best somewhat confused.

This inadequate theoretical foundation is increasingly a matter for concern, as ToBI-like systems are being designed for many languages besides English. Designers of such systems generally say that they are basing their notation systems on the "principles" underlying the original ToBI and adapting them to the intonational systems of each new language, but since these principles are not very coherent, in practice the family of ToBI transcriptions is becoming more and more similar to the IPA alphabet. This is particularly true of the "star" notation, which has largely lost its original phonological function and has taken on an implicit phonetic interpretation related to F0/segment alignment.

The use of the star notation for phonetic alignment bears a striking resemblance to the way IPA notational devices represent differences of voice onset time. I argue that in fact ToBI has developed empirical inadequacies that are strongly analogous to the known empirical inadequacies of the IPA alphabet. Since we understand the strengths and weaknesses of IPA transcription fairly clearly, we should refrain from taking ToBI further in the direction of impressionistic phonetic transcription without a clearer acknowledgement of what we are doing and why.



03 Feb 2004

Hanne Anderson

Danish learners' perception and production of English vowels — a tough case for current models of L2 acquistion


17 Feb 2004

Sue Peppe (QMUC)

Perception of pitch-accent in rising contours

From data collected in the course of administering a prosody assessment procedure, we show that it is apparently harder to distinguish pitch-accent in rising contours than in falling ones, and that when an utterance with rising contour is spliced into alternative contexts, the place of pitch accent within the utterance is likely to be assigned differently, according to the influence of context on pitch-accent expectation in the listener. Furthermore, using data from judgment reliability testing, we show that listeners tend to be certain of where pitch accent occurs in what they hear, and are reluctant to opt for a judgment of 'ambiguous'. We investigate some avenues as to why judgments are more likely to concur in falling than in rising contours, and suggest some implications (for communication) of errors in assignment of pitch-accent, as relevant for varieties of English where rising contours / high terminals are the norm.



24 Feb 2004

Mits Ota, Rob Hartsuiker and Sarah Haywood

Fry Me to the Moon (In Other Words)

We know that L1 phonology influences the perception and production of L2 sounds, but does it also affect the phonological representations of L2 words in our lexicon? Spelling errors such as the one in our talk title (taken from a real CD sold in Japan) suggest that it does. More specifically, the lexical contrast between L2 words appears to be indeterminate when the relevant sound contrast is absent from the L1. The aim of this study was to examine this hypothesis, building on findings from reading research which show that homophones induce lexical categorization errors (e.g., subjects tend to positively identify <PAIR> as "a type of fruit"; Van Order, 1987; Van Orden & Goldinger, 1994). To the extent that identification of visually presented words is mediated in such a task by some level of phonological representation, we predicted that lexical indeterminacy in L2 would give rise to a similar type of error in near-homophones (e.g., native-speakers of Japanese misidentifying <FRY> as "to move through the air").

Twenty native speakers of English, twenty Japanese-speaking learners of English and twenty Spanish-speaking learners of English participated in a lexical categorization task involving real homophones (SON vs. SUN) and near-homophones with a /l/-/r/ contrast (LOCK vs. ROCK), /b/-/v/ contrast (BAN vs. VAN), and /ae/-/^/ contrast (FAN vs. FUN). The English speakers made more errors with homophones than their spelling controls, but did not show any difference in their categorization of near-homophones. The Japanese speakers showed categorization confusion in homophones and all three types of near-homophones. The Spanish speakers, like the English speakers, only showed homophone effects. The results of the English and Japanese group confirm our hypothesis that near-homophones can cause lexical categorization errors when they involve constrasts lacking in the L1. But the results from the Spanish group suggest that the effects may be interrupted by the grapheme-phoneme correspondence rules in the L1.



02 Mar 2004

Patrick Honeybone (English Language)

Voicing and non-voicing in English fricatives, past and present

Two sets of assumptions are common in phonological theory in connection with obstruent laryngeal specifications: (i) where a language has a contrast between two series of both plosives and fricatives, their laryngeal states are characterised in the same way; and (ii) except where obvious glottalisation is involved, this contrast is made either by the privative presence vs absence of a 'voicing' gesture/element/feature, or by a binary feature [+/- voice].

In this work-in-progress talk, I explore a set of issues which arise when both these assumptions are rejected. I believe that, if (ii) is rejected, a neat explanation arises for certain developments in the fricatives of Old English; I go on to show that, if this is correct, (ii) must also be rejected to account for the fricatives of Present-Day English. Finally, I show that certain other historical developments make this scenario highly plausible.



09 Mar 2004

Bert Remijsen

Segmental and tonal factors in a Matbat vowel change

Matbat is an endangered Austronesian language, spoken on Misol Island (Indonesia). There are two interesting phenomena in the phonological system. One is its lexical tone system, with six lexically contrastive tonemes. Another is the vowel system, which has developed a seven-vowel system from the five-vowel system of its protolanguage. In this paper, I present phonological and phonetic data on the Matbat tones and vowels. Particular attention is paid to the interaction between vowels and tones. It is well-known that vowels have an intrinsic influence on fundamental frequency (f0), with high vowels having higher f0 than low vowels in the same context. The tonal conditioning of the Matbat vowel system suggests that, the way around, f0 can also affect vowel quality.



23 Mar 2004

Bart de Boer (Vrije Universiteit Brussel) and Jelle Zuidema

From Holistic to Combinatorial Signals

The signals that all human languages use are combinatorial: a limited number of basic signals (phonemes or syllables) can be combined into an enormous number of possible complex signals. Primate signal systems, in contrast, are not combinatorial. A number of theories and models have been developed to explain this evolutionary transition, but some major problems remain. We present a simulation to investigate the hypothesis that combinatorial phonology is a side effect of optimizing signal systems for acoustic distinctiveness. Crucially, signals in our model are trajectories in an (abstract) acoustic space. Hence, both holistic and combinatorial signals have a temporal structure. We believe the model shows a possible evolutionary pathway to the first half of the "duality of patterning".



[back to PWorkshop Archives]

<owner-pworkshop@ling.ed.ac.uk>