The Centre for Speech Technology Research, The university of Edinburgh

PWorkshop Archives: Autumn Term 2001

04 Sep 2001

Dr Corine Astésano & Dr Mireille Besson (CNRS Marseille)

Brain Potentials Investigation of Semantic And Prosodic Processing During Spoken Language Comprehension    (Corine Astésano)

Few neurophysiological experiments have been aimed at understanding the role of prosodic cues in spoken language comprehension. Here, we used Event-Related brain Potentials (ERPs) to study the relationship between semantic and prosodic processing. Results showed that an N400 was associated with semantic mismatch (right centro-parietal scalp distribution) and that a P800 was elicited by prosodic mismatch (left parietal scalp distribution). These topographic differences may indicate that different underlying generators are responsible for the semantic and prosodic effects observed at the scalp. Moreover, we were able to demonstrate that semantic information is processed predominantly with respect to prosodic information even under different task demands. The temporal alignment and the magnitude of the ERP components suggest that semantic processing is allocated more processing resources and that, at least under the specific experimental conditions of our experiment, prosodic information is underspecified during sentence comprehension.

The specificity of language processing: Electrophysiological approach    (Mireille Besson)



02 Oct 2001

Dr James M Scobbie (Queen Margaret University College)

Sounds and structures: covert contrast and non-phonemic aspects of the phonological inventory

I argue that the study of child phonology is hampered by a transcriptional methodology which fails to detect phonemically crucial aspects of the child's actual phonetic output and renders it almost impossible to study the acquisition of phonetic systems. Additionally I will argue that the study of child phonology is not well served by its focus on the emergence of phonemic contrast in any case.

First I will show that many types of phonemic contrast can be "covert" , drawing on evidence from normally-developing children and from so-called "phonologically disordered" child speech. Such covert contrasts arise when adult listeners, including trained phonetic transcribers, are unable to detect auditorily a phonemic contrast which a child is in fact producing. In all such cases of mismatch, transcription underestimates the ability of the child to produce contrasts, and results in theoretical analyses postulating phonological neutralisation whereas in fact there is none. I will discuss the implications of these findings. I will then argue that to account for the patterns and the variation in child speech, we need to appeal to a coherent phonetic-phonological model of an integrated sound system.

I will also argue that there is an overwhelming tendency in child language acquisition research to focus on something like a phonemic inventory rather than on wider aspects of the sound system. Clearly, "low level" phonetic implementation is little understood, but my critique is aimed at "inventory-based" research which fails to address categorical phonological allophony. Such a narrow focus is theoretically stifling, both for researchers in surface-oriented phonological theory, and for those who take a broader view of the challenges facing children who have to learn all the rich non-universal detail of their language.



16 Oct 2001

Dr Ineke Mennen (QMUC) & Dr Gerry Docherty (University of Newcastle)

Cross-linguistic differences in pitch characteristics: a study of mono- and bi-lingual speakers


23 Oct 2001

Dr. James M. Scobbie (Queen Margaret University College)

Report on two conferences:

Workshop on Early Phonological Acquisition

Conference on the Phonetics-Phonology Interface



30 Oct 2001

Madoka Tsuchiya, Prof. D. Robert Ladd & Dr. Mits Ota

Evidence for effects of prosodic structure on mora duration in Japanese

The conventional wisdom, based on a paper by Port, Dalby and O'Dell (JASA, 1986), is that the mora can be treated as a basic unit of utterance timing in Japanese: the duration of an utterance is a linear function of the number of moras it contains. However, even Port et al. acknowledge that the linear function may mask subtler effects. The experiments reported here investigated whether the supposed organisation of Japanese prosody in terms of bimoraic feet (e.g. Poser 1990, Ito 1990) is reflected in mora duration.

We measured mora durations in 4-mora surnames (e.g. Yamamoto) embedded in carrier sentences. (Methodological details will be reported in the talk.) There was clear evidence that foot-final moras are longer than foot-initial moras, and that foot-final moras in word-final feet are longer than those in word-initial feet. However, there was also clear evidence of an interaction whereby moras in absolute word-initial position are longer than others. Closer investigation suggested that this word-initial effect is due to lengthening of the word-initial consonant (cf. the work on word-initial strengthening by Keating et al. (1998, Labphon 6)). It can therefore be separated out in a model from the foot-based effects, which seem to be genuine.

A possible confound in the 4-mora names such as Yamamoto is that foot structure is isomorphic with morpheme structure (yama+moto). Three control studies explored whether the effect discovered in the main experiment were better explained in terms of foot structure or morpheme structure. The findings of the control studies can be summarised as follows: foot structure has a more robust and consistent effect than morpheme structure, although we cannot entirely rule out a possible effect of the latter.

It should be noted that all the effects we are dealing with here, though quite consistent, are extremely small: the extra duration exhibited by foot-final moras is on average no more than 10 ms in a total duration of 130-160ms. This raises a lot of questions about perceptibility, naturalness of synthetic speech, etc., which we will probably not discuss in the talk.



06 Nov 2001

Cristine Haunz

The Role of Perception in Loanword Adaptation

When foreign words enter a borrowing language, they often undergo drastic changes. Sounds of the input word can be changed, or deleted, and epenthetic vowels can break up illegal clusters. These processes have been viewed as a valuable source of information about what happens when two phonologies clash. The main focus of loanword research to date has been on how and why the representation of the borrowed word is changed as a result of this clash.

However, little attention has been paid to the perception process that creates this representation from the acoustic input that the speaker hears. Where perception has been mentioned, it has either been claimed to be universal und thus perceiving all foreign sounds without difficulty (Jacobs and Gussenhoven 1999), or to be strictly limited to the native system for mono- as well as bilingual speakers (Silverman 1992, Yip 1993). As speech perception research shows that L2 learners do have difficulty with foreign sounds, but often improve greatly over time, both these positions are untenable.

Experiments are therefore under way to test the discrimination and identification abilities of speakers in the perception of those foreign sounds that are adapted in loanwords. This will examine to what extent, if any, adaptations may take place in (mis)perception rather than at a different level. Cases of interest are Spanish adaptations of the high front and back vowels of English (assimilation in perception or at a different stage?), Hindi adaptations of the English interdental fricative and alveolar stop (counter to phonetic similarity), and French perception of the English dental fricative depending on context (a possible influence of phonotactics in perception).



20 Nov 2001

Dr. Martin Meyer (Institute for Adaptive & Neural Computation)

Brain responses to affective and non-affective prosody


27 Nov 2001

Ben Matthews (Queen Margaret University College)

On Variability and the Acquisition of Vowels in Normally Developing Scottish Children (18-36 months) The acquisition of vowel systems of American and RP accents of English have been studied in increasing detail over recent years, but no studies have been undertaken of the acquisition of the Scottish vowel system, which is radically different from other vowel systems of English. This talk presents a longitudinal study of 7 Scottish children, aged 18 to 36 months during the data collection period. Recordings took place in a naturalistic setting, using a standard set of toys and books as stimuli. These were obtained on a monthly basis over a period of one year. Analysis focused on 3 sessions for each child, at 4 month intervals. Analysis consisted mainly of transcription analysis. Individual patterns of vowel production showed a great deal of variability, both among tokens of individual words, and in terms of systematic development across children. Transcription analysis revealed certain vowels (such as /¬ / in FOOT and GOOSE) which were consistently less adult-like than others within each child's individual sessions. The theoretical distinction between vowels and approximants (i.e. liquids and glides) was also addressed, and investigated in detail. Both /l/ and /r/ were frequently vocalised, and often had a large effect on the accuracy of adjacent vowels. Patterns also emerged showing that /j/ and /w/ vary between consonantal realisations and vocalic realisations. General developmental trends were for Scottish English identified, but the findings highlight variable patterns of development as well as casting doubt on the theoretical status of corner vowels.


04 Dec 2001

Mika Ito

Japanese politeness and suprasegmentals - An approach for more natural speech materials

Here we discuss some of the problems regarding the unnaturalness of speech data currently used to research spoken Japanese politeness and proposes improved techniques. To extract natural unscripted utterances within a well-defined of vocabulary and contexts, a Map Task was employed with social status control between participants. To observe a perception side of spoken Japanese politeness, a rating experiment of formality was conducted. In this experiment, lexically similar but not identical tokens without any manipulation, were presented to raters for scoring the various utterances for their degree of formality. Magnitude Estimation (ME) was employed to reflect perceived distance of degree on ratings. Results show following findings. From the production side, it is not always the case, that raising the fundamental frequency (F0) is correlated with increasing formality or politeness. Also one speaker did not show significant difference in speech rate either. As neither F0 nor speech rate do seem to be dominant factors for judging formality, there is a need to explore other acoustic cues for conveying formality.



11 Dec 2001

Bettina Braun & Tiina Karsikas (Saarland University)

Evoking and Assessing Cooperation in Dialogue [Systems]    (Bettina Braun)

Dialogue systems in which humans communicate with machines will become more and more common. This is especially true in in situations in which speech interaction is easier than manual operation (e.g. in car environment). Current dialogue systems, however, still have a very rigid dialogue structure and inappropriate speech output, which prevents and many users from supporting these systems.

Within the large field of research in man-machine interaction I am mainly interested in canned speech output and in how cooperation can be conveyed/simulated with this technique. Thus, I want to find out which behaviour is judged as cooperative.

Under the assumption that humans behave cooperatively in dialogues, a pilot study with human operators was performed. They were told to interact with real users but were restricted to a set of utterances (to simulate a real dialogue system). The analysis of the data showed that only a few human operators behaved as expected. Reasons for this may partly be found in the experimental setup. I will discuss this difficulty of evoking cooperative utterances in dialogue in a controlled way in slightly more detail.

The last part of my talk will be about the problem of assessing cooperation in dialogue systems. Even if we had a maximally cooperative way of reacting to different sitations it is still questionable if this behaviour would be appropriate for machines. A perception study aims at ratings of operator behaviours in two conditions (once under the assumption that human-human interaction is to be judged, once under the assumption of man-machine interaction). For this part only preliminary ideas can be presented.

Non-Native Speech in Automatic Speech Recognition    (Tiina Karsikas)


[back to PWorkshop Archives]

<owner-pworkshop@ling.ed.ac.uk>