The Centre for Speech Technology Research, The university of Edinburgh

PWorkshop Archives: Spring Term 2002

29 Jan 2002

Dr Philip Jackson (University of Birmingham)

Data-driven, non-linear, formant-to-acoustic mapping for ASR

The underlying dynamics of speech can be captured in an automatic speech recognition system via an articulatory representation, which resides in a domain other than that of the acoustic observations. Thus, given a set of models in this hidden domain, it is essential that a mapping can be obtained to relate the intermediate representation to the acoustic domain. In this talk, two methods for mapping from formants to short-term spectra will be compared: multi-layered perceptrons (MLPs) and radial-basis function (RBF) networks. Both are capable of providing non-linear transformations, and were trained using features extracted from the TIMIT database. Various schemes for dividing the frames of speech data according to their phone class will also be discussed. Results show that the RBF networks perform approximately 10% better than the MLPs, in terms of an rms error, and that a classification based on discrete regions of the articulatory space gives the greatest improvements over a single network.



05 Feb 2002

Linguistics Circle

See here for title and abstract.


12 Feb 2002

Bob Ladd & Michaela Atterer

Phonological vs. phonetic accounts of intonational variation

Regional differences in intonation are often treated in phonological terms. For example, Northern and Southern German are sometimes said to have categorically distinct pitch patterns for accented syllables (e.g. in ToBI-style notation H* or L+H* for Northern and L*+H for Southern). Similar claims have been made in studies of other languages. However, this paper argues that some such differences must be treated as different phonetic realisations of "the same" phonological category. This conclusion is based on a close acoustic analysis of the alignment of accentual rises with segmental landmarks (e.g. beginning and end of stressed vowel) in Northern and Southern German, and comparison of the results with past work on Greek, English, and Dutch. Differences between varieties and between languages are shown to be small and not plausibly related to phonologically distinct patterns of tonal association.



19 Feb 2002

Mits Ota, Cecile Pereira, Bob Ladd, Antonella Sorace & Cathy Sotillo

Cross-linguistic conditioning of near-native L2 phonological representations


26 Feb 2002

David Rojas (Cognitive Science & Natural Language) & Iskra Iskrova (Indiana U)

Nasality Correlates and Variation in Haitian and Louisiana Creoles

In this pilot study we investigate possible acoustic correlates of nasality in two French-based Creoles (FbC): Haitian Creole (HC) and Louisiana Creole (LC). Both are assumed to originate from St. Kitts and therefore share some common Caribbean FbC features. In HC and in LC nasal vowels are phonemically contrastive, and the four-way opposition in (1) can be observed in both languages. (1) v[nas] : v : v[nas]N : vN Assuming that there are three different types of nasal or nasalized vowels in these languages---underlyingly nasal, nasalized by local regressive assimilation (in the environment of a nasal consonant), or nasalized by spreading from nucleus to nucleus---it would prove very valuable to determine whether these three phonologically distinct types display differences in their phonetic realizations. Previous investigations (Chen 1997, Feng and Castelli 1996, Hawkins and Stevens 1985) have been conducted on the quantification of acoustic correlates of nasality with regard to French, English, and other languages. It is commonly reported that a prominent characteristic of nasality is the amplitude of two resonant frequencies (P0 and P1) around 250-350Hz and 950-1050Hz, respectively. Through the analysis of tokens representative of each of the mentioned environments for the low vowel /a/, we attempt to determine the efficacy of applying the described means of analyzing these acoustic correlates to LC and HC vowels. In addition, we examine the effects of the phonetic environment and nasality on vowel duration as well. The findings of this rather broad, exploratory treatment include main effects on vowel duration that contrast between LC and HC. Also, the nasal correlates discussed seem to pattern differently for each language, though further exploration is clearly in order. The discoveries discussed here validate the usefulness of applying certain mentioned approaches to the examination of nasality in LC and HC, which could potentially lead to several interesting cross-linguistic typological studies on a number of regional French varieties and related French-based Creoles.



12 Mar 2002

Cassie Mayo & Alice Turk

Development of Acoustic Cue Weighting in Children's Speech Perception



19 Mar 2002

Bob Ladd and Robin Lickley

Lab speech is real speech: the case of Dutch falling-rising questions

This talk has a dual purpose. First, it presents experimental evidence that the post-nuclear F0 minimum in Dutch falling-rising questions is aligned with the segmental string in a way that is determined by the location of secondary stress: the minimum aligns with a post-nuclear secondary stress if one is present, and with the beginning of the final syllable otherwise. This is as predicted by Grice, Ladd and Arvaniti (2000, Phonology), on the assumption that the F0 minimum respresents a "low phrase accent" that seeks secondary association with a prominent syllable. In the second part of the talk, we examine the phonetic details of approximately 35 falling-rising questions in a Dutch map task corpus, and show that the alignment of the F0 minimum is determined by the same principles as in the experimental data. This finding allays the concerns of those investigators who claim that intonation is most appropriately investigated only by means of corpora of natural speech, and that controlled speech materials read aloud in the lab are not a valid source of evidence.



[back to PWorkshop Archives]

<owner-pworkshop@ling.ed.ac.uk>