PWorkshop Archives: Summer Term 2003
15 Apr 2003 | Mariko Sugahara |
The Right vs. Left Asymmetry of Post-FOCUS Prosodic Phrase
Boundaries in Tokyo Japanese
This paper examines presence/absence of prosodic boundaries in a post-FOCUS part of an utterance in Tokyo Japanese and provides a phonological account for it within the framework of Optimality Theory. | |
22 Apr 2003 | Christine Haunz |
Grammatical and Non-Grammatical Factors in Loanword Adaptation
This talk aims to include not only phonological differences between borrowing and donor language in the study of loanwords, but also factors which may not depend solely on these differences, e.g. similarity, frequency and gradient grammaticality. The influence of these factors on the performance of English speakers in a shadowing task of Russian words with English-illegal initial clusters was tested. The frequency of potential adapted onsets in the English lexicon does not correlate with the strategy of adaptation. Judgments about the grammaticality of words containing illegal initial clusters and the similarity between pairs of words partially containing illegal onsets were obtained from English native speakers. Similarity of a target to an adaptation was shown to be a predictor of its rate of use. The perceived grammaticality of a target cluster influenced performance in two ways: high-grammaticality target clusters were modified less often, and low-grammaticality clusters were mostly associated with vowel epenthesis. | |
29 Apr 2003 | Bob Ladd and Caroline Ekelund |
Downstep, Emphasis, and overall pitch range | |
06 May 2003 | Yiya Chen |
Prosody and systematic variation in the F0 realization of lexical tones in Standard Chinese | |
14 May 2003 | Laura Redi (Harvard, MIT) |
Categories of intonational representation: Some effects of alignment and pitch range | |
20 May 2003 | Abigail Cohn (Cornell University) |
Superheavy Monosyllables in American English: The Role of the Mora
Words with diphthong or tense vowel nuclei and post-vocalic liquids, such as flour and eel, are an area of considerable interest in American English due to their variability. Native speakers are not in agreement as to their syllable count and this can differ between dialects. Drawing on a variety of phonological and mophological evidence, we conclude that these words are monosyllabic, but superheavy. We argue that such superheavy syllables are best represented as being trimoraic, due to a requirement that liquids in the rime bear a mora. Results of an acoustic study lend support to our analysis of these words as superheavy monosyllables represented moraically. The universal markedness of trimoraic syllables makes them vulnerable to resolution, which is manifested in different ways in different dialects of American English. | |
21 May 2003 | Julian Bradfield (Computer Science) |
Concurrency and Phonology
This talk presents some very preliminary ideas on the use of concurrency, and in particular the rich computer science notion of concurrency, in phonetics and phonology. After a brief description of the CS concept, I'll consider, as time permits, its application to formal models of phonology, to relating different phonological theories, to click languages and to speech recognition. I hope for your feedback! A more detailed abstract can be found here. | |
03 Jun 2003 | Antti-Veikko Rosti (Cambridge University) |
Switching linear dynamical systems for speech recognition
Currently the most popular acoustic model for speech recognition is the hidden Markov model (HMM). However, HMMs are based on a series of assumptions some of which are known to be poor. In particular the assumption that successive speech frames are conditionally independent given the state that generated them. To overcome this, segment models have been proposed. These model whole segments of frames rather than individual ones. One form is the stochastic segment model (SSM), which uses a standard linear dynamical system to model the sequence of observations within a segment. Here the dynamics are modelled by a first-order Gauss-Markov process in some low-dimensional state space. The feature vector is a noise corrupted linear transformation of the state vector. Though the training and recognition algorithms are more complex compared to HMMs, it is feasible to use standard techniques for inference with SSMs. For the SSM, segments are assumed to be independent. Intuitively, this is not always valid due to co-articulation between the modelling units. Switching linear dynamical systems (SLDS) have therefore been proposed. In SLDS, the posterior distribution of the state vector is propagated between segments. Unfortunately, exact inference in SLDS is not tractable due to exponential growth of components in time. In this talk, approximate methods for the inference in SLDSs will be presented. First there are approximate methods based on heuristic Viterbi-like algorithm. Alternatively variational learning may be used. Finally approaches based on Markov chain Monte Carlo methods can be used, including a training scheme based on stochastic expectation maximisation (SEM). For the SEM scheme, convergence and implementation issues for use with SLDS will be discussed in detail. | |
01 Jul 2003 | Nikola Ikonomov (Bulgarian Academy of Sciences) |
Preservation and Digital Restoration of Audio Archives
Problems, related to the storage and handling of audio archives on magnetic tapes were examined. A corresponding set of measures and restoration utilities were defined, developed and implemented. Results offer an efficient way to overcome problems related to safekeeping and restoration of sound recordings and could be successfully applied in institutions (linguistic and speech research, dialectological research, folklore, history related recordings etc.) with similar audio archives. | |
14 Jul 2003 | Alan W Black, Tanja Schultz, and Robert Frederking (CMU) |
Towards Communicating with Dolphins
After working in the area of rapid development of speech-to-speech translation systems for human languages with limited resources, we were recently contacted about applying our techniques to communication with dolphins. Of course although full translation is not feasible, there a number of ways speech technology can help in dolphin research. Working with the Wild Dolphin Project, who have almost 20 years of experience with a pod of spotted dolphins 40 miles off the Bahamas, we are using their existing recordings for this work, and currently designing new equipment to allow collection of more data. After a general description of dolphin acoustics, this talk will describe some areas where speech recognition technology can be used to better classify dolphin recordings, and present initial results on a simple dolphin ID system based on signature whistles. Also we will describe a framework for an experiment we intend to run later this summer, to investigate how dolphins may relate to synthesized noises in their acoustic domain, and how they may mimic them. | |
31 Jul 2003 | ICPhS Practice Talks |
Corine Astesano and Ellen Bard
Rob Clark
Christine Haunz
Bob Ladd
Richard Mullooly
Mits Ota, Bob Ladd and Madoka Tsuchiya
Susana Cortes Pomacondor
Alice Turk Posters
Mika Ito
Cassie Mayo and Alice Turk
Sherry Ou
James M Scobbie and Alan A Wrench
Sarah Creer and Maria Wolters | |
12 Aug 2003 | Nobuaki Minematsu (University of Tokyo; Royal Institute of Technology, Stockholm) |
Phonetic Tree Analysis
Two new techniques are proposed to characterize the accented pronunciation. The first technique, Phonetic Tree Analysis, extracts phonetic tree structure embedded in utterances of a student. Results of analyzing Japanese English visually and clearly present well-known Japanese habits in speaking English. The second technique automatically estimates the segmental intelligibility not based upon acoustic matching with native speakers' utterances but based upon matching between two structures, the extracted phonetic structure in the student's pronunciation and the lexical structure in the target language's vocabulary. The estimation is done using one of word perception models, Cohort Model, and the estimated cohort size is interpreted as degree of the segmental unintelligibility. Experimental results show good accordance between the estimated intelligibility and the segmental proficiency rated by teachers. Further, some possible applications are also shown based upon the proposed two techniques. | |
26 Aug 2003 | Laurence White and Alice Turk |
Polysyllabic shortening revisited: word length and the attenuation of accentual lengthening | |
<owner-pworkshop@ling.ed.ac.uk> |