|
[1]
|
Peter Bell, Myroslava Dzikovska, and Amy Isard.
Designing a spoken language interface for a tutorial dialogue system.
In Proc. Interspeech, Portland, Oregon, USA, September 2012.
[ bib |
.pdf ]
We describe our work in building a spoken language
interface for a tutorial dialogue system. Our goal is
to allow natural, unrestricted student interaction with
the computer tutor, which has been shown to improve the
student's learning gain, but presents challenges for
speech recognition and spoken language understanding.
We discuss the choice of system components and present
the results of development experiments in both acoustic
and language modelling for speech recognition in this
domain.
|
|
[2]
|
Myroslava Dzikovska, Amy Isard, Peter Bell, Johanna Moore, Natalie Steinhauser,
and Gwendolyn Campbell.
Beetle II: an adaptable tutorial dialogue system.
In Proceedings of the SIGDIAL 2011 Conference, demo session,
pages 338-340, Portland, Oregon, June 2011. Association for Computational
Linguistics.
[ bib |
http ]
We present Beetle II, a tutorial dialogue system which
accepts unrestricted language input and supports
experimentation with different tutorial planning and
dialogue strategies. Our first system evaluation
compared two tutorial policies and demonstrated that
the system can be used to study the impact of different
approaches to tutoring. The system is also designed to
allow experimentation with a variety of natural
language techniques, and discourse and dialogue
strategies.
|
|
[3]
|
Myroslava Dzikovska, Amy Isard, Peter Bell, Johanna D. Moore, Natalie B.
Steinhauser, Gwendolyn E. Campbell, Leanne S. Taylor, Simon Caine, and
Charlie Scott.
Adaptive intelligent tutorial dialogue in the Beetle II system.
In Artificial Intelligence in Education - 15th International
Conference (AIED 2011), interactive event, volume 6738 of Lecture Notes
in Computer Science, page 621, Auckland, New Zealand, 2011. Springer.
[ bib |
DOI ]
|
|
[4]
|
Helen Wright-Hastie, Massimo Poesio, and Stephen Isard.
Automatically predicting dialogue structure using prosodic features.
Speech Communication, 36(1-2):63-79, 2002.
[ bib ]
|
|
[5]
|
Sue Fitt and Steve Isard.
Synthesis of regional English using a keyword lexicon.
In Proc. Eurospeech 1999, volume 2, pages 823-826, Budapest,
September 1999.
[ bib |
.ps |
.pdf ]
We discuss the use of an accent-independent keyword
lexicon to synthesise speakers with different regional
accents. The paper describes the system architecture
and the transcription system used in the lexicon, and
then focuses on the construction of word-lists for
recording speakers. We illustrate by mentioning some of
the features of Scottish and Irish English, which we
are currently synthesising, and describe how these are
captured by keyword synthesis.
|
|
[6]
|
H. Wright, Massimo Poesio, and Stephen Isard.
Using high level dialogue information for dialogue act recognition
using prosodic features.
In Proceedings of an ESCA Tutorial and Research Workshop on
Dialogue and Prosody, pages 139-143, Eindhoven, The Netherlands, 1999.
[ bib |
.ps |
.pdf ]
|
|
[7]
|
John McKenna and Stephen Isard.
Tailoring kalman filtering towards speaker characterisation.
In Proc. Eurospeech '99, volume 6, pages 2793-2796,
Budapest, 1999.
[ bib |
.ps |
.pdf ]
|
|
[8]
|
Simon King, Todd Stephenson, Stephen Isard, Paul Taylor, and Alex Strachan.
Speech recognition via phonetically featured syllables.
In Proc. ICSLP `98, pages 1031-1034, Sydney, Australia,
December 1998.
[ bib |
.ps |
.pdf ]
We describe a speech recogniser which uses a speech
production-motivated phonetic-feature description of
speech. We argue that this is a natural way to describe
the speech signal and offers an efficient intermediate
parameterisation for use in speech recognition. We also
propose to model this description at the syllable
rather than phone level. The ultimate goal of this work
is to generate syllable models whose parameters
explicitly describe the trajectories of the phonetic
features of the syllable. We hope to move away from
Hidden Markov Models (HMMs) of context-dependent phone
units. As a step towards this, we present a preliminary
system which consists of two parts: recognition of the
phonetic features from the speech signal using a neural
network; and decoding of the feature-based description
into phonemes using HMMs.
|
|
[9]
|
Sue Fitt and Steve Isard.
Representing the environments for phonological processes in an
accent-independent lexicon for synthesis of English.
In Proc. ICSLP 1998, volume 3, pages 847-850, Sydney,
Australia, December 1998.
[ bib |
.ps |
.pdf ]
This paper reports on work developing an
accent-independent lexicon for use in synthesising
speech in English. Lexica which use phonemic
transcriptions are only suitable for one accent, and
developing a lexicon for a new accent is a long and
laborious process. Potential solutions to this problem
include the use of conversion rules to generate lexica
of regional pronunciations from standard accents and
encoding of regional variation by means of keywords.
The latter proposal forms the basis of the current
work. However, even if we use a keyword system for
lexical transcription there are a number of remaining
theoretical and methodological problems if we are to
synthesise and recognise accents to a high degree of
accuracy; these problems are discussed in the following
paper.
|
|
[10]
|
Paul A. Taylor, S. King, S. D. Isard, and H. Wright.
Intonation and dialogue context as constraints for speech
recognition.
Language and Speech, 41(3):493-512, 1998.
[ bib |
.ps |
.pdf ]
|
|
[11]
|
Laurence Molloy and Stephen Isard.
Suprasegmental duration modelling with elastic constraints in
automatic speech recognition.
In ICSLP, volume 7, pages 2975-2978, Sydney, Australia, 1998.
[ bib |
.ps |
.pdf ]
|
|
[12]
|
Briony J. Williams and Stephen Isard.
A keyvowel approach to the synthesis of regional accents of
English.
In Eurospeech 97, Rhodes, Greece, 1997.
[ bib |
.ps |
.pdf ]
|
|
[13]
|
Jean Carletta, Amy Isard, Stephen Isard, Jacqueline C. Kowtko, Gwyneth
Doherty-Sneddon, and Anne H. Anderson.
The reliability of a dialogue structure coding scheme.
Computational Linguistics, 23(1):13-31, 1997.
[ bib |
.ps |
.pdf ]
|
|
[14]
|
Beth Ann Hockey, Deborah Rossen-Knill, Beverly Spejewski, Matthew Stone, and
Stephen Isard.
Can you predict responses to yes/no questions? yes, no, and stuff.
In Eurospeech '97, pages 2267-2270, 1997.
[ bib ]
|
|
[15]
|
Paul A. Taylor, Simon King, Stephen Isard, Helen Wright, and Jacqueline Kowtko.
Using intonation to constrain language models in speech recognition.
In Proc. Eurospeech'97, Rhodes, 1997.
[ bib |
.pdf ]
This paper describes a method for using intonation to
reduce word error rate in a speech recognition system
designed to recognise spontaneous dialogue speech. We
use a form of dialogue analysis based on the theory of
conversational games. Different move types under this
analysis conform to different language models.
Different move types are also characterised by
different intonational tunes. Our overall recognition
strategy is first to predict from intonation the type
of game move that a test utterance represents, and then
to use a bigram language model for that type of move
during recognition. point in a game.
|
|
[16]
|
Paul A. Taylor, Hiroshi Shimodaira, Stephen Isard, Simon King, and Jacqueline
Kowtko.
Using prosodic information to constrain language models for spoken
dialogue.
In Proc. ICSLP `96, Philadelphia, 1996.
[ bib |
.ps |
.pdf ]
We present work intended to improve speech recognition
performance for computer dialogue by taking into
account the way that dialogue context and intonational
tune interact to limit the possibilities for what an
utterance might be. We report here on the extra
constraint achieved in a bigram language model
expressed in terms of entropy by using separate
submodels for different sorts of dialogue acts and
trying to predict which submodel to apply by analysis
of the intonation of the sentence being recognised.
|
|
[17]
|
A. Conkie and Stephen D. Isard.
Optimal coupling of diphones.
In J. P. H. Santen, R. W. Sproat, J. P. Olive, and Hirschberg,
editors, Progress in Speech Synthesis. Springer, 1996.
[ bib ]
|
|
[18]
|
Jean Carletta, Amy Isard, Stephen Isard, Jacqueline Kowtko, Gwyneth
Doherty-Sneddon, and Anne H. Anderson.
The coding of dialogue structure in a corpus.
In J.A. Andernach, S.P. van de Burgt, and G.F. van der Hoeven,
editors, Proceedings of the Ninth Twente Workshop on Language
Technology: Corpus-based Approaches to Dialogue Modelling. Universiteit
Twente, Enschede, 1995.
[ bib ]
|
|
[19]
|
Stephen Isard, Simon King, Paul A. Taylor, and Jacqueline Kowtko.
Prosodic information in a speech recognition system intended for
dialogue.
In IEEE Workshop in speech recognition, Snowbird, Utah, 1995.
[ bib ]
We report on an automatic speech recognition system
intended for use in dialogue, whose original aspect is
its use of prosodic information for two different
purposes. The first is to improve the word level
accuracy of the system. The second is to constrain the
language model applied to a given utterance by taking
into account the way that dialogue context and
intonational tune interact to limit the possibilities
for what an utterance might be.
|
|
[20]
|
Paul A. Taylor and S. D. Isard.
A new model of intonation for use with speech recognition and
synthesis.
In International Conference on Spoken Language Processing,
Banff, Canada, 1992.
[ bib |
.ps |
.pdf ]
|
|
[21]
|
W. N. Campbell and Stephen D. Isard.
Segmental durations in a syllable frame.
Journal of Phonetics, 19:37-47, 1991.
[ bib ]
|
|
[22]
|
Paul A. Taylor and Stephen D. Isard.
Automatic diphone segmentation.
In Proc. Eurospeech '91, Genova, Italy, 1991.
[ bib ]
|
|
[23]
|
Paul A. Taylor and Stephen D. Isard.
Automatic diphone segmentation using hidden markov models.
In SST-90, Third International Australian Conference in Speech
Science and Technology, Melbourne, Australia, 1990.
[ bib ]
|
|
[24]
|
W. N. Campbell, Stephen D. Isard, A. I. C. Monaghan, and J. Verhoven.
Duration, pitch and diphones in the CSTR TTS system.
In ICSLP '90, 1990.
[ bib ]
|
|
[25]
|
Stephen D. Isard and Mark Pearson.
A repertoire of British English contours for speech synthesis.
In SPEECH '88, 7th FASE Symposium, London, 1988.
[ bib ]
|
|
[26]
|
Stephen D. Isard and D. A. Miller.
Diphone synthesis techniques.
In IEEE Conference Publication no 258, pages 77-82, 1986.
[ bib ]
|