|
[1]
|
C. Mayo, V. Aubanel, and M. Cooke.
Effect of prosodic changes on speech intelligibility.
In Proc. Interspeech, Portland, OR, USA, 2012.
[ bib ]
|
|
[2]
|
M. Koutsogiannaki, M. Pettinato, C. Mayo, V. Kandia, and Y. Stylianou.
Can modified casual speech reach the intelligibility of clear speech?
In Proc. Interspeech, Portland, OR, USA, 2012.
[ bib ]
|
|
[3]
|
V. Aubanel, M. Cooke, E. Foster, M. L. Garcia-Lecumberri, and C. Mayo.
Effects of the availability of visual information and presence of
competing conversations on speech production.
In Proc. Interspeech, Portland, OR, USA, 2012.
[ bib ]
|
|
[4]
|
C. Mayo, R. A. J. Clark, and S. King.
Listeners' weighting of acoustic cues to synthetic speech
naturalness: A multidimensional scaling analysis.
Speech Communication, 53(3):311-326, 2011.
[ bib |
DOI ]
The quality of current commercial speech synthesis
systems is now so high that system improvements are
being made at subtle sub- and supra-segmental levels.
Human perceptual evaluation of such subtle improvements
requires a highly sophisticated level of perceptual
attention to specific acoustic characteristics or cues.
However, it is not well understood what acoustic cues
listeners attend to by default when asked to evaluate
synthetic speech. It may, therefore, be potentially
quite difficult to design an evaluation method that
allows listeners to concentrate on only one dimension
of the signal, while ignoring others that are
perceptually more important to them. The aim of the
current study was to determine which acoustic
characteristics of unit-selection synthetic speech are
most salient to listeners when evaluating the
naturalness of such speech. This study made use of
multidimensional scaling techniques to analyse
listeners' pairwise comparisons of synthetic speech
sentences. Results indicate that listeners place a
great deal of perceptual importance on the presence of
artifacts and discontinuities in the speech, somewhat
less importance on aspects of segmental quality, and
very little importance on stress/intonation
appropriateness. These relative differences in
importance will impact on listeners' ability to attend
to these different acoustic characteristics of
synthetic speech, and should therefore be taken into
account when designing appropriate methods of synthetic
speech evaluation.
Keywords: Speech synthesis; Evaluation; Speech perception;
Acoustic cue weighting; Multidimensional scaling
|
|
[5]
|
Vasilis Karaiskos, Simon King, Robert A. J. Clark, and Catherine Mayo.
The blizzard challenge 2008.
In Proc. Blizzard Challenge Workshop, Brisbane, Australia,
September 2008.
[ bib |
.pdf ]
The Blizzard Challenge 2008 was the fourth annual
Blizzard Challenge. This year, participants were asked
to build two voices from a UK English corpus and one
voice from a Man- darin Chinese corpus. This is the
first time that a language other than English has been
included and also the first time that a large UK
English corpus has been available. In addi- tion, the
English corpus contained somewhat more expressive
speech than that found in corpora used in previous
Blizzard Challenges. To assist participants with
limited resources or limited ex- perience in
UK-accented English or Mandarin, unaligned la- bels
were provided for both corpora and for the test
sentences. Participants could use the provided labels
or create their own. An accent-specific pronunciation
dictionary was also available for the English speaker.
A set of test sentences was released to participants,
who were given a limited time in which to synthesise
them and submit the synthetic speech. An online
listening test was con- ducted, to evaluate
naturalness, intelligibility and degree of similarity
to the original speaker.
Keywords: Blizzard
|
|
[6]
|
F. Gibbon and C. Mayo.
Adults' perception of conflicting acoustic cues associated with
epg-defined undifferentiated gestures.
In 4th International EPG Symposium, Edinburgh, UK., 2008.
[ bib ]
|
|
[7]
|
Robert A. J. Clark, Monika Podsiadlo, Mark Fraser, Catherine Mayo, and Simon
King.
Statistical analysis of the Blizzard Challenge 2007 listening
test results.
In Proc. Blizzard 2007 (in Proc. Sixth ISCA Workshop on Speech
Synthesis), Bonn, Germany, August 2007.
[ bib |
.pdf ]
Blizzard 2007 is the third Blizzard Challenge, in
which participants build voices from a common dataset.
A large listening test is conducted which allows
comparison of systems in terms of naturalness and
intelligibility. New sections were added to the
listening test for 2007 to test the perceived
similarity of the speaker's identity between natural
and synthetic speech. In this paper, we present the
results of the listening test and the subsequent
statistical analysis.
Keywords: Blizzard
|
|
[8]
|
C. Mayo, R. A. J. Clark, and S. King.
Multidimensional scaling of listener responses to synthetic speech.
In Proc. Interspeech 2005, Lisbon, Portugal, September 2005.
[ bib |
.pdf ]
|
|
[9]
|
C. Mayo and A. Turk.
The influence of spectral distinctiveness on acoustic cue weighting
in children's and adults' speech perception.
Journal of the Acoustical Society of America, 118:1730-1741,
2005.
[ bib |
.pdf ]
|
|
[10]
|
C. Mayo and A. Turk.
No available theories currently explain all adult-child cue weighting
differences.
In Proc. ISCA Workshop on Plasticity in Speech Perception,
London, UK, 2005.
[ bib |
.pdf ]
|
|
[11]
|
C. Mayo and A. Turk.
The development of perceptual cue weighting within and across
monosyllabic words.
In LabPhon 9, University of Illinois at Urbana-Champaign, 2004.
[ bib ]
|
|
[12]
|
C. Mayo and T. Turk.
Adult-child differences in acoustic cue weighting are influenced by
segmental context: Children are not always perceptually biased towards
transitions.
Journal of the Acoustical Society of America, 115:3184-3194,
2004.
[ bib |
.pdf ]
|
|
[13]
|
C. Mayo and A. Turk.
Is the development of cue weighting strategies in children's speech
perception context-dependent?
In XVth International Congress of Phonetic Sciences, Barcelona,
2003.
[ bib |
.pdf ]
|
|
[14]
|
C. Mayo, J. Scobbie, N. Hewlett, and D. Waters.
The influence of phonemic awareness development on acoustic cue
weighting in children's speech perception.
Journal of Speech, Language and Hearing Research,
46:1184-1196, 2003.
[ bib |
.pdf ]
|
|
[15]
|
C. Mayo, A. Turk, and J. Watson.
Development of cue weighting strategies in children's speech
perception.
In Proceedings of TIPS: Temporal Integration in the Perception
of Speech, Aix-en-Provence, 2002.
[ bib ]
|
|
[16]
|
C. Mayo, A. Turk, and J. Watson.
Flexibility of acoustic cue weighting in children's speech
perception.
Journal of the Acoustical Society of America, 109:2313, 2001.
[ bib |
.pdf ]
|
|
[17]
|
C. Mayo.
The relationship between phonemic awareness and cue weighting in
speech perception: longitudinal and cross-sectional child studies.
PhD thesis, Queen Margaret University College, 2000.
[ bib |
.pdf ]
|
|
[18]
|
C. Mayo.
Perceptual weighting and phonemic awareness in pre-reading and
early-reading children.
In XIVth International Congress of Phonetic Sciences, San
Francisco, 1999.
[ bib |
.pdf ]
|
|
[19]
|
C. Mayo.
The development of phonemic awareness and perceptual weighting in
relation to early and later literacy acquisition.
In 20th Annual Child Phonology Conference, Bangor, Wales, 1999.
[ bib ]
|
|
[20]
|
C. Mayo.
The developmental relationship between perceptual weighting and
phonemic awareness.
In LabPhon 6, University of York, UK, 1998.
[ bib ]
|
|
[21]
|
C. Mayo.
A longitudinal study of perceptual weighting and phonemic awarenes.
In Chicago Linguistics Society 34, 1998.
[ bib ]
|
|
[22]
|
C. Mayo, M. Aylett, and D. R. Ladd.
Prosodic transcription of glasgow english: an evaluation study of
GlaToBI.
In Intonation: Theory, Models and Applications, 1997.
[ bib |
.pdf ]
|