|
[1]
|
Maria Wolters, Karl Isaac, and Jason Doherty.
Hold that thought: are spearcons less disruptive than spoken
reminders?
In CHI '12 Extended Abstracts on Human Factors in Computing
Systems, CHI EA '12, pages 1745-1750, New York, NY, USA, 2012. ACM.
[ bib |
DOI |
http ]
Keywords: irrelevant speech effect, reminders, spearcon, speech,
working memory
|
|
[2]
|
Maria Klara Wolters, Christine Johnson, and Karl B Isaac.
Can the hearing handicap inventory for adults be used as a screen for
perception experiments?
In Proc. ICPhS XVII, Hong Kong, 2011.
[ bib |
.pdf ]
When screening participants for speech perception
experiments, formal audiometric screens are often not
an option, especially when studies are conducted over
the Internet. We investigated whether a brief
standardized self-report questionnaire, the screening
version of the Hearing Handicap Inventory for Adults
(HHIA-S), could be used to approximate the results of
audiometric screening. Our results suggest that while
the HHIA-S is useful, it needs to be used with
extremely strict cut-off values that could exclude
around 25% of people with no hearing impairment who
are interested in participating. Well constructed,
standardized single questions might be a more feasible
alternative, in particular for web experiments.
|
|
[3]
|
Maria K. Wolters, Karl B. Isaac, and Steve Renals.
Evaluating speech synthesis intelligibility using Amazon Mechanical
Turk.
In Proc. 7th Speech Synthesis Workshop (SSW7), pages 136-141,
2010.
[ bib |
.pdf ]
Microtask platforms such as Amazon Mechanical Turk
(AMT) are increasingly used to create speech and
language resources. AMT in particular allows
researchers to quickly recruit a large number of fairly
demographically diverse participants. In this study, we
investigated whether AMT can be used for comparing the
intelligibility of speech synthesis systems. We
conducted two experiments in the lab and via AMT, one
comparing US English diphone to US English
speaker-adaptive HTS synthesis and one comparing UK
English unit selection to UK English speaker-dependent
HTS synthesis. While AMT word error rates were worse
than lab error rates, AMT results were more sensitive
to relative differences between systems. This is mainly
due to the larger number of listeners. Boxplots and
multilevel modelling allowed us to identify listeners
who performed particularly badly, while thresholding
was sufficient to eliminate rogue workers. We conclude
that AMT is a viable platform for synthetic speech
intelligibility comparisons.
|