The Centre for Speech Technology Research, The university of Edinburgh

Publications by Maria Wolters

[1] Maria K. Wolters. The minimal effective dose of reminder technology. In Proceedings of the extended abstracts of the 32nd annual ACM conference on Human factors in computing systems - CHI EA '14, pages 771-780, New York, New York, USA, April 2014. ACM Press. [ bib | DOI | http ]
Remembering to take one's medication on time is hard work. This is true for younger people with no chronic illness as well as older people with many co-morbid conditions that require a complex medication regime. Many technological solutions have been proposed to help with this problem, but is more IT really the solution? In this paper, I argue that technological help should be limited to the minimal effective dose, which depends on the person and their living situation, and may well be zero.

Keywords: alerts,ehealth,medication,reminders,telecare
[2] Maria K. Wolters, Elaine Niven, and Robert H. Logie. The art of deleting snapshots. In Proceedings of the extended abstracts of the 32nd annual ACM conference on Human factors in computing systems - CHI EA '14, pages 2521-2526, New York, New York, USA, April 2014. ACM Press. [ bib | DOI | http ]
In this paper, we investigate why people decide to delete snapshots. 74 participants took snapshots of a street festival every three minutes for an hour and were then asked to cull pictures immediately or after a delay of a day, a week, or a month. We found that the ratio of kept to deleted pictures was fairly constant. Deletion criteria fell into six main categories that mostly involved subjective assessments such as whether a photo was sufficiently characteristic. We conclude that automatic tagging of photos for deletion is problematic; interfaces should instead make it easy for users to find and compare similar photos.

Keywords: forgetting,photowork,preservation
[3] Mark Hartswood, Maria Wolters, Jenny Ure, Stuart Anderson, and Marina Jirotka. Socio-material design for computer mediated social sensemaking. In Proc. CHI Workshop on Explorations in Social Interaction Design, April 2013. [ bib | .pdf ]
Telemonitoring healthcare solutions often struggle to provide the hoped for efficiency improvements in managing chronic illness because of the difficulty interpreting sensor data remotely. Computer-Mediated Social Sensemaking (CMSS) is an approach to solving this problem that leverages the patient's social network to supply the missing contextual detail so that remote doctors can make more accurate decisions. However, implementing CMSS solutions is difficult because users need to know who can see which information, and whether private and confidential information is adequately protected. In this paper, we wish to explore how socio-material design solutions might offer ways of making properties of a CMSS solution tangible to participants so that they can control and understand the implications of their participation.

[4] Ravichander Vipperla, Maria Wolters, and Steve Renals. Spoken dialogue interfaces for older people. In Kenneth J. Turner, editor, Advances in Home Care Technologies. IOS Press, 2012. [ bib | .pdf ]
Although speech is a highly natural mode of communication, building robust and usable speech-based interfaces is still a challenge, even if the target user group is restricted to younger users. When designing for older users, there are added complications due to cognitive, physiological, and anatomical ageing. Users may also find it difficult to adapt to the interaction style required by the speech interface. In this chapter, we summarise the work on spoken dialogue interfaces that was carried out during the MATCH project. After a brief overview of relevant aspects of ageing and previous work on spoken dialogue interfaces for older people, we summarise our work on managing spoken interactions (dialogue management), understanding older people's speech (speech recognition), and generating spoken messages that older people can understand (speech synthesis). We conclude with suggestions for design guidelines that have emerged from our work and suggest directions for future research.

[5] Christopher Burton, Brian McKinstry, Aurora Szentagotai Tatar, Antoni Serrano-Blanco, Claudia Pagliari, and Maria Wolters. Activity monitoring in patients with depression: A systematic review. Journal of Affective Disorders, (0):-, 2012. [ bib | DOI | http ]
Background: Altered physical activity is an important feature of depression. It is manifested in psychomotor retardation, agitation and withdrawal from engagement in normal activities. Modern devices for activity monitoring (actigraphs) make it possible to monitor physical activity unobtrusively but the validity of actigraphy as an indicator of mood state is uncertain. We carried out a systematic review of digital actigraphy in patients with depression to investigate the associations between measured physical activity and depression. Methods: Systematic review and meta-analysis. Studies were identified from Medline, EMBASE and Psycinfo databases and included if they were either case control or longitudinal studies of actigraphy in adults aged between 18 and 65 diagnosed with a depressive disorder. Outcomes were daytime and night-time activity and actigraphic measures of sleep. Results: We identified 19 eligible papers from 16 studies (412 patients). Case control studies showed less daytime activity in patients with depression (standardised mean difference −0.76, 95% confidence intervals −1.05 to −0.47). Longitudinal studies showed moderate increase in daytime activity (0.53, 0.20 to 0.87) and a reduction in night-time activity (−0.36, −0.65 to −0.06) over the course of treatment. Limitations: All study participants were unblinded. Only seven papers included patients treated in the community. Conclusions: Actigraphy is a potentially valuable source of additional information about patients with depression. However, there are no clear guidelines for use of actigraphy in studies of patients with depression. Further studies should investigate patients treated in the community. Additional work to develop algorithms for differentiating behaviour patterns is also needed.

[6] Maria Wolters, Karl Isaac, and Jason Doherty. Hold that thought: are spearcons less disruptive than spoken reminders? In CHI '12 Extended Abstracts on Human Factors in Computing Systems, CHI EA '12, pages 1745-1750, New York, NY, USA, 2012. ACM. [ bib | DOI | http ]
Keywords: irrelevant speech effect, reminders, spearcon, speech, working memory
[7] Maria Wolters and Colin Matheson. Designing Help4Mood: Trade-offs and choices. In Juan Miguel Garcia-Gomez and Patricia Paniagua-Paniagua, editors, Information and Communication Technologies applied to Mental Health. Editorial Universitat Politecnica de Valencia, 2012. [ bib ]
[8] Maria Wolters, Lucy McCloughan, Martin Gibson, Chris Weatherall, Colin Matheson, Tim Maloney, Juan Carlos Castro-Robles, and Soraya Estevez. Monitoring people with depression in the community-regulatory aspectts. In Workshop on People, Computers and Psychiatry at the British Computer Society's Conference on Human Computer Interaction, pages 1745-1750, 2012. [ bib ]
[9] Claudia Pagliari, Maria Wolters, Chris Burton, Brian McKinstry, Aurora Szentagotai, Antoni Serrano-Blanco, Daniel David, Luis Ferrini, Susanna Albertini, Joan Carlos Castro, and Soraya Estévez. Psychosocial implications of avatar use in supporting therapy of depression. In CYBER17-17th Annual CyberPsychology & CyberTherapy Conference, 2012. [ bib ]
[10] Maria Wolters, Louis Ferrini, Juan Martinez-Miranda, Helen Hastie, and Chris Burton. Help4Mood - a flexible solution for supporting people with depression in the community across europe. In Proceedings of The International eHealth, Telemedicine and Health ICT Forum For Education, Networking and Business (MedeTel, 2012). International Society for Telemedicine & eHealth (ISfTeH), 2012. [ bib ]
[11] Soraya Estevez, Juan Carlos Castro-Robles, and Maria Wolters. Help4Mood: First release of a computational distributed system to support the treatment of patients with major depression. In Proceedings of The International eHealth, Telemedicine and Health ICT Forum For Education, Networking and Business (MedeTel, 2012), pages 1745-1750. International Society for Telemedicine & eHealth (ISfTeH), 2012. [ bib ]
[12] Maria Wolters, Juan Martínez-Miranda, Helen Hastie, and Colin Matheson. Managing data in Help4Mood. In The 2nd International Workshop on Computing Paradigms for Mental Health - MindCare 2012, 2012. [ bib ]
[13] Maria Klara Wolters, Christine Johnson, and Karl B Isaac. Can the hearing handicap inventory for adults be used as a screen for perception experiments? In Proc. ICPhS XVII, Hong Kong, 2011. [ bib | .pdf ]
When screening participants for speech perception experiments, formal audiometric screens are often not an option, especially when studies are conducted over the Internet. We investigated whether a brief standardized self-report questionnaire, the screening version of the Hearing Handicap Inventory for Adults (HHIA-S), could be used to approximate the results of audiometric screening. Our results suggest that while the HHIA-S is useful, it needs to be used with extremely strict cut-off values that could exclude around 25% of people with no hearing impairment who are interested in participating. Well constructed, standardized single questions might be a more feasible alternative, in particular for web experiments.

[14] Andi K. Winterboer, Martin I. Tietze, Maria K. Wolters, and Johanna D. Moore. The user-model based summarize and refine approach improves information presentation in spoken dialog systems. Computer Speech and Language, 25(2):175-191, 2011. [ bib | .pdf ]
A common task for spoken dialog systems (SDS) is to help users select a suitable option (e.g., flight, hotel, and restaurant) from the set of options available. As the number of options increases, the system must have strategies for generating summaries that enable the user to browse the option space efficiently and successfully. In the user-model based summarize and refine approach (UMSR, Demberg and Moore, 2006), options are clustered to maximize utility with respect to a user model, and linguistic devices such as discourse cues and adverbials are used to highlight the trade-offs among the presented items. In a Wizard-of-Oz experiment, we show that the UMSR approach leads to improvements in task success, efficiency, and user satisfaction compared to an approach that clusters the available options to maximize coverage of the domain (Polifroni et al., 2003). In both a laboratory experiment and a web-based experimental paradigm employing the Amazon Mechanical Turk platform, we show that the discourse cues in UMSR summaries help users compare different options and choose between options, even though they do not improve verbatim recall. This effect was observed for both written and spoken stimuli.

[15] Kallirroi Georgila, Maria Wolters, Johanna D. Moore, and Robert H. Logie. The MATCH corpus: A corpus of older and younger users' interactions with spoken dialogue systems. Language Resources and Evaluation, 44(3):221-261, March 2010. [ bib | DOI ]
We present the MATCH corpus, a unique data set of 447 dialogues in which 26 older and 24 younger adults interact with nine different spoken dialogue systems. The systems varied in the number of options presented and the confirmation strategy used. The corpus also contains information about the users' cognitive abilities and detailed usability assessments of each dialogue system. The corpus, which was collected using a Wizard-of-Oz methodology, has been fully transcribed and annotated with dialogue acts and “Information State Update” (ISU) representations of dialogue context. Dialogue act and ISU annotations were performed semi-automatically. In addition to describing the corpus collection and annotation, we present a quantitative analysis of the interaction behaviour of older and younger users and discuss further applications of the corpus. We expect that the corpus will provide a key resource for modelling older people's interaction with spoken dialogue systems.

Keywords: Spoken dialogue corpora, Spoken dialogue systems, Cognitive ageing, Annotation, Information states, Speech acts, User simulations, Speech recognition
[16] Maria Wolters and Marilyn McGee-Lennon. Designing usable and acceptable reminders for the home. In Proc. AAATE Workshop AT Technology Transfer, Sheffield, UK, 2010. [ bib | .pdf ]
Electronic reminders can play a key role in enabling people to manage their care and remain independent in their own homes for longer. The MultiMemoHome project aims to develop reminder designs that are accessible and usable for users with a range of abilities and preferences. In an initial exploration of key design parameters, we surveyed 378 adults from all age groups online (N=206) and by post (N= 172). The wide spread of preferences that we found illustrates the importance of adapting reminder solutions to individuals. We present two reusable personas that emerged from the research and discuss how questionnaires can be used for technology transfer.

[17] Maria Wolters, Klaus-Peter Engelbrecht, Florian Gödde, Sebastian Möller, Anja Naumann, and Robert Schleicher. Making it easier for older people to talk to smart homes: Using help prompts to shape users' speech. Universal Access in the Information Society, 9(4):311-325, 2010. [ bib | DOI ]
It is well known that help prompts shape how users talk to spoken dialogue systems. This study investigated the effect of help prompt placement on older users' interaction with a smart home interface. In the dynamic help condition, help was only given in response to system errors; in the inherent help condition, it was also given at the start of each task. Fifteen older and sixteen younger users interacted with a smart home system using two different scenarios. Each scenario consisted of several tasks. The linguistic style users employed to communicate with the system (interaction style) was measured using the ratio of commands to the overall utterance length (keyword ratio) and the percentage of content words in the user's utterance that could be understood by the system (shared vocabulary). While the timing of help prompts did not affect the interaction style of younger users, it was early task-specific help supported older users in adapting their interaction style to the system's capabilities. Well-placed help prompts can significantly increase the usability of spoken dialogue systems for older people.

[18] Maria K. Wolters, Karl B. Isaac, and Steve Renals. Evaluating speech synthesis intelligibility using Amazon Mechanical Turk. In Proc. 7th Speech Synthesis Workshop (SSW7), pages 136-141, 2010. [ bib | .pdf ]
Microtask platforms such as Amazon Mechanical Turk (AMT) are increasingly used to create speech and language resources. AMT in particular allows researchers to quickly recruit a large number of fairly demographically diverse participants. In this study, we investigated whether AMT can be used for comparing the intelligibility of speech synthesis systems. We conducted two experiments in the lab and via AMT, one comparing US English diphone to US English speaker-adaptive HTS synthesis and one comparing UK English unit selection to UK English speaker-dependent HTS synthesis. While AMT word error rates were worse than lab error rates, AMT results were more sensitive to relative differences between systems. This is mainly due to the larger number of listeners. Boxplots and multilevel modelling allowed us to identify listeners who performed particularly badly, while thresholding was sufficient to eliminate rogue workers. We conclude that AMT is a viable platform for synthetic speech intelligibility comparisons.

[19] Kallirroi Georgila, Maria Wolters, and Johanna D. Moore. Learning dialogue strategies from older and younger simulated users. In Proc. SIGDIAL, 2010. [ bib | .pdf ]
Older adults are a challenging user group because their behaviour can be highly variable. To the best of our knowledge, this is the first study where dialogue strategies are learned and evaluated with both simulated younger users and simulated older users. The simulated users were derived from a corpus of interactions with a strict system-initiative spoken dialogue system (SDS). Learning from simulated younger users leads to a policy which is close to one of the dialogue strategies of the underlying SDS, while the simulated older users allow us to learn more flexible dialogue strategies that accommodate mixed initiative. We conclude that simulated users are a useful technique for modelling the behaviour of new user groups.

[20] Maria K. Wolters, Florian Gödde, Sebastian Möller, and Klaus-Peter Engelbrecht. Finding patterns in user quality judgements. In Proc. ISCA Workshop Perceptual Quality of Speech Systems, Dresden, Germany, 2010. [ bib | .pdf ]
User quality judgements can show a bewildering amount of variation that is diffcult to capture using traditional quality prediction approaches. Using clustering, an ex- ploratory statistical analysis technique, we reanalysed the data set of a Wizard-of-Oz experiment where 25 users were asked to rate the dialogue after each turn. The sparse data problem was addressed by careful a priori parameter choices and comparison of the results of different cluster algorithms. We found two distinct classes of users, positive and critical. Positive users were generally happy with the dialogue system, and did not mind errors. Critical users downgraded their opinion of the system after errors, used a wider range of ratings, and were less likely to rate the system positively overall. These user groups could not be predicted by experience with spoken dialogue systems, attitude to spoken dialogue systems, anity with technology, demographics, or short-term memory capacity. We suggest that evaluation research should focus on critical users and discuss how these might be identified.

[21] Maria Wolters, Ravichander Vipperla, and Steve Renals. Age recognition for spoken dialogue systems: Do we need it? In Proc. Interspeech, September 2009. [ bib | .pdf ]
When deciding whether to adapt relevant aspects of the system to the particular needs of older users, spoken dialogue systems often rely on automatic detection of chronological age. In this paper, we show that vocal ageing as measured by acoustic features is an unreliable indicator of the need for adaptation. Simple lexical features greatly improve the prediction of both relevant aspects of cognition and interactions style. Lexical features also boost age group prediction. We suggest that adaptation should be based on observed behaviour, not on chronological age, unless it is not feasible to build classifiers for relevant adaptation decisions.

[22] Christine Johnson, Pauline Campbell, Christine DePlacido, Amy Liddell, and Maria Wolters. Does peripheral hearing loss affect RGDT thresholds in older adults. In Proceedings of the American Auditory Society Conference, March 2009. [ bib | .pdf ]

[23] Ravi Chander Vipperla, Maria Wolters, Kallirroi Georgila, and Steve Renals. Speech input from older users in smart environments: Challenges and perspectives. In Proc. HCI International: Universal Access in Human-Computer Interaction. Intelligent and Ubiquitous Interaction Environments, number 5615 in Lecture Notes in Computer Science. Springer, 2009. [ bib | DOI | http | .pdf ]
Although older people are an important user group for smart environments, there has been relatively little work on adapting natural language interfaces to their requirements. In this paper, we focus on a particularly thorny problem: processing speech input from older users. Our experiments on the MATCH corpus show clearly that we need age-specific adaptation in order to recognize older users' speech reliably. Language models need to cover typical interaction patterns of older people, and acoustic models need to accommodate older voices. Further research is needed into intelligent adaptation techniques that will allow existing large, robust systems to be adapted with relatively small amounts of in-domain, age appropriate data. In addition, older users need to be supported with adequate strategies for handling speech recognition errors.

[24] Maria Wolters, Kallirroi Georgila, Sarah MacPherson, and Johanna Moore. Being old doesn't mean acting old: Older users' interaction with spoken dialogue systems. ACM Transactions on Accessible Computing, 2(1):1-39, 2009. [ bib | http ]
Most studies on adapting voice interfaces to older users work top-down by comparing the interaction behavior of older and younger users. In contrast, we present a bottom-up approach. A statistical cluster analysis of 447 appointment scheduling dialogs between 50 older and younger users and 9 simulated spoken dialog systems revealed two main user groups, a “social” group and a “factual” group. “Factual” users adapted quickly to the systems and interacted efficiently with them. “Social” users, on the other hand, were more likely to treat the system like a human, and did not adapt their interaction style. While almost all “social” users were older, over a third of all older users belonged in the “factual” group. Cognitive abilities and gender did not predict group membership. We conclude that spoken dialog systems should adapt to users based on observed behavior, not on age.

[25] Maria Wolters, Kallirroi Georgila, Robert Logie, Sarah MacPherson, Johanna Moore, and Matt Watson. Reducing working memory load in spoken dialogue systems. Interacting with Computers, 21(4):276-287, 2009. [ bib | .pdf ]
We evaluated two strategies for alleviating working memory load for users of voice interfaces: presenting fewer options per turn and providing confirmations. Forty-eight users booked appointments using nine different dialogue systems, which varied in the number of options presented and the confirmation strategy used. Participants also performed four cognitive tests and rated the usability of each dialogue system on a standardised questionnaire. When systems presented more options per turn and avoided explicit confirmation subdialogues, both older and younger users booked appointments more quickly without compromising task success. Users with lower information processing speed were less likely to remember all relevant aspects of the appointment. Working memory span did not affect appointment recall. Older users were slightly less satisfied with the dialogue systems than younger users. We conclude that the number of options is less important than an accurate assessment of the actual cognitive demands of the task at hand.

[26] Maria Wolters, Pauline Campbell, Christine DePlacido, Amy Liddell, and David Owens. Adapting Speech Synthesis Systems to Users with Age-Related Hearing Loss. In Beiträge der 8. ITG Fachtagung Sprachkommunikation, September 2008. [ bib | .pdf ]
This paper summarises the main results of a pilot study into the effect of auditory ageing on the intelligibility of synthetic speech. 32 older and 12 younger users had to answer simple questions about a series of meeting reminders and medication reminders. They also underwent an extensive battery of audiological and cognitive assessments. Older users only had more difficulty understanding the synthetic voice than younger people if they had elevated pure-tone thresholds and if they were asked to unfamiliar medication names. We suggest that these problems can be remedied by better prompt design. User interviews show that the synthetic voice used was quite natural. Problems mentioned by users fit the results of a previous error analysis.

[27] Florian Gödde, Sebastian Möller, Klaus-Peter Engelbrecht, Christine Kühnel, Robert Schleicher, Anja Naumann, and Maria Wolters. Study of a speech-based smart home system with older users. In International Workshop on Intelligent User Interfaces for Ambient Assisted Living, pages 17-22, 2008. [ bib ]
[28] Kallirroi Georgila, Maria Wolters, Vasilis Karaiskos, Melissa Kronenthal, Robert Logie, Neil Mayo, Johanna Moore, and Matt Watson. A fully annotated corpus for studying the effect of cognitive ageing on users' interactions with spoken dialogue systems. In Proceedings of the 6th International Conference on Language Resources and Evaluation, 2008. [ bib ]
[29] Sebastian Möller, Florian Gödde, and Maria Wolters. A corpus analysis of spoken smart-home interactions with older users. In Proceedings of the 6th International Conference on Language Resources and Evaluation, 2008. [ bib ]
[30] Maggie Morgan, Marilyn R. McGee-Lennon, Nick Hine, John Arnott, Chris Martin, Julia S. Clark, and Maria Wolters. Requirements gathering with diverse user groups and stakeholders. In Proc. 26th Conference on Computer-Human Interaction, Florence, 2008. [ bib ]
[31] Maria Wolters, Pauline Campbell, Christine DePlacido, Amy Liddell, and David Owens. The effect of hearing loss on the intelligibility of synthetic speech. In Proc. Intl. Conf. Phon. Sci., August 2007. [ bib | .pdf ]
Many factors affect the intelligibility of synthetic speech. One aspect that has been severely neglected in past work is hearing loss. In this study, we investigate whether pure-tone audiometry thresholds across a wide range of frequencies (0.25-20kHz) are correlated with participants' performance on a simple task that involves accurately recalling and processing reminders. Participants' scores correlate not only with thresholds in the frequency ranges commonly associated with speech, but also with extended high-frequency thresholds.

[32] Maria Wolters, Pauline Campbell, Christine DePlacido, Amy Liddell, and David Owens. Making synthetic speech accessible to older people. In Proc. Sixth ISCA Workshop on Speech Synthesis, Bonn, Germany, August 2007. [ bib | .pdf ]
In this paper, we report on an experiment that tested users' ability to understand the content of spoken auditory reminders. Users heard meeting reminders and medication reminders spoken in both a natural and a synthetic voice. Our results show that older users can understand synthetic speech as well as younger users provided that the prompt texts are well-designed, using familiar words and contextual cues. As soon as unfamiliar and complex words are introduced, users' hearing affects how well they can understand the synthetic voice, even if their hearing would pass common screening tests for speech synthesis experiments. Although hearing thresholds correlate best with users' performance, central auditory processing may also influence performance, especially when complex errors are made.

[33] Maria Wolters, Pauline Campbell, Christine DePlacido, Amy Liddell, and David Owens. The role of outer hair cell function in the perception of synthetic versus natural speech. In Proc. Interspeech, August 2007. [ bib | .pdf ]
Hearing loss as assessed by pure-tone audiometry (PTA) is significantly correlated with the intelligibility of synthetic speech. However, PTA is a subjective audiological measure that assesses the entire auditory pathway and does not discriminate between the different afferent and efferent contributions. In this paper, we focus on one particular aspect of hearing that has been shown to correlate with hearing loss: outer hair cell (OHC) function. One role of OHCs is to increase sensitivity and frequency selectivity. This function of OHCs can be assessed quickly and objectively through otoacoustic emissions (OAE) testing, which is little known outside the field of audiology. We find that OHC function affects the perception of human speech, but not that of synthetic speech. This has important implications not just for audiological and electrophysiological research, but also for adapting speech synthesis to ageing ears.

[34] David Owens, Pauline Campbell, Amy Liddell, Christine DePlacido, and Maria Wolters. Random gap detection threshold: A useful measure of auditory ageing? In Proc. Europ. Cong. Fed. Audiol. Heidelberg, Germany, June 2007. [ bib | .pdf ]

[35] Amy Liddell, David Owens, Pauline Campbell, Christine DePlacido, and Maria Wolters. Can extended high frequency hearing thresholds be used to detect auditory processing difficulties in an ageing population? In Proc. Europ. Cong. Fed. Audiol. Heidelberg, Germany, June 2007. [ bib ]

[36] Marilyn McGee-Lennon, Maria Wolters, and Tony McBryan. Auditory reminders in the home. In Proc. Intl. Conf. Auditory Display (ICAD), Montreal, Canada, June 2007. [ bib ]

[37] David Beaver, Brady Zack Clark, Edward Flemming, T. Florian Jaeger, and Maria Wolters. When semantics meets phonetics: Acoustical studies of second occurrence focus. Language, 83(2):245-276, 2007. [ bib | .pdf ]
[38] Heike Penner, Nicholas Miller, and Maria Wolters. Motor speech disorders in three Parkinsonian syndromes: A comparative study. In Proc. Intl. Conf. Phon. Sci,, 2007. [ bib ]