CSTR Projects

Active Research Projects

Acoustic-articulatory inversion: This inversion project aims to estimate the articulatory movements which underpin an acoustic speech signal.
Combilex: Combilex is a high-quality multi-accent pronunciation lexicon for English with several advanced features.
Deep architectures for statistical speech synthesis: This fellowship is concerned with developing a new model for statistical speech synthesis which allows us to include more information about how speech is produced, as well as information about how it is perceived and how external factors, such as background noise, affect speech.
The Edinburgh Speech Tools: Speech tools is a set of core libraries used by Festival and various other applications
EU-Bridge: EU-Bridge is a three year project which will develop automatic transcription and translation technology to enable innovative multimedia captioning and translation services of audiovisual documents between European and non-European languages. The project will provide streaming technology that can convert speech from lectures, meetings, and telephone conversations into the text in another language. Within Edinburgh CSTR will work closely with the Statistical Machine Translation Group.
The Festival speech synthesis system: The Festival Speech Synthesis system
InEvent: InEvent is a three year project whose main goal is to develop new means to structure, retrieve, and share large archives of networked, and dynamically changing, multimedia recordings, mainly consisting of meetings, video-conferences, and lectures.
INSPIRE: INSPIRE is a Marie Curie Initial Training Network, concerned with investigating speech processing in realistic environments.
LISTA: The Listening Talker: LISTA is an EU project about speaker- and environment-adaptive speech synthesis and speech modification
MultiMemoHome: MultiMemoHome is a research project aiming to develop user-friendly, accessible and effective reminder systems in order to improve home care.
Natural Speech Technology: Natural Speech Technology (NST) is a 5-year EPSRC Programme Grant with the aim of significantly advancing the state-of-the-art in speech technology by making it more natural, approaching human levels of reliability, adaptability and conversational richness. NST is a collaboration between CSTR, the Speech Group at the University of Cambridge and the Speech and Hearing Research Group (SpandH), University of Sheffield.
RSE / NSFC Bilateral Research Award: The Royal Society of Edinburgh / National Science Foundation China travel grant has been awarded to CSTR and USTC for further joint and linked research on our novel framework for speech synthesis.
SALB: Speech synthesis of Auditory Lecture books for Blind children: In this project we want to evaluate HMM-based synthesis of different language varieties (standard, dialect, sociolect) for auditive lecture books. Moreover, we want to analyze the influence of different social roles (teacher vs student) as well as of self-perception and perception of others, that exists between the listener and the person whose voice is synthesized.
SCALE: SCALE is a Marie-Curie Initial Training Network. The research themes are: Automatic Speech Recognition, Machine learning, Speech Synthesis, Signal Processing, and Human speech recognition
Simple4All: The Simple4All project will create speech synthesis technology which learns from data with little or no expert supervision, and continually improves simply by being used.
SSPNet: The Social Signal Processing Network: SSPNet is an EU FP7 Network of Excellence project about social signal processing
uDialogue: Joint with Nagoya Institute of Technology, uDialogue is a five year project concerned with crowdsourcing multimodal dialogue systems, speech synthesis, and speech recognition.
Ultrax: Ultrax aims to develop ultrasound scanning technology into a useful and effective tool for child speech therapy.
Voicebank: The Voicebank project aims to develop clinical applications of HMM-based speech synthesis such as personalised voices for communication aids. The project is a collaboration between CSTR, the Euan MacDonald Centre for Motor Neurone Disease and the Anne Rowling Regenerative Neurology Clinic.
Voice Building KTP: This is a Knowledge Transfer Partnership with Orange/France Telecom. The aim of the KTP is to improve automatic voice building through development/integration of novel automatic speech recognition techniques and build commercial-grade systems for bringing personalised speech technology to Orange customers.

Project Archives

(this is an incomplete list)

AMI: Augmented Multiparty Interaction: AMI is an EU Integrated Project about computer enhanced multi-modal interaction in the context of meetings
AMIDA: Augmented Multiparty Interaction with Distance Access: AMIDA is an EU Integrated Project, following on from AMI, about computer enhanced multi-modal interaction in the context of meetings
The Articulatory Database Registry: A directory of articulatory data resources
BARKS: The BARKS project is exploring the use of switching linear dynamic models for automatic speech recognition.
Cougar: The Cougar project investigates using an articulatory(-like) domain for the calculation of join costs in conjunction with smoothing unit transitions in unit selection speech synthesis.
EMIME: Enhanced Multilingual Interaction in Mobile Environments: EMIME is an EU project about personalised speech synthesis and speech-to-speech translation
ESLASR: ESLASR aims to improve the quality of automatic speech recognition using loosely-coupled HMMs with articulatory-acoustic features.
Espresso: Novel acoustic models for ASR
Expressions: Expressions aims to improve the quality of prosody and intonation for unit selection speech synthesis.
EUSTACE: Edinburgh University Speech Timing Archive and Corpus of English: The EUSTACE corpus comprises 4608 spoken sentences designed to examine a number of durational effects in speech.
Feature MLPs: A freely available set of articulatory feature MLPs trained on 2000 hours of conversational telephone speech.
Evaluating pitch determination algorithms: Paul Bagshaw's database for evaluating pitch determination algorithms
ID4S: Intonation in Dialogue for Speech Recognition
Instrumented Meeting Room (AMIDA WP3): Smart room environments for communication scene analysis
M4: Multimodal Meeting Manager: M4 was concerned with the construction of a demonstration system to enable structuring, browsing and querying of an archive of automatically analysed meetings.
SABLE: The Sable Consortium
Satissfy: Statistical Analysis of Text In Speech Synthesis
Sole: The Spoken Output Labelling Explorer
SVitchboard: Small vocabulary tasks from Switchboard
TESS: Testing Evaluation of Speech Synthesis: TESS is a project designed to investigate the psychoacoustic processes underlying human auditory evaluation of synthetic speech, with the goal of developing more perceptually principled evaluation methods.
Unisyn Lexicon: A keysymbol lexicon
Voice transformation: Transforming the quality and intonation of the speech of one speaker so that it sounds like another speaker
Welsh: Welsh Speech Database