The Centre for Speech Technology Research, The university of Edinburgh

Opportunities

Funded Research Studentships

We currently have fully-funded PhD positions in Speech Technology, for the following projects. You are encouraged to make informal contact with the named supervisor(s) before submitting a formal application: we will assist you in preparing a strong application.

1. Broadcast Quality End-to-end Speech Synthesis

In collaboration with the BBC

The studentship will be jointly supervised by Simon King and Oliver Watts at CSTR, University of Edinburgh; and by Andrew McParland at BBC R&D.

Advances in neural networks made jointly in the fields of automatic speech recognition and speech synthesis, amongst others, have led to a new understanding of their capabilities as generative models. Neural networks can now directly generate synthetic speech waveforms, without the limited quality of a vocoder.

We have made separate advances, using neural networks to discover representations of spoken and written language that have applications in lightly-supervised text processing for almost any language, and for adaptation of speaker identity and style. The project will combine these techniques into a single end-to-end model for speech synthesis. This will require new techniques to learn from both text and speech data, which may have other applications, such as automatic speech recognition.

Funding: This PhD studentship is jointly supported by the EPSRC and the BBC under the Industrial CASE scheme. The studentship includes full EPSRC funding for 4 years, and includes internships and collaboration with BBC R&D. We would like the successful applicant to start by September 2017, but a later start date may be negotiable.

Eligibility: Applicants must meet the usual EPSRC eligibility requirements which are summarised as "Normally, to be eligible for a full award a student must have no restrictions on how long they can stay in the UK and have been ordinarily resident in the UK for at least 3 years prior to the start of the studentship (with some further constraint regarding residence for education)." Please contact us if you would like confirmation of your eligibility.

How to apply: read more details about studying for a PhD in CSTR / ILCC, and how to apply. Please indicate in the application that you wish to be considered for this studentship. Applications will be considered on a rolling basis until the position is filled.


2. Automatic Extraction of Rich Metadata from Broadcast Speech

In collaboration with the BBC

This EPSRC iCASE studentship will be jointly supervised by: Steve Renals and Mirella Lapata at the University of Edinburgh; and by Andrew McParland at BBC R&D.

The research studentship will be concerned with automatically learning to extract rich metadata information from broadcast television recordings, using speech recognition and natural language processing techniques. We will build on recent advances in convolutional and recurrent neural networks, using architectures which learn representations jointly, considering both acoustic and textual data. The project will build on our current work in the rich transcription of broadcast speech using neural network based speech recognition systems, along with neural network approaches to machine reading and summarisation. In particular, we are interested in developing approaches to transcribing broadcast speech in a way appropriate to the particular context. This may include compression or distillation of the content (perhaps to fit in with the constraints of subtitling), transforming conversational speech into a form that is more easy to read as text, or transcribing broadcast speech in a way appropriate for a particular reading age.

Funding: This PhD studentship is jointly supported by the EPSRC and the BBC under the Industrial CASE scheme. The studentship includes full EPSRC funding for 4 years, and includes internships and collaboration with BBC R&D. The successful applicant must start no later than September 2017.

Eligibility: Applicants must meet the usual EPSRC eligibility requirements which are summarised as "Normally, to be eligible for a full award a student must have no restrictions on how long they can stay in the UK and have been ordinarily resident in the UK for at least 3 years prior to the start of the studentship (with some further constraint regarding residence for education)." Please contact us if you would like confirmation of your eligibility.

How to apply: read more details about studying for a PhD in CSTR / ILCC, and how to apply. Please indicate in the application that you wish to be considered for this studentship. Applications will be considered on a rolling basis until the position is filled.


3. Distant Speech Recognition of Overlapping Speech

In collaboration with Toshiba Research Europe

This studentship will be supervised by Prof Steve Renals at the University of Edinburgh and by Dr Rama Doddipatla and Prof Yannis Stylianou at Toshiba.

The PhD research project will focus on the recognition of overlapped speech. There has been a variety of work in speech separation and recognition, dating back to the speech separation challenges of a decade ago. More recent approaches have addressed the problem using deep learning approaches, including stacked LSTM networks and deep clustering.

In this project we shall investigate jointly modelling speech separation and speech recognition in order to better recognise overlapped speech, exploring both the single channel and multichannel cases. We shall initially investigate two principal research approaches: (1) Multi-task and adversarial learning, in which convolutional and recurrent networks will be trained with combined objective functions for both recognition and enhancement (or separation). (2) Attentional approaches that explicitly aim to attend to a single source at a time, building on gating architectures used in LSTM and deep feed-forward networks.

The project will also explore domain adaptive training. This is a necessity for distant speech recognition as are relatively few corpora with multiple distant microphone recordings.

The work will be mainly carried out using publicly available databases (in particular the AMI and ICSI corpora) and will use open source software toolkits (in particular, Kaldi).

Funding: This PhD studentship is supported by Toshiba Research Europe. The studentship includes full funding (fees + stipend) for 3.5 years, and includes an internship at Toshiba Research Europe's Cambridge laboratory. We would like the successful applicant to start by September 2017, but a later start date may be negotiable.

Eligibility: This studentship is subject to the usual eligibility requirements for PhD study et Edinburgh. The studentship can support either Home/EU or Overseas fees.

How to apply: read more details about studying for a PhD in CSTR / ILCC, and how to apply. Please indicate in the application that you wish to be considered for this studentship. Applications will be considered on a rolling basis until the position is filled.


General applications for our PhD programme

Please read about studying for a PhD at CSTR if you would like to apply to us with your own research proposal.

Further information

PhD Studentships in Data Science

Ten fully funded PhD studentships available in data science as part of a new Centre for Doctoral Training in Data Science. Topics include machine learning, databases, natural language processing, speech processing, and other areas. This is a 4-year PhD programme that includes MSc-level coursework, and so is suitable for students coming directly from a Bachelor's degree. Funded by EPSRC and the University of Edinburgh. More information at datascience.inf.ed.ac.uk

Masters Programmes

There are several Masters' programmes which act as excellent preparation for speech and language research.

Internships

We occasionally offer internships in CSTR. These are mainly for PhD students at other institutions, but we do consider undergraduate and Masters applicants. Opportunities change every year, so please contact us for more information.