WWW pages of 3rd European Master School on Language and Speech

Using the Keyword Lexicon for Speech Recognition

Christophe Van Bael
(University of Edinburgh)

Recent work at the Centre for Speech Technology Research (CSTR, The University of Edinburgh) has developed a dialect/accent-independent lexicon for speech synthesis. The main purpose of this lexicon is to by-pass the problems and cost of writing a new lexicon for every new dialect/accent needed for synthesis.

The problem of how to deal with pronunciation variation for automatic speech recognition is currently a hot topic. It is clear that there is variation in pronunciation (at least, across speakers) which cannot be handled solely by the acoustic models. Therefore, mul- tiple pronunciations for some words must be included in the lexicon. This can improve recognition accuracy, but if too many variants are included, accuracy actually decreases. Since variation within a single speaker is relatively small we propose a novel method using speaker-specific lexica for ASR, which include only those pronunciation variants appropriate for a single speaker's accent. The main aim of this project is therefore to use the Keyword Lexicon to generate such speaker-specific lexica, and use these lexica in a standard HMM-based recognition system. We will use a standard speech recognition benchmark task (WSJCAM0 - a British English version of the Wall Street Journal corpus) to evaluate our new method.