The Centre for Speech Technology Research, The university of Edinburgh

08 Jun 2004

Moritz Neugebauer (University College Dublin)


Tree-based Acoustic Modelling with Phonological Constraints

Decision-tree based state tying has become increasingly popular for modelling context dependency for large vocabulary speech recognition. Firstly, the classification and prediction property of decision trees allows model units or contexts which do not occur in the training data to be provided. Secondly, the node splitting procedure of decision-tree based state tying is a model selection process. It thus maintains the balance between model complexity and the number of parameters in order to render a robust estimation of model parameters from the limited amount of training data.

Decision trees are built from a set of phonetic questions which refer to classes such as vowels or plosives in order to assign triphone states to appropriate acoustic models. The assumption behind the choice of phoneme classes is that phonemes which belong to the same class have a similar acoustic effect on neighbouring sounds. The standard approach in deriving the phonetic questions for a particular task with a specific phoneme set is to use a human expert.

In this presentation a new method is presented which automatically defines this question set. To this end, tree learning algorithms are interleaved with a knowledge representation component. Decision trees are explicitly linked to deduced phonological feature descriptions which then provide a classification scheme of phoneme contexts. The following components will be presented in detail: (a) the automatic learning of paradigmatic phonological constraints, (b) their formal representation and (c) their application to tree-based state tying for speech recognition.

[back to PWorkshop Archives]

<owner-pworkshop@ling.ed.ac.uk>