BARKS
Project Summary
The BARKS project is exploring the use of switching linear dynamic models for automatic speech recognition.
Project Details
BARKS stands for 'Better Recognition through Kalman Switching'
This project concerns the application of linear dynamic models (LDM)
to automatic speech recognition (ASR). The LDM is a generative model
which gives a time-varying multivariate Gaussian distribution over the
observations. Underlying dynamics are modelled by the state, which
evolves according to a first-order auto-regressive (AR) process. The
potential benefits of using such a model for ASR compared to hidden
Markov models (HMM) include:
- first-order dynamics of state gives a model of inter-frame correlations.
- spatial correlations can be modelled fully or approximated via projection of lower dimensional state.
- passing state information across phone boundaries relaxes the assumption of segmental independence.
- continuous underlying representation reflects known properties of speech production.
Previous work has demonstrated a benefit from the addition of a hidden dynamic state. The current project extends this by developing a switching system which will allow:
- multimodal output distributions without introducing problems of computational intractability.
- approximation of non-linear dynamics whilst retaining the
linear-Gaussian properties which make filtering simple.
Personnel
Funding Source
The Engineering and Physical Scienc Research Council (EPSRC grant GR/S21281/01)