11 Nov 2003
Steve Renals
Speech and crosstalk detection in multi-channel audio
The analysis of scenarios in which a number of microphones record the activity of speakers, such as in a roundtable meeting, presents a number of computational challenges. For example, if each participant wears a microphone, it can receive speech from both the microphone's wearer (local speech) and other participants (crosstalk). The recorded audio can be broadly classified in four ways: local speech, crosstalk plus local speech, crosstalk alone and silence. In this talk I shall discuss some investigations related to the automatic classification of audio into these four classes. In particular, I shall discuss the utility of various acoustic features (eg kurtosis, cross-correlation metrics and fundamentalness) for this problem, and the construction of some simple statistical models that use these features.
(Joint work with Stuart Wrigley and Guy Brown.)
<owner-pworkshop@ling.ed.ac.uk> |