11 Nov 2003

Steve Renals

Speech and crosstalk detection in multi-channel audio

The analysis of scenarios in which a number of microphones record the activity of speakers, such as in a roundtable meeting, presents a number of computational challenges. For example, if each participant wears a microphone, it can receive speech from both the microphone's wearer (local speech) and other participants (crosstalk). The recorded audio can be broadly classified in four ways: local speech, crosstalk plus local speech, crosstalk alone and silence. In this talk I shall discuss some investigations related to the automatic classification of audio into these four classes. In particular, I shall discuss the utility of various acoustic features (eg kurtosis, cross-correlation metrics and fundamentalness) for this problem, and the construction of some simple statistical models that use these features.

(Joint work with Stuart Wrigley and Guy Brown.)

[back to PWorkshop Archives]

<owner-pworkshop@ling.ed.ac.uk>