Detecting unknown words in spontaneous speechJosep CasarramonaThis presentation
deals with the problem of Out-of-Vocabulary words. It is an important
issue in a speech recognizer to detect those words which are probably
wrongly recognized. A singular case are words that are not included in the
vocabulary. The speech recognizer performs in the normal way when an OOV
word is presented: it searches in its vocabulary which word better matches
what it has heard, so that the resulting hypothesis will be an
in-Vocabulary word. But in the process of the recognition it might be
possible to detect some indicators of this bad recognition. In this work
it is intended to collect a number of such features, which are introduced
into a classifier and finally a confidence measure for each word is
obtained. We extend the framework of confidence estimation so that for
each word a probability of belonging to one of the three classes COR, INC
and OOV is obtained. It is seen here that it is possible to obtain
promising results with this approach and it is also seen that the
alignment, the classifier and the structure of the classification are very
important in order to obtain good results. |