JP2005164837A5

JP2005164837A5 -

Info

Publication number: JP2005164837A5
Application number: JP2003401724A
Authority: JP
Filing date: 2003-12-01
Publication date: 2005-09-22
Anticipated expiration: 2023-12-01

Claims

Receiving a speech recognition result representing a plurality of hypothesized word strings each consisting of a word to which a word posterior probability is given, output from the speech recognition decoder, and verifying the reliability of the speech recognition result based on the word posterior probability A speech recognition result reliability verification device for
For each word included in the speech recognition result, generalized word posterior probability calculating means for calculating a generalized word posterior probability based on the word posterior probability of the word included in the speech recognition result;
Updating means for updating the word posterior probability of each word included in the speech recognition result with the generalized word posterior probability calculated by the generalized word posterior probability calculating means;
Based on the speech recognition result in which the word posterior probability is updated by the updating means, a search is made among the plurality of hypothesis word strings for which the sum of the word posterior probabilities of the words included in the hypothesis word string is maximized. Search means for,
Determination means for verifying the reliability of the speech recognition result by determining whether or not the sum of the word posterior probabilities of the hypothesis word string searched by the search means satisfies a predetermined condition, Reliability verification device for speech recognition results.

Prior to the calculation of generalized word probabilities by the generalized word posterior probability calculating means, only the word string consisting of the speech recognition results having a higher likelihood than the threshold value determined by a predetermined criterion is selected. The speech recognition result reliability verification apparatus according to claim 1, further comprising means for giving to the generalized word posterior probability calculating means.

Each word included in the speech recognition result is further provided with information for determining a time period during the input utterance to the speech recognition decoder,
The generalized word posterior probability calculating means is:
For each word included in the speech recognition result, a word search means for searching for a word in the speech recognition result that has a time period overlapping with the time period of the word and matches the word,
Based on the sum of the word posterior probabilities of the words searched by the word search means and the sum of the word posterior probabilities of all words included in the speech recognition result, the generalized word posterior probability of each word is calculated. The speech recognition result reliability verification apparatus according to claim 1 or 2, further comprising:

The means for calculating the generalized word posterior probability is a ratio between the sum of the word posterior probabilities of the words searched by the word search means and the sum of the word posterior probabilities of all words included in the speech recognition result. The speech recognition result reliability verification apparatus according to claim 3, further comprising means for calculating a generalized word posterior probability of each word.

The generalized word posterior probability p ([w; s, t] | x ₁ ^T ) of the word w in the hypothesis word string (where s and t are the start time and end time of the time period of the word w, respectively) is formula
Given, provided that _{^{_{x 1 T = x 1, ...}}} , x T is the observed speech sequence, M is the number of words included in the hypothesis of a speech recognition result, s _n and t _n, respectively, word w _Are the start time and end time of the _nth word _wn , where p (x _sm ^tm | w _m ) is the acoustic likelihood, and p (w _m | w ₁ ^M ) is the language likelihood, The speech verification result reliability verification apparatus according to claim 1, wherein p (x ₁ ^T ) is an acoustic observation likelihood, and α and β are predetermined constants, respectively.

A computer program that, when executed by a computer, causes the computer to operate so as to realize each unit of the speech recognition result reliability verification apparatus according to any one of claims 1 to 5.

A computer programmed by the computer program according to claim 6.