JPH03141398A - Syntax processing device for continuous sound recognition - Google Patents

Syntax processing device for continuous sound recognition

Info

Publication number
JPH03141398A
JPH03141398A JP1280442A JP28044289A JPH03141398A JP H03141398 A JPH03141398 A JP H03141398A JP 1280442 A JP1280442 A JP 1280442A JP 28044289 A JP28044289 A JP 28044289A JP H03141398 A JPH03141398 A JP H03141398A
Authority
JP
Japan
Prior art keywords
word
candidate
syntax
storage means
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP1280442A
Other languages
Japanese (ja)
Other versions
JPH077273B2 (en
Inventor
Yasushi Ishikawa
泰 石川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mitsubishi Electric Corp
Original Assignee
Mitsubishi Electric Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mitsubishi Electric Corp filed Critical Mitsubishi Electric Corp
Priority to JP1280442A priority Critical patent/JPH077273B2/en
Publication of JPH03141398A publication Critical patent/JPH03141398A/en
Publication of JPH077273B2 publication Critical patent/JPH077273B2/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Landscapes

  • Document Processing Apparatus (AREA)
  • Machine Translation (AREA)

Abstract

PURPOSE:To decrease an enormous processing quantity without discarding a correct hypothesis by once storing an input word candidate of low probability and making syntax decision by returning to the stored word candidate when the probability that this word candidate can be connected to a word array candidate is low even if an acoustic distance value is small. CONSTITUTION:The probability that the input word candidate 11 is included in the result of voice recognition is first judged by a threshold deciding means 1 and the input word candidate 11 of the low probability is once stored in a word candidate memory means 8. The input word candidate 11 decided to be not connectable to the word array candidate 22 by a syntax deciding means 2 is connected before the word candidates 55 by a backward syntax deciding means 3 and further, whether this word candidate 55 can be connected behind the word array candidate 22 or not is decided by applying a backward consultation syntax rule 44. The input word candidate 11 which can be connected is additionally stored as the fresh word array candidate 22 to be connected to this word candidate 55 and the word array candidate 22. The processing quantity is decreased without lowering a sentence recognition rate in this way and the processing in real time is possible.

Description

【発明の詳細な説明】 〔産業上の利用分野〕 この発明は連続音声認識装置で入力音声を音響的に処理
する音響処理装置から入力単語候補を受け構文規則を適
用して音声認識結果を得る構文処理装置に関する。
[Detailed Description of the Invention] [Field of Industrial Application] This invention is a continuous speech recognition device that receives input word candidates from a sound processing device that acoustically processes input speech, and obtains speech recognition results by applying syntactic rules. Relating to a syntax processing device.

〔従来の技術〕[Conventional technology]

第7図は例えば文献(共立出版・情報科学講座E 、1
9.3 、音声認識、139頁〜142頁、1979)
に示された従来例の連続音声認識用構文処理装置の機能
ブロック図である。
Figure 7 shows, for example, literature (Kyoritsu Shuppan Information Science Course E, 1
9.3, Speech Recognition, pp. 139-142, 1979)
FIG. 2 is a functional block diagram of a conventional syntax processing device for continuous speech recognition shown in FIG.

図において■は入力音声を音響的に処理して得られる入
力単語候補01)を単語列候補記憶手段(50)から読
出した単語列候補(22)に対して構文規則記憶手段(
60)から検索した構文規則(33)を適用して構文検
定をし接続条件を充足する単語列候補(22)に入力単
語候補01)を接続して成る新たな単語列候補(22)
を単語列候補記憶手段(50)に追加記憶をさせる構文
判定手段、 (40)は入力単語候補(11)の入力を
全て終えろと単語列候補記憶手段(50)から総合音響
距離値が最小の単語列候補(22)を読出して音声認識
結果(77)として出力する認識結果判定手段。
In the figure, ■ indicates an input word candidate (01) obtained by acoustically processing input speech for a word string candidate (22) read out from a word string candidate storage means (50), and a syntax rule storage means (
A new word string candidate (22) is created by connecting the input word candidate 01) to the word string candidate (22) that satisfies the connection condition by applying the syntax rule (33) searched from 60) and performing a syntax test.
(40) is a syntax determining means that causes the word string candidate storage means (50) to additionally store the word string candidate storage means (50); recognition result determination means for reading word string candidates (22) and outputting them as speech recognition results (77);

(50)は構文処理で生成される単語列候補(22)を
記憶する単語列候補記憶手段、 (60)は構文状態遷
移網で記述した構文規則(33)を記憶する構文規則記
憶手段である。
(50) is a word string candidate storage means for storing a word string candidate (22) generated by syntactic processing, and (60) is a syntactic rule storage means for storing a syntactic rule (33) described in a syntactic state transition network. .

上記従来例の連続音声認識用構文処理装置は。The above conventional syntax processing device for continuous speech recognition is as follows.

まず構文判定手段(20)で第8図のように入力音声の
先頭から時間軸方向に沿って処理していく前向き法(l
eft−to−right method)により入力
音声を音響的に処理して得られる入力単語候補(+1>
の単語名Wが単語列候補記憶手段(50)から読出した
単語列候補(22)の最終状態pで受理されるか否かを
構文規則記憶手段(60)から検索した構文規則(33
)を適用して構文検定をする。このとき受理・生成され
る単語列候補(文章仮説)の全てについて第9図のよう
に その構成する各単語の総合音響距離値(音響距離値
の総合評価値)の大きいもの即ち尤度の低いものを棄却
し、小さいもの即ち尤度の高いものt!けをその時点で
の最終的な単語列候補として採用し次の入力単語候補0
1)を処理していくビーム探索法(beam 5ear
ch method)て仮説検定をする。
First, the syntax determining means (20) processes the input speech from the beginning along the time axis as shown in FIG.
Input word candidates (+1>
The syntax rule (33) retrieved from the syntax rule storage means (60) determines whether the word name W is accepted in the final state p of the word string candidate (22) read from the word string candidate storage means (50).
) to perform a syntactic test. For all of the word string candidates (sentence hypotheses) accepted and generated at this time, as shown in Figure 9, the words that constitute the words have a large total acoustic distance value (comprehensive evaluation value of acoustic distance values), that is, have a low likelihood. t! is adopted as the final word string candidate at that point and the next input word candidate 0
1) Beam search method (beam 5ear
ch method) to test the hypothesis.

第9図で節点に付した数字は尤度、丸印は検定を進める
仮説、黒点印は棄却する仮説をそれぞれ示す。この構文
検定の結果、接続条件を充足する単語列候補(22)に
入力単語候補(11)を接続して成る新たな単語列候補
(22)を単語列候補記憶手段(50)に追加記憶をす
る。
In Figure 9, the numbers attached to the nodes indicate the likelihood, the circles indicate the hypotheses to be tested, and the black dots indicate the hypotheses to be rejected. As a result of this syntax test, a new word string candidate (22) formed by connecting the input word candidate (11) to a word string candidate (22) that satisfies the connection condition is additionally stored in the word string candidate storage means (50). do.

次に発声を終り入力単語候補(11)の入力を全て終え
たとき認識結果判定手段(40)で単語列候補記憶手段
(50)から総合音響距離値が最小の単語列候補(22
)を読出して音声認識結果(77)として出力する。
Next, when the utterance is finished and all the input word candidates (11) have been input, the recognition result determination means (40) selects the word string candidate (22) with the minimum total acoustic distance value from the word string candidate storage means (50).
) is read out and output as a voice recognition result (77).

なお処理の方向として上記前向き法を用いる方式に対し
て第10図のように最初に入力単語候補(11)をいっ
たん全て記憶しておいた中から最も確からしい単語候補
を初期仮説としその初期仮説から文頭側または文末側に
拡張しながら時間軸の前後方向に残りの部分を処理して
いく高駆動法(i s l anddriven me
thocl)を用いる方式もある。
In addition, for the method that uses the above-mentioned forward method as the processing direction, as shown in Fig. 10, first, all input word candidates (11) are memorized and the most probable word candidate is set as an initial hypothesis. The high-drive method (is l and driven me) in which the remaining parts are processed in the forward and backward direction of the time axis while extending from the beginning to the beginning or end of the sentence.
There is also a method using thocl).

〔発明が解決しようとする課題〕[Problem to be solved by the invention]

上記のような従来の連続音声認識用構文処理装置では、
前向き法を用いる方式は仮説の評価検証を時間軸方向に
処理するから、途中で正しい仮説をいったん棄却してし
まうと正しい認識結果を得られず、また発声の先頭付近
の単語候補は音響距離値が小さいとは限らず正しい認識
結果を得るためには多くの仮説をたてる必要があり、更
に棄却を少なくして文認識率を向上させろためには音響
距離値が比較的大きい単語候補まで同様に扱う必要があ
り処理量が膨大になる。
In the conventional syntax processing device for continuous speech recognition as described above,
Since the forward-looking method processes hypothesis evaluation and verification in the time axis direction, if a correct hypothesis is rejected midway through, correct recognition results cannot be obtained, and word candidates near the beginning of the utterance have acoustic distance values. is not necessarily small, and it is necessary to formulate many hypotheses in order to obtain correct recognition results.In order to further reduce rejection and improve sentence recognition rate, it is necessary to formulate many hypotheses even for word candidates with relatively large acoustic distance values. The amount of processing required will be enormous.

一方、島駆動法を用いる方式は処理の初期に正しい仮説
を棄却する可能性が低く高い認識率を期待できるが2発
声文章中の一単語候補を初期仮説とすることはその対象
の単語が出現する構文的な状態を全て仮説とすることに
なるから、その検証に必要な処理量は大きくなる。また
発声文章中の全ての単語候補を検出してから初期仮説を
決定すろので実時間処理ができない。
On the other hand, the method using the island driving method is less likely to reject the correct hypothesis at the beginning of processing and can be expected to have a high recognition rate. Since all the syntactic states to be used are hypotheses, the amount of processing required to verify them becomes large. In addition, real-time processing is not possible because the initial hypothesis must be determined after all word candidates in the uttered sentence are detected.

従来の技術では以上のような課題があった。Conventional technology has had the problems described above.

この発明が解決しようとする課題は、この発明の構文処
理装置を用いる連続音声認識装置が従来と比べ文認識率
を低下させないで処理量を低減し。
The problem to be solved by the present invention is to provide a continuous speech recognition device using the syntax processing device of the present invention that reduces the processing amount without reducing the sentence recognition rate compared to the conventional one.

かつ実時間処理を可能にすることである。and to enable real-time processing.

〔課題を解決するための手段〕[Means to solve the problem]

上記課題を達成するためこの発明の連続音声認識用構文
処理装置は、下記の手段を含むことを特徴とするもので
ある。
In order to achieve the above object, a syntax processing device for continuous speech recognition according to the present invention is characterized in that it includes the following means.

連続音声認識装置で入力音声を音響的に処理する音響処
理装置から入力単語候補を受け構文規則を適用して音声
認識結果を得る構文処理装置において、 hq構文処理
生成される単語列候補を記憶する単語列候補記憶手段と
、ある状態とその状態から受理できる単語名とその単語
を受理したときに遷移する状態とで表わされる構文状態
遷移網で記述した構文規則を記憶する構文規則記憶手段
と。
A continuous speech recognition device acoustically processes input speech. In the syntax processing device, which receives input word candidates from the acoustic processing device and applies syntactic rules to obtain speech recognition results, the word string candidates generated by hq syntactic processing are stored. A word string candidate storage means, and a syntax rule storage means for storing syntactic rules described in a syntactic state transition network represented by a certain state, a word name that can be accepted from that state, and a state to which the word transitions when the word is accepted.

ある単語名とその単語の前に接続でさる単語名と接続し
tコときに遷移する状態とで表わされる構文状態遷移網
で記述した逆引き構文規則を記憶する逆引き構文規則記
憶手段と、単語列候補に属する可能性の低い単語候補を
記憶する単語候補記憶手段とを設け、前記入力単語候補
の音響距離値の閾値判定をし、前記入力単語候補を構文
判定させるかいったん前記単語候補記憶手段に記憶させ
るかを判断する閾値判定手段、この閾値判定手段からの
前記入力単語候補が接続できる可能性のある単語列候補
を前記単語列候補記憶手段から読出し前記構文規則記憶
手段から検索した構文規則を適用して構文判定をし、新
たな単語列候補として前記単語列候補記憶手段に追加記
憶をさせるか前記入力単語候補を更に後向き構文判定さ
せるかを判断する構文判定手段、この構文判定手段から
の前記入力単語候補が接続できる単語候補を前記単語候
補記憶手段から読出し前記逆引き構文規則記憶手段から
検索した逆引き構文規則を適用し、更にその単語候補が
接続できる単語列候補を前記単語列候補記憶手段から読
出し同様に逆引き構文規則を適用して後向き構文判定を
し、新たな単語列候補として前記単語列候補記憶手段に
追加記憶をさせるか否かを判断する後向き構文判定手段
、入力音声の音響的処理を終えて音響処理終了信号を入
力すると前記単語列候補記憶手段から終端位置が入力音
声の終端位置付近で総合音響距離値が最小の単語列候補
を読出し音声認識結果として出力する認識結果判定手段
を備える。
a reverse syntax rule storage means for storing a reverse syntax rule described in a syntactic state transition network represented by a certain word name, a word name connected before the word, and a state that transitions when connected; a word candidate storage means for storing word candidates that are unlikely to belong to a word string candidate, and once the input word candidate is subjected to syntax determination by determining a threshold value of the acoustic distance value of the input word candidate, the word candidate storage means is provided. a threshold value determining means for determining whether the input word candidate from the threshold value determining means can be connected to the word string candidate storage means; a syntax retrieved from the syntax rule storage means; Syntax determining means for determining syntax by applying rules and determining whether to additionally store the word string candidate as a new word string candidate in the word string candidate storage means or to further perform backward syntax determination on the input word candidate; The word candidates to which the input word candidates from . backward syntax determination means for reading from the column candidate storage means, performing backward syntax determination by similarly applying reverse lookup syntax rules, and determining whether or not to cause the word string candidate storage means to additionally store a new word string candidate; When the acoustic processing of the input speech is completed and an acoustic processing end signal is input, a word string candidate whose terminal position is near the terminal position of the input speech and whose total acoustic distance value is the minimum is read out from the word string candidate storage means and output as a speech recognition result. A recognition result determination means is provided.

〔作 用〕[For production]

上記のように構成した連続音声認識用構文処理装置は、
まず閾値判定手段で入力音声を音響的に処理して得られ
る入力単語候補が音声認識結果に含まれる可能性を判断
し、可能性の低い入力単語候補はいったん記憶する。
The syntax processing device for continuous speech recognition configured as above is
First, the probability that an input word candidate obtained by acoustically processing input speech is included in the speech recognition result is determined by a threshold value determination means, and input word candidates with a low probability are temporarily stored.

次に構文判定手段で、閾値判定手段で音声認識結果に含
まれる可能性が高いと判定された入力単語候補が単語列
候補の後に接続できるか否かを構文規則を適用すること
で判定し、接続可能な入力単語候補はその単語列候補に
接続して成る新たな単語列候補として追加記憶をする。
Next, the syntax determining means determines whether the input word candidate determined by the threshold determining means as having a high possibility of being included in the speech recognition result can be connected after the word string candidate by applying a syntactic rule, Connectable input word candidates are additionally stored as new word string candidates connected to the word string candidates.

更に後向き構文判定手段で、構文判定手段で単語列候補
に接続できないと判定された入力単語候補が記憶してお
いtコ単語候補の前に接続でき更にその単語候補が単語
列候補の後に接続できるか否かを逆引き構文規則を適用
することで判定し、接続可能な入力単語候補はその単語
候補と単語列候補に接続して成る新たな単語列候補とし
て追加記憶をする。
Further, the backward syntax determining means stores input word candidates that are determined by the syntax determining means to be unconnectable to the word string candidate, and connects them before the word candidate, and furthermore, the word candidates can be connected after the word string candidate. It is determined whether or not this is the case by applying reverse lookup syntax rules, and connectable input word candidates are additionally stored as new word string candidates formed by connecting the word candidates and the word string candidates.

最後に認識結果判定手段で音響処理終了信号が入力され
ると上記構文処理で生成されtコ単語列候補の終端位置
が入力音声の終端位置付近にあり総合音響距離値が最小
のものを音声認識結果として出力する。
Finally, when the acoustic processing end signal is input to the recognition result judgment means, the terminal position of the t word string candidates generated by the above syntax processing is near the terminal position of the input speech and the total acoustic distance value is the minimum. Output as result.

〔実施例〕〔Example〕

第1図はこの発明の一実施例を示す連続音声認識用構文
処理装置の機能ブロック図である。
FIG. 1 is a functional block diagram of a syntax processing device for continuous speech recognition showing one embodiment of the present invention.

図において(1)は入力音声を音響的に処理して得られ
る入力単語候補(11)の音響距離値の閾値判定をし、
その入力単語候補(II)を構文判定させるか、あるい
は単語候補としていったん記憶させるかを判断する閾値
判定手段、(2)はこの閾値判定手段からの入力単語候
補(]1)の始端位置が単語列候補記憶手段(5)から
読出した単語列候補(22)の終端位置と隣接し、更(
とその単語列候補(22)の最終状態pで入力単語候補
01)の単語名Wが受理されることを示す構文規則(3
3)が構文規則記憶手段(6)から検索できるとき、そ
の単語列候補(22)と入力単語候補01)を接続して
成る新たな単語列候補の総合音響距離値の閾値判定をし
、新たな単語列候補(22)として追加記憶をさせるか
、あるいは接続できる単語列候補がないとして入力単語
候補01)を更に後向き構文判定させるかを判断する構
文+U定平手段(3)はこの構文判定手段からの入力単
語候補(11)の始端位置が単語候補記憶手段(8)か
ら読出した単語候補(55)の終端位置と隣接し、かつ
その単語候補(55)の始端位置が単語列候補記憶手段
(5)から読出した単語列候補(22)の終端位置と隣
接し、更にその単語伺候?! (22)の最終状態Sで
単語候補(55)の単語名W′とその後の入力単語候補
01)の単語名Wが接続されることを示す逆引き構文規
則(44)を逆引き構文規則記憶手段(7)から検索で
きるとき、その単語列候補(22)と単語候補(55)
と入力単語候補(11)を接続して成る新たな単語列候
補の総合音響距離値の閾値判定をし、新たな単語列候補
(22)として追加記憶をさせるか否かを判断する後向
き構文判定手段、(4)は全ての入力単語候補θ1)の
入力を終えて音響処理終了信号(66)が入力されろと
上記構文処理で生成された単語列候補の終端位置が入力
音声の終端位置付近にあり総合音響距離値が最小のもの
を単語列候補記憶手段(5)から読出し音声認識結果(
77)として出力する認識結果判定手段、(5)は構文
処理で生成された単語列候補(22)を第2図のように
単語列中の各単語のアドレス、単語名、単語音響距離値
、単語列総合音響距離値、始端位置、終端位U。
In the figure, (1) determines the threshold value of the acoustic distance value of the input word candidate (11) obtained by acoustically processing the input speech,
Threshold determination means determines whether the input word candidate (II) is subjected to syntactical judgment or whether it is temporarily stored as a word candidate. Adjacent to the end position of the word string candidate (22) read from the string candidate storage means (5), further (
The syntax rule (3) indicates that the word name W of the input word candidate 01) is accepted in the final state p of the word string candidate (22).
3) can be retrieved from the syntax rule storage means (6), the threshold value of the total acoustic distance value of a new word string candidate formed by connecting the word string candidate (22) and the input word candidate 01) is determined, and a new The syntax+U-determining means (3) determines whether to additionally store the word string candidate (22) as a word string candidate (22), or to further perform backward syntax judgment on the input word candidate 01) since there is no connectable word string candidate. The starting position of the input word candidate (11) is adjacent to the ending position of the word candidate (55) read from the word candidate storage means (8), and the starting position of the word candidate (55) is the word string candidate storage means. Adjacent to the end position of the word string candidate (22) read from (5), and also the word candidate? ! The reverse syntax rule (44) indicating that the word name W' of the word candidate (55) and the word name W of the subsequent input word candidate 01) are connected in the final state S of (22) is stored as a reverse syntax rule. When it is possible to search from means (7), the word string candidate (22) and word candidate (55)
and a backward syntax judgment that determines the threshold value of the total acoustic distance value of a new word string candidate formed by connecting the input word candidate (11) and determines whether or not to additionally store it as a new word string candidate (22). Means (4) indicates that the end position of the word string candidate generated by the syntactic processing is near the end position of the input speech when the acoustic processing end signal (66) is input after all input word candidates θ1) have been input. The one with the minimum total acoustic distance value is read out from the word string candidate storage means (5) and the speech recognition result (
77), and (5) outputs the word string candidate (22) generated by the syntactic processing as shown in FIG. 2, the address, word name, word acoustic distance value, Word string comprehensive acoustic distance value, starting position, and ending position U.

文頭からの単語数、状態および各単語の前単語が記憶さ
れているアドレスの構成で記憶する単語列候補記憶手段
、(6)はある状態pとその状態から受理できる単語名
Wとその単語を受理したときに遷移する状態qとを第3
図のように構文状態遷移網で記述した構文規則(33)
を予め記憶し、第4図のように ある状態plと単語名
fllJをキーとして先頭の2ワードとの比較で遷移状
態がqI7であることを検索できるように構成された構
文規則記憶手段。
The word string candidate storage means stores the number of words from the beginning of the sentence, the state, and the address in which the previous word of each word is stored.(6) stores a certain state p, the word name W that can be accepted from that state, and the word The state q to which it transitions when it is accepted is the third
Syntax rules (33) described using a syntax state transition network as shown in the figure
The syntax rule storage means is configured to store in advance a certain state pl and a word name fllJ, and to search for a transition state qI7 by comparing it with the first two words using a certain state pl and a word name fllJ as keys, as shown in FIG.

(7)はある単語名Wとその単語の前に接続できる単語
8冑′と接続したときに遷移する状態syP+qとを第
5図のように構文状態遷移網で記述しtコ逆引き構文規
則(44)を予め記憶し、第6図のようにある単語名1
flの前には単語名W’(4が接続できそれによる遷移
状態がff1Jll)lJII?ljであることを時間
軸の逆方向に遡って検索できるように構成された逆引き
構文規則記憶手段、(8)は単語列候補(22)に属す
る可能性の低い単語候補(II)を記憶する単語候補記
憶手段である。
(7) describes a certain word name W, the word 8' that can be connected before that word, and the state syP+q that it transitions to when connected, using a syntactic state transition network as shown in Figure 5, and then uses t reverse lookup syntactic rules. (44) in advance, and word name 1 as shown in Figure 6.
Before fl is the word name W' (4 can be connected and the resulting transition state is ff1Jll) lJII? A reverse syntax rule storage means configured to be able to search backwards in the time axis to find that lj, (8) stores word candidates (II) that are unlikely to belong to the word string candidate (22). This is a word candidate storage means.

上記実施例の連続音声認識用構文処理装置は。The syntax processing device for continuous speech recognition according to the above embodiment is as follows.

まず連続音声4エ装置で入力音声を音響的に処理する音
響処理装置が生成し単語名と始端・終端位置と音響圧g
l値の各情報を含む入力単語候補θl)を受ける。
First, an acoustic processing device that acoustically processes input speech using a continuous speech 4D device generates word names, starting and ending positions, and sound pressure g.
An input word candidate θl) including each information of l value is received.

次に閾値判定手段(1)で入力単語候補Of)の音響距
離値が閾値以下のときはその入力単語候補01)が音声
認識結果(77)に含まれる可能性が高いとして構文判
定手段(2)に入力単語候補01)を送信する。一方間
値を越えるとき(よ同様の可能性が低いとして単語候補
記憶手段(8)に入力単語候補(II>をいったん記憶
する。また構文判定手段(2)で入力単語候補Of)の
始端位置が単語列候補記憶手段(5)から読出した単語
列候補(22)の終端位置と隣接し、更にその単語列候
補(22)の最終状態p+で入力単語候補(11>の単
語名w1Jが受理されることを示す構文規則(33)が
構文規則記憶手段(6)から検索できるとき、その単語
列候補(22)と入力単語候補θl)を接続して成る新
たな単語列候補の総合音響距離値が閾値以下のときは入
力単語候補ODが読出した単語列候補(22)に接続可
能と判定し、単語列候補記憶手段(5)に新たな単語伺
候?ll!(’22)として追加記憶をする。一方間値
を越えて入力単語候補01〉を接続できる単語列候補が
存在しないときは後向き構文判定手段(3)に入力単語
候補(11)を送信する。
Next, when the acoustic distance value of the input word candidate Of) is less than the threshold value, the threshold value determination means (1) determines that there is a high possibility that the input word candidate Of) is included in the speech recognition result (77), and the syntax determination means (2) ) is sent the input word candidate 01). On the other hand, when the value exceeds the value, the input word candidate (II> is temporarily stored in the word candidate storage means (8) as the possibility of similar occurrence is low. Also, the starting position of the input word candidate Of) is determined by the syntax determination means (2). is adjacent to the end position of the word string candidate (22) read from the word string candidate storage means (5), and the word name w1J of the input word candidate (11>) is accepted in the final state p+ of the word string candidate (22). When the syntactic rule (33) indicating that the syntax rule (33) that When the value is less than the threshold, it is determined that the input word candidate OD can be connected to the read word string candidate (22), and it is additionally stored in the word string candidate storage means (5) as a new word candidate?ll! ('22). On the other hand, if there is no word string candidate that can connect the input word candidate 01> beyond the interval value, the input word candidate (11) is sent to the backward syntax determining means (3).

更に後向き構文判定手段(3)で入力単語候補(11)
の始端位置が単語候補記憶手段(8)から読出した単語
候補(55)の終端位置と隣接し、かつその単語候補(
55)の始端位置が単語列候補記憶手段(5)から読出
した単語列候補(22)の終端位置と隣接し、更にその
単語列候補(22)の最終状態S、jで単語候補(55
)の単語名W’llとその前の入力単語候補(11)の
単語名wlが接続されることを示す逆引き構文規則(4
4)が逆引き構文規則記憶手段(7)から検索できると
き。
Furthermore, input word candidates (11) are determined by backward syntax determination means (3).
The start position of the word candidate (55) is adjacent to the end position of the word candidate (55) read from the word candidate storage means (8), and the word candidate (
The starting position of the word string candidate (55) is adjacent to the end position of the word string candidate (22) read from the word string candidate storage means (5), and furthermore, the word candidate (55) is in the final state S, j of the word string candidate (22).
) is connected to the word name wl of the previous input word candidate (11).
4) can be retrieved from the reverse syntax rule storage means (7).

その単語列候補(22)と単語候補(55)と入力単語
候?lfl (+1>を接続して成る新たな単語列候補
の総合音響距離値が閾値以下のときは入力単語候補01
)が読出17た単語候補(55)と単語列候補(22)
に接続可能と判定し、単語列候補記憶手段(5)に新た
な単語列候補(22)として追加記憶をする。
The word string candidate (22), word candidate (55), and input word candidate? If the total acoustic distance value of a new word string candidate formed by connecting lfl (+1>) is less than or equal to the threshold, input word candidate 01
) read out 17 word candidates (55) and word string candidates (22)
It is determined that the word string candidate can be connected to the word string candidate storage means (5), and is additionally stored as a new word string candidate (22).

全ての入力単語候補01)に対して上記の閾値判定。The above threshold value judgment is made for all input word candidates 01).

構文判定および後向き構文判定処理をし全ての入力を終
えると音響処理装置から音響処理終了信号(66)を受
ける。
After performing syntax determination and backward syntax determination processing and completing all inputs, a sound processing end signal (66) is received from the sound processing device.

最後に認識結果判定手段(4)で単語列候補記憶手段(
5)から上記処理で生成された単語列候補(22)の終
端位置が入力音声の終端位置付近にあり総合音響距離値
が最小のものを読出し音声認識結果(77)として出力
する。
Finally, the recognition result determination means (4) uses the word string candidate storage means (
From 5), the word string candidate (22) generated in the above process whose terminal position is near the terminal position of the input speech and whose overall acoustic distance value is the minimum is read out and output as a speech recognition result (77).

〔発明の効果〕〔Effect of the invention〕

この発明は以上説明したように構成されており。 This invention is constructed as described above.

従来のように全ての入力単語候補に対して構文判定をし
ないで音響距離値が小さく音声認識結果に含まれる可能
性の高い入力単語候補だけ構文判定をし可能性の低いも
のはいったん記憶しておいて。
Instead of performing syntax judgment on all input word candidates as in the past, only input word candidates with a small acoustic distance value and a high probability of being included in the speech recognition result are subjected to syntax judgment, and those with low probability are temporarily memorized. Leave it behind.

音響距離値が小さくても単語列候補に接続できる可能性
の低いときは記憶しておいた単語候補に戻って構文判定
をするようにしたから、従来の前向き法を用いる方式の
問題点であった膨大な処理量を正しい仮説の棄却をしな
いで低減できる。また一部後向きの処理はあっても基本
的には時間軸方向に処理しているようにしたから、従来
の高駆動法を用いる方式の問題点であった発声終了後で
ないと処理を開始できないことも解消できる。
Even if the acoustic distance value is small, when the possibility of connecting to a word string candidate is low, we go back to the memorized word candidate and make a syntax judgment, which eliminates the problem with the conventional method using the forward method. The huge amount of processing required can be reduced without rejecting correct hypotheses. In addition, although there is some backward processing, it is basically processed in the time axis direction, so processing cannot be started until after the utterance is finished, which was a problem with the conventional high drive method. This can also be resolved.

従って、この発明の構文処理装置を用いる連続音声認識
装置は従来装置に比べ文認識率を低下させないで処理量
を低減し、かつ実時間処理を可能にする効果がある。
Therefore, the continuous speech recognition apparatus using the syntax processing apparatus of the present invention has the effect of reducing the amount of processing without lowering the sentence recognition rate and enabling real-time processing compared to conventional apparatuses.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図はこの発明の一実施例を示す連続音声認識用構文
処理装置の機能ブロック図、第2図は第1図の単語列候
補記憶手段の記憶構成例を示す図。 第3図と第4図は第1図の構文規則記憶手段の構文規則
を記述する構文状態遷移網とその記憶構成例を示す図、
第5図と第6図は第1図の逆引き構文規則記憶手段の逆
引き構文規則を記述する構文状i遷移網とその記憶構成
例を示す図、第7図は従来例を示す連続音声認識用構文
処理装置の機能ブロック図、第8図と第9図と第10図
は第7図の従来例で的向き法とビーム探索法と高駆動法
を用いる方式を示す図である。 図において(1)は閾値判定手段、(2)は構文判定手
段、(3)は後向き構文判定手段、(4)は認識結果判
定手段、(5)は単語列候補記憶手段、(6)は構文規
則記憶手段、(7)は逆引き構文規則記憶手段、(8)
は単語候補記憶手段、 (+1>は入力単語候補、 (
22)は単語列候補、 (33)は構文規則、 (44
1は逆引き構文規則。 (55)は単語候補、 (66)は音響処理終了信号、
 (77)ば音声認識結果。 なお2図中、同一符号は同−又は相当部分を示す。
FIG. 1 is a functional block diagram of a syntax processing device for continuous speech recognition showing one embodiment of the present invention, and FIG. 2 is a diagram showing an example of the storage configuration of the word string candidate storage means of FIG. 1. 3 and 4 are diagrams showing a syntactic state transition network describing syntactic rules in the syntactic rule storage means of FIG. 1 and an example of its storage configuration;
5 and 6 are diagrams showing a syntactic state i transition network that describes reverse syntax rules in the reverse syntax rule storage means of FIG. 1 and an example of its storage configuration, and FIG. FIGS. 8, 9, and 10, which are functional block diagrams of the syntax processing device for recognition, are diagrams showing a system using the target method, beam search method, and high drive method in the conventional example of FIG. 7. In the figure, (1) is a threshold determination means, (2) is a syntax determination means, (3) is a backward syntax determination means, (4) is a recognition result determination means, (5) is a word string candidate storage means, and (6) is a Syntax rule storage means, (7) is a reverse syntax rule storage means, (8)
is word candidate storage means, (+1> is input word candidate, (
22) is a word string candidate, (33) is a syntax rule, (44
1 is a reverse syntax rule. (55) is a word candidate, (66) is a sound processing end signal,
(77) Voice recognition result. Note that in the two figures, the same reference numerals indicate the same or equivalent parts.

Claims (1)

【特許請求の範囲】[Claims] 連続音声認識装置で入力音声を音響的に処理する音響処
理装置から入力単語候補を受け構文規則を適用して音声
認識結果を得る構文処理装置において、構文処理で生成
される単語列候補を記憶する単語列候補記憶手段と、あ
る状態とその状態から受理できる単語名と、その単語を
受理したときに遷移する状態とで表わされる構文状態遷
移網で記述した構文規則を記憶する構文規則記憶手段と
、ある単語名とその単語の前に接続できる単語名と接続
したときに遷移する状態とで表わされる構文状態遷移網
で記述した逆引き構文規則を記憶する逆引き構文規則記
憶手段と、単語列候補に属する可能性の低い単語候補を
記憶する単語候補記憶手段とを設け、前記入力単語候補
の音響距離値(入力音声の音響的特徴量と標準パターン
の基準値との差違)の閾値判定をし、前記入力単語候補
を構文判定させるかいったん前記単語候補記憶手段に記
憶させるかを判断する閾値判定手段、この閾値判定手段
からの前記入力単語候補が接続できる可能性のある単語
列候補を前記単語列候補記憶手段から読出し前記構文規
則記憶手段から検索した構文規則を適用して構文判定を
し、新たな単語列候補として前記単語列候補記憶手段に
追加記憶をさせるか前記入力単語候補を更に後向き構文
判定させるかを判断する構文判定手段、この構文判定手
段からの前記入力単語候補が接続できる単語候補を前記
単語候補記憶手段から読出し前記逆引き構文規則記憶手
段から検索した逆引き構文規則を適用し、更にその単語
候補が接続できる単語列候補を前記単語列候補記憶手段
から読出し同様に逆引き構文規則を適用して後向き構文
判定をし、新たな単語列候補として前記単語列候補記憶
手段に追加記憶させるか否かを判断する後向き構文判定
手段、入力音声の音響的処理を終えて音響処理終了信号
を入力すると前記単語列候補記憶手段から終端位置が入
力音声の終端位置付近で総合音響距離値が最小の単語列
候補を読出し音声認識結果として出力する認識結果判定
手段を備えたことを特徴とする連続音声認識用構文処理
装置。
A continuous speech recognition device acoustically processes input speech. Receives input word candidates from the acoustic processing device and applies syntactic rules to obtain speech recognition results. In the syntactic processing device, word string candidates generated by syntactic processing are stored. a word string candidate storage means; a syntactic rule storage means for storing syntactic rules described in a syntactic state transition network represented by a certain state, a word name that can be accepted from that state, and a state to which the word transitions when the word is accepted; , a reverse syntax rule storage means for storing a reverse syntax rule described in a syntactic state transition network represented by a word name, a word name that can be connected before the word, and a state to which it transitions when connected; and a word string. a word candidate storage means for storing word candidates that are unlikely to belong to the candidates, and a threshold value judgment of an acoustic distance value (difference between an acoustic feature amount of input speech and a reference value of a standard pattern) of the input word candidate is provided. a threshold value determining means for determining whether the input word candidate is subjected to syntax determination or to be temporarily stored in the word candidate storage means; Read the word string candidate from the word string candidate storage means, apply the syntax rules retrieved from the syntax rule storage means to determine the syntax, and either cause the word string candidate storage means to additionally store the word string candidate as a new word string candidate, or further store the input word candidate. a syntax determining means for determining whether to perform backward syntax determination; a word candidate to which the input word candidate from the syntax determining means can be connected is read from the word candidate storage means; and a reverse syntax rule retrieved from the reverse syntax rule storage means is read out. Then, a word string candidate to which the word candidate can be connected is read out from the word string candidate storage means, a backward syntax is determined by similarly applying the reverse syntax rules, and the word string candidate storage means is used as a new word string candidate. A backward syntax determining means determines whether or not to additionally store the input speech, and when an acoustic processing end signal is input after completing the acoustic processing of the input speech, the word string candidate storage means determines the end position of the input speech when the end position is near the end position of the input speech. 1. A syntax processing device for continuous speech recognition, comprising recognition result determination means for reading out a word string candidate with a minimum distance value and outputting it as a speech recognition result.
JP1280442A 1989-10-27 1989-10-27 Syntax processor for continuous speech recognition Expired - Fee Related JPH077273B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP1280442A JPH077273B2 (en) 1989-10-27 1989-10-27 Syntax processor for continuous speech recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP1280442A JPH077273B2 (en) 1989-10-27 1989-10-27 Syntax processor for continuous speech recognition

Publications (2)

Publication Number Publication Date
JPH03141398A true JPH03141398A (en) 1991-06-17
JPH077273B2 JPH077273B2 (en) 1995-01-30

Family

ID=17625115

Family Applications (1)

Application Number Title Priority Date Filing Date
JP1280442A Expired - Fee Related JPH077273B2 (en) 1989-10-27 1989-10-27 Syntax processor for continuous speech recognition

Country Status (1)

Country Link
JP (1) JPH077273B2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05323989A (en) * 1992-05-19 1993-12-07 Fujitsu Ltd Speech recognizing system
US7240008B2 (en) 2001-10-03 2007-07-03 Denso Corporation Speech recognition system, program and navigation system

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000010160A1 (en) * 1998-08-17 2000-02-24 Sony Corporation Speech recognizing device and method, navigation device, portable telephone, and information processor

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05323989A (en) * 1992-05-19 1993-12-07 Fujitsu Ltd Speech recognizing system
US7240008B2 (en) 2001-10-03 2007-07-03 Denso Corporation Speech recognition system, program and navigation system

Also Published As

Publication number Publication date
JPH077273B2 (en) 1995-01-30

Similar Documents

Publication Publication Date Title
EP0590925B1 (en) Method of speech modelling and a speech recognizer
US5212730A (en) Voice recognition of proper names using text-derived recognition models
US6182039B1 (en) Method and apparatus using probabilistic language model based on confusable sets for speech recognition
CN1321401C (en) Speech recognition apparatus, speech recognition method, conversation control apparatus, conversation control method
JP2000122691A (en) Automatic recognizing method for spelling reading type speech speaking
CN104978963A (en) Speech recognition apparatus, method and electronic equipment
JPS5991500A (en) Voice analyzer
KR101317339B1 (en) Apparatus and method using Two phase utterance verification architecture for computation speed improvement of N-best recognition word
JP4950024B2 (en) Conversation system and conversation software
JP2000293191A (en) Device and method for voice recognition and generating method of tree structured dictionary used in the recognition method
US11615787B2 (en) Dialogue system and method of controlling the same
CN112863496B (en) Voice endpoint detection method and device
JP2002358097A (en) Voice recognition device
JPH03141398A (en) Syntax processing device for continuous sound recognition
Johansen A comparison of hybrid HMM architecture using global discriminating training
Adel et al. Text-independent speaker verification based on deep neural networks and segmental dynamic time warping
JP3104900B2 (en) Voice recognition method
JPH06266386A (en) Word spotting method
JP2002215184A (en) Speech recognition device and program for the same
JP3494338B2 (en) Voice recognition method
US7818172B2 (en) Voice recognition method and system based on the contexual modeling of voice units
JP3315565B2 (en) Voice recognition device
JPS645320B2 (en)
JPH09244691A (en) Input speech rejecting method and device for executing same method
JP3357752B2 (en) Pattern matching device

Legal Events

Date Code Title Description
LAPS Cancellation because of no payment of annual fees