JPH06102897A

JPH06102897A - Continuous sentence speech recognition system

Info

Publication number: JPH06102897A
Application number: JP4252767A
Authority: JP
Inventors: Hideki Kojima; 英樹小島
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1992-09-22
Filing date: 1992-09-22
Publication date: 1994-04-15
Anticipated expiration: 2018-07-07
Also published as: JP3425165B2

Abstract

PURPOSE:To provide the continuous sentence speech recognition system which efficiently recognizes a sentence in consideration of modification relation, semantic relation, and relativity between phrases. CONSTITUTION:Phrases are classified into phrase groups by cases and meanings and stored in a case ruling grammar table 3. A phrase selection part 2 sets a flag for a phrase group of a phrase group flag 4 which possibly appears in an input speech. A collation part 1 collates the phrase templat 5 of the phrase for which the flag is set with the input speech. The phrase selection part 2 resets the flag once the phrase for which the flag is set appears in the input speech. The result of collation between the phrase in the input speech and the phrase templat is outputted as a recognition score and the continuous input speech is recognized. Further, a beam search branch cutting part 6 is provided to narrow down candidates for phrases and a relativity score calculation part 7 calculates a relativity score to recognize the continuous sentence speech faster and more accurately.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は連続発声された文音声を
認識する連続文音声認識方式に関し、特に本発明は文節
間の係り受け関係、意味関係、関連度を用いて、より速
く、より正確に文の認識を行うことができる連続文音声
認識方式に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a continuous sentence voice recognition system for recognizing continuously spoken sentence voices, and more particularly, the present invention uses a dependency relation, a semantic relation, and a degree of association between clauses to make a sentence faster and more efficient. The present invention relates to a continuous sentence voice recognition method capable of accurately recognizing sentences.

【０００２】[0002]

【従来の技術】従来、連続発声された文音声を認識する
文音声認識方式としては、文脈自由文法の文を、ＣＹＫ
やＥａｒｌｅｙといったパーザー（文を成分に分けて処
理する方式）とＤＰ（ダイナミック・プログラミング、
以下ＤＰという）照合を組み合わせて認識する方式が用
いられていた。2. Description of the Related Art Conventionally, as a sentence-speech recognition method for recognizing continuously-sentence-sentences, a sentence of context-free grammar is CYK.
Parser (method that divides sentence into components) and DP (Dynamic programming, etc.)
A method of recognizing a combination of collations has been used.

【０００３】[0003]

【発明が解決しようとする課題】ところで、上記した従
来の方式は扱える文法が文脈自由文法に限られ、日本語
の係り受け関係や意味関係、文節間の関連度を表現する
には不適当であるという問題があった。すなわち、上記
した方式は、例えば英文のように文節の順序に意味を持
つ文音声を認識するには適しているが、日本語のよう
に、係り受け関係（助詞による文節の結合関係）、意味
関係、関連度等に多く依存する文音声を認識するには適
当でなかった。By the way, the conventional methods described above are limited to context-free grammars that can be handled, and are unsuitable for expressing dependency relations and semantic relations in Japanese, and the degree of association between phrases. There was a problem. That is, the above-described method is suitable for recognizing sentence voices that have a meaning in the order of bunsetsu, such as English sentences, but as in Japanese, dependency relations (joint relations of clauses by particles), meaning It was not suitable for recognizing sentence-speech, which is highly dependent on the relation and the degree of association.

【０００４】本発明は上記した従来技術の問題点に鑑み
なされたものであって、係り受け関係や意味関係、文節
間の関連度を考慮しながら効率良く連続文音声を認識す
ることができる連続文音声認識方式を提供することを目
的とする。The present invention has been made in view of the above-mentioned problems of the prior art, and it is possible to efficiently recognize continuous sentence speech while considering the dependency relation, the semantic relation, and the degree of association between phrases. It is intended to provide a sentence voice recognition method.

【０００５】[0005]

【課題を解決するための手段】図１は本発明の原理ブロ
ック図である。同図において、１は入力音声と文節テン
プレートを照合し、認識スコアを出力する照合部、２は
文節群フラグを参照して照合部１において次に照合すべ
き文節を選択する文節選択部、３は格支配文法テーブ
ル、４は次に選択する文節を指示する文節群フラグを格
納した文節群フラグ・テーブル、５は照合部１において
入力音声と照合する文節テンプレートを格納した文節テ
ンプレート格納部、６はビーム・サーチを行うためのビ
ーム・サーチ枝刈り部、７は関連度スコア算出部であ
る。FIG. 1 is a block diagram showing the principle of the present invention. In the figure, 1 is a collation unit that collates an input speech with a phrase template, and outputs a recognition score. 2 is a phrase selection unit that refers to the phrase group flag to select a phrase to be collated next in the collation unit 1. Is a case dominance grammar table, 4 is a clause group flag table storing a clause group flag indicating a clause to be selected next, 5 is a clause template storage unit storing a clause template to be matched with the input voice in the matching unit 1, 6 Is a beam search pruning unit for performing a beam search, and 7 is a relevance score calculation unit.

【０００６】上記課題を解決するため、本発明の請求項
１の発明は、あらかじめ、文節をその格と意味により文
節群に分類して格支配文法テーブル３に格納しておき、
上記分類結果を参照して、文節の係り受け関係と意味関
係から文節群フラグ・テーブル４の入力音声の文節中に
現れる可能性のある文節群にフラグを立てる。また、上
記フラグはフラグが立てられた文節群が入力音声の文節
中に現れる可能性がある限り継承する。In order to solve the above problems, the invention of claim 1 of the present invention classifies bunsetsu into groups of buns according to their case and meaning and stores them in the case governing grammar table 3.
With reference to the classification result, the bunsetsu group that may appear in the bunsetsu of the input speech of the bunsetsu group flag table 4 is flagged from the dependency relationship and the semantic relationship of the bunsetsu. Further, the flag is inherited as long as the flagged phrase group may appear in the phrase of the input voice.

【０００７】そして、文節テンプレート格納部５に格納
されたテンプレートの内、フラグが立てられた文節の文
節テンプレートと入力音声の文節とを照合部１において
照合する。その際、フラグが立てられた文節と入力音声
の文節を照合した結果、フラグが立てられた文節群に含
まれる文節が実際に入力音声の文節中に現れたとき、上
記フラグをリセットすることにより、次に現れる文節の
候補を絞るようにしたものである。Then, of the templates stored in the phrase template storage unit 5, the collation unit 1 collates the phrase template of the flagged phrase with the phrase of the input voice. At that time, as a result of matching the flagged phrase with the phrase of the input voice, when a phrase included in the flagged phrase group actually appears in the phrase of the input voice, by resetting the flag, , It is designed to narrow down the candidates for the next bunsetsu.

【０００８】本発明の請求項２の発明は、請求項１の発
明において、入力音声の文節と照合する文節候補をビー
ム・サーチ枝刈り部６においてビーム・サーチにより一
定数内に絞るようにしたものである。本発明の請求項３
の発明は、請求項１または請求項２の発明において、関
連度スコア算出部７において、文節間の関連度から関連
度スコアを算出し、算出された関連度スコアを入力音声
の照合結果に付加するようにしたものである。According to a second aspect of the present invention, in the first aspect of the invention, the beam search pruning unit 6 narrows down the phrase candidates to be matched with the phrase of the input voice to a predetermined number by the beam search. It is a thing. Claim 3 of the present invention
In the invention of claim 1 or 2, the degree-of-association score calculation unit 7 calculates a degree-of-relevance score from the degree-of-relationship between phrases and adds the calculated degree-of-relevance score to the matching result of the input speech. It is something that is done.

【０００９】本発明の請求項４の発明は、請求項３の発
明において、文節間の関連度データとして、文節間の共
起関係データを用いるようにしたものである。本発明の
請求項５の発明は、請求項３の発明において、文節間の
関連度データとして、文節間の隣接関係データを用いる
ようにしたものである。本発明の請求項６の発明は、請
求項３の発明において、関連度データとして、２つの文
節間のみの関連度データを用い、入力音声の文節と照合
する毎に、その前に照合した文節との関連度スコアを計
算して記憶しておき、その回の関連度スコアは前回計算
した関連度スコアと、今回計算した関連度スコアとから
算出するようにしたものである。According to a fourth aspect of the present invention, in the third aspect of the invention, co-occurrence relation data between clauses is used as the degree-of-relationship degree data. According to a fifth aspect of the present invention, in the third aspect of the invention, the adjacency relationship data between clauses is used as the degree-of-relationship degree data. According to the invention of claim 6 of the present invention, in the invention of claim 3, as the degree-of-association data, the degree-of-association data between only two clauses is used, and each time the clause is matched with the clause of the input voice, the clause matched before is used. The relevance score with is calculated and stored, and the relevance score at that time is calculated from the relevance score calculated last time and the relevance score calculated this time.

【００１０】本発明の請求項７の発明は、請求項６の発
明において、前回計算した関連度スコアと今回計算した
関連度スコアの和からその回の関連度スコアを算出する
ようにしたものである。本発明の請求項８の発明は、請
求項６の発明において、前回計算した関連度スコアと今
回計算した関連度スコアの積からその回の関連度スコア
を算出するようにしたものである。According to a seventh aspect of the present invention, in the sixth aspect of the invention, the relevance score at that time is calculated from the sum of the relevance score calculated last time and the relevance score calculated this time. is there. According to the invention of claim 8 of the present invention, in the invention of claim 6, the relevance score at that time is calculated from the product of the relevance score calculated last time and the relevance score calculated this time.

【００１１】本発明の請求項９の発明は、請求項６の発
明において、前回計算した関連度スコアと今回計算した
関連度スコアの内、大きい方をその回の関連度スコアと
するようにしたものである。本発明の請求項１０の発明
は、請求項６の発明において、前回計算した関連度スコ
アと今回計算した関連度スコアの内、小さい方をその回
の関連度スコアとするようにしたものである。According to the invention of claim 9 of the present invention, in the invention of claim 6, the larger one of the previously calculated relevance score and the relevance score calculated this time is set as the relevance score of the current time. It is a thing. In the invention of claim 10 of the present invention, in the invention of claim 6, the smaller one of the relevance score calculated this time and the relevance score calculated this time is set to be the relevance score of that time. .

【００１２】本発明の請求項１１の発明は、請求項６の
発明において、前回計算した関連度スコアと今回計算し
た関連度スコアの平均値からその回の関連度スコア求め
るようにしたものである。According to the invention of claim 11 of the present invention, in the invention of claim 6, the relevance score at that time is obtained from the average value of the relevance score calculated last time and the relevance score calculated this time. .

【００１３】[0013]

【作用】本発明の請求項１の発明においては、文節をそ
の格と意味により文節群に分類して格支配文法テーブル
３に格納しておき、上記分類結果を参照して、文節の係
り受け関係と意味関係から入力音声の文節中に現れる可
能性のある文節群にフラグを立て、フラグが立てられた
文節と入力音声の文節を照合し、その結果、フラグが立
てられた文節群に含まれる文節が実際に入力音声の文節
中に現れたとき、上記フラグをリセットするようにした
ので、係り受け関係や意味関係を考慮して効率よく連続
文音声入力を認識することができ、連続文音声認識の性
能を向上させることができる。According to the first aspect of the present invention, bunsetsu are classified into bunsetsu groups according to their case and meaning and stored in the case governing grammar table 3, and the bunsetsu dependency is referred to by referring to the classification result. A phrase group that may appear in the phrase of the input speech is flagged from the relation and the semantic relation, and the flagged phrase and the phrase of the input speech are matched, and as a result, included in the flagged phrase group. When the phrase that appears in the phrase actually appears in the phrase of the input speech, the flag is reset so that the continuous sentence voice input can be efficiently recognized in consideration of the dependency relation and the semantic relation. The performance of voice recognition can be improved.

【００１４】本発明の請求項２の発明においては、入力
音声の文節と照合する文節候補をビーム・サーチにより
一定数内に絞るようにしたので、照合のための計算量を
減少させることができる。本発明の請求項３ないし請求
項１１の発明においては、文節間の関連度から関連度ス
コアを算出し、算出された関連度スコアを入力音声の照
合結果に付加するようにしたので、より正確に連続文音
声入力を認識することができる。According to the second aspect of the present invention, since the bunsetsu candidates to be matched with the bunsetsu of the input voice are narrowed to a fixed number by the beam search, the calculation amount for the matching can be reduced. . In the inventions according to claims 3 to 11 of the present invention, the relevance score is calculated from the relevance between clauses, and the calculated relevance score is added to the matching result of the input voice. It can recognize continuous sentence voice input.

【００１５】[0015]

【実施例】図２は本発明の第１の実施例を示す図であ
り、同図において、１１は入力音声と文節テンプレート
をダイナミック・プログラミングにより照合し、認識ス
コアを出力するＤＰ照合部、１２は文節群フラグを参照
してＤＰ照合部１１において次に照合すべき文節を選択
する文節選択部、１３は格支配文法を格納した格支配文
法テーブル、１４は次に選択する文節を指示する文節群
フラグを格納した文節群フラグ・テーブル、１５はＤＰ
照合部１１において入力音声と照合する文節テンプレー
トを格納した文節テンプレート格納部である。FIG. 2 is a diagram showing a first embodiment of the present invention. In FIG. 2, 11 is a DP collating unit for collating an input voice with a phrase template by dynamic programming and outputting a recognition score, 12 Is a phrase selection unit that selects the phrase to be matched next in the DP matching unit 11 by referring to the phrase group flag, 13 is a case governing grammar table that stores the case governing grammar, and 14 is a phrase that indicates the phrase to be selected next. Clause group flag table storing group flags, 15 is DP
It is a phrase template storage unit that stores a phrase template to be matched with the input voice in the matching unit 11.

【００１６】図３は格支配文法テーブル１３の構成の一
例を示す図であり、同図に示すように格支配文法テーブ
ル１３には、動詞、名詞句等の係り受け関係、すなわ
ち、動詞、名詞句等の文節と、その文節がきた場合には
次のどの文節がくるかという対応関係が格納されてい
る。例えば、「行く」という動詞に係る可能性のある文
節として「私が」、「太郎が」等の「（人）が」という
文節と、「学校へ」、「病院へ」といった「（場所）
へ」のような文節がありうることが記述されている。ま
た、例えば、名詞句の前に形容詞がくるといったことも
同様に記述されている。FIG. 3 is a diagram showing an example of the structure of the case governing grammar table 13. As shown in FIG. 3, the case governing grammar table 13 has a dependency relation of verbs, noun phrases, etc., that is, verbs and nouns. The correspondence relation between a phrase such as a phrase and, when the phrase comes, which next phrase comes is stored. For example, the phrase "(person) ga" such as "I am" or "Taro ga" and the phrase "(place) such as" to school "or" to hospital "may be associated with the verb" go ".
It is described that there can be a clause like "he". Further, for example, it is also described that an adjective comes before a noun phrase.

【００１７】なお、文の順序では「行く」という動詞は
一般に文の最後にくるが、日本語の場合には、前から後
ろへの係り受け関係により次に続く文節が決まってくる
ため、認識は後ろからやっていった方が文法の制限をよ
り明確に反映させることができる。したがって、認識時
には、文末の「行く」から認識していき、「行く」の次
に「場所へ」あるいは「人が」等がくることとなる。In the sentence order, the verb "go" generally comes to the end of the sentence, but in the case of Japanese, the sentence that follows is determined by the dependency relation from the front to the rear. Can reflect the restrictions of grammar more clearly if you go from behind. Therefore, at the time of recognition, recognition starts from "go" at the end of the sentence, and "go to" is followed by "to a place" or "a person".

【００１８】図４（ａ）は文節群フラグ・テーブル１４
の構成を示す図であり、文節群フラグ・テーブル１４に
は同図に示すように、動詞、名詞句、形容詞等とそれら
の文節群フラグが格納され、これら文節群フラグは、格
支配文法テーブル１３を参照した結果、照合された文節
の次にくると予測される文節に○が付与され、それ以外
の文節および照合済の文節には×が付与される。FIG. 4A shows a clause group flag table 14.
As shown in the figure, the bunsetsu group flag table 14 stores verbs, noun phrases, adjectives, and their bunsetsu group flags. These bunsetsu group flags are stored in the case dominance grammar table. As a result of referring to 13, the circles are given to the clauses predicted to come after the matched clauses, and the crosses are given to the other clauses and the matched clauses.

【００１９】図４（ｂ）は文節群フラグ・テーブル１４
の文節群フラグの変化の様子を示す図であり、同図は
「私が学校へ行く」という文音声を認識する場合の
文節群フラグの状態を示している。同図により、一例と
して上記のように「私が学校へ行く」という文音声
を認識する場合の文節群フラグの付与について説明す
る。前述したように連続文を認識する場合には文音声の
末尾から認識が行われ、日本語文においては通常動詞が
文末にくるので文節群フラグ・テーブル１４には最初、
動詞の全てに○が付与されている。ついで、動詞として
「行く」が認識されると文節群フラグ・テーブル１４に
格納された動詞の文節群フラグに×が付与される。格支配文法テーブル１３を参照して動詞「行く」の
次にくる文節が予測され、その結果、「行く」の場合に
は「（場所）へ」、「（人）が」が関係ある文節として
見いだされるので、図４（イ）に示すように、「（場
所）へ」、「（人）が」に○が付与され、「（場所）
で」には×が付与される。「行く」の次にくる文節を照合した結果、例えば、
「学校へ」という文節が認識された場合には、「（場
所）へ」という文節に×が付与される。これは、単文に
おいは、同種の格支配が２以上表れることはほとんどな
いというルールを利用している。FIG. 4B shows the clause group flag table 14.
Is a diagram showing how the bunsetsu group flag changes, and shows the state of the bunsetsu group flag when recognizing the sentence voice "I go to school". As an example, the addition of the phrase group flag when recognizing the sentence voice "I go to school" as described above will be described with reference to FIG. As described above, when recognizing a continuous sentence, recognition is performed from the end of the sentence voice, and in a Japanese sentence, the verb usually comes to the end of the sentence, so that the phrase group flag table 14 is
○ is given to all of the verbs. Then, when “go” is recognized as the verb, “x” is added to the clause group flag of the verb stored in the clause group flag table 14. By referring to the case dominance grammar table 13, the phrase that comes after the verb "go" is predicted, and as a result, in the case of "go", "(place) to" and "(person) ga" are relevant phrases. Since it is found, as shown in Fig. 4 (a), "to (place)" and "(person)" are marked with "(place)".
X is added to "de". As a result of matching the phrase that comes after "go", for example,
When the phrase “to school” is recognized, a cross is attached to the phrase “to (place)”. This uses the rule that a simple sentence rarely shows more than one case of the same type.

【００２０】その結果、文節群フラグ・テーブル１４の
文節群フラグの状態は図４（ロ）に示すように、
「（人）が」に○が付与されている状態となる。「学校へ」の次の文節を照合した結果、次の文節が
「私が」であることが認識されると、文節群フラグ・テ
ーブル１４の「（人）が」の文節のフラグに×が付与さ
れ、文節群フラグの状態は図４（ハ）に示す状態とな
る。As a result, the state of the clause group flag in the clause group flag table 14 is as shown in FIG.
A circle is given to "(person) ga". As a result of collating the next phrase of "to school", if the next phrase is recognized as "I", an X is added to the flag of the phrase "(person) ga" in the phrase group flag table 14. The state of the added phrase group flag becomes the state shown in FIG.

【００２１】結果として、全ての文節群フラグに×がつ
くことになるので、この文の認識は終了したということ
となる。なお、上記例においては、全ての文節群フラグ
が×になり、認識を終了することができたが、文法の書
き方によってはいつまでたっても文の終わりが検出でき
ないこともあり得る。そのため、例えば、認識できる文
節の数を制限するなどにより、上記問題を回避すること
が可能である。As a result, x is added to all clause group flags, which means that the recognition of this sentence is completed. In the above example, all the bunsetsu group flags became x, and the recognition could be ended. However, depending on how the grammar is written, the end of the sentence may not be detected forever. Therefore, the above problem can be avoided by limiting the number of recognizable phrases.

【００２２】図５は文節選択部１２における処理を示す
フローチャートであり、同図、図３および図４を用いて
図２に示す第１の実施例について説明する。文節選択部
１２はまず、文節群フラグとＤＰの初期状態（照合結果
が入力されていない状態）をキューに入れる（図５のス
テップＳ１）。初期状態においては、文節群フラグは前
述したように動詞の文節に○が付与されている。FIG. 5 is a flow chart showing the processing in the phrase selecting unit 12, and the first embodiment shown in FIG. 2 will be described with reference to FIG. 3, FIG. 3 and FIG. The phrase selection unit 12 first puts the phrase group flag and the initial state of the DP (a state in which the collation result is not input) into the queue (step S1 in FIG. 5). In the initial state, the bunsetsu group flag has a circle attached to the verb bunsetsu as described above.

【００２３】なお、ＤＰの照合結果は文節の各部分まで
の照合結果からなる数字列であり、初期状態において
は、上記数字列に無限大の記号が記されている。つい
で、ステップＳ２において、キューが空か否が判別さ
れ、キューが空の場合には終了する。キューが空でない
場合（初期状態においては、キューにはステップＳ１に
おいて入力された文節群フラグとＤＰの初期状態が入力
されている）、ステップＳ３に行き、キューから文節群
フラグとＤＰの結果を取り出す。また、Ｉを初期状態で
あるＩ＝１とする。The DP matching result is a number string consisting of matching results up to each part of the phrase, and in the initial state, an infinite symbol is written in the number string. Then, in step S2, it is determined whether or not the queue is empty, and if the queue is empty, the process ends. If the queue is not empty (in the initial state, the queue has the phrase group flag input in step S1 and the initial state of DP), go to step S3, and obtain the result of the clause group flag and DP from the queue. Take it out. Further, I is set to I = 1 which is the initial state.

【００２４】ステップＳ４において、Ｉの値が図２の文
節テンプレート格納部１５に格納された文節テンプレー
ト登録文節数より大きいか否かが判別され、Ｉの値が文
節テンプレート登録文節数より大くなった場合には、そ
の文節について文節テンプレートとの照合が終了したも
のとしてステップＳ２に戻り、キューが空であるか否か
を判別し、空でない場合には、ステップＳ３に行く。In step S4, it is determined whether or not the value of I is greater than the number of clause template registration clauses stored in the clause template storage unit 15 of FIG. 2, and the value of I becomes greater than the number of clause template registration clauses. If so, the process returns to step S2 on the assumption that the matching with the phrase template for the phrase is completed, and it is determined whether or not the queue is empty. If not, the process proceeds to step S3.

【００２５】また、Ｉの値が文節テンプレート登録文節
数より小さい場合には、ステップＳ５に行き、Ｂに文節
テンプレートより取り出した第Ｉ番目の文節を入れる。
ついで、ステップＳ６に行き、文節テンプレートより取
り出した第Ｉ番目の文節について、文節群フラグ・テー
ブル１４を参照してその文節に文節群フラグが立ってい
る（○が付されている）か否かを判別する。そして、文
節群フラグが立っていない場合にはステップＳ８に行
き、Ｉに１を加えてステップＳ４に戻る。If the value of I is smaller than the number of registered phrase templates, the process goes to step S5, and the I-th phrase extracted from the phrase template is put in B.
Next, in step S6, with respect to the I-th bunsetsu extracted from the bunsetsu template, the bunsetsu group flag table 14 is referred to and whether or not the bunsetsu group flag is set (marked with a circle) is checked. To determine. If the phrase group flag is not set, the process goes to step S8, 1 is added to I, and the process returns to step S4.

【００２６】また、文節群フラグが立っている場合に
は、ステップＳ７に行き、図２のＤＰ照合部１１に、入
力された音声データと文節テンプレートより取り出した
Ｉ番目の文節Ｂとの照合を行わせるとともに、文節群フ
ラグ・テーブル１４に格納された文節群フラグを更新す
る。すなわち、前記したように照合済の文節について、
文節群フラグを○から×にする。If the phrase group flag is set, the process proceeds to step S7, where the DP collation unit 11 in FIG. 2 compares the input voice data with the I-th phrase B extracted from the phrase template. At the same time, the phrase group flag stored in the phrase group flag table 14 is updated. That is, as described above, for the matched clauses,
Change the clause group flag from ○ to ×.

【００２７】上記処理が終わると、ＤＰ照合部１１にお
ける照合結果と、更新された文節群フラグ・テーブルを
キューに入れる。ついで、ステップＳ８に行き、Ｉに１
を加算してステップＳ４に戻り、文節テンプレート格納
部１５に格納された次の文節テンプレートについて、上
記と同様にＤＰ照合と、文節群フラグ更新処理を行う。When the above processing is completed, the collation result in the DP collation unit 11 and the updated clause group flag table are put in a queue. Then go to step S8 and set 1 for I
Is added and the process returns to step S4, and for the next phrase template stored in the phrase template storage unit 15, the DP collation and the phrase group flag updating process are performed in the same manner as above.

【００２８】以上のようにして、文節テンプレート格納
部１５に格納された全ての文節テンプレートと入力音声
の照合が終わると、ステップＳ４からステップＳ２に戻
り、キューが空か否かを判断して、空でない場合には、
ステップＳ３に行く。ステップＳ３においては、キュー
から文節群フラグとＤＰの結果を取り出し、Ｉ＝１とし
て、入力音声の次の文節について、キューから取り出し
た文節群フラグとＤＰの結果を基に上記と同様、ＤＰ照
合と文節群フラグの更新処理を行う。As described above, when all the phrase templates stored in the phrase template storage unit 15 have been collated with the input voice, the process returns from step S4 to step S2 to judge whether or not the queue is empty. If not empty,
Go to step S3. In step S3, the bunsetsu group flag and the result of the DP are taken out from the queue, I = 1 is set, and for the next bunsetsu of the input voice, based on the bunsetsu group flag taken from the queue and the result of the DP, the DP collation is performed in the same manner as above. And update the clause group flag.

【００２９】以上のように、図５の処理においは、文節
テンプレート格納部１５に格納された各テンプレートの
内、文節群フラグが付された文節のテンプレートと入力
音声の最初の文節（文末の文節）とをＤＰ照合部１１に
おいて、順次、照合するともに文節群フラグを更新し、
照合結果と更新された文節群フラグをキューに格納す
る。As described above, in the processing of FIG. 5, among the templates stored in the clause template storage unit 15, the template of the clause to which the clause group flag is added and the first clause of the input speech (the clause at the end of the clause) ) And the DP collating unit 11 sequentially collates and updates the clause group flag,
The collation result and the updated clause group flag are stored in the queue.

【００３０】ついで、次の入力音声の文節と、文節テン
プレート格納部１５に格納された各テンプレートの内、
文節群フラグが付された文節のテンプレートとを上記と
同様に照合し、ＤＰ照合結果を、キューに格納された最
初の文節のＤＰ照合結果を基にしてつなげていく。そし
て、ＤＰ照合結果と更新された文節群フラグを前記した
ように、キューに格納する。Next, among the phrases of the next input speech and each template stored in the phrase template storage unit 15,
Similar to the above, the clause templates to which the clause group flag is added are collated, and the DP collation result is connected based on the DP collation result of the first clause stored in the queue. Then, the DP collation result and the updated clause group flag are stored in the queue as described above.

【００３１】以下同様に、入力音声の各文節と文節テン
プレートを順次照合していき、前記したように、文節群
フラグがすべて×状態になると、照合を終了する。そし
て、ＤＰ照合部１１は連続文音声の各文節のＤＰ照合結
果をつなげて得られた連続文音声に対する複数のＤＰ照
合結果より認識スコア求めて出力し、この認識スコアの
最も高いものを入力音声の認識結果とする。Similarly, each phrase of the input voice is sequentially collated with the phrase template. As described above, the collation ends when all the phrase group flags are in the x state. Then, the DP matching unit 11 obtains and outputs a recognition score from a plurality of DP matching results for continuous sentence speech obtained by connecting the DP matching results of each clause of the continuous sentence speech, and outputs the recognition score having the highest recognition score. And the recognition result.

【００３２】図６は本発明の第２の実施例を示す図であ
り、同図において、図２に示した第１の実施例と同一の
ものには同一の符号が付されており、本実施例において
は、第１の実施例のものに、ビームサーチ枝刈り部２１
と関連度スコア処理部２２と関連度データ格納部２３付
加したものであり、その他の構成は第１の実施例と同一
である。FIG. 6 is a diagram showing a second embodiment of the present invention. In FIG. 6, the same parts as those of the first embodiment shown in FIG. In the embodiment, the beam search pruning unit 21 is the same as that of the first embodiment.
And the degree-of-association score processing unit 22 and the degree-of-association data storage unit 23 are added, and other configurations are the same as those in the first embodiment.

【００３３】図７はビーム・サーチの概念を示す図であ
り、同図を参照して、本実施例におけるビーム・サーチ
について説明する。連続文音声認識は基本的には、前記
したように、入力音声の各文節のＤＰ照合結果（これを
以下、ＤＰプレーンという）をつなぐことにより実現す
ることができ、図７は、前記した「私が学校へ行
く」という文を例にして上記ＤＰプレーンを図示したも
のである。FIG. 7 is a diagram showing the concept of the beam search, and the beam search in this embodiment will be described with reference to the figure. Basically, the continuous sentence voice recognition can be realized by connecting the DP matching results (hereinafter, referred to as DP plane) of each clause of the input voice as described above, and FIG. The DP plane is illustrated using the sentence "I go to school" as an example.

【００３４】図７に示すように、連続文の認識を行う場
合には、最初に現れる文節（文末の文節）のＤＰプレー
ンを作成したのち（同図では、最初の現れる文節として
「行く」、「聞く」、「見る」のＤＰプレーンが例示さ
れている）、各文節について、その後に続き得る文節の
ＤＰプレーンをつなげる（同図では、上記文末の文節に
続いて「学校へ」、「私が」のＤＰプレーンがつなげら
れている）ことにより、連続認識を行うことができる。
なお、前記したように認識は後ろからやっていった方が
文法の制限をより明確に反映させることができ、図７の
例においては、文末の「行く」等の動詞から照合してい
る。As shown in FIG. 7, when recognizing a continuous sentence, a DP plane of the first appearing phrase (end of sentence) is created (in the figure, "go" as the first appearing phrase, The DP planes of “listen” and “see” are shown as examples), and the DP planes of the clauses that can follow after each clause are connected (in the figure, after the clause at the end of the clause, “to school”, “I By connecting the DP plane of "ga"), continuous recognition can be performed.
It should be noted that, as described above, it is possible to more clearly reflect the restriction of grammar when recognition is performed from the back, and in the example of FIG. 7, the verb such as “go” at the end of the sentence is used for matching.

【００３５】ところで、図２に示した第１の実施例にお
いては、入力された音声の各文節と文節テンプレートと
の照合結果の全てについてＤＰプレーンを作成し、その
ＤＰプレーンをつなげていくため、全ての可能性につい
てＤＰマッチングを行うこととなり、計算量が膨大なも
のとなる。そこで、各文節を照合した段階で、その照合
結果による枝刈りによりＤＰプレーンを伸ばす個数を一
定数に制限すれば、上記計算量を抑えることができる。By the way, in the first embodiment shown in FIG. 2, since the DP planes are created for all the matching results of each phrase of the inputted voice and the phrase template, and the DP planes are connected, DP matching is performed for all possibilities, and the amount of calculation becomes enormous. Therefore, at the stage of matching each clause, if the number of extending DP planes by pruning based on the matching result is limited to a fixed number, the above calculation amount can be suppressed.

【００３６】このような手法がビーム・サーチであり、
図７の例においては、ＤＰプレーンを伸ばす個数を１に
制限した例を示している。すなわち、図２に示した第１
の実施例の場合には、「行く」、「聞く」、「見る」の
全てのＤＰプレーンに続けて、入力音声の次の文節のＤ
Ｐプレーンをつなげていくこととなるため、ＤＰプレー
ンの分岐数が増え計算量が膨大なものとなるが、図７の
ように、「行く」、「聞く」、「見る」の内、最も認識
スコアの高い「行く」のみからＤＰプレーンを伸ばすこ
とにより、計算量を抑えることができる。Such a method is beam search,
In the example of FIG. 7, the number of extending DP planes is limited to one. That is, the first shown in FIG.
In the case of the above embodiment, all the DP planes of “go”, “listen”, and “see” are followed by D of the next clause of the input voice.
Since the P planes are connected, the number of branches in the DP plane increases and the amount of calculation becomes enormous, but as shown in FIG. 7, it is the most recognized among “go”, “listen”, and “see”. The amount of calculation can be suppressed by extending the DP plane only from “go” having a high score.

【００３７】具体的には、図５のフローチャートにおい
て、キューにＤＰ照合結果と文節群フラグを入れるとき
に、ＤＰ照合結果を図６のビーム・サーチ枝刈り部２１
に送り、ビーム・サーチ枝刈り部２１で文節候補の数を
ビーム幅内に絞り、文節選択部１２のキューに送る。す
なわち、ＤＰ照合結果の内、ビーム・サーチ枝刈り部２
１において選定される認識スコアの高い１ないし複数の
文節を文節候補として（図７の例においては「行く」が
文節候補として選択されている）、文節選択部１２のキ
ューに送る。Specifically, in the flow chart of FIG. 5, when the DP collation result and the clause group flag are put in the queue, the DP collation result is shown in the beam search pruning unit 21 of FIG.
The beam search pruning unit 21 narrows the number of bunsetsu candidates within the beam width and sends the bunsetsu candidates to the queue of the bunsetsu selecting unit 12. That is, of the DP matching results, the beam search pruning unit 2
One or a plurality of phrases having a high recognition score selected in 1 are sent to the queue of the phrase selecting unit 12 as a phrase candidate (“go” is selected as a phrase candidate in the example of FIG. 7).

【００３８】図８は関連度データ格納部２３に格納され
た関連度データの一例を示す図であり、同図には、関連
度データとして、一つの文に二つの文節が同時に現れる
可能性を示す値（共起関係データという）が示されてい
る。同図の例においては、例えば「行く」という文節の
前に「今日」という文節が現れる確率は０．３であり、
また、「今日」という文節が２回現れる確率は０である
ことが示されている。FIG. 8 is a diagram showing an example of the degree-of-association data stored in the degree-of-association data storage 23. In FIG. 8, there is a possibility that two clauses may appear in one sentence at the same time as the degree-of-association data. The indicated value (referred to as co-occurrence relation data) is shown. In the example of the figure, the probability that the phrase "today" appears before the phrase "go" is 0.3,
It is also shown that the probability that the phrase "today" appears twice is 0.

【００３９】なお、関連度データとしては、上記例のほ
か、例えば、ある文節が他のある文節と隣合う可能性を
示したデータ等（隣接関係データという）を用いること
もできる。図６の関連スコア処理部は２２は上記した関
連度データ格納部２３に格納された関連度データを参照
して、関連度スコアを算出する手段であり、文節選択部
１２により文節が選択された場合に、その文中の既に選
択された文節との関連度を算出し、ＤＰ照合部１１にお
ける照合結果に関連度を加算するか、あるいは照合結果
と関連度との積を求めて認識スコアとして出力する。As the degree-of-association data, in addition to the above example, for example, data indicating the possibility that a certain phrase is adjacent to another certain phrase (referred to as adjacency relation data) can be used. 6 is a means for calculating a relevance score by referring to the relevance score data stored in the relevance score data storage unit 23, and the phrase selecting unit 12 selects a phrase. In this case, the degree of relevance to the already selected phrase in the sentence is calculated, and the degree of relevance is added to the collation result in the DP collation unit 11, or the product of the collation result and the degree of relevance is obtained and output as a recognition score. To do.

【００４０】関連度の算出方法としては、例えば、文節
選択部１１が「行く」の前の文節として「学校へ」を選
択した場合、図８に示した関連度データから関連度は
０．８と求められる。次に、「学校へ」の前の文節とし
て「私は」が選ばれたときには、図８の関連度データか
ら「学校へ」と「私は」の関連度が０．４であり、この
値と上記した「行く」と「学校へ」の関連度０．８を加
えて、関連度１．２となる。As a method of calculating the degree of association, for example, when the clause selecting unit 11 selects "to school" as the clause before "go", the degree of association is 0.8 from the degree-of-association data shown in FIG. Is required. Next, when “I am” is selected as the clause before “To school”, the degree of association between “To school” and “I” is 0.4 from the association degree data of FIG. Then, the relevance of 0.8 is added to the above-mentioned “go” and “to school” to obtain a relevance of 1.2.

【００４１】さらに、「私は」の次の文節として、「今
日」が選択された場合には、関連度は図８より０．２と
なるから、上記した関連度１．２にこの０．２を加えて
関連度は１．４となる。すなわち、図８に示した関連度
データの値を累積していくことにより、関連度が求めら
れる。上記例においては、関連度の和を算出して複数の
文節の関連度を求める例を示したが、関連度の算出方法
としては、その他種々の方法を採用することができ、例
えば、和のかわりに積を用いることもできる。この場合
には、上記例のように、「行く」と「学校へ」と「私
は」が選択された場合の関連度は０．８×０．４＝０．
３２となり、さらに、「今日」が選択された場合の関連
度は０．３２×０．２＝０．０６４となる。Further, when "today" is selected as the next clause of "I am", the degree of association is 0.2 according to FIG. By adding 2, the degree of association becomes 1.4. That is, the degree of association is obtained by accumulating the values of the degree-of-association data shown in FIG. In the above example, an example of calculating the sum of the degree of association to obtain the degree of association of a plurality of clauses has been shown, but various other methods can be adopted as the method of calculating the degree of association. Products can be used instead. In this case, as in the above example, the degree of association when “go”, “to school”, and “I am” is selected is 0.8 × 0.4 = 0.
32, and the degree of association when “today” is selected is 0.32 × 0.2 = 0.064.

【００４２】また、関連度を求める手法としては、その
他、図８の関連度データから求めた関連度の最大値ｍａ
ｘを求めたり（この場合、上記のように「行く」と「学
校へ」と「私は」が選択された場合の関連度は０．
８）、あるいは、最小値ｍｉｎを求めたり、さらに、各
関連度値の平均を求める等により、関連度を算出するこ
ともできる。Further, as a method of obtaining the degree of association, other than this, the maximum value ma of the degree of association obtained from the degree-of-association data of FIG.
x is calculated (in this case, the degree of relevance when “go”, “to school” and “I am” is selected as described above is 0.
8) Alternatively, the degree of association can be calculated by obtaining the minimum value min, and further obtaining the average of the degree of association values.

【００４３】次に、図７、図８を参照して図６の第２の
実施例について説明する。図２の示した実施例と同様、
文節テンプレート格納部１５に格納された各テンプレー
トの内、文節群フラグが付された文節のテンプレートと
入力音声の最初の文節（文末の文節）とをＤＰ照合部１
１において照合するともに文節群フラグを更新する。こ
の照合結果はＤＰ照合部１１よりビームサーチ枝刈り部
２１に送られ、ビームサーチ枝刈り部２１はＤＰ照合に
より得られた文節候補の数を、認識スコアにより、ビー
ム幅内の一定の数に絞る。Next, the second embodiment of FIG. 6 will be described with reference to FIGS. 7 and 8. Similar to the embodiment shown in FIG.
Among the templates stored in the phrase template storage unit 15, the template of the phrase to which the phrase group flag is added and the first phrase (the last sentence of the sentence) of the input voice are compared with the DP matching unit 1.
In step 1, the phrase group flag is updated. This matching result is sent from the DP matching unit 11 to the beam search pruning unit 21, and the beam search pruning unit 21 sets the number of phrase candidates obtained by the DP matching to a fixed number within the beam width by the recognition score. squeeze.

【００４４】ビームサーチ枝刈り部２１により絞られた
文節候補は文選択部１２に送られ、キューに入れられ
る。ついで、次の入力音声の文節と、文節テンプレート
格納部１５に格納された各テンプレートの内、文節群フ
ラグが付された文節のテンプレートとを上記と同様に照
合する。そして、そのＤＰ照合結果をキューに入力され
ている前の文節の文節候補のＤＰプレーンにつなげてい
く。The phrase candidates narrowed down by the beam search pruning unit 21 are sent to the sentence selection unit 12 and put in a queue. Then, the phrase of the next input voice and the template of the phrase to which the phrase group flag is added among the templates stored in the phrase template storage unit 15 are collated in the same manner as above. Then, the DP matching result is connected to the DP plane of the phrase candidate of the previous phrase input to the queue.

【００４５】以下同様に、入力音声の各文節と文節テン
プレートを順次照合していき、前記したように、文節群
フラグがすべて×状態になると、照合を終了する。ま
た、関連度スコア処理部２２は文選択部１２において文
節が選択されたとき、関連度データ格納部２３に格納さ
れた関連度データを参照して、前記した手法により関連
度を算出し、ＤＰ照合部１１の照合結果に関連度を加え
て（例えば、和を求めたり、積を求める）、認識スコア
として出力する。Similarly, each phrase of the input voice is sequentially collated with the phrase template, and as described above, the collation ends when all the phrase group flags are in the x state. Further, when the phrase is selected by the sentence selection unit 12, the relevance score processing unit 22 refers to the relevance data stored in the relevance data storage unit 23 to calculate the relevance by the above-described method, The degree of relevance is added to the matching result of the matching unit 11 (for example, the sum or the product is calculated), and the result is output as a recognition score.

【００４６】図９は本発明の第３の実施例を示す図であ
り、同図において、図６に示した第２の実施例と同一の
ものには同一の符号が付されており、本実施例において
は、第２の実施例の関連度スコア処理部２２を関連度ス
コア付加部２４と関連度スコア計算部２５から構成した
ものであり、その他の構成は図６の実施例と同一であ
る。FIG. 9 is a diagram showing a third embodiment of the present invention. In FIG. 9, the same parts as those of the second embodiment shown in FIG. In the embodiment, the relevance score processing unit 22 of the second embodiment is composed of a relevance score addition unit 24 and a relevance score calculation unit 25, and other configurations are the same as those of the embodiment of FIG. is there.

【００４７】図９の実施例において、関連度スコア計算
部２５は関連度データ格納部２３に格納された関連度デ
ータを参照して、前記したように関連度データの和、積
等から関連度スコアを計算し、関連度スコア付加部２４
に出力する。関連度スコア付加部２４は関連度スコア計
算部２５により求められた関連度スコアをＤＰ照合部１
１が出力する認識スコアに付加して（例えば、前記した
ように、認識スコアに関連度スコアを加算、もしくは掛
けて）、認識スコアを出力する。In the embodiment shown in FIG. 9, the relevance score calculator 25 refers to the relevance data stored in the relevance data storage 23 and determines the relevance from the sum, product, etc. of the relevance data as described above. The score is calculated and the relevance score adding unit 24
Output to. The relevance score adding unit 24 uses the relevance score obtained by the relevance score calculation unit 25 as the DP matching unit 1.
1 is added to the recognition score output (for example, as described above, the recognition score is added to or multiplied by the relevance score) to output the recognition score.

【００４８】[0048]

【発明の効果】以上説明したことから明らかなように、
本発明においては、文節をその格と意味により文節群に
分類しておき、上記分類結果を参照して、文節の係り受
け関係と意味関係から入力音声の文節中に現れる可能性
のある文節群にフラグを立て、フラグが立てられた文節
と入力音声の文節を照合し、その結果、フラグが立てら
れた文節群に含まれる文節が実際に入力音声の文節中に
現れたとき、上記フラグをリセットするようにしたの
で、係り受け関係や意味関係を考慮して効率よく連続文
音声入力を認識することができ、連続文音声認識の性能
を向上させることができる。As is apparent from the above description,
In the present invention, bunsetsus are classified into bunsetsu groups according to their case and meaning, and with reference to the above classification result, bunsetsu groups that may appear in the bunsetsu of the input speech from the dependency relation and semantic relation of the bunsetsu. When a phrase included in the flagged phrase group actually appears in the phrase of the input voice, the above flag is set. Since the resetting is performed, the continuous sentence voice input can be efficiently recognized in consideration of the dependency relation and the semantic relation, and the performance of the continuous sentence voice recognition can be improved.

【００４９】また、入力音声の文節と照合する文節候補
をビーム・サーチにより一定数内に絞るようにすること
により、照合のための計算量を減少させることができ
る。さらに、文節間の関連度から関連度スコアを算出
し、算出された関連度スコアを入力音声の照合結果に付
加することにより、より正確に連続文音声入力を認識す
ることができる。Further, by limiting the number of phrase candidates to be matched with the phrase of the input voice to a certain number by beam search, the amount of calculation for matching can be reduced. Furthermore, by calculating the relevance score from the relevance between phrases and adding the calculated relevance score to the matching result of the input voice, it is possible to recognize the continuous sentence voice input more accurately.

[Brief description of drawings]

【図１】本発明の原理ブロック図である。FIG. 1 is a principle block diagram of the present invention.

【図２】本発明の第１の実施例を示す図である。FIG. 2 is a diagram showing a first embodiment of the present invention.

【図３】格支配文法テーブルの一例を示す図である。FIG. 3 is a diagram showing an example of a case dominance grammar table.

【図４】文節群フラグ・テーブルの一例を示す図であ
る。FIG. 4 is a diagram showing an example of a clause group flag table.

【図５】文節選択部における処理を示すフローチャート
である。FIG. 5 is a flowchart showing processing in a phrase selecting unit.

【図６】本発明の第２の実施例を示す図である。FIG. 6 is a diagram showing a second embodiment of the present invention.

【図７】ビーム・サーチの概念を示す図である。FIG. 7 is a diagram showing the concept of beam search.

【図８】関連度データの一例を示す図である。FIG. 8 is a diagram showing an example of association degree data.

【図９】本発明の第３の実施例を示す図である。FIG. 9 is a diagram showing a third embodiment of the present invention.

[Explanation of symbols]

１１ＤＰ照合部１２文節選択部１３格支配文法テーブル１４文節群フラグ・テーブル１５文節テンプレート格納部２１ビームサーチ枝刈り部２２関連度スコア処理部２３関連度データ２４関連度スコア付加部２５関連度スコア計算部 11 DP collation unit 12 Phrase selection unit 13 Case dominance grammar table 14 Phrase group flag table 15 Phrase template storage unit 21 Beam search pruning unit 22 Relevance score processing unit 23 Relevance score data 24 Relevance score addition unit 25 Relevance score Calculator

Claims

[Claims]

1. In a continuous sentence voice recognition method for recognizing continuous sentence voices by phrase matching, bunsetsus are classified into bunsetsu groups in advance according to their case and meaning, and the dependency of bunsetsu is referred to by referring to the classification result. Flag the clauses that may appear in the phrase of the input speech from the relationship and the semantic relation.The above flags are inherited as long as the flagged clauses may appear in the clause of the input speech, When matching the phrase template of the phrase that has been set with the phrase of the input voice and recognizing the input voice based on the matching result, the flag that is set as a result of matching the phrase that has been flagged with the phrase of the input voice is recognized. When a phrase included in the selected phrase group actually appears in the phrase of the input speech, the above flags are reset to narrow down the candidates of the next phrase that appears. Continuous sentence speech recognition system will be.

2. The continuous sentence voice recognition system according to claim 1, wherein the phrase candidates to be matched with the phrases of the input voice are narrowed down within a fixed number by beam search.

3. The continuous sentence voice recognition according to claim 1, wherein a relevance score is calculated from the relevance between phrases and the calculated relevance score is added to the matching result of the input voice. method.

4. The continuous sentence speech recognition method according to claim 3, wherein co-occurrence relation data between phrases is used as the degree-of-relationship degree data.

5. The continuous sentence speech recognition method according to claim 3, wherein adjacent relation data between phrases is used as the degree-of-relationship degree data.

6. As the degree-of-association data, the degree-of-association data between only two clauses is used, and each time the term is matched with the term of the input voice, a degree-of-association score with the term matched before is calculated and stored. Every time, the relevance score of that time is the relevance score calculated last time,
The continuous sentence voice recognition method according to claim 3, wherein the continuous sentence voice recognition method is calculated from the relevance score calculated this time.

7. The continuous sentence speech recognition method according to claim 6, wherein the relevance score for the current time is calculated from the sum of the relevance score calculated last time and the relevance score calculated this time.

8. The continuous sentence speech recognition method according to claim 6, wherein the relevance score for the current time is calculated from the product of the relevance score calculated last time and the relevance score calculated this time.

9. The continuous sentence speech recognition method according to claim 6, wherein the larger one of the previously calculated relevance score and the relevance score calculated this time is used as the relevance score for the current time.

10. The continuous sentence speech recognition system according to claim 6, wherein the smaller one of the previously calculated relevance score and the relevance score calculated this time is used as the relevance score for the current time.

11. The continuous sentence speech recognition method according to claim 6, wherein the relevance score of the current time is calculated from the average value of the relevance score calculated last time and the relevance score calculated this time.