JPH01502610A

JPH01502610A - Continuous speech recognition system

Info

Publication number: JPH01502610A
Application number: JP50337286A
Authority: JP
Inventors: ジャーソン・アイラ　アラン
Original assignee: モトローラ・インコーポレーテッド
Priority date: 1986-06-02
Filing date: 1986-06-02
Publication date: 1989-09-07
Also published as: WO1987007748A1; CA1336017C

Abstract

Method and arrangement for speech recognition for use in a speech recognition system, wherein a grammar model may be stored in memory, the grammar model being composed of nodes connected by arcs, each arc generally having an associated template prestored in memory and respective originating and terminating grammar nodes, and wherein groups of input frames may be compared to templates to generate similarity measure parameters. The invention includes structuring the grammar model such that arcs may be looped within the structure. For implementing this type of structure, the invention includes providing first parameter storage and second parameter storage for one or more nodes including a terminating node for a selected arc, determining similarity measure parameters for the originating node for the selected arc, determining similarity measure parameters for the arc using a template associated with the arc and the originating node's similarity measure parameters. Further, similarity measure parameters are determined at the terminating node and are stored in the first parameter storage for the terminating node. The first parameter storage contents are transferred to the second parameter storage for the terminating node, and the groups of input frames are recognized using the parameter storage contents associated with the terminating node.

Description

【発明の詳細な説明】連続音声認識システム発明の背景本発明は音声認識システムに関し、特に、話された単語（ｗｏｒｄ）の終点（ｅｎｄ　ｐｏｌｎｔ）があらかじめ決まっていない音声の認識に関する。[Detailed description of the invention] continuous speech recognition system Background of the invention TECHNICAL FIELD The present invention relates to speech recognition systems, and more particularly, to the endpoint (e) of a spoken word (word). This invention relates to the recognition of speech whose nd, polnt) are not predetermined.

既知の話し手（ｓｐｅａｋｅｒ）について所定の文法（（ｒａｍｍａｒ）から孤立した単語を認識することは長い間知られている。isolated from a given grammar ((rammar) for a known speaker). It has been known for a long time that humans can recognize words that are set up.

文法の単語を個別のテンプレート（型板）としてあらかじめ記憶しておき、各テンプレートが文法により単語に対する音のパターンを表わすようにする。孤立した単語が話されると、システムはその単語を文法を表わす各個別のテンプレートと比較する。この方法は一般に完全単語テンプレート合致法（ｗｈｏ！ｅ−ｗｏｒｄ　ｔｅｓｐｌａｉｅ　ｍａｔｃｈｉｎｇ）と言われている。好結果を与える認識システムの多くは動的プログラミングを用いた完全単語テンプレート合致法を採用して、話された単語とあらかじめ記憶しであるテンプレートとの間の非線形時間尺度の変化に対処している。Memorize grammar words in advance as individual templates and use them for each test. Let templates represent sound patterns for words using grammar. isolated When a word is spoken, the system converts the word into each individual template representing the grammar. Compare with. This method is generally used as the exact word template matching method (who!e-wo rd tesplaie matching). give good results Most recognition systems use complete word template matching using dynamic programming. is adopted to detect non-linearity between spoken words and pre-memorized templates. It deals with changes in the shape time scale.

この手法は孤立した単語を認識する用途には有効であるが、多くの実用的用途には連続した単語の認識が必要である。連続単語の認識では、語句中の単語の数は限定しなくてよく、始めの方の単語の本体は語句が終る前に決定することができるが、孤立単語認識では、入カバターンの始まりと終りとを特定するのに区切り記号（ｄｅｌｉｃｉｔｅｒ）を用い、認識は一度に１語ずつ行われる。更に、連続音声認識システムは入カバターンを他の認識可能なパターン、背景雑音、呼吸雑音のような話し手から出る雑音から区別しなければならず、一方孤立認識は通常、単語の始まりまたは終りに他の認識可能なパターンを容認できない。Although this method is effective for recognizing isolated words, it is not suitable for many practical applications. requires recognition of consecutive words. In continuous word recognition, the number of words in a phrase is There is no need to limit, the body of the word at the beginning can be determined before the end of the phrase. However, in isolated word recognition, a delimiter is used to identify the beginning and end of an input pattern. Recognition is done one word at a time using deliciters. Furthermore, the series The continuous speech recognition system recognizes incoming patterns from other recognizable patterns, background noise, and breathing. It must be distinguished from the noise coming from the speaker, such as background noise, while isolated recognition is Usually cannot tolerate other recognizable patterns at the beginning or end of words.

ＩＥＥＥ）ランザクジョン、音響学、音声および信号処理、ｖｏｌ、　ＡＳＳＰ −２７，Ｎｏ、８．　ｐｐ、ｓｇｇ　〜５９５　（１９７９年１２月）のＨ，５ａｋｏｅによる「２レベルＤＰ合致法−接続（ｃｏｎｎｅｃｔｅｄ）単語認識のためのパターン合致アルゴリズムに基づく動的プログラミング」では、完全単語テンプレート合致の方法が接続単語認識を取扱うように拡張されている。この論文は入カバターン全体に最もよく合う一連の単語テンプレートを見出す２パス（ｔｗｏ−ｐａｓｓ）動的プログラミング・アルゴリズムを提示している。第１のパスでは、入カバターンの各可能な部分に対して合致（ｗａｔｃｈ）　した各テンプレート間の類似性を示すスコアを発生する。第２のパスでは、このスコアを使用して入カバターン全体に対応する最良のテンプレート列を見出す。IEEE) Analysis, Acoustics, Speech and Signal Processing, vol, ASSP -27, No, 8. pp, sgg ~595 (December 1979) H, 5 ``Two-level DP matching method - Connected word recognition'' by akoe Dynamic Programming Based on Pattern Matching Algorithm for Complete Words The template matching method has been extended to handle connected word recognition. this theory Sentences are created using two passes ( two-pass) dynamic programming algorithm. first In the path, each text is matched against each possible part of the input pattern. Generates a score indicating the similarity between templates. In the second pass, this score is to find the best template sequence that corresponds to the entire input pattern.

この拡張された方法には明らかな欠点がある。この技法の１つの欠点は必要な計算時間の量である。特定の設計要件によっては、この限度のため高価な高速プロセッサを不当に必要とすることがある。This expanded method has obvious drawbacks. One drawback of this technique is that the required is the amount of computing time. Depending on your specific design requirements, this limit may make expensive high-speed Sessa may be unreasonably needed.

この方法の他の欠点は入カバターンの終点をあらかじめ定めねばならず、入カバターン全体をテンプレートの合致が正確に生ずるまでシステム内に格納しなければならないということである。入カバターンがかなり長い場合には、認識応答時間が実質上悪くなる。また、終点検出時の誤差が認識器の性能を甚だしく低下させる。更に、この情報を格納するのに必要なメモリが極端に多くなることもある。Another disadvantage of this method is that the end point of the input pattern must be determined in advance; The entire turn must be stored in the system until an exact template match occurs. This means that it must be done. If the input pattern is quite long, the recognition response The gap actually gets worse. In addition, the error in detecting the end point can seriously degrade the performance of the recognizer. let Furthermore, the memory required to store this information can be prohibitive. .

ＩＥＥＥ）ランザクジョン、音響学、音声および信号処理、ｖｏｌ、　ＡＳＳＰ −２７，Ｎｏ、　６．　ｐｐ、　５８８〜５９５　（１９７９年１２月）のＰ、　Ｂｒｏｗｎ、　Ｊ、　５ｐｏｈｒｅｒｓ　Ｐ、　Ｒｏｃｈｓｃｈｉｌｄ　、　Ｊ、　Ｂａｋｅｒによる「部分的トレースバック（ｔｒａｅｅｂａｃｋ）　’および動的プログラミング」では、終点をあらかじめ定めずに任意の長さの入カバターンの連続音声認識を考慮した技法が述べられている。これは部分的トレースバックと呼ばれる技法を使用して行われる。部分的トレースバックでは、認識器の性能を犠牲にすることなしに完全な入カバターンの完成の前に認識された単語が出力される。しかしながら、記されている部分的トレースバック技法はプロセッサの負担になるとともに実行が厄介なように思われる。IEEE) Analysis, Acoustics, Speech and Signal Processing, vol, ASSP -27, No, 6. pp, 588-595 (December 1979), Brown, J, 5pohrers P, Rochschild, “Partial traceback” by Baker, J. Dynamic programming allows you to program inputs of arbitrary length without predetermining the end point. A technique considering continuous speech recognition of turns is described. This is a partial trace It is done using a technique called backing. For partial traceback, the recognizer words recognized before the completion of a complete input pattern without sacrificing performance. is output. However, the described partial traceback technique It seems to be a burden on the user and difficult to implement.

したがって、容易に実行することができ、しかも実時間で効果的かつ安価に動作することができる連続音声認識システムの必要性が存在する。Therefore, it is easy to implement and works effectively and cheaply in real time. There is a need for a continuous speech recognition system that can perform continuous speech recognition.

発明の目的と概要本発明の目的は実時間用途に対して実施し安価なハードウェアで連続音声を認識することができる音声認識の機構と方法とを提供することである。Purpose and outline of the invention The purpose of the present invention is to implement it for real-time applications and to recognize continuous speech using inexpensive hardware. It is an object of the present invention to provide a speech recognition mechanism and method that can perform speech recognition.

本発明の更に他の目的は認識プロセス中音声認識メモリを効果的にメモリ管理できる音声認識の機構と方法とを提供することである。Still another object of the present invention is to effectively manage speech recognition memory during the recognition process. An object of the present invention is to provide a speech recognition mechanism and method that can perform speech recognition.

本発明の更に他の目的はループを有する文法を与える音声認識の機構と方法とを提供することである。Still another object of the present invention is to provide a speech recognition mechanism and method that provides a grammar with loops. It is to provide.

手短かに言えば、本発明は音声認識システムに使用する音声認識の方法と機構とに関するものであって、文法モデルがメモリ、にあらかじめ格納してあり、この文法モデルはメモリにあらかじめ格納しである関連のテンプレート（ｔｅｍｐｌａｔｅ）およびそれぞれの起点（ｏｒｉｇｉｎａｔｉｎｇ）ノード（ｎｏｄｅｓ）と終端（ｔｅｒｉｉｎａｔｔｎｇ）ノードとを有するアーク（弧；　ａｒｃｓ）により接続されているノードから構成されており、更に入力フレーム群がテンプレートと比較されて類似性尺度パラメータを発生する。本発明はアークが構造内でループを描くように文法モデルを構成する操作を含んでいる。この種の構造を実現するため、本発明は選択されたアークに対する終点ノードを含む１つ以上のノードに対する第１パラメータ記憶装置と第２パラメータ記憶装置とを偏性尺度パラメータをめること、そのアークに対する終点ノードでアークに関連するテンプレートと先にめた類似性尺度パラメータとを用いて類似性尺度パラメータをめることを含んでいる。更に、終点ノードでめられた類似性パラメータは終点ノードに対する第１のパラメータ記憶装置に格納される。第１パラメータ記憶装置の内容は終点ノードに対する第２のパラメータ記憶装置に移され、入力フレーム群が終点ノードに関するパラメータ記憶装置の内容を用いて認識される。Briefly, the present invention provides a speech recognition method and mechanism for use in a speech recognition system. , the grammar model is stored in memory in advance, and this Grammar models are pre-stored in memory and associated templates. ate) and their respective originating nodes. ) and a terminal node. ), and the input frame group is a template. is compared with the plate to generate a similarity measure parameter. The present invention has an arc structure. Contains operations to construct a grammar model such as drawing a loop within. This kind of structure To achieve this, the present invention provides one or more The first parameter storage device and the second parameter storage device for the node of setting the parameters related to the arc at the end node for that arc. The similarity measure parameter is calculated using the sample template and the similarity measure parameter determined earlier. It includes being able to understand. Furthermore, the similarity parameter determined at the end node is is stored in the first parameter storage for the parameter. First parameter storage device The contents of the input frame are moved to the second parameter store for the destination node and Groups are recognized using the contents of the parameter store for the destination nodes.

図面の簡単な説明本発明の特徴で新規であると信ぜられるものは特許請求の範囲に特異性とともに示しである。本発明は、その他の目的と利点とともに、付図と関連して行う次の説明を参照することにより最も良く理解されるが、付図のいくつかの図面で同じ参照数字は同じ要素を示している。Brief description of the drawing Features of the invention believed to be novel are included in the claims with specificity. This is an indication. The invention, among other objects and advantages, comprises the following: Although best understood by referring to the description, the same Reference numbers indicate the same elements.

第１図は本発明にしたがって描いた音声認識システムのハードウェアのブロック図である。Figure 1 is a hardware block diagram of a speech recognition system according to the present invention. It is a diagram.

第２図は本発明にしたがって実施した音声認識システムの１つの局面を示す認識文法モデルの図式表現である。FIG. 2 shows recognition illustrating one aspect of a speech recognition system implemented in accordance with the present invention. This is a diagrammatic representation of a grammar model.

第３図は第１図の音声文法モデルによるすべての可能なパス（径路；　ｐａｔｈｓ）を列挙した音声文法トリー（ｔｒｅｅ）の図式表現である。Figure 3 shows all possible paths (paths) according to the speech grammar model in Figure 1. s) is a diagrammatic representation of a phonetic grammar tree that enumerates the following.

第４図は本発明による文法モデルの図式表現である。FIG. 4 is a diagrammatic representation of a grammar model according to the invention.

第５ａ図、第５ｂ図、およびＭＳｃ図は本発明による認識プロセスを実施するのに行われる一連のステップを描く流れ図である。Figures 5a, 5b and MSc illustrate the implementation of the recognition process according to the invention. 1 is a flowchart depicting a series of steps performed in a process.

第６図は第５Ｃ図のブロック７２を一層詳細に示す流れ図である。FIG. 6 is a flowchart illustrating block 72 of FIG. 5C in more detail.

第７ａ図、第７ｂ図、第７Ｃ図、および第７ｄ図は第５ａ図のブロック４４を一層詳細に示す流れ図である。Figures 7a, 7b, 7c, and 7d combine block 44 of figure 5a. FIG. 3 is a flowchart showing layer details; FIG.

第８図は本発明による「トレースバック」の１例を示す一連の文法トリー図である。FIG. 8 is a series of grammar tree diagrams illustrating an example of "traceback" according to the present invention. Ru.

好ましい実施例の詳細な説明第１図を参照すると、本発明を実施するのに使用することができる音声認識システムのブロック図が示されている。DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS Referring to FIG. 1, a speech recognition system that can be used to implement the present invention is shown. A block diagram of the system is shown.

ブロック図はあらかじめ格納した文法が入っているテンプレート・メモリ１０を含んでいる。典型的なあらかじめ格納した文法の形成はアメリカ音響学会誌、６ｇ　（５）、１９８０年１１月のり、　Ｒ，Ｒａｂｌｎｅｒ　、　Ｊ、　Ｇ、　Ｗｉｌｐｏｎによる「訓練された話者に対する簡略化した強力訓練手順、孤立語認識システム」に述べられている。音響学、音声、および信号処理に関するＩ　ＥＥＥ　）ランザクジョン、ｖｏｌ、　ＡＳＳＰ−３１゜ＰＰ、　７９３〜８０Ｂ　（１９８３年８月）のＢ、　Ａ、　Ｄａｕｔｒｉｃｈ、　Ｌ、　Ｒ。The block diagram shows a template memory 10 containing pre-stored grammars. Contains. Typical pre-stored grammar formations can be found in Journal of the Acoustical Society of America, 6. g (5), November 1980 Nori, R, Rablner, J, G, Wilpon, “A Simplified Powerful Training Procedure for Trained Speakers, Isolated Words” Recognition System”. I on Acoustics, Speech, and Signal Processing EEE) Ranzakjon, vol, ASSP-31゜PP, 793-80 B (August 1983), B, A, Dautrich, L, R.

Ｒａｂｌｎｅｒ　ｓおよびＴ、　Ｂ、　Ｍａｒｔｉｎによる「孤立語認識のフィルタ・バンク・パラメータに変化をつけることの効果について」に述べられているような音響プロセッサ１２は入力音声を、一般に「フレーム」と言う、一連の音声セグメン）　（ｓｅｇｍｅｎｔｓ）に変換することができる。各フレームは入力音声の時間セグメントを、通常はＬＰＧまたはフィルタのバンク・データの形で表わす。音響プロセッサからのフレームは認識器１４に伝えられる。Rablners and T. B. Martin, ``Field of isolated word recognition''. ``On the effects of varying the router bank parameters''. An audio processor 12, such as the audio segments). Each frame is A time segment of the input audio, typically of LPG or filter bank data. Expressed in form. Frames from the acoustic processor are passed to recognizer 14 .

認識器１４はテンプレート・メモリ１０にあらかじめ格納しである文法から単語テンプレートにアクセスし、音響プロセッサ１２からの各入力フレームを単語テンプレートのセグメントを用いて処理する。このような技法は多くの音声認識システムに固有のものであり、「テンプレート処理」と言うことができる。The recognizer 14 selects words from a grammar previously stored in the template memory 10. access the template and convert each input frame from the sound processor 12 into a word template. process using segments of the template. Such techniques are used in many speech recognition systems. It is unique to the stem and can be referred to as "template processing."

認識器１４はリンク・テーブル用メモリ１６とノード・テーブル用メモリ１９との２つのテーブルに双方向的にアクセスする。リンク用メモリテーブル１６は５個の関連アレイを格納するのに使用される。ノード・テーブル用メモリ１９は文法モデルに関連するパラメータを格納するのに使用される。これらのテーブルについては、文法モデルとともに、以下に更に説明する。The recognizer 14 has a link table memory 16 and a node table memory 19. Access the two tables bidirectionally. The link memory table 16 is 5 used to store related arrays. The node table memory 19 is Used to store parameters related to the legal model. on these tables This will be further explained below along with the grammar model.

認識器１４は２つのプロセッサ、すなわち認識プロセッサ１８とリンク・トレースバック・プロセッサ２０とを用いて実現することができる。認識プロセッサ１８はテンプレート合致、文法、制御、およびリンク・トレースバックψプロセッサ２０との通信のすべてを処理する。リンク・トレースバック−プロセッサ２０はリンク争テーブル用メモリを維持する（ｗａｌｎｔａｌｎ）のに使用される。The recognizer 14 includes two processors: a recognition processor 18 and a link tray. This can be realized using the Subac processor 20. recognition processor 1 8 includes template matching, grammar, control, and link traceback ψ processes. handles all communications with the server 20; Link Traceback - Processor 20 is used to maintain memory for the link contention table.

この機能は連続音声を入力しながら可能なテンプレート合致を記録すること、関連情報をリンク・テーブル用メモリ１６に格納すること、リンク・テーブル用メモリ１６のスペースを他の情報のため自由にすること、および入力音声が特定されるにつれて認識結果を出力することを含む。認識プロセッサ１８とリンク・トレースバック・プロセッサ２０との機能は１つのプロセッサに組合せたり、あるいは、図示のように分離することができ、これにより認識プロセッサ１８を音響学、音声および信号処理に関するＩ　ＥＥＥ国際会ａ：ｉ事録ｐｐ、　８９９〜９０２　（１９８２年）のＪ、　Ｂｒ１ｄｌｅ　ｓ　Ｍ、　Ｂｒｏｗｎ。This feature allows you to record possible template matches while inputting continuous audio, storing link information in the link table memory 16; 16 space for other information, and if the input audio is This includes outputting recognition results as the process progresses. Recognition processor 18 and link Functions with raceback processor 20 can be combined into one processor or Alternatively, the recognition processor 18 can be separated as shown, thereby making the recognition processor 18 IEEE International Conference on Science, Speech and Signal Processing A:i Proceedings pp, 899- 902 (1982) J, Br1dles M, Brown.

およびＲ，Ｃｈａｍｂｅｒｌａｉｎの「接続単語認識のアルゴリズム」に詳細に説明されているように実現することができる。リンク・トレースバック・プロセッサは、本発明にしたがって使用するときは、モトローラのＭ　Ｃ６８０１のような８ビツトのプロセッサを用いて実現することができる。and R. Chamberlain, “Algorithms for Connected Word Recognition” in detail. It can be implemented as described. Link traceback process When used in accordance with the present invention, a processor such as the Motorola MC6801 This can be realized using a similar 8-bit processor.

文法のモデル化今度は第２図を参照すると、システムが認識することができるすさての可能な単語シーケンスを示す簡略化した認志文法モデルが示されている。このモデルは示されている文法が、例示の目的で、一般的に必要なものから、ひどく限定されているので、「簡略化した」と言う。第２図には、各々が２語から成る６つの可能な単語ストリングが存在する（第３図で更に説明する）。典型的な音声認識システムでは、文法モデルは各々が敷詰を含む一層多くの可能な単語ストリング（アーク）を備えることができる。文法モデルのトポロジー（ｔｏｐｏｌｏｇｙ）は、各々が起点ノードと終点ノードとを有する相互に接続されたアークのネットワークとして一層メモリに格納される。各アークはテンプレート・メモリ内の対応するテンプレートを指す１つ以上のポインタを備えることもできる。Grammar modeling Now referring to Figure 2, we can see all the possible units that the system can recognize. A simplified recognition grammar model showing word sequences is shown. This model is The grammar provided is for illustrative purposes and is severely limited from what is generally needed. Therefore, it is said to be ``simplified''. Figure 2 shows six possibilities, each consisting of two words. There are word strings (further explained in Figure 3). Typical speech recognition system In the system, the grammar model can be used to generate more possible word strings, each containing a Park) can be provided. The topology of the grammar model is , a network of interconnected arcs, each having a source node and a destination node. It is stored further in memory as a archive. Each arc has a corresponding one in template memory There may also be one or more pointers to templates to be used.

文法バスのモデル化第３図において、第２図からの６つの可能な単語ストリングの各々がトリー図に列挙されている。３つの可能な第１語、ｒＯＮＥＪ　（ワン）、ｒＴＷＯＪ　（）ウー）、およびｒＴＨＲＥＥＪ　（スリー）が存在する。各可能な第１語には２つの可能な第２語ｒＦＯＵＲＪ　（フォー）およびｒＦＩＶＥＪ　（ファイブ）が続く。テンプレート合致の期間中、すなわち、入力フレームがあら力）じめ格納した単語テンプレートと比較されている間、認識プロセッサは可能な「単語終端」を認識する。可能な「単語終端」は一連の入力フレームが単語テンプレートと合致する可能性があれば見つかる。特定された単語テンプレートは、先に述べたリンク・テーブルに格納されているリンク情報と、処理されている一連の入力フレームとノードに導くテンプレートとの間の類似性の尺度を示す累積距離とを通してトリー図に付加される。たとえば、第２図および第３図の文法ストリングの可能性を与えて、入力フレームのシーケンスが単語ｒＴＷＯＪと合致する可能性があるものとして特定されると、「ＴＷＯ」が始めのノード、すなわちノード２４からトリー図に付加される。第３図は別のフレームが入力され処理されてから、単語ｒＯＮＥＪが可能性のある合致となったことを示している。したがって、これは次にノード２４でトリー図にも付加される。次に単語ｒＴＨＲＥＥＪがトリー図に付加され、次いでｒＦＯＵＲＪがノード２６に付加され、その後同じノードにｒＦ　ＩＶＥＪが付加され、以下同様となる。これは各可能性ある合致テンプレートが早晩特定されるにつれてこれをトリー図に付加しながら続けられる。Modeling the grammar bus In Figure 3, each of the six possible word strings from Figure 2 is plotted in a tree diagram. are listed. Three possible first words, rONEJ (one), rTWOJ ( ) Wu), and rTHREEJ (three) exist. For each possible first word Two possible second words rFOURJ (four) and rFIVEJ (five) ) is followed. During template matching, i.e. if the input frame While being compared to the stored word templates, the recognition processor selects possible Recognize the terminus. Possible "word endings" are when a sequence of input frames is a word template. If there is a possibility of a match, it will be found. The identified word templates are link information stored in the input link table and the set of inputs being processed. The cumulative distance and the measure of similarity between the force frame and the template leading to the node. is added to the tree diagram through For example, the grammar strings in Figures 2 and 3 It is possible that the sequence of input frames matches the word rTWOJ. Once identified as possible, “TWO” becomes the first node, i.e. It is added to the tree diagram from step 24. Figure 3 shows another frame being input and processed. , indicating that the word rONEJ is a possible match. Therefore This is then also added to the tree diagram at node 24. Then the word rTHREEJ is added to the tree diagram, then rFOURJ is added to node 26, and then the same rF IVEJ is added to the same node, and so on. This is for each possible case. Continue to add these to the tree diagram as matching templates are identified. It will be done.

「トリー・ノード」と言う言葉、あるいはトリー図中のノードへの参照はリンク −レコード（ｌｉｎｋ（ｅｄ）　ｒｅｃｏｒｄ）という言葉と相互に交換可能に使用することにする。一般に、リンク・レコードはトリー図中の接続を規定するメモリに格納されているデータ集合体（ｄａｔａ　５ｅｔ）であり、特定のトリー・ノードの特定とトリー・トポロジー内の前のノードとの関係とを含む。The term "tree node" or a reference to a node in a tree diagram is a link - Interchangeable with the word record (link (ed) record) I decide to use it. In general, link records specify connections in a tree diagram It is a data collection (data 5et) stored in memory, and it is - Contains the identification of a node and its relationship to previous nodes in the tree topology.

単語の終端の可能性があるフレームごとに、新しいエントリー、またはリンク・レコードが代表的トリー図中のリンクに対応するリンク・テーブルに追加される。きわめて頻繁に、典型的には状態図の形で（状態を表わす）一連のフレームとして表わされている単語テンプレートに、入力フレームが処理されるにつれて、単語の終端である可能性が複数現われることになる。単語の終端の可能性が検出されるごとに、対応するテンプレートが新しいリンクとしてトリーに追加される。更に、各テンプレートの各状態は、現在の入力フレームを通して処理された蓄積距離と、そのテンプレートに対するデコーディングが始まるトリーのリンクに対応するリンク・テーブルのリンク・レコードを指すリンク・ポインタとを記録している。テンプレート合致に関するこれ以上の事項については、音響学、音声、および信号処理に関するＩ　ＥＥＥ国際会議議事録、ｐｐ、８９９〜９０２．１９Ｆ１２年のＪ、Ｂｒ１ｄｌｅ、　Ｍ、　ＢｒｏｗｎおよびＲｏＣｈａｍｂｅｒｌａｉｎの「接続語認識のアルゴリズム」を参考にすることができる。A new entry, or link Records are added to the link table corresponding to the links in the representative tree diagram. . Very often, a series of frames (representing a state), typically in the form of a state diagram, and As the input frame is processed, the word template, represented as There are multiple possibilities for the end of a word. Possible end of word detected each time the corresponding template is added to the tree as a new link. . Furthermore, each state of each template represents the storage processed through the current input frame. The product distance and the tree link from which decoding begins for that template. Record a link pointer pointing to a link record in the corresponding link table are doing. For further information on template matching, please refer to Acoustics, Audio , and Proceedings of the IEEE International Conference on Signal Processing, pp. 899-902. 19F12 J, Br1dle, M, Brown and RoChambe You can refer to rlain's "Algorithm for Connecting Word Recognition".

残念ながら、文法が大きいと、トリーに絶えず付加するテンプレートから問題が生ずる。第１に、これによって認識応答時間が遅れる。入力フレームのシーケンスが長くなるほど、オペレータは、システムが認識しその認識した単語に対して処理を行うまで待たなければならない時間が長くなる。Unfortunately, large grammars cause problems from constantly appending templates to the tree. arise. First, this delays recognition response time. Sequence of input frames The longer the sequence, the more the operator can use the words the system recognizes. The amount of time you have to wait for processing increases.

第２に、テンプレートを絶えず追加するにはトリー図の情報をリンクするための多大なメモリが必要である。文法モデルが複雑でかつ各可能性のある単語に対して、いくつかの可能性のある単語終端フレームが、合致する場合、リンク・テーブル（トリー）に必要なメモリは非常に急速な割合で大きくなる。大きくなる割合が大きすぎれば、メモリの所要条件は非実用的となる。Second, templates are constantly being added for linking tree diagram information. Requires a large amount of memory. The grammar model is complex and for each possible word If several possible end-of-word frames match, the link table is The memory required for bulls (trees) grows at a very rapid rate. As it gets bigger If the ratio is too large, the memory requirements become impractical.

第３の問題は実際に文法をモデル化することに関係している。連続音声に適用するとき、不定に長い接続単語シーケンスを認識できることが望ましく、ときには必要である。The third problem has to do with actually modeling the grammar. Applies to continuous audio It is desirable to be able to recognize indefinitely long connected word sequences when is necessary.

このような単語シーケンスを持つモデル構造を「不離定長モデル」と称することにする。上述のように、第２図の文法モデルはいくらか簡略化されている。音声認工技術に関する当業者には無限に長いシーケンスをメモリ内にモデルとして表わそうとすることは実用的でないことが明らかなはずである。その上、モデルのセグメントを計算するのは実時間処理の用途では非実用的である。これらの理由のため本発明は無限長のモデルを、計算およびメモリの必要条件を可能な限り少なくして収容するようになっている。A model structure with such a word sequence is called a "non-separable fixed length model". Make it. As mentioned above, the grammar model in Figure 2 is somewhat simplified. audio For those skilled in the art of certification, it is not possible to represent an infinitely long sequence as a model in memory. It should be clear that it is impractical to try to do so. Moreover, the model Computing segments is impractical for real-time processing applications. These reasons Therefore, the present invention allows infinite-length models to be created with as few computational and memory requirements as possible. It is designed to be accommodated without having to do so.

無限長モデル今度は第４図を参照すると、無限長のシーケンスをメモリおよび計算時間を可能な限り少なくして収容することができる特定の文法モデルの図式表現が描かれている。当業者には、所与の文法には特定のモデル化基準が必要であることが、理解されるはずである。したがって、第４図に示す例は実用的に必要な文法を収容するように修正することが可能であることが明らかなはずである。infinite length model Now referring to Figure 4, an infinitely long sequence can be stored in memory and in computation time. A diagrammatic representation of a particular grammatical model that can be accommodated with as little as possible is drawn. There is. Those skilled in the art will understand that a given grammar requires specific modeling criteria. It should be understood. Therefore, the example shown in Figure 4 accommodates the practically necessary grammar. It should be obvious that it can be modified to do so.

文法は０から９までの任意の数字を認識することができ、そのどれかを無音期間（ｐｅｒｉｏｄ　ｏｒ　５ｉｌｅｎｃｅ）で分離してもよいし分離しなくてもよい。このモデルでは、認識プロセスへの入力は言葉ｒｓｔｏｐＪで終結する。たとえば、シーケンスｒ　０−ｌ−１−９−８−０−４−ｓｔｏｐＪは随意的に無音を表わすダッシュで認識することができる。The grammar can recognize any number from 0 to 9, and any of them can be designated as a silent period. (period or 5 ilence) may or may not be separated. stomach. In this model, the input to the recognition process ends with the word rstopJ. Ta For example, the sequence r0-l-1-9-8-0-4-stopJ is optionally It can be recognized by the dash that represents the sound.

この種の文法モデルの２つの特徴には、ノード間に無効アーク（ｎｕｌｌ　ａｒｃ）を使用することと、文法モデル内にループを使用することがある。無効アークは実質的に２つのノード間を仮想的に接続するものである。これによってシーケンスあるいは単語を、文法モデル内にこれらの単語を表わすアークを複製せずに、認識することができる。たとえば、ノード２７で、言葉ｒｓｔｏｐＪを無効アークを使用せずに認識しようとする場合には、まず始めに無音を検出しなければならない。言葉ｒｓｔｏｐＪの前に無音を置く必要がない場合には、文法モデルはｒｓｔｏｐＪを表わすアークをもノード２７から出すように修正しなければならない。したがって、無効アークによって、無効アークの始発ノードで終る特定のアークが、その始発ノードから無効アークの終端ノードまで複製されることが゛、なくなる。Two characteristics of this type of grammar model include null arcs (null arcs) between nodes. c) and loops within the grammar model. invalid arc A network is essentially a virtual connection between two nodes. This will cans or words without duplicating the arcs representing these words in the grammar model. can be recognized. For example, at node 27, disable the word rstopJ If you want to recognize without using an arc, you must first detect silence. Must be. If you do not need silence before the word rstopJ, use the grammar model The arc representing rstopJ must also be modified to exit from node 27. No. Therefore, an invalid arc allows a characteristic that ends at the starting node of the invalid arc to A given arc is replicated from its starting node to the terminal node of the invalid arc. It disappears.

この種の文法モデルが実現する２番目の利点はループ自身である。ループを許容することにより、文法モデル内でテンプレート表現が複製されるのを回避することができる。The second advantage that this type of grammar model provides is the loop itself. Allow loops By doing this, you can avoid duplicating template expressions in the grammar model. I can do it.

たとえば、単語シーケンスｒ　０−Ｏ−０−１−３−９Ｊを認識しなければならない場合、単語モデルを通る経過を、モデル−内で「０」を３回複製せずに認識することができる。For example, we have to recognize the word sequence r　0-O-0-1-3-9J If not, recognize the progression through the word model without duplicating ``0'' three times within the model. can do.

この種の単語モデルを収容するのに、本発明は特別なパラメータを取入れている。これらのパラメータは、各ノードに対して、現行累積距離、その累積距離の始点となる現行リンク、前に累積された距離、および前に累積された距離の始点となる前のリンクを含む。これらパラメータはノード・テーブルに格納されており、次のように構成することができる。To accommodate this kind of word model, the present invention introduces special parameters. . These parameters include, for each node, the current cumulative distance, the beginning of its cumulative distance, The current link as a point, the previously accumulated distance, and the starting point of the previously accumulated distance. Contains previous links. These parameters are stored in the node table , can be constructed as follows.

ノード・テーブルノード・テーブルはノードに導く入力フレーム・アーク（ｉｎｐｕｔ　ｒｒａ■ ｅ　ａｒｃｓ）を処理するとき各ノードの情報を一時的に格納するのに使用する。ノードに導くアークは無効アークでもよいし、関連テンプレートを備えているアークでもよい。node table The node table contains input frame arcs (input rra■ Used to temporarily store information on each node when processing . Arcs leading to nodes can be invalid arcs and have associated templates. It can also be an arc.

各ノードに対して、テーブルへの２組のエントリが割当てられている。第１ＩＩｉのエントリは前に蓄積された距離とその関連リンクとである。この情報は一時的に格納されてノードへの「最良」アークに対する累積距離を決定するのに使用することができる。Two sets of entries in the table are assigned to each node. 1 II The entries in i are the previously accumulated distances and their associated links. This information is temporary used to determine the cumulative distance for the "best" arc to the node. can do.

第２組のエントリは現在の累積距離とその関連リンクとである。一旦、ノードに導くアークがすべて所定の入力フレームに対して処理されてしまえば、前の「パラメータ」が「現行」パラメータにコピーされる。すなわち、左の２つの欄からの情報が右の２つの欄にコピーされ、これによリノードに導く最良累積距離と、始発するアークに対するリンクとが保持される。また、前の欄の累積距離がすべて活動しなくなる。ノード・テーブルについてはｔＪ　Ｓ　ａ図および第５ｂ図の認識流れ図を用いて更に説明することにする。The second set of entries are the current cumulative distances and their associated links. Once on the node Once all leading arcs have been processed for a given input frame, the previous parameter' is copied to the 'current' parameter. That is, from the two columns on the left information is copied into the two columns on the right, which gives the best cumulative distance leading to the linode, and A link to the starting arc is maintained. Also, the cumulative distance in the previous column is and become inactive. For the node table, see Figure 5a and Figure 5b. This will be further explained using a recognition flowchart.

リンク・テーブルリンク・テーブルは、第２図と同様のトリー・ネットワークの形に合うように、考察中の可能性のある単語シーケンスのすべてを表わすものである。単語シーケンスは、事実、テンプレート合致期間中に検出された可能な単語終端を有する連結されたテンプレートである。このような方法でネットワークを設定することにより可能性のあるあらゆる単語シーケンスの明瞭な部分であるこれらのリンクを分析することができる。この分析プロセスをトレースバック（ｔｒａｃｅｂａｃｋ）と言う。適格に利用すれば、トレースバックにより、もはや明らかになお考察中のシーケンスの一部ではなくなっているリンク・レコードを解放する効率的な方法が可能となる。link table The link table is configured to fit the tree network shape similar to Figure 2. It represents all possible word sequences under consideration. word sequence In fact, strings with possible word endings detected during the template matching period are This is a combined template. Setting up the network in this way These links are distinct parts of any more likely word sequence. can be analyzed. This analysis process can be traced back (traceback). k). If used properly, tracebacks can make things no longer obvious. An efficient way to free link records that are no longer part of the sequence being monitored. method becomes possible.

トリー図の各リンク、あるいはノード式トリー接続には数種類の情報を格納しておかなければならない。この情報は第１図のリンク・テーブル１６のメモリのＬ −ＡＣＴ。Each link in a tree diagram or node-style tree connection stores several types of information. I have to keep it. This information is stored in memory L of link table 16 in FIG. -ACT.

Ｌ−ＦＷＲＤ、Ｌ−ＢＡＣＫ、Ｌ−ＷＯＲＤ、およびＬ−ＰＴＲのアレイに格納される。この実施例では、各アレイは長さが２５５バイトで、２５６バイトのバウンダリから１バイト過ぎた位置に配置されて、効率的アクセスができるようになっている。各アレイからの対応する要素は「リンク・レコード」を構成している。リンク・レコードは２リンク式リストになるようにチェインされている。１つのリストは自由リンク・レコード、すなわち、追加リンクに利用できる空きレコード・スペースを備えている。第２のリストは設定したリストであり、現在使用されているリンクのレコードを備えている。これらのリストはＬ−ＰＴＲアレイによって互いにチェインされている。この場合し−ＰＴＲ内の１つのエントリは、設定リストまたは自由リストからの、テーブル内の次のリンク・レコードを示しており、各レコードは各５つのアレイからの１バイトを含んでいる。たとえば、設定リスト内の所定のリンク・レコードについて、Ｌ−ＰＴＲアレイの対応するバイトが数「２」の２進表現を含んでいれば、設定リスト内の次のレコードは５つのアレイすべての第２バイト目に存在することになる。Ｌ−ＰＴＲアレイの「０」エントリはリンク式リストの終端を画定している。Stored in L-FWRD, L-BACK, L-WORD, and L-PTR arrays be done. In this example, each array is 255 bytes long with a 256-byte buffer. It is placed one byte past the boundary to allow efficient access. It has become. Corresponding elements from each array constitute a "link record". Ru. Linked records are chained into a two-linked list. 1 A list of free link records, that is, free records available for additional links. It has a code space. The second list is the list you have configured and is currently in use. Contains a record of links used. These lists are for L-PTR arrays. are chained together by i. In this case - one entry in the PTR returns the next linked record in the table from the set or free list. As shown, each record contains one byte from each of the five arrays. parable For example, for a given link record in the configuration list, the L-PTR array's correspondence the next record in the configuration list if the byte to be will be present in the second byte of all five arrays. L-PTR array The "0" entry in defines the end of the linked list.

Ｌ−ＢＡＣＫアレイとＬ−ＷＯＲＤアレイとは実際のリンク情報を備えている。The L-BACK array and L-WORD array contain actual link information.

Ｌ−ＢＡＣＫはデコード・パス内の前のリンク、すなわち、トリー図の前のノードを指すポインタを備えており、一方、Ｌ−ＷＯＲＤは現行リンクの終りにデコードされた単語を表わす記号を備えている。たとえば、第３図において、トリー・ノード２６に単語「ＦＯＵＲＪを付加してから、Ｌ−ＷＯＲＤは単語ｒＦＯＵＲＪを表わす８ビット記号を備え、Ｌ−ＢＡＣＫはトリー・ノード２６に対応するリンク・レコードを指すポインタを備えることになる。その他の２つのアレイ、Ｌ−ＡＣＴとＬ−ＦＷＲＤとはデコード・パスを通る「トレースバック」（可能性ある単語シーケンス）に使用される。Ｌ−ＡＣＴは合致の可能性あるものとしてなお考察中のデコード・パスを示すのに使用され、Ｌ−ＦＷＲＤはトリー国内の後続ノード、すなわちＬ−ＢＡＣＫの逆、をポイントするのに使用される。L-BACK is the previous link in the decoding path, i.e. the previous node in the tree diagram. The L-WORD has a pointer pointing to the current link, while the L-WORD has a pointer pointing to the current link. It has symbols to represent the coded words. For example, in Figure 3, the tree ・After adding the word “FOURJ” to node 26, L-WORD becomes the word “rFOU” with an 8-bit symbol representing RJ, and L-BACK corresponds to tree node 26. A pointer to the linked record will be provided. Two other arrays , L-ACT and L-FWRD are “traceback” (possible) through the decoding path. possible word sequences). L-ACT is a possible match is used to indicate the decoding path still under consideration, and L-FWRD is is used to point to the successor node within, i.e. the inverse of L-BACK.

今後、認識される可能性あるパス（アクティブ・パス（ａｃｔｌｖｅ　ｐａｔｈ）　）の一部として考察されているとフラグが立てられたリンク・レコードをアクティブ・リンク・レコードと称することにする。Paths that may be recognized in the future (active paths) ) )). This will be referred to as an active link record.

リンク・レコードはトレースバック情報を示すので、その状態に到達するのに使用される単語モデルを通るアークを確認することができる。トレースバックはトリーから役に立たない情報を切払うこともできる。これは情報がメモリ内に余分に蓄積されないようにするのに必要である。トレースバックは明瞭に認識されている単語、すなわちアクティブ・パスすべてに共通な単語を出力するのに使用することもできる。リンク・テーブルのＬ−ＢＡＣＫエントリはテーブル内の前のエントリを指しており、これはトリー国内の前に接続したノードに対応する。したがって、トレースバックはトリー図を通ってすべてのパスが会合する点、すなわちトリー・ノードまで逆に追跡する（ｔｒａｃｌｎｇ　ｂａｃｋ）プロセスであると言われる。すべてのパスが会合する点まで追跡して戻るという概念は当業者にはよく知られている。Link records show traceback information and can be used to reach that state. You can see the arc passing through the word model used. The traceback is You can also remove useless information from Lee. This means that the information is redundant in memory. This is necessary to prevent it from accumulating. Tracebacks are clearly recognized is used to output words that are common to all active paths. You can also The L-BACK entry in the link table is the previous link in the table. It points to the entry, which corresponds to the previously connected node within the tree. death Therefore, the traceback is the point where all paths meet through the tree diagram, i.e. In the process of tracing back to the tree node, It is said that there is. The concept of tracing all paths back to the point where they meet is well within the skill of the art. well known to those who

というのは「トレースバック」の一般的説明は前掲の「部分的トレースバックおよび動的プログラミング」を参照して行われるからである。This is because the general explanation of "traceback" is the same as "partial traceback" and "partial traceback" mentioned above. This is because it is done with reference to ``and dynamic programming''.

上述のアレイを後続の図の説明中の参考のため以下に掲げる。The arrays described above are listed below for reference in the description of subsequent figures.

Ｌ−ＰＴＲ：　２５５バオト。各バイトはトリー図（テーブル）に設定リストに対する時開の関数として付加された前のリンク・レコードを指すポインタとして利用できる。また自由リストの自由リンク・レコードをチェインするのにも使用される。L-PTR: 255 baots. Each byte is set in a tree diagram (table) list as a pointer to the previous link record appended as a function of time to Available. Also used to chain free linked records in free lists. be done.

Ｌ−ＢＡＣＫ：２５’５バイト。各バイトはトリー図の前のリンク・レコードを指すポインタとして利用できる。L-BACK: 25'5 bytes. Each byte represents the previous link record in the tree diagram. It can be used as a pointer.

Ｌ−ＷＯＲＤ　：　２５５バイト。各バイトは現行リンク・レコードに対応する、可能性のある認識単語を示す記号として利用できる。L-WORD: 255 bytes. Each byte corresponds to the current link record , can be used as a symbol to indicate a potentially recognized word.

Ｌ−ＡＣＴ　：　２５５バイト。各バイトは現行リンク・レコードがアクティブか否かを示すのに利用することができる（トレースバック中に使用される）。L-ACT: 255 bytes. Each byte has the current link record active (used during traceback).

Ｌ−ＦＷＲＤ：２５５バイト。各バイトはトリー図中の後続有効リンク・レコードを示すポインタとして利用できる（トレースバック中に使用される）。L-FWRD: 255 bytes. Each byte is a subsequent valid link record in the tree diagram. (used during traceback).

上述のアレイの他に、別の５つのポインタが使用される。In addition to the array described above, another five pointers are used.

それらは次の通りである。They are as follows.

ＨＥＡＤ　：　Ｌ−ＰＴＲアレイによフてチェインされた、設定リスト内の最初の、すなわち最も最近に追加されたリンク・レコードを示す１バイトのポインタ。HEAD: First in the configuration list chained by the L-PTR array , a 1-byte pointer to the most recently added link record. .

ＦＲＥＥ　二Ｌ−ＰＴＲアレイによってチェインされた、自由リストの最初のリンク・レコードを示す１バイトのポインタ。FREE First list of free list chained by two L-PTR arrays A 1-byte pointer to the link record.

ＰＴＲ：処理中の現行トリー・ノードを参照する１バイトのポインタ。PTR: 1-byte pointer to the current tree node being processed.

ＴＭＰＩとＴＭＰ２：それぞれ認識流れ図に使用される１バイトのテンポラリ（ｔｅｍｐｏｒａｒｙ）　・ポインタである。TMPI and TMP2: 1-byte temporary ( temporary) pointer.

構造的に、設定リスト内に１０個のエントリだけを備えたテーブルを仮定すると、これらのアレイは次のように配列することができる。Structurally, assuming a table with only 10 entries in the configuration list , these arrays can be arranged as follows.

レコード番号　Ｌ−ＰＴＲＬ−ＢＡＣＫ　Ｌ−１ｊＯＲＤ　Ｌ−ＡＣＴ　Ｌ−ＦＷＲＤ８　５　５　”　９　０　０上記テーブルのエントリを付録Ａにトリー図で示す。ＨＥＡＤはレコード＃（番号）７を指し、レコード＃３のＬ−ＰＴＲエントリの「０」はリストの最後のレコードを示すことに注目すべきである。ＦＲＥＥは図示してない。Record number L-PTRL-BACK L-1jORD L-ACT L-F WRD8 5 5” 9 0 0 The entries in the table above are shown in a tree diagram in Appendix A. HEAD is record # number) 7, and “0” in the L-PTR entry of record #3 indicates the last record in the list. It should be noted that the code is shown. FREE is not shown.

Ｌ−ＰＴＲにより、設定リスト内のレコードを、レコードを自由リストから単に取出してそのＬ−ＰＴＲエントリ１：　ＨＥ　Ａ　Ｄレコードを指示させ、ＨＥＡＤとＦＲＥＥとを更新することにより、入れることができる。トレースバック中にレコードをテーブルの設定リストから削除すると、そのレコードは、レコード・エントリを自由リストにリンクし、Ｌ−ＰＴＲエントリを利用して設定リストの除去レコードにわたってリンクすることにより、テーブルを配列しなおすことなく利用できるようになる。Ｌ−ＡＣＴおよびＬ−ＦＷＲＤのエントリはトレースバック中にのみ使用され、その他の場合は常にＯにリセットされている。L-PTR allows you to simply delete records in the configuration list and records from the free list. Take it out and specify its L-PTR entry 1: HE AD record, and It can be inserted by updating AD and FREE. traceback If you delete a record from a table's configuration list during link the set list to the free list and use the L-PTR entry to link the set list to the free list. You can reorder the table by linking across the removed records. It will be available for free. Entries for L-ACT and L-FWRD are Used only during backup, otherwise always reset to O.

認識流れ図今度は第５ａ図から第５ｃ図までを参照すると、認識流れ図が本発明にしたがって示されている。第５ａ図の流れ図はリンク・テーブルとその関連ポインタとをリセットすることによりブロック３０から始まる。リセット手順にはＬ−ＦＲＷＤとＬ−ＡＣＴとの各バイトを０に等しくセットすること、ＨＥＡＤポインタを１に等しくセットして設定リストの始まりを示すこと、およびＬ−ＰＴＲ（１）とＬ−ＰＴＲ（２５５）とを等しく０にセットしてそれぞれ設定リストと自由リストとの終りを示すことが含まれる。Recognition flow chart Referring now to FIGS. 5a through 5c, the recognition flowchart is shown in accordance with the present invention. is shown. The flowchart in Figure 5a shows the link table and its associated pointers. We begin at block 30 by resetting. L-FRW for reset procedure Setting each byte of D and L-ACT equal to 0, setting the HEAD pointer to set equal to 1 to indicate the beginning of the configuration list, and L-PTR(1) and L-PTR (255) are set equal to 0 to create the configuration list and free read respectively. This includes indicating the end of the strike.

また、典型的には一層メモリに格納されているテンプレート状態メモリがインアクティブ（ｉｎａｃｔｉｖｅ）にされる。したがって、第１のレコードは設定リストから構成され、レコード２から２５５まではＬ−ＰＴＲエントリを使用して自由リストを形成するようにチェインされる。ここでＨＥＡＤは設定リストの始まりを指し、ＦＲＥＥは自由リストの始まり（リンク・レコード＃２）を指す。Additionally, the template state memory, which is typically stored in more memory, is be made inactive. Therefore, the first record is records 2 to 255 use L-PTR entries. Chained to form a free list. Here HEAD is the beginning of the settings list. FREE points to the beginning of the free list (link record #2).

第５ａ図のブロック３２で、認識文法モデルはノード・テーブルを初期設定する二とによって初期設定される。これは初、期ノードの前の累積距離に、低い累積距離尺度を、任意に割当て、文法の出発点を示すことにより行うことができる。At block 32 of Figure 5a, the recognition grammar model initializes the node table. Initialized by 2 and 2. This corresponds to the cumulative distance before the initial, period node, and the low cumulative This can be done by arbitrarily assigning a distance measure to indicate the starting point of the grammar.

初期ノードの前のリンク・ポインタが、リンク・テーブルの最初のエントリに対応する１にセットされる。The link pointer before the initial node points to the first entry in the link table. is set to 1 accordingly.

文法モデルの他のすべてのノードはインアクティブに初期設定される。文法ノードはノード・テーブルの前の累積距離を無限大に等しく設定することによりインアクティブに設定することができ、これにより処理の始めに他のノードに可能性が存在しないことを示すことができる。All other nodes in the grammar model are initialized to inactive. grammar no The node is imported by setting the cumulative distance before the node table equal to infinity. Can be set to active, which makes it possible for other nodes to It can be shown that it does not exist.

ブロック３４で、トレースバック・カウンタが１０に初期設定される。トレースバック・カウンタはトレースバック・プロセスを行うべきことを周期的に示すのに使用される。この実施例では、トレースバックは１０個の入力フレームが処理されるごとに行われる。At block 34, a traceback counter is initialized to ten. trace The back counter periodically indicates that a traceback process should be performed. used for. In this example, the traceback is processed when 10 input frames are processed. It is done every time it is done.

ブロック３６で次の入力フレームが先に述べたテンプレート合致のためにシステムに入力される。この流れ図の残りのステップはすべて現在の入力フレームの処理に関係する。At block 36, the next input frame is sent to the system for template matching as described above. input into the system. All remaining steps in this flowchart process the current input frame. related to reason.

トレースバック・カウンタはブロック３８でデクレメン）　（ｄｅｃｒｅｍｅｎｔ）　してフレームが丁度入力されたことを示す。The traceback counter is decremented in block 38. t) to indicate that the frame has just been input.

ブロック３９で、入力フレームを処理する前にノード・テーブルが更新される。At block 39, the node table is updated before processing the input frame.

前のノード・パラメータ（「パラメータ」とは累積距離とどこから始まったかを示すリンクとを指す）はノード・テーブルの現行ノード・パラメータにコピーされる。また、先に累積された距離がすべてインアクティブになる。Previous node parameters ('parameters' are the cumulative distance and where it started) (pointing to the link indicated) is copied to the current node parameter in the node table. It will be done. Additionally, all previously accumulated distances become inactive.

ブロック４０で文法モデルのすべてのノードが処理されたか否かを確認する試験が行われる。換言すれば、入力フレームが文法モデル全体について処理されたか否かについてである。文法モデルのすべてのノードが処理されてしまっていれば、流れはブロック４２に進んでトレースバック・カウンタがトレースバックをこのフレームで行うべきことを示しているか確認する。もしそうなっていれば、第７ａ図から第７ｄ図までの後に説明するトレースバック・サブルーチン４４を呼んでトレースバックが行われる。トレースバックに続き、ブロック３６で次の入力フレームを処理する前に、ブロック４６でトレースバック・カウンタがリセットされる。Test to see if all nodes of the grammar model have been processed in block 40 will be held. In other words, has the input frame been processed for the entire grammar model? It's about whether or not. If all nodes of the grammar model have been processed , flow proceeds to block 42 where the traceback counter registers the traceback. Check if the frame indicates what to do. If so, then 7a to 7d, which will be described later. A traceback is then performed. Following the traceback, block 36 Before processing the input frame, the traceback counter is reset at block 46. will be played.

文法モデルのノードがすべては処理されてしまっていなければ、流れは第５ａ図のブロック４０から第５ｂ＠のブロック５０に進む。ブロック５０で、認識文法モデルの処理が次のノードまで進む。このフレームについてノードが処理されていなければ、「次のノード」は無効アークに対する終結ノードではない任意のノードとすることができる。If all the nodes in the grammar model have not been processed, the flow is as shown in Figure 5a. The process proceeds from block 40 to block 50 of 5b@. At block 50, the recognition grammar Model processing continues to the next node. Nodes have been processed for this frame. If not, "next node" is any node that is not the terminal node for the invalid arc. It can be a code.

特に、処理のためのノードの順序は無効アークが始まるノードが無効アークに対する終結ノードであるノードの前に処理されるようになっていなければならない。これはノード累積距離とリンクとがテンプレートの始発ノードについて、これらのテンプレートが処理される前に確実に更新されるようにするためである。In particular, the order of nodes for processing is such that the node where the invalid arc begins is relative to the invalid arc. must be processed before the node that is the final node . This means that the node cumulative distance and links are the same for the starting node of the template. This is to ensure that these templates are updated before being processed.

ブロック５２でこのノードで終るすべてのテンプレートが処理されてしまったか否かを確認する試験が行われる。Have all templates ending in this node been processed in block 52? A test will be conducted to confirm whether this is the case.

その後文法モデルの各ノードの直ぐ前の各テンプレートが他のノードに進む前に処理されることが認識される。二〇ノードで終るテンプレートのすべてが処理されてしまうと、流れは後に説明する第５ｃ図のブロック６８に進む。すべてのテンプレートがまだ処理されていなければ、流れはブロック５４に進み、このノードで処理すべき次のアークが無効アークであるか確認する。次のアークが無効アークであれば、流れはブロック５５に進む。Then each template immediately before each node in the grammar model before proceeding to other nodes Recognized to be processed. All of the templates ending in 20 nodes are processed. If this occurs, flow proceeds to block 68 of FIG. 5c, discussed below. All Tees If the sample plate has not yet been processed, flow continues to block 54 to process this node. Check whether the next arc to be processed by the code is an invalid arc. The next arc is invalid If so, flow proceeds to block 55.

ブロック５５で、累積距離とそのノードに導くリンクとが、ノード・テーブルに格納されている、前の累積距離と無効アークの始発ノードに対するリンクとにセットされる。At block 55, the cumulative distance and the links leading to that node are entered into the node table. The stored previous cumulative distance and the link to the starting node of the invalid arc are will be cut.

この時点で、これまで認識流れ図で示された認識文法モデルの処理を要約するのが有用であろう。もう一度第２図を参照すると、文法モデルの初期ノード２２がアクティブにセットされ、対応するリンク・テーブル・エントリが可能性のあるすべての文法バス（トリーの枝）が発生する基準を示すように初期設定される。At this point, we can summarize the processing of the recognition grammar model shown in the recognition flow diagram so far. would be useful. Referring again to Figure 2, the initial node 22 of the grammar model is set to active and the corresponding link table entry may All grammar buses (branches of the tree) are initialized to indicate the criteria on which they occur.

入力フレームが処理されるごとに、文法モデルは一度に１ノードずっ始発ノードから終了ノードまで進む。更に、文法モデルの各ノードに対して、そのノードで終る各テンプレートが、後に説明するように、１度に１つのテンプレートを処理する。したがって、各入力フレームごとに、各ノードが処理され、各ノードごとに、そのノードで終る各テンプレートが処理される。Each time an input frame is processed, the grammar model returns one node at a time to the starting node. Proceed from to the end node. Furthermore, for each node in the grammar model, at that node Each template that ends processes one template at a time, as explained below. do. Therefore, for each input frame, each node is processed, and for each node Then, each template ending at that node is processed.

ブロック５６で示したように、トレースバックが必要であるか否かに無関係に、現在アークに対する次のテンプレートがブロック５８かブロック６０かで処理される。いずれかのブロックでテンプレートが合致すると、現在の入力フレーム、テンプレート、および現行累積距離とノード・テーブル内のノードが始まるアークのリンク・ポインタとに基づいて、累積距離とテンプレートのすべての状態に対するリンク・ポインタとが更新される。現行累積距離とリンクとが計算された時刻（前のフレーム）およびそれらが利用されている時刻（現在のフレーム）から１フレーム処理時間遅れているため、この情報は前のフレームに対する処理であるかのようにテンプレート処理により利用されなければならない。換言すれば、テンプレート処理は現在フレームに対するテンプレート処理を行う前に現行累積距離と現行リンクとを使用して前のフレームの処理を終了することになる。現在のフレームのこのテンプレートに対して単語終端である可能性が存在すれば、可能性ある単語終端に対応する累積距離とリンク・ポインタとが発生する。前掲の接続単語認識のアルゴリズム。As indicated at block 56, regardless of whether a traceback is required, The next template for the current arc will be processed at block 58 or block 60. It will be done. If the template matches in any block, the current input frame, template, and the current cumulative distance and the arch from which the node in the node table starts. The cumulative distance and all states of the template based on the link pointer of the The link pointer for the link is updated. The current cumulative distance and links have been calculated time (previous frame) and the time they are used (current frame) Since the processing time for the previous frame is delayed by one frame, this information is not processed for the previous frame. It must be utilized by template processing as if it were. In other words , template processing is performed on the current cumulative frame before performing template processing on the current frame. The product distance and current link will be used to finish processing the previous frame. current If there is a possibility that it is a word end for this template in the current frame, Cumulative distances and link pointers corresponding to possible word ends are generated. Above Connected word recognition algorithm.

流れがブロック５６からブロック６０に進み、トレースバックが、上述のテンプレート処理の他に、このフレームを処理していることを示すと、そのテンプレート内の各アクティブ状態に対するリンク・ポインタが指すリンク・レコードに対応するＬ−ＡＣＴエントリのすべてが非ゼロにセットされる。「アクティブ」テンプレート状態は有限累積距離を有するものである。Flow proceeds from block 56 to block 60, where the traceback follows the template described above. In addition to rate processing, if you indicate that you are processing this frame, its template The link record pointed to by the link pointer for each active state in the All corresponding L-ACT entries are set to non-zero. "Active" A template state is one that has a finite cumulative distance.

次に流れはブロック６２に進み、ここで現行テンブレー距離がこのフレームに対するノードで終る先に処理された最良のテンプレート（アーク）（これはこれがこのノードに対して処理された最初のテンプレートである場合には無限大になる）に対応する現行累積距離より良いか否かを確認する試験が行われる。この試験の結果は現在の入力フレームのテンプレート合致が単語テンプレートの単語終端である可能性があることを示す場合にのみ真である。先に述べたとおり、単語終端である可能性があれば入力フレームのシーケンスがテンプレート・メモリに格納されている単語テンプレートと対応する、すなわち合致することを示す。Flow then proceeds to block 62 where the current tenbray distance is The best template (arc) processed earlier that ends in a node that Infinite if this is the first template processed for this node ) will be tested to see if it is better than the current cumulative distance corresponding to this exam The result is that the current input frame's template match is the word end of the word template. is true only if it indicates that it is possible that As mentioned earlier, word-final The sequence of input frames is stored in template memory if there is a possibility of an edge. Indicates that it corresponds to, or matches, the stored word template.

テンプレートに現在の入力フレームに対して単語終端の可能性がなければ、その関連の累積距離は無限になる。If the template has no end-of-word possibility for the current input frame, then The cumulative distance of association becomes infinite.

最も最近に処理されたテンプレート（アーク）にそのノードに関して格納されている現行′Ａｍ距離より良い累積距離がないことがわかれば流れはブロック５２に戻り、ここでそのノードで終る別のテンプレートが処理される。The most recently processed template (arc) stored for that node If it is found that there is no cumulative distance better than the current 'Am distance, the flow goes to block 52. , where another template ending at that node is processed.

最も最近処理されたアークに対する累積距離がそのノードに関して今まで処理されたものの中で最良であることがわかれば、流れはブロック６４に進んでこの情報を記録する。ブロック６４で、上で処理されたアークに対応する累積距離とリンク・ポインタとがノード・テーブルの文法モデルのノードに対する現行累積距離および現行リンクとして記録される。他に、テンプレートを表わす単語番号、あるいは記号が記録される。単語番号は、後に認識されたと確認されれば単語を引続き出力するために記録される。ブロック６４から、流れは上に説明したようにブロック５２に進む。The cumulative distance for the most recently processed arc so far processed for that node. If the information is found to be the best, flow proceeds to block 64 where this information is record information. At block 64, the cumulative distance and distance corresponding to the arc processed above are determined. The link pointer is the current cumulative distance to the node in the grammar model in the node table. Recorded as separated and current links. In addition, the word number representing the template, Or a symbol is recorded. The word number will indicate the word if it is later confirmed as recognized. Recorded for subsequent output. From block 64, the flow is as described above. Then proceed to block 52.

ブロック５４で、そのノードで終るすべてのアークが処理されたことが示されれば、流れは第５Ｃ図のブロック６８に進む。第５ｃ図で、ブロック６８から７４まではリンク・レコードをトリーに加えるべかき否かを確認し、もし加えるべきである場合には、リンク・レコードがリンク・アレイを通してトリーに加えられる。Block 54 indicates that all arcs ending at that node have been processed. If so, flow continues to block 68 of FIG. 5C. In Figure 5c, blocks 68 to 74 Check whether a linked record should be added to the tree until , the linked record is added to the tree through the linked array. Ru.

ブロック６８で、文法モデルのノードがアクティブであるか確認する試験が行われる。ノードがアクティブになっていることができる唯一の態様は、そのノードに対して処理された少なくとも１つの単語テンプレートが現在の入力フレームに対して単語終端である可能性がある場合である。At block 68, a test is performed to see if the node in the grammar model is active. It will be done. The only way a node can be active is when that node At least one word template processed for is in the current input frame On the other hand, there is a possibility that it is the end of a word.

ノードがインアクティブである場合には、流れは第５ａ図のブロック４０に進み、現在フレームに対して処理する他のノードを探す。それ以外の場合は、流れはブロック７０に進む。If the node is inactive, flow continues to block 40 of FIG. 5a. , find other nodes to process on the current frame. Otherwise, the flow is Proceed to block 70.

ブロック７０で、そのノードで終る最良アークが対応する単語テンプレートを備えているか否かを確認する試験が行われる。ある例では、無音テンプレートのような別種のテンプレートであってもよく、あるいは無効アークであってもよい。At block 70, the best arc ending at that node is equipped with a corresponding word template. A test will be conducted to confirm whether the In some cases, like the silent template. It may be a different type of template, such as a template, or it may be an invalid arc.

この場合には流れは第５ａ図のブロック４０に進む。無音テンプレートは、典型的には認識されているような無音を出力する必要がないから、トリーには加えられない。そのノードで終る最良のアークが単語テンプレートを表わしている場合には、ＬＩＮＫサブルーチン（第６図）が呼出されてリンク・レコードをトリー図に加える。そのテンプレートのリンク・レコードに対応するリンク・ポインタとテンプレートを表わす単語番号とを示すパラメータはＬＩＮＫサブルーチンに送られる。In this case, flow continues to block 40 of Figure 5a. The silent template is typical Since there is no need to output silence that is recognized as such, it is not added to the tree. Not possible. If the best arc ending at that node represents a word template The LINK subroutine (Figure 6) is called to read the link record. Add to diagram. the link pointer corresponding to that template's link record and the word number representing the template are sent to the LINK subroutine. Sent.

以下に説明するように、リンク・レコードが加えられてから、新しいリンク・ポインタがＬＩＮＫから戻される。After a link record is added, a new link point is added, as described below. Inter is returned from LINK.

ブロック７４で、文法モデルのノードに対する現行リンク・ポインタがＬＩＮＫから送られたリンク・ポインタにセットされる。At block 74, the current link pointer for the node of the grammar model is set to LINK. Set to the link pointer sent from.

ブロック７４に続いて、流れは第５ａ図のブロック４０に進み、文法トリーのすべてのノードが現在の入力フレームに対して処理されてしまったかチェックする。Following block 74, flow continues to block 40 of Figure 5a, where all grammar trees are check if all nodes have been processed for the current input frame .

リンクのトリーへの付加今度は第６図を参照すると、先に説明したようにこのサブルーチンはリンク・レコードをリンク・アレイで規定されたようにトリー図に追加する。サブルーチンに送られるパラメータはそこから加えられるトリー〇ノードに対応する単語番号とリンク・ポインタとである。Adding links to the tree Referring now to Figure 6, as explained earlier, this subroutine Add code to the tree diagram as specified by the link array. subroutine The parameter sent to is the word number corresponding to the tree node added from there. and a link pointer.

ブロック７８で、自由リンク・レコードが存在するか否かを確認する試験が行われる。これはＦＲＥＥと０とを比較することにより行われる。ＦＲＥＥが０に等しければ、もはや自由リンク争レコードは存在しない。上に説明したが、リンク・アレイ内のレコードは、Ｌ−ＰＴＲアレイで互いにチェインされており、自由リンク・レコードと設定リンク・レコードとから構成されている。自由リンク・レコードは別のリンク・レコードをトリー図に追加できるようにする。したがって、自由リンク・レコードが存在しなければ、すべてのリンク・レコードは使用されており、流れはブロック８０に進んでエラーが報告され、システムがリセットされる。ブロック８０のこのステップはリンク・テーブルをオーバフローさせるおそれのある異常状態から保護するときにのみ使用することに注意すべきである。正常状態では、本発明は適当な長さのリンク・テーブルを使用して自由リンク・レコードがなくならないようにしている。At block 78, a test is performed to determine if a free link record exists. It will be done. This is done by comparing FREE to 0. FREE equals 0 If so, the free link competition record no longer exists. As explained above, link ・Records in the array are chained together in the L-PTR array and are free It consists of a link record and a setting link record. Free link/ The record allows another linked record to be added to the tree diagram. Therefore If no free link records exist, all link records are used. has been reset, flow continues to block 80 where an error is reported and the system is reset. will be played. This step in block 80 causes the link table to overflow. It should be noted that they should only be used to protect against abnormal conditions that may cause Ru. Under normal conditions, the invention uses a link table of appropriate length to perform free linking. I'm trying not to run out of records.

１つ以上の自由リンク・レコードがあれば、流れはブロック８２に進み、次の利用可能なリンク・レコードが自由リストから取出され、ＨＥＡＤおよびＦＲＥＥのポインタを更新することにより設定リストの最上部、すなわち始まりに挿入される。ＦＲＥＥは次の自由レコードのインデックス（ｉｎｄｅｘ）を指すようにセットされ、ＨＥＡＤは丁度加えられたばかりのリンク会レコードを指すようにセットされる。新しいｒＨＥ　Ａ　ＤＪリンク・レコードのＬ−ＪＰＴＲは新しいレコードを設定リストにチェインする、前の「ＨＥＡＤＪリンク争レコードを指すようにセットされる。If there is one or more free link records, flow continues to block 82 where the next The available link records are retrieved from the free list and set to HEAD and FREE. is inserted at the top of the configuration list, i.e. at the beginning, by updating the pointer of It will be done. FREE points to the index of the next free record. set and HEAD points to the link meeting record that was just added. Set. New rHE A DJ link record L-JPTR is new Chain the new record to the settings list. set to point.

ブロック８６では、ＨＥＡＤは設定リストに丁度加えられたばかりのリンク・レコードを指すので、このサブルーチンに送られた単語番号がＬ　−ＷＯＲＤアレイに新しいレコードとして記録される。また、このサブルーチンに送られたリンク・ポインタはＬ−ＢＡＣＫアレイにリンク・レコードとして記録される。At block 86, the HEAD selects the link record just added to the configuration list. code, so the word number sent to this subroutine is the L-WORD array. recorded as a new record. Also, the link sent to this subroutine The link pointer is recorded in the L-BACK array as a link record.

ブロック８８で現在の入力フレームにトレースバックが必要であるか否かを確認する試験が行われる。必要であれば、流れはブロック９０に進み、新しく加えられたリンク・レコードにアクティブの印が付く。これはそのレコードのＬ−ＡＣＴアレイを１に等しくセットすることにより行われる。現在の入力フレームに対してトレースバックが必要でなければ、サブルーチンは終り、流れは第５Ｃ図のブロック７４に戻る。Block 88 checks whether a traceback is required for the current input frame A test will be conducted. If necessary, flow continues to block 90 to add new The linked record that was added is marked active. This is the L-AC of that record This is done by setting the T array equal to one. for the current input frame. If traceback is not required, the subroutine ends and the flow continues as shown in Figure 5C. Return to block 74.

トリーを通るトレーシング・バック今度は第７ａ図から第７ｄ図までを参照すると、トレースバック・サブルーチン、すなわち、第５ａ図のブロック４４が詳細に示されている。トレースバック・サブルーチンはトリー図を通して可能性のある合致として特定されている単語を探し、合致の一義性（ｕｎｉｑｕｅｎｅｓｓ）にあいまいさくａｍｂｉｇｕｉｔｙ）があるか否か確認する。一義的に特定されている単語は認識システムから２２された単語として出力される。更に、トレースバック・サブルーチンは死んだすべてのリンク・レコード、すなわち、もはや可能性ある合致として考察していないレコードを自由リストに取出し、メモリを将来のリンク・レコードに利用できるようにする。tracing back through tree Referring now to Figures 7a through 7d, the traceback subroutine , block 44 of FIG. 5a is shown in detail. Traceback The subroutine searches through the tree diagram for words that have been identified as possible matches. ambiguit, ambiguous about the uniqueness of the match Check whether y) exists. Words that are uniquely identified are recognized by the recognition system as 2 It is output as a 2 word. Furthermore, the traceback subroutine is dead All linked records, i.e., are no longer considered as possible matches. records that are not available in the free list, freeing up memory for future linked records. make it possible to do so.

トレースバックに入る前に、Ｌ−ＡＣＴが上述のようにすべてのアクティブ・リンク・レコードに対してセットされる、すなわちフラグが掲げられる。トレースバックのはじめに、アクティブ・リンク・レコードはトリーを通るまだ考察中のすべてのパスの終りを表わす。トレースバックの基本概念はトリーを通してすべてのアクティブ・パスの終り（はじめにＬ−ＡＣＴアレイにより印がつけられている）から「トレー・スパックしてすべてのアクティブ・パスがどこで会合するかを見つけることである。すべてのアクティブ・パスに共通なトリーの部分は明確な部分パスを表わし識されているとして出力することができる。トレースバック中、Ｌ−ＦＷＲＤアレイは部分パスを順方向に（トリーの終りの方に向って）チェインするのに使用される。これら部分バスが形成されるにつれて各部分バスの基本ノードがＬ−ＡＣＴアレイを経由してアクティブとされる。部分バスを（Ｌ−ＢＡＣＫ情報を使用して）現行ノードから既にアクティブと記されている前のノードに拡張しようとすると、１つ以上の可能なパスがこの前のノードから発生して両ノードからの部分バスが削除される（順方向ポインタΦチェイン（Ｌ− ＦＷＲＤ）が０にリセットする）。アクティブと記されているすべてのノードはこのような仕方で処理される。ノード処理の順序はリンク・レコードがリンク・テーブルに追加された時間の順序とは反対の順序である。この順序は設定リストの構造に固有のものである。処理される最後のノードはトリーのルート（ｒｏｏｔ）・ノードである。この点でそのノードから出る順方向チェイン（部分バス）は明確な部分バスを表わしており、対応する認識された単語が出力される。トレースバック手順はまたＬ−ＦＷＲＤおよびＬ−ＡＣＴアレイが０へのリセットを完了したら直ちにそれ自身「掃除する（ｃｌｅａｎｓ　ｕｐ）　Ｊ。更に、アクティブ争パス上にないすべてのリンク・レコードは、既に出力されている明確な部分パス上のリンク・レコードとともに、自由リストに戻される。Before entering the traceback, L-ACT checks all active resources as described above. flag is set or flagged for the link record. trace At the beginning of the back, active link records are still being considered through the tree. Represents the end of all paths. The basic concept of traceback is to end of active path (first marked by L-ACT array) ) to ``tray-spack'' to see where all active paths meet. It's about finding out. The parts of the tree that are common to all active paths are The exact partial path can be represented and output as known. traceback During the process, the L-FWRD array moves the partial path in the forward direction (towards the end of the tree). Used to chain. Each partial bus as these partial buses are formed. The elementary nodes of are made active via the L-ACT array. Partial bus ( Previously marked as active from the current node (using L-BACK information) If you try to expand to a node, one or more possible paths originate from this previous node. The partial buses from both nodes are deleted (forward pointer Φ chain (L- FWRD) is reset to 0). All nodes marked as active are It is processed in this way. The order of node processing is This is the opposite order of the times they were added to the table. This order is set list is specific to the structure of The last node processed is the root of the tree (roo t)-node. the forward chain (partial bus) leaving that node at this point represents a distinct partial bus, and the corresponding recognized word is output. training The reset back procedure also causes the L-FWRD and L-ACT arrays to reset to zero. Immediately after completion, it “cleans up” itself. All link records that are not on the active contention path are explicitly Returned to the free list along with link records on the partial path.

トレースバック・サブルーチンを詳細に説明する前に、図示例に入るのが役に立つであろう。第８図を参照すると、図Ａでは、トリーはトレースバックの前に示されており、アクティブ・リンク会レコード、すなわち、アクティブ単語リンクが出ているリンクはリンクの右側に太い点で記しである。トレースバックの第１のステップは丁度良い時期に加えられた最も最近のアクティブ・リンク・レコード、この場合は２５と記したノード、を確定することである。Before explaining the traceback subroutine in detail, it may be helpful to go through an illustrated example. Probably one. Referring to Figure 8, in Figure A the tree is shown before the traceback. active link records, i.e. active word links Links that appear are marked with a thick dot on the right side of the link. 1st traceback The step is the most recent active link record added at just the right time. node, in this case the node marked 25.

トリーのこのノードの直前にアクティブ・ノードが存在するか確認する試験が行われる。存在すれば、この例でノード２５を経由するかまたはノード２１を経由するというように、どのパスがノード２１に戻るかに関してあいまいさが存在する。あいまいさが生ずるとあいまいなノードに対する順方向ポインタがそのチェインされた順方向ポインタを、もし存在すれば、取除く。これは各後続のリンク・レコードに対してＬ−ＦＷＲＤアレイに０を挿入することによって行われる。A test is performed to see if there is an active node immediately before this node in the tree. be exposed. If it exists, it will be routed through node 25 or node 21 in this example. There is an ambiguity as to which path returns to node 21, such that Ru. When an ambiguity occurs, the forward pointer to the ambiguous node is Removes the forward pointer that was inserted, if it exists. This is for each subsequent link - Done by inserting a 0 into the L-FWRD array for the record.

この例では、どのノードにも順方向ポインタがない。すなわちＬ−ＦＷＲＤ−０である。In this example, there are no forward pointers for any nodes. That is, L-FWRD-0 It is.

次の最も最近加えられたアクティブ・リンク・レコードが特定される（ノード２４）。ノード２４の先行リンクもノード２１であり、上述のノード２５の処理と同様に処理される。The next most recently added active link record is identified (node 2 4). The preceding link of node 24 is also node 21, and the processing of node 25 described above is performed. Processed similarly.

次の最も最近加えられたアクティブ・ノードはノード２３である。このノードにはアクティブな先行ノードがないので、トレースバック量プロセスは先行ノードをアクティブとし、現在処理しているノード、ノード２３、に等しい、先行ノード、ノード１９、の順方向ポインタ（Ｌ−ＦＷＲＤ）を記録する。図Ｂはノード２３を処理した後の図Ａを示しており、ノード１９に加えられた順方向ポインタを太線で描いである。各ノードが処理されたら、それがアクティブ・ノードであることを示しているそのＬ−ＡＣＴエントリが取除かれる。したがって、トリー図Ｂはもはやノード２３．２４、および２５をアクティブと描いていない。The next most recently added active node is node 23. to this node Since there is no active predecessor node, the traceback amount process is is active and the preceding node is equal to the node currently being processed, node 23. The forward direction pointer (L-FWRD) of node 19 is recorded. Diagram B is a node Figure A is shown after processing 23 and the forward pointer added to node 19. is drawn with a thick line. After each node is processed, it becomes the active node. The L-ACT entry indicating that the Therefore, the tree Diagram B no longer depicts nodes 23, 24, and 25 as active.

ノード２２はアクティブ・ノードを備えている、次に最も最近加えられたエントリである。これにはアクティブな先行ノードがない。したがって、ノード２３に対して行ったように、ノード１８の順方向ポインタがノード２２に等しくセットされ、ノード１８にアクティブの印が付けられ。Node 22 comprises the active node, the next most recently added entry. It is li. It has no active predecessor nodes. Therefore, at node 23 As we did for node 18, the forward pointer is set equal to node 22. and node 18 is marked active.

る。図Ｃはノード２２を処理した後のトリーを示す。Ru. Diagram C shows the tree after processing node 22.

ノード２１は、図りに示すように、その先行ノードがアクティブでないので、ノード２２と同様に処理される。Node 21, as shown in the diagram, is active because its predecessor node is not active. It is processed in the same way as code 22.

ノード２０は次に処理されるべきアクティブ・ノードである。ノード２０に先行するノードはアクティブであり、これはあいまいさを示す。あいまいさが生ずると、あいまいさのノード、この場合ノード１８と２０、に対する順方向ポインタはその順方向ポインタ・チェインが解かれる。Node 20 is the next active node to be processed. Precedent node 20 is active, indicating ambiguity. Ambiguity arises and forward pointers to the nodes of ambiguity, in this case nodes 18 and 20. has its forward pointer chain unraveled.

この例ではノード１８にだけ順方向ポインタがある。ノード１８の後に太線で示したように、ノード１８に対する順方向ポインタはノード２２と等しくセットされている。したがって、図Ｅで、ノード１８の順方向ポインタを０に等しくセットすることにより、太線が除去される。In this example, only node 18 has a forward pointer. Indicated by a thick line after node 18 As before, the forward pointer for node 18 is set equal to node 22. It is. Therefore, in Figure E, we set the forward pointer of node 18 equal to 0. By doing so, the thick lines are removed.

ノード１９は次に最も最近加えられた、アクティブ・ノードを備えているエントリである。その先行ノードはアクティブであるから、図Ｆに示すように、あいまいさによって両ノード１６および１９に対する順方向ポインタ・チェインを外さなければならない。Node 19 is the next most recently added entry with an active node. It is li. Since its predecessor is active, the ambiguous The forward pointer chain for both nodes 16 and 19 is removed due to the There must be.

ノード１８が次に処理される。その先行ノード、１６、はアクティブであるが、ノード１６または１８はいずれも順方向ポインタを備えていないので、ノードのアクティビティを示す太い点を除去すること以外何らの処置も取られない。Node 18 is processed next. Its predecessor node, 16, is active, but Since neither node 16 or 18 has a forward pointer, the No action is taken other than removing the thick dots that indicate activity.

次のアクティブ・ノードはノード１６であり、これの先行ノードはアクティブでない。この場合には、トレースバック・プロセスは先行ノードをアクティブとし、現在処理されているノード、ノード１６、に等しい、先行ノード、ノード１３、に対する順方向ポインタを記録する。図Ｇはノード１６を処理した後のトリー図Ａを示す。The next active node is node 16, whose predecessor nodes are active do not have. In this case, the traceback process makes the predecessor node active. , the predecessor node, node 13, is equal to the currently processed node, node 16 , record the forward pointer to . Figure G shows the tree after processing node 16. Figure A is shown.

ノード１３はノード１６と同様に処理される。したがって、図Ｈではトリーはノード１１だけをアクティブと記して示してあり、ノード１１と１３とに対する順方向ポインタだけが残っている。Node 13 is treated similarly to node 16. Therefore, in Figure H, the tree is Only node 11 is shown marked as active; the order for nodes 11 and 13 is Only the direction pointer remains.

一旦トレースバック・プロセスが、先行する他のノードが無いトリーのルート・ノード（ノード１１）に到達すれば、順方向ポインタを通してチェインした単語を認識単語として出力する。これはチェインされた順方向ポインタを備えているリンク・レコードをＬ−ＦＷＲＤ内でルート・ノードから出発して引続き探すことにより行われる。図１で示すように、そのそれぞれのＬ−ＷＯＲＤアレイにｒｅｉｇｈｔ　Ｊおよびｒ「１ｖｅＪを記録するリンク・レコードを出力する。その他、図Ｈで、ノード１１と１６との間のリンクを表わすリンク・レコードが設定リストから除去され、Ｌ−ＰＴＲアレイおよびＦＲＥＥポインタで示したように、自由リストにリンクされる。この時点でのトリーの新しいルート・ノードはノード１６である。残りのトリーは図Ｉに示してあり、これは別の入力フレームが処理されるとき、すなわち、流れが認識流れ図の第５ａ図のブロック４６に戻る場合に使用される。Once the traceback process has reached the root of the tree with no other nodes ahead of it, When the node (node 11) is reached, the word chained through the forward pointer Output as a recognized word. This has a chained forward pointer Continue searching for link records in L-FWRD starting from the root node. This is done by As shown in Figure 1, r eight J and r "Output a link record that records 1veJ. Additionally, in Figure H, a link record representing the link between nodes 11 and 16 is set. removed from the fixed list, as shown by the L-PTR array and the FREE pointer. will be linked to the free list. The new root node of the tree at this point is This is node 16. The remaining tree is shown in Figure I, which represents another input frame. is processed, i.e., flow returns to block 46 of Figure 5a of the recognition flow diagram. Used when

今度は第７ａ〜７ｄ図を参照してトレースバック流れ図を詳細に説明することにする。第７ａ図で、リンク・テーブルから最も最近加えられたアクティブであるリンク・レコードを探す。ブロック９４で、リンク・テーブルの最初のレコードが設定レコード・リスト内の唯一のレコードであるか否かを確認する試験が行われる。これはＨＥ　Ａ　Ｄ　Ｅより指示されているＬ−ＰＴＲアレイのインデックスを探すことによって行われる。前述のとおり、ＨＥ　Ａ　Ｄは最も最近加えられたリンク・レコードのインデックスを備えている。ＨＥＡＤに対応するＬ− ＰＴＲエントリがＯに等、しければ、チェインは終結し、テーブルの中には他にレコードがない。この場合には、流れはブロック９６に進み、対応するＬ−ＡＣＴエントリがインアクティブに設定される。The traceback flowchart will now be explained in detail with reference to Figures 7a to 7d. do. In Figure 7a, the most recently added active from the link table is Find link records. At block 94, the first record of the link table is the only record in the configuration record list. It will be done. This is the index of the L-PTR array instructed by HE ADD. This is done by looking for Kusu. As mentioned above, HEAD is the most recently added Contains an index of linked records. L- corresponding to HEAD If the PTR entry is equal to O, the chain is terminated and there are no other entries in the table. There are no records. In this case, flow proceeds to block 96 where the corresponding L-AC T entry is set to inactive.

ブロック９６から、サブルーチンは認識流れ図の第５ａ図のブロック４６に戻る。From block 96, the subroutine returns to block 46 of FIG. 5a of the recognition flow diagram. .

テーブルの中に別のリンク・レコードが存在する場合には、ブロック９８で最初のリンク・レコードがアクティブであるか確認する試験が行われる。その対応するＬ−ＡＣＴエントリが０に等しくなければリンク・レコードはアクティブである。If another link record exists in the table, block 98 A test is performed to see if the link record for is active. The corresponding A link record is active if its L-ACT entry is not equal to 0. Ru.

リンク・レコードがアクティブであれば、流れはブロック１００に進む。ブロック１００で、リンク・レコードが既に考慮に入れられてしまっていることを示すには、リンク・レコードをインアクティブに設定する。次に流れはブロック１１０に進み、ＨＥＡＤで指示されたレコードがテンポラリ・ポインタＰＴＨに格納される。ブロック１１０から、流れは、引続き説明するが、ブロック１２０に進む。If the link record is active, flow continues to block 100. Block check 100 indicates that the link record has already been taken into account. , set the linked record to inactive. Next, the flow is block 11 Go to 0 and store the record pointed to by HEAD in temporary pointer PTH. be done. From block 110, flow continues to block 120, as will be described. nothing.

最初のリンク・レコードがインアクティブに戻ってしまえば、流れはブロック９８からブロック１１２に進む。ブロック１１２で、Ｌ−ＰＴＲはアクティブ・リンク・レコードが見つかるまで動き回り、アクティブ・リンク・レコードが見つかると、そのアクティブ・レコードに対するインゲン、クスがＰＴＲに格納される。ブロック１１４で、アクティブのインジケータがクリアされ、ブロック１００で行ったと同様に、そのリンク・レコードに対する処理を示す。Once the first link record returns to inactive, the flow continues at block 9 8, proceed to block 112. At block 112, the L-PTR Move around until you find a link record, and then move around until you find an active link record. Then, the green beans and sour beans for that active record are stored in the PTR. Ru. At block 114, the active indicator is cleared and at block 10 0, the processing for that link record is shown.

ブロック１１６で、ポインタＨＥＡＤとＰＴＲとの間で示されたように、インアクティブであるとわかったリンク・レコードが将来の使用のため自由リストに戻される。At block 116, the inner Link records found to be active are returned to the free list for future use. be done.

ブロック１１８で、リンク・テーブル内に更にリンクがあるか確認する試験が行われる。この試験は上のブロック９４で行われた試験と同じである。At block 118, a test is performed to see if there are more links in the link table. be exposed. This test is the same as the test performed in block 94 above.

設定リスト内にもはやリンク・レコードが存在しなければ、サブルーチンは認識流れ図に第５ａ図のブロック４６で戻る。If there are no more linked records in the configuration list, the subroutine is recognized. The flow diagram returns to block 46 of FIG. 5a.

ブロック１２０で現行リンクに先行するリンク（ノード）がインアクティブであるか否かを確認する試験が行われる。Block 120 indicates that the link (node) preceding the current link is inactive. A test will be conducted to confirm whether the

これは現行リンク・レコードのバック・ポインタを探し、その対応するＬ−ＡＣＴエントリを探すことにより行われる。先行ノードがアクティブであれば、流れは第７ｂ図のブロック１２４に進み、先に説明したあいまいさの問題を処理する。先行ノードがインアクティブであれば、流れは第７ｃ図のブロック１４２に進む。This looks for the back pointer of the current link record and its corresponding L-AC This is done by looking for T entries. If the predecessor node is active, the flow proceeds to block 124 of FIG. 7b to handle the ambiguity problem discussed above. . If the predecessor node is inactive, flow continues to block 142 of Figure 7c. nothing.

今度は第７ｂ図を参照すると、ここのステップは、先行ノードがなお合致について考察中のノードから発生する２つ以上のリンク・レコードがある可能性のあることを示しているとき、あいまいさを処理する。この状態はアクティブ・ノードに先行するノードもアクティブであるときに起る。リンク・レコード・データを操作するため第７ｂ図で３つのテンポラリ・ポインタ（ＴＭＰＩ、ＴＭＰ２およびＰＴＲ）を使用する。ブロック１２１のステップは順方向ポインタ・チェインを前のリンクから取外す。このステップにはブロック１２４．１２６．１２８、および１３０が含まれている。第７ｂ図にはＰＴＲが現在処理されているノード、あるいはリンク・レコードを指示した状態で入る。Referring now to Figure 7b, the steps here are There may be more than one link record originating from the node under consideration. Handle ambiguity when indicating that This state is an active node Occurs when the node preceding it is also active. link record data Three temporary pointers (TMPI, TMP2 and and PTR). The step in block 121 is a forward pointer chain. Remove from previous link. This step includes blocks 124.126.128, and 130 are included. Figure 7b shows the node where the PTR is currently being processed. , or enter with a link record pointed to.

リンクＱレコードに対応するＬ−ＢＡＣＫエントリは、上述のように、現在処理されているノードの直前のノードを指している。リンク・レコードに関連するＬ −ＦＷＲＤエントリは可能性のある子孫リンク・レコードだけを指示する。The L-BACK entry corresponding to the link Q record is currently being processed as described above. Points to the node immediately before the node being displayed. L related to link record -FWRD entries point only to possible descendant link records.

ブロック１２４で、現行アクティブ・ノードの直前のノードを指すポインタがＴＭＰＩに格納される。ブロック１２６で、先行ノードの、Ｌ−ＦＷＲＤにより指示された子孫リンク・レコードがＴＭＰ２に格納される。At block 124, a pointer to the node immediately preceding the current active node is Stored in MPI. At block 126, the predecessor node's The indicated descendant link record is stored in TMP2.

ブロック１２８で、ＴＭＰ２で指示されたノードが実、際のＬ−ＦＷＲＤエントリを備えているか、あるいはそれが０にセットされているかを確認する試験が行われる。ＴＭＰ２で指示されたノードが順方向ポインタを備えていれば（Ｌ−ＦＷＲＤが０に等しくなければ）、流れはブロック１３０に進み、そのノードに対する順方向ポインタが除去される。ブロック１３０で、ＴＭＰ２の内容もＴＭＰＩに移動し、これにより現行ノードの参照が一時的にＴＭＰ２により指示されているノードに移動し、次いで、ブロック１２６から始まって、上記のステップが後続のノードに対して、順方向チェイン内に、ブロック１２８で示したように、順方向チェモレ終りを示す順方向ポインタを持たないノードが見つかるまで、繰返される。At block 128, the node indicated by TMP2 determines whether the actual L-FWRD entry is A test is performed to verify that the be exposed. If the node pointed to by TMP2 has a forward pointer (L-F if WRD is not equal to 0), flow continues to block 130 where the The forward pointer that points to is removed. At block 130, the contents of TMP2 are also TMP I, which causes the current node reference to be temporarily directed by TMP2. , and then, starting at block 126, the above steps are performed. For subsequent nodes, in the forward chain, as indicated by block 128, Iterate until a node is found that does not have a forward pointer indicating the end of the forward chemole. returned.

ブロック１２２のステップは、ＰＴＲで示されているように、現行リンクから順方向ポインタ・チェインを除去する。ブロック１３２で、現行リンク・レコード・ポインタＰＴＲがＴＭＰＩに格納される。ブロック１３４で、そのリンク・レコードに対するＬ−ＦＷＲＤエントリがＴ　Ｍ　Ｐ２に格納される。ブロック１３６で、上のブロック１２８で行われたように、このリンク・レコードに対する順方向ポインタが存在するか確認する試験が行われる。このリンク・レコードに対する順方向ポインタが存在すれば、流れはブロック１３８に進み、ここで順方向ポインタが除去され、子孫ノードが、その順方向ポインタをも同様に除去するため、ＴＭＰＩに格納される。ブロック１３４から始めて、上のステップは、現行ノードからチェインされている順方向ポインタがすべて削除されるまで繰返される。次に、ブロック１３６から、流れは第６Ｃ図のブロック１４４に進み、次のアクティブ・リンク・レコードを処理する。The steps in block 122 start with the current link, as indicated by PTR. Remove directional pointer chain. At block 132, the current link record - Pointer PTR is stored in TMPI. At block 134, the link record is The L-FWRD entry for the code is stored in TMP2. block 1 36, for this link record as done in block 128 above. A test is made to see if a forward pointer exists. to this link record If a forward pointer exists for the forward pointer, flow continues to block 138 where the forward The forward pointer is removed, and the descendant node removes its forward pointer as well. Therefore, it is stored in TMPI. Starting at block 134, the above steps Repeats until all forward pointers chained from the row node are removed. It will be done. From block 136, flow then proceeds to block 144 of FIG. Process active link records.

第７ａ図のブロック１２０に戻って参照すると、現行ノードに先行するノードがインアクティブである場合には、流れは第７ｃ図のブロック１４２に進む。これについて説明する。Referring back to block 120 of FIG. 7a, the node preceding the current node is If so, flow continues to block 142 of Figure 7c. this I will explain about it.

第７ｃ図のブロック１４２では、先行ノード（リンク・レコード）がインアクティブであるとわかっているので、二〇ノードがアクティブにセットされ、その順方向ポインタが現行リンク・レコードを指すようにセットされる。ブロック１４４で、現行リンク・レコードから始めて、次のアクティブ・リンク・レコードが見つかり、ＰＴＲがこのレコードを指すようにセットされ、新しい現行リンクレコードを示すようになるまでテーブルが捜索される。ブロック１４６で、ブロック１４４のステップ中に遭遇したすべてのインアクティブ・レコードが、Ｌ−ＰＴＲアレイ中の適切なエントリを修正することにより、自由リストに戻される。In block 142 of Figure 7c, the predecessor node (link record) is inactive. 20 nodes are set active, and in that order A direction pointer is set to point to the current link record. Block 14 4, starting from the current link record, the next active link record is found, the PTR is set to point to this record, and a new current link record is created. The table is searched until a code is found. At block 146, the block All inactive records encountered during step 144 are stored in the L-P It is returned to the free list by modifying the appropriate entry in the TR array.

ブロック１４８で、新しいノードがインアクティブにセットされて、第７ａ図のブロック１００および１１４で行われたと同扛に、そのノードが既に考慮に入れられていることを示す。At block 148, the new node is set to inactive, as shown in FIG. 7a. In the same way as was done in blocks 100 and 114, the node is already taken into account. Indicates that the

ブロック１５２でこの新しいリンク・レコードがチェインの最後であるか否かを確認する試験が行われる。最後である場合には、すべてのリンク・レコードが処理されてしまっており、流れは第７ｄ図のブロック１５６に進んでトレースバック・プロセス中に認識された単語を出力する。Block 152 determines whether this new link record is the last in the chain. Tests will be conducted to confirm this. If last, all link records are processed. 7d, flow continues to block 156 of Figure 7d to write the traceback. Outputs the words recognized during the checking process.

この新しいリンク・レコードがチェインの最後でなければ、流れは更に処理するため第７ａ図のブロック１２０に進む。If this new link record is not the last in the chain, the flow continues processing Therefore, proceed to block 120 of FIG. 7a.

今度は第７ｄ図を参照すると、ブロック１５６で現行リンク・レコードのインデックスが、これはトリーのルート・ノードであるが、ＴＭＰｌに格納される。ブロック１５８で、現行ノードの順方向ポインタによって表わされているノード（リンク・レコード）がＴＭＰ２に格納される。Referring now to FIG. 7d, block 156 sets the index of the current link record. The box, which is the root node of the tree, is stored in TMPl. Bu At lock 158, the node represented by the current node's forward pointer ( link record) is stored in TMP2.

たとえば、第８図の図Ｈを参照すると、ＴＭＰｌは１１（ノード１１）を含んでおり、ＴＭＰ２は１３（ノード１３）を含んでいる。For example, referring to Figure H in Figure 8, TMPl includes 11 (node 11). Therefore, TMP2 includes 13 (node 13).

ブロック１６０で、ＴＭＰＩに格納されているノードからの子孫となる順方向ポインタが存在するか確認する試験が行われる。これはＴＭＰ２の内容と０とを比較して行われる。存在すれば、流れはブロック１６２に進み、現行ノードに対する順方向ポインタが除去され、現行ノードが、ＴＭＰ２に格納されている現行ノードの順方向ポインタによって示されているように、順方向チェイン内の次のノードまで持ち上げられる。ブロック１６４で、現行リンク・レコードに関連する単語が認識された単語として出力される。ブロック１５８から始まって、上記のステップは、順方向ポインタ・チェインの各リンク・レコードが、認識された単語としてその関連単語を備えるようになるまで、繰返される。ブロック１６０でのステップ中、順方向ポインタを備えていない子孫リンク・レコードが見つかるが、この場合には、流れはブロック１６８に進み、ＴＭＰｌおよびＰＴＲＯ間に示されているように、すべての死んだリンク・レコードが自由リストに戻される。その他に、ブロック１６８でＬ−ＰＴＲアレイがＰＴＲにより現在指示されている新しい基底（ｂａｓｅ）ノードに対するＬ−ＰＴＲエントリをＯにセットすることにより更新される。このレコード内の０はトリーのルート（ｒｏｏｔ）とリンク・レコードの設定リストの終りとを示す。ブロック１６８で、トレースバックは完了し、流れは第５ａ図のブロック４６に進む。At block 160, the forward point that is a descendant from the node stored in the TMPI is A test is performed to check if the interface exists. This compares the contents of TMP2 and 0. This is done by comparing. If so, flow continues to block 162 where the The forward pointer stored in TMP2 is removed and the current node is changed to the current node stored in TMP2. the next node in the forward chain, as indicated by the forward pointer of the be lifted up to the board. At block 164, associated with the current link record The word is output as a recognized word. Starting at block 158, the above A step is a process in which each link record in a forward pointer chain is It is repeated until the related word is included as a word. in block 160 During the step, a descendant link record is found that does not have a forward pointer. However, in this case, flow proceeds to block 168 where the flow is transferred between TMPL and PTRO. All dead link records are returned to the free list as shown . Additionally, block 168 determines whether the L-PTR array is currently pointed to by the PTR. Set the L-PTR entry for the new base node to O. It is updated by 0 in this record is the root of the tree. Indicates the end of the link record settings list. At block 168, the tracebar The check is complete and flow continues to block 46 of Figure 5a.

本発明はしたがって連続音声認識のための新しいかつ改良されたシステムと方法とを提供する。本発明は効率的に様式化した上述の流れ図により説明したように簡単かつ安価な８ビツトのプロセッサで実時間認工を行うように簡単に実現することができる。本発明は更に入力フレームを処理するとき最小限の数のリンク・レコードだけを格納すればよいようなすぐれたメモリ管理法を提供する。The present invention therefore provides a new and improved system and method for continuous speech recognition. and provide. As illustrated by the above-described flowcharts, the present invention is efficiently stylized. Easily implement real-time verification using a simple and inexpensive 8-bit processor be able to. The present invention further provides a method for minimizing the number of links when processing an input frame. Provides an excellent memory management method that only requires storing records.

本発明について特に好ましい実施例を参照して図示し、説明したが、当業者には上述の本発明に対しその精神および範囲を逸脱することなく各種修正および変更を行い得ることが理解されるであろう。Although the invention has been illustrated and described with reference to particularly preferred embodiments, those skilled in the art will appreciate that the invention Various modifications and changes may be made to the invention described above without departing from its spirit and scope. It will be understood that this can be done.

国際調査報告international search report

Claims

[Claims]

1. Storing in memory a grammar model consisting of nodes connected by arcs and the arc has at least one associated template stored in memory. Each has a starting node and an ending node, and stores each group of input frames. Similarity measure parameters can be generated by comparing the audio to the template In a recognition system, one or more devices for recognizing each group of input frames. A first predetermined arc collection consisting of arcs and their respective nodes is a grammar model. A means of configuring the grammar model to have loops within the model and a given arc the first and second parameters for one or more nodes, including the terminal node for means for providing data storage; at least one first similarity measure parameter for the starting node of the predetermined arc; means for identifying a meter assembly; and at least one first meter for said predetermined arc. 2 of the similarity measure parameter set for one or more templates related to the given arc. template and the first similarity measure parameter set for the starting node; a means of checking using a third similarity measure parameter at the terminal node for the given arc; the at least one second class identified for the predetermined arc; A means for confirming using a similarity measure parameter set and a method for confirming the terminal node. said identified in said terminal node in said first parameter storage means; means for storing a third similarity measure parameter set; and a means for storing a third similarity measure parameter set; selecting the contents of the storage device as said second parameter storage means for said terminal node; means for selectively transferring; each group of input frames is stored in said parameter storage means associated with said terminal node; each of the input frames characterized by comprising: means for recognizing using the contents of the input frame; A device that recognizes groups.

2. Each similarity measure parameter set includes a cumulative distance measure and link information. 2. An apparatus for recognizing each group of input frames as claimed in claim 1.

3. the terminal node is the terminal node for one of the arcs comprising the loop; An apparatus for recognizing each group of input frames as claimed in claim 1.

4. The method according to claim 1, further comprising one or more invalid arcs within the grammar model. A device that recognizes each group of input frames.

5. Additionally, check the similarity measure parameter set at the starting node for invalid arcs. and the similarity measure parameter ascertained at the starting node for the invalid arc. the first parameter storage device for the terminal node of the invalid arc; each group of input frames as claimed in claim 4, further comprising means for selectively storing each group of input frames at a location. Recognizing device.

6. Furthermore, the node in the grammar model is defined as the starting node for the invalid arc. Provides a means to process the node so that it is processed before the terminal node for the invalid arc. 5. An apparatus for recognizing each group of input frames according to claim 4.

7. Storing in memory a grammar model consisting of nodes connected by arcs and the arc has at least one associated template stored in memory. Each has a starting and ending node, and stores each group of input frames. The similarity measure parameter can be generated by comparing the template In a voice recognition system, a method for recognizing each group of input frames, the method comprising: A first predetermined arc collection consisting of the above arcs and their respective nodes is a grammar The steps of configuring a grammar model to have a loop in the model and the first and second parameters for one or more nodes, including the terminal node for the providing data storage; at least one first similarity measure parameter for the starting node of the given arc; determining a meter aggregate; at least one second similarity measure parameter set for the predetermined arc; for one or more templates associated with the given arc and the starting node. determining the first similarity measure parameter set using the first similarity measure parameter set; a third similarity measure parameter at the terminal node for the predetermined arc; the at least one second class determined for the predetermined arc; determining the final node using the similarity measure parameter set; The third similarity measure parameter determined by applying the third similarity measure parameter to the terminal node storing in a first parameter storage means; and selecting the contents of the storage device as said second parameter storage means for said terminal node; selectively transferring the data; the contents of said parameter storage device relating each group of input frames to said terminal node; each group of input frames, characterized in that the step of recognizing the input frames using How to recognize.

8. Each similarity measure parameter set includes a cumulative distance measure and link information. 8. A method for recognizing each group of input frames as claimed in claim 7.

9. the terminal node is the terminal node for one of the arcs with the loop; 8. A method for recognizing each group of input frames as claimed in claim 7.

10. Furthermore, two or more nodes are associated with at least one invalid archive in the grammar model. 8. The method for recognizing each group of input frames according to claim 7, comprising the step of: Law.

11. Furthermore, the similarity measure parameter set at the starting node for invalid arcs the class determined at the starting node for the invalid arc; the similarity measure parameter set to the first parameter for the terminal node of the invalid arc; 11. The input file of claim 10, further comprising the step of selectively storing in a parameter storage device. How to recognize each group of frames.

12. Further, the node in the grammar model is set as the starting node for invalid arcs. A step of processing such that is processed before the terminal node for the invalid arc. 11. A method for recognizing each group of input frames as claimed in claim 10.

13. The input frame is matched to a pre-stored template representing audio. Templates under consideration as templates that may be processed and recognized as are recorded separately as link records in the link network, and The link record generally comprises ancestor and descendant link records. In speech recognition systems, means for providing a temporary pointer to the link record; label a given one of said link records with said temporary pointer; possible recognized templates corresponding to said one link record with means for tracing back through said network while connecting to said network; Link record with two or more possible descendant link records a means for verifying the linked record; and a means for verifying the linked record; means for deleting the re-pointer; the linked record still labeled with the temporary pointer means for outputting data corresponding to; A device for recognizing voice patterns, comprising:

14. Periodically alert the system after a predetermined number of input frames have been processed 14. A device for recognizing speech patterns as claimed in claim 13, comprising means.

15. Means for configuring a network according to a predetermined grammar model topology 14. The apparatus for recognizing speech patterns according to claim 13.

16. For at least one of said templates, a possible word end is Once recognized, the contract must have a means of attaching a link record to the network. A device for recognizing the voice pattern according to claim 13.

17. Provides a means to indicate the most recently added link record in the network. 14. The device for recognizing speech patterns according to claim 13, wherein

18. Provides a means for indicating that a link record has no ancestor link record. 14. An apparatus for recognizing speech patterns according to claim 13.

19. The link record is defined as a free link record and a set link record. 14. The speech pattern according to claim 13, further comprising means for storing the speech pattern in a table consisting of A device that recognizes

20. Indicates the beginning of a set link record and the beginning of a free link record 20. Apparatus for recognizing speech patterns according to claim 19, comprising means.

21. A means of indicating the end of a set link record and the end of a free link record. 20. An apparatus for recognizing speech patterns according to claim 19.

22. For link records that are descendants from the confirmed link record. 14. The voice of claim 13, further comprising means for deleting a temporary pointer that A device that recognizes patterns.

23. The input frame is matched to a pre-stored template representing audio. Templates under consideration as templates that may be processed and recognized as are recorded separately as link records in the link network, and The link record generally comprises ancestor and descendant link records. In speech recognition systems, The link record is divided into a symbol, each representing a template, and a link record. a sequence indicator that represents the relative time the code was stored, and a network a first pointer to the link record from which each of the links is derived; , a second temporary pointer; and means for storing the second temporary pointer; trace back through the network using data aggregations with means for identifying at least one known link record; means for outputting the clearly recognized link record; 1. A device for recognizing voice patterns, comprising:

24. Warn the system periodically after a predetermined number of input frames have been processed 24. Apparatus for recognizing speech patterns according to claim 23, comprising means.

25. A means of constructing a network according to a predetermined grammar model topology 24. The apparatus for recognizing speech patterns according to claim 23.

26. possible end-of-word for at least one of said templates. Claims that have a means of attaching link records to the network after recognition 24. A device for recognizing a voice pattern according to item 23.

27. Provides a means to indicate the most recently added link record in the network. 24. The device for recognizing speech patterns according to claim 23, wherein

28. Provides a means for indicating that a link record has no ancestor link record. 24. An apparatus for recognizing speech patterns as claimed in claim 23, comprising:

29. The said link record is a free link record and a set link record. 24. The audio pattern according to claim 23, further comprising means for storing the audio pattern in a table comprising: A device that recognizes

30. Indicates the beginning of a set link record and the beginning of a free link record 30. Apparatus for recognizing speech patterns according to claim 29, comprising means.

31. A means of indicating the end of a set link record and the end of a free link record. 30. An apparatus for recognizing speech patterns according to claim 29.

32. For link records that are descendants from the confirmed link record. 24. The voice of claim 23, further comprising means for deleting a temporary pointer that A device that recognizes patterns.

33. The input frame is matched to a pre-stored template representing the audio. Templates under consideration as templates that may be processed and recognized for The route is recorded separately in the link network as a link record. , said link record generally comprises ancestor and descendant link records. In a speech recognition system, The link record is divided into a symbol, each representing a template, and a link record. a sequence indicator that represents the relative time the code was stored, and a network and a pointer to the link record derived from it. a means of storing the indexed data in a table as an indexed data collection; The table consists of a free record space and a set record space, The link record is stored in the configuration record space and Traceback through the network using indexed data collections , one or more link records whose corresponding templates are explicitly known a means for identifying the code; outputting data representing said specifically recognized link record; A record is retrieved from said configuration record space, thereby retrieving said retrieved record. The linked record continues to have free record space to store the linked record. a means of making it so that A device for recognizing voice patterns, comprising:

34. A way to periodically alert the system after processing a predetermined number of input frames. 34. An apparatus for recognizing speech patterns as claimed in claim 33, comprising steps.

35. A means of constructing a network according to a predetermined grammar model topology An apparatus for recognizing a speech pattern according to claim 33, comprising:

36. A possible word terminator for at least one of said templates is Once recognized, the contract must have a means of attaching a link record to the network. 34. A device for recognizing a voice pattern according to claim 33.

37. Provides a means to indicate the most recently added link record in the network. 34. The device for recognizing speech patterns according to claim 33, wherein

38. Provides a means for indicating that a link record has no ancestor link record. 34. An apparatus for recognizing speech patterns as claimed in claim 33, comprising:

39. Possible to have more than one descendant link record that may be recognized. 34. The sound of claim 33, comprising means for determining possible link records. A device that recognizes voice patterns.

40. In said configuration record space, the corresponding template is clearly recognized. select a specific link record that is not 34. A voice pattern according to claim 33, further comprising means for returning to an original record space. A device that recognizes

41. The input frame is matched to a pre-stored template representing audio. Templates under consideration as templates that may be processed and recognized as are recorded separately as link records in the link network, and The link record generally comprises ancestor and descendant link records. In speech recognition systems, providing a temporary pointer to the link record; labeling a predetermined one of the link records with the temporary pointer; While tracing back through the network, the one link record connecting a potentially recognized template corresponding to the code; of the possibility of having more than one descendant link record that could be recognized. determining a link record; delete the temporary pointer corresponding to the determined link record; and the link still labeled with the temporary pointer. ・A step of outputting data corresponding to the record; A method for recognizing speech patterns, comprising:

42. Periodically alert the system after a predetermined number of input frames have been processed 42. The method of recognizing speech patterns of claim 41, comprising the steps of:

43. A step to construct a network according to a predetermined grammar model topology. 42. The method of recognizing speech patterns of claim 41, comprising:

44. A possible word terminator for at least one of said templates is After being recognized, it includes the step of attaching a link record to the network. 42. The method of recognizing speech patterns according to claim 41.

45. Step to show the most recently attached link record in the network 42. The method of recognizing speech patterns of claim 41, comprising:

46. A step indicating that the link record has no descendant link records. 42. The method of claim 41, wherein the method comprises:

47. The said link record is a free link record and a set link record. 42. The audio parameters according to claim 41, further comprising the step of: storing the audio parameters in a table comprising: How to recognize turns.

48. Indicates the beginning of a set link record and the beginning of a free link record 48. A method of recognizing speech patterns as claimed in claim 47, comprising the steps of:

49. A step indicating the end of a set link record and the end of a free link record. 48. The method of recognizing speech patterns of claim 47, comprising:

50. for link records that are descendants from the determined link record. 42. The method of claim 41, further comprising the step of deleting a temporary pointer that How to recognize speech patterns.

51. The input frame is matched to a pre-stored template representing audio. Templates under consideration as templates that may be processed and recognized as are recorded separately as link records in the link network, and The link record generally comprises ancestor and descendant link records. In speech recognition systems, Each of the link records includes a symbol representing a template and a link record. a sequence indicator that represents the relative time the code was stored, and a network A first pointer to the link record from which each in the link is derived. and a second temporary pointer. and the step of storing it as the indexed data collection, including the second temporary pointer; traceback through the network using identifying one link record; and the step of identifying one link record; ・A step to output a record, A method for recognizing speech patterns, comprising:

52. Periodically alert the system after a predetermined number of input frames have been processed 52. The method of recognizing speech patterns of claim 51, comprising the steps of:

53. A step to construct a network according to a predetermined grammar model topology. 52. The method of recognizing speech patterns of claim 51, comprising:

54. possible word endings for at least one of said templates; includes the step of attaching a link record to the network after the 52. The method of recognizing speech patterns according to claim 51.

55. Step to show the most recently attached link record in the network 52. The method of recognizing speech patterns of claim 51, comprising:

56. A step indicating that the link record has no ancestor link record. 52. The method of claim 51, wherein the method comprises:

57. The link record is defined as a free link record and a set link record. 52. The audio pattern according to claim 51, further comprising the step of storing the audio pattern in a table consisting of: How to recognize the zone.

58. Indicates the beginning of a set link record and the beginning of a free link record 58. The method of recognizing speech patterns of claim 57, comprising the steps of:

59. A step indicating the end of a set link record and the end of a free link record. 58. The method of recognizing speech patterns of claim 57, comprising:

60. for link records that are descendants of the determined link record. 52. The audio of claim 51, comprising the step of deleting the temporary pointer. How to recognize patterns.

61. The input frame is matched to a pre-stored template representing audio. Templates under consideration as templates that may be processed and recognized as are recorded separately as link records in the link network, and The link record generally comprises ancestor and descendant link records. In speech recognition systems, Each of the link records includes a symbol representing a template and a link record. a sequence indicator that represents the relative time the code was stored, and a network Each link in the record contains a pointer to the link record from which it was derived. free record space and configured record space as indexed data collections. The data stored in the configuration record space of the table consisting of the code space and step, tracebar through the network using the indexed data collection. one or more link records for which the corresponding template is clearly known. identifying the code; outputting data representing said specifically recognized link record; A record is retrieved from said configuration record space, thereby retrieving said retrieved record. Free record space for linked records that the linked record subsequently stores a voice pattern characterized by comprising steps for making it paced; How to recognize.

62. Periodically alert the system after a predetermined number of input frames have been processed 62. The method of recognizing speech patterns of claim 61, comprising the steps of:

63. A step to construct a network according to a predetermined grammar model topology. 62. The method of recognizing speech patterns of claim 61, comprising:

64. A possible word terminator for at least one of said templates is After being recognized, it includes the step of attaching a link record to the network. 62. A method for recognizing speech patterns according to claim 61.

65. Step to show the most recently attached link record in the network 62. The method of recognizing speech patterns of claim 61, comprising:

66. A step indicating that the link record has no ancestor link record. 62. The method of claim 61, wherein the method comprises:

67. Possible to have more than one descendant link record that may be recognized. 62. The method of claim 61, further comprising the step of determining possible link records. How to recognize speech patterns.

68. If the corresponding template is not output in the configuration record space above. select a specific link record, and add the specific link record to the free record. 62. The voice pattern of claim 61, comprising the step of returning the audio pattern to a code space. How to recognize.