JPS587697A

JPS587697A - Voice recognizing system

Info

Publication number: JPS587697A
Application number: JP56106507A
Authority: JP
Inventors: 宏明横道
Original assignee: Tokyo Shibaura Electric Co Ltd
Current assignee: Toshiba Corp
Priority date: 1981-07-08
Filing date: 1981-07-08
Publication date: 1983-01-17

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】本発明は音声認識方式、具体的には認識処理の効率化を
はかった音声認識ならびにそれに対する応答のための手
法に関する。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a speech recognition system, and specifically to a method for speech recognition and response thereto that improves the efficiency of recognition processing.

現在、オフィスコン♂１−夕の入力手段としては、キー
？−ド争タッチインやセンサツヤネル等、出力手段とし
てはシリアルグリンタ、記憶ファイル手段としてはフロ
ッピーディスク・磁気メモリ等が確立されてきた。Currently, the key is the input method for office computer♂1-evening? - Serial printers have been established as output means for touch-in, sensor output, etc., and floppy disks, magnetic memories, etc. have been established as storage file means.

ところで、音声合成・音声ｇｗ＆の技術革新により、オ
フィスコンビ、−タへも音声による入出力手段が提供さ
れてきている。音声入力は認識技術によるところが大で
ある０通常、単Ｗ！認識を行う音声認識装置は、内部の
辞書エリアに登録さｎた単語の辞書（標準）々ターン）
を格納しておき、その辞書を参照することへより音声を
認識する。By the way, due to technical innovations in voice synthesis and voice GW&, voice input/output means have been provided to office combinations and computers. Voice input depends largely on recognition technology.Normally, single W! The speech recognition device that performs recognition uses a dictionary (standard) of n words registered in the internal dictionary area.
is stored and the speech is recognized by referring to the dictionary.

従来の音声認識装置において処理されるＶ！識命令の動
作フローチャートが第１図に示さｎている。動作の概略
を列挙すると以下の如くなる。V! processed in a conventional speech recognition device. An operational flowchart of the recognition command is shown in FIG. The outline of the operation is listed below.

（１）音声を入力する。(1) Input voice.

（２）入力された音声と辞書の構成要素である単語との
距離を計算する。(2) Calculate the distance between the input speech and words that are constituent elements of the dictionary.

（３）辞書内の全ての単相について終了するまで上記（
２）の動作を繰返す。(3) The above (
Repeat operation 2).

（４）　　ｉｋ小距離が予め定められた値以下のとき認
識したものとみなし、その単語名をホストコンビ、−タ
（ＣＰＵ　）に通知する。(4) When the short distance is less than or equal to a predetermined value, it is assumed that the word has been recognized, and the name of the word is notified to the host computer (CPU).

ところで、上記従夫例によれば、（２）と（３）におい
て辞書領域内に存在する全ての単語について距離を計算
するため、特定の範聯に属する単語（オペレータ制御指
令、数字等）のみをｇｗ＆する場合、余計な時間が費さ
れ、その間システムに無駄な負担を強いることになる。By the way, according to the above subordinate example, in (2) and (3), distances are calculated for all words existing in the dictionary area, so only words belonging to a specific range (operator control commands, numbers, etc.) are calculated. If you do gw&, extra time will be spent, and the system will be burdened unnecessarily during that time.

又、システムに対し、要求される範−外の単語を入力し
た場合には、ＣＰＨに対し単語名を通知した後、ＣＰＵ
　＠で受付は不可能と判定するまでの無意味な処理を必
要とする。この間のＣＰＵのオーバーへ、ド（負担増）
はもちろん、このために費やされる時間も無視しえない
ものであった。In addition, if a word outside the required range is input to the system, the word name is notified to the CPH, and then the CPU
@ requires meaningless processing until it is determined that reception is impossible. During this time, the CPU is overloaded (increased burden)
Of course, the amount of time spent on this was not to be ignored.

本発明は上記事情に基づいてなされたものであり、音声
認識時に参照する辞書名を通常の辞書情報と共に７９ラ
メータとして設定し、且つ認識に対する応答としてｕＲ
単語名とその単語を含む辞書名を要求のあった装置へ転
送する構成とすることにより、ＣＰＵの負担を軽減し、
更には処理の高速化をはかった音声認識方式を提供する
ことを目的とする。The present invention has been made based on the above circumstances, and the dictionary name to be referred to during speech recognition is set as 79 rammeters along with normal dictionary information, and uR is set as a response to recognition.
By transmitting the word name and the dictionary name containing the word to the requested device, the load on the CPU is reduced.
Furthermore, it is an object of the present invention to provide a speech recognition method that speeds up processing.

以下、図面を使用して本発明に関し詳細に説明する。Hereinafter, the present invention will be explained in detail using the drawings.

第２図は本発明が採用される計算機システムの構成例を
示すブロック図である。図において、１はプログラムあ
るいは各種データが格納される主メモリでおる。２は前
記主メモリ１に格納されたプログラムに従いシステム全
体をコントロールする中央処理装置（以下、ＣＰＵと称
する）である、Ｊは入出力制御装置であって、キー？−
ド、　ＣＲＴディスプレイ、シリアルグリンタ等（いず
れも図示せず）の入出力デバイスが接続される。４は本
発明が実現される音声認識装置であって、機能的には音
声１ｊ！！織制御部５と音声辞書パ、ファｅとから成る
。音声辞書Δ、ファｄは音声認識のための登録単語群の
一部標準Δターンが収納されるものであり、前記主メモ
リ１延長上のアドレスマツプを持つものとする。FIG. 2 is a block diagram showing an example of the configuration of a computer system to which the present invention is applied. In the figure, 1 is a main memory in which programs or various data are stored. 2 is a central processing unit (hereinafter referred to as CPU) that controls the entire system according to the program stored in the main memory 1; J is an input/output control device; −
Input/output devices such as a card, a CRT display, and a serial printer (none of which are shown) are connected. 4 is a speech recognition device in which the present invention is realized, and functionally it is a speech recognition device 1j! ! It consists of a texture control section 5 and voice dictionaries P and F. The speech dictionary Δ, fad stores some standard Δ turns of registered word groups for speech recognition, and has an address map as an extension of the main memory 1.

音声認識制御部５の詳細は第２図にて示す、ｒはファイ
ルコントローラである。ファイルコントローラ７には固
定磁気ディスク装置（８；外部７アイルメモリ）が接続
されており、該固定磁気ディスク装置８には本計算機シ
ステムの一連の処理で必要とされる音声辞４＃（音声認
識のための登録単語群の標準パターン）があらかじめ格
納されている。９はアドレス・データ・コントロールの
ためのラインが複数本から成るシステムパスである。前
記各装置１．：ｌ、３．ｊ。Details of the voice recognition control section 5 are shown in FIG. 2, where r is a file controller. A fixed magnetic disk device (8; external 7-isle memory) is connected to the file controller 7, and the fixed magnetic disk device 8 is connected to a voice language 4# (speech recognition (standard patterns of registered word groups) are stored in advance. 9 is a system path consisting of a plurality of lines for address/data control. Each of the above devices 1. :l, 3. j.

ｒは前記システムパス９に共通接続され、計算機システ
ムを構成している。r is commonly connected to the system path 9 and constitutes a computer system.

第３図は第２図における音声認識装置りの内部構成を示
すプロ、り図である。図において、１１は音声認識装置
りを計算機システムに接続する際に設けられるパスイン
ターフェースユニ、　）−ｔ’６ル。このパスインタ、
−７エースユニ。FIG. 3 is a professional diagram showing the internal configuration of the speech recognition device in FIG. 2. In the figure, 11 is a path interface unit provided when connecting the speech recognition device to the computer system. This path interface,
-7 Ace Uni.

ト１１は外部（ＣＰＵ　２）より発せられるｇＲ命令に
伴うΔラメータを音声ｇｍ装置り内に取込み、ここで認
識された結果、単語名とそれに伴う辞書名をＣＰＵ　２
に対し通知する、いわゆる外部とのインターフェース部
分を司どるハードウェア機構（データ通路）である、１
２はマイクＵグロセ、す、１１はＲＯＭ　、　１４　ｉ
ＲＡＭである。The controller 11 imports the Δ parameter accompanying the gR command issued from the outside (CPU 2) into the voice GM device, and as a result of the recognition, the word name and the accompanying dictionary name are sent to the CPU 2.
It is a hardware mechanism (data path) that controls the so-called interface part with the outside, which notifies the
2 is Mike U Grosse, 11 is ROM, 14 i
It is RAM.

ＢＯＮ　Ｉ　Ｊには本装置工をコントロールするための
ファームクエアルーチンが格納され、マイクロプロセ、
す１２はこのファームウェアの内容に基づき装置全体の
制御を行う０本発明において使用されるファームウェア
ルーチンの処理フローチャートは第５図に示されている
、ＲＡＭ１４Ｂ＋１［ｅファームウェアのワークエリア
として使用される他、音声入力データの特徴正規化時に
おけるワーク用として、あるいは正規化データの収納エ
リアとして使用される。１ｊは距離計算ロゾ、りである
。このロジックは標準パターンと音声入カノダターンの
類似度を自動的に算出（比較演算）するＩ・−ドウエア
であって、辞書パ、ファーから得られる辞書単語と後述
する音声入力部１６を介して得られる音声人カッＩター
ンの差の絶対値全求めるロジックである。ここで得られ
る数値がある足められた値以下のとき認識され念ものと
みなされ、その単語名ならびにその単語の属する辞書名
がＣＰＵ　２に対し通知される。BON IJ stores firmware routines to control this equipment, and includes microprocessors,
The processing flowchart of the firmware routine used in the present invention is shown in FIG. It is used for work when normalizing the characteristics of voice input data, or as a storage area for normalized data. 1j is the distance calculation method. This logic is an I-ware that automatically calculates (comparison calculation) the degree of similarity between a standard pattern and a voice-input Kanodata pattern, and it is an I-ware that automatically calculates (comparison calculation) the degree of similarity between a standard pattern and a voice-input pattern. This is the logic that calculates the absolute value of the difference between the spoken person's cursive turns. When the obtained value is less than or equal to a certain added value, the word is recognized and assumed to be true, and the CPU 2 is notified of the name of the word and the name of the dictionary to which the word belongs.

１６は音声入力部である。音声入力部１６はマイクロフ
ォン（図示せず）から入力された音声アナログ信号を周
波数分析し、例えばファームウェア動作時間毎にそれぞ
れの周波数帯のピーク値をサンプルホールドし、８ビ、
トのデジタル値に変換する機能を持つ、前記パスインタ
ーフェースユニｙ　）　１１　、ＲＯＭ　１２　＊　Ｒ
ＡＭ１３−距離計算部１５．音声入力部１６は前記マイ
クロフォン。16 is an audio input section. The audio input unit 16 analyzes the frequency of an audio analog signal input from a microphone (not shown), samples and holds the peak value of each frequency band at each firmware operation time, and outputs an 8-bit,
The path interface unit y) 11, ROM 12*R, which has the function of converting into a digital value
AM13-distance calculation unit 15. The audio input section 16 is the microphone.

す１２の持つ内部パス１７へ共通に接続される。It is commonly connected to the internal path 17 of the bus 12.

尚、６は第２図において示した音声辞書バッファであっ
てパスインターフェースユニット１８を介してシステム
パス９に接続される。この音声辞書バッファ６は複数の
辞書が格納される辞書領域ＤＡならびに辞書情報領域（
ＤＩＡ　）から成る。この辞書情報領域には前記辞書領
辞書内の単語数が設定される。6 is the speech dictionary buffer shown in FIG. 2, which is connected to the system path 9 via the path interface unit 18. This speech dictionary buffer 6 includes a dictionary area DA where a plurality of dictionaries are stored and a dictionary information area (
DIA). The number of words in the dictionary is set in this dictionary information area.

第４図（、）Ｃｂ）は本発明において使用される音声認
識指令に伴うパラメータでありて、（ａ）ｔｆ命＋開始
時、ＣＰＵＪがら音声認識装置ｌに対し渡されるパラメ
ータ、伽）は認識処理後、音声認識装置差からＣＰＵ　
ｘに対し渡される７４９メータを示すＯ以下、第２図〜第５図を使用して本発明の動作につき詳
細に説明する。Figure 4 (,) Cb) are parameters accompanying the voice recognition command used in the present invention, (a) tf command + parameters passed from CPUJ to voice recognition device l at the start, 佽) are recognition After processing, the difference between the voice recognition device and the CPU
O indicates 749 meters passed for x.The operation of the present invention will now be described in detail using FIGS. 2 to 5.

まず、ＣＰＵＪは音声認識装置４に対し、システムバス
９を介して音声久方指令を発すると共に第４図（、）に
て示したパラメータを送出する。First, the CPUJ issues a voice command to the voice recognition device 4 via the system bus 9, and also sends the parameters shown in FIG. 4(,).

指令を受けた音声認識装置ヱは自身で有するマイクログ
ロセ、す１２を起動し、第５図に示したファームウェア
ルーチンの先頭、始めより処理を開始する。もちろん第
５図にて示すファームウェアルーチンｉＲＯＭＪｊｌｋ
３に格納すしているものとする。Upon receiving the command, the voice recognition device 1 starts up its own microgrocer 12 and starts processing from the beginning of the firmware routine shown in FIG. Of course, the firmware routine iROMJjlk shown in FIG.
Assume that it is stored in 3.

以下、マイクログ筒セｙ　ｔ　１　ｊ　Ｋよる音声認識
装置ｌ内部での音声認識動作の流ｎｔ−順次列挙する。Hereinafter, the flow of speech recognition operations inside the speech recognition device 1 using the microlog tube will be enumerated sequentially.

（１）　　第４図（−）にて示したフォーマ、トに従い
、Ｖ！識時に参照する辞書数及び辞書名をパスインター
フェースユニット１１を介して読取る。(1) According to the former shown in Figure 4 (-), V! The number of dictionaries and dictionary names to be referred to at the time of recognition are read via the path interface unit 11.

（２）入力音声を音声入力部１６にて受付け、上記操作
に従い入力音声を取込む。(2) Receive the input voice at the voice input unit 16 and capture the input voice according to the above operation.

（３）　　（１）で読取った辞書名の辞書情報に従い、
辞書バッファＣ内に収納されである単語と上記（２）で
取込んだ音声との距離を比較演算する。ここでの演算鉱
上述した手順で距離計算ロジック１ｊによりなされる。(3) According to the dictionary information of the dictionary name read in (1),
The distance between a certain word stored in the dictionary buffer C and the voice captured in the above (2) is compared and calculated. The calculation here is performed by the distance calculation logic 1j according to the procedure described above.

本発明実施例では距離計算の高速化のためにハードウェ
アを用いているが、もちろんこの機能をマイクログロセ
、す１２が代用しても良い。In the embodiment of the present invention, hardware is used to speed up the distance calculation, but of course a microgross processor 12 may also be used instead of this function.

（４）指定された１辞書内の全ての単語につき上記距離
計算を繰返す。(4) Repeat the above distance calculation for all words in one specified dictionary.

（５）指定された全ての辞書につき終了するまで上記（
３）　＃　（４）の操作を繰返す。(5) The above (
3) #Repeat the operation in (4).

（６）距離計算によシ得られる最小距離が予め定められ
た値以下のと＊ｇ＊がなされたものとみなし、その単語
を含む辞書名と単語名を第４図（ｂ）に示し九７オーマ
、トに従い、パスインター７エー・スユニ、ト１１を９
し、ＣＰＵ　２に対し送付する。(6) If the minimum distance obtained by distance calculation is less than or equal to a predetermined value, it is assumed that *g* has been performed, and the dictionary name and word name that include that word are shown in Figure 4 (b). 7 Ohm, according to T, pass Inter 7 A Suuni, T 11 to 9
and sends it to CPU 2.

以上説明の如く本発明拡、音声認識時に参照する辞書名
を通常の辞書情報と共にパラメータとして設定し、且つ
認識に対する応答として認識単語名とその単語を含む辞
書名を要求のあった装置へ転送する構成としたものであ
って、これにより、無意味な処理が省略されることにな
り、従って認識に要する処理時間が減少すると共にＣＰ
Ｕの負担が軽減される０本発明を採用することにより効
率の良い音声認識システムを実現できる。As explained above, the present invention is expanded to set the dictionary name to be referred to during speech recognition as a parameter together with normal dictionary information, and to transfer the recognized word name and the dictionary name containing the word to the requesting device as a response to the recognition. As a result, meaningless processing is omitted, the processing time required for recognition is reduced, and the CP
By adopting the present invention, which reduces the burden on U, an efficient speech recognition system can be realized.

[Brief explanation of the drawing]

第１図は従来の音声認識装置において処理されるｌｌｉ
！繊命令の動作を示す７０−チャート、第２図は本発明
が採用される計算機システムの構成例を示すブロック図
、第３図は第２図における音声認識装置の内部構成を示
すブロック図、第４図（ａ）　（ｂ）は本発明において
使用される音声認識指令に伴うパラメータを示す図、第
５図は本発明の動作を概略的に示すファームウェアフロ
ーチャートである。２・・・ＣＰＵ、４・−音声認識装置、６−音声辞書パ
、ファ、１１・−Δスインターフ　、　−スユ＝　。ト、１２・−マイクロプロセ、す、Ｊ　Ｊ−ＲＯＭ。１５−距離計算ロシック。Figure 1 shows the lli processed in a conventional speech recognition device.
! FIG. 2 is a block diagram showing an example of the configuration of a computer system to which the present invention is applied; FIG. 3 is a block diagram showing the internal configuration of the speech recognition device in FIG. 2; 4(a) and 4(b) are diagrams showing parameters associated with voice recognition commands used in the present invention, and FIG. 5 is a firmware flowchart schematically showing the operation of the present invention. 2...CPU, 4-speech recognition device, 6-speech dictionary pa, fa, 11-Δsinterf, -suyu=. 12.-Microprocessor, J J-ROM. 15-Distance Calculation Rosic.

Claims

[Claims]

In a speech input computer system that stores a plurality of dictionaries in a storage device that can be referenced by a speech recognition device, recognizes input speech tg using the dictionaries, and treats recognized words as input data,
The speech recognition device includes a first means for receiving input speech, and performs a computation to compare registered words, which are constituent elements of the dictionary, with the input speech according to the dictionary name and dictionary information obtained from the outside and referred to during speech recognition. and a second means to perform Kg discrimination based on the value obtained by the second means, and when it is within a certain range, recognize the word name and the dictionary name containing the word. A voice recognition method characterized by notifying an external device that has received a request.