JPS6269298A

JPS6269298A - Voice recognition equipment

Info

Publication number: JPS6269298A
Application number: JP60209369A
Authority: JP
Inventors: 正典宮武
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 1985-09-20
Filing date: 1985-09-20
Publication date: 1987-03-30
Also published as: JPH0562998B2

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】（イ）産業上の利用分野本発明は、音声認識の結果によって次回の認識処理に於
ける識別対象の状態が遷移する音声認識装置に関する。DETAILED DESCRIPTION OF THE INVENTION (A) Field of Industrial Application The present invention relates to a speech recognition device in which the state of an identification target in the next recognition process changes depending on the result of speech recognition.

（ロ）従来の技術一般に音声認識装置は、特開昭５９−２１９７９８号公
報に示されている如く、認識語数が増えるほど認識率が
低下する。従来よりこれをふせぐために、認識語をいく
つかのグループに分け、音声認識の結果に応じてその次
に認はする時の認識対象語を限定することにより、認識
率を向上させていた。例えば、電化製品を音声認識によ
って制御する場合、まず「ライト」というｆ声が認識さ
れれば、次は「オン」、「オフ」または「取り消し」の
３語の音声のみを認識対象とすることができる。第２図
はこのような状態遷移規則情報の例を示したもので、各
状態（状態１、状態２、状態６、・・・・・・）におけ
る認＆ｉｋ対未語と認識結果に対する状暢遷移の情報が
ｇｄ憶されている。例えばいま、ライトとクーラー、お
よびクーラーに付属したタイマーを音声認識により制御
するとする。便用する単語は数字の「イチ」から「キュ
ウ」、「ライト」、「クーラー」、「タイマー」、「オ
ン」、「オフ」、「トリケシ」の１５語である。図中の
Ｘは認識対象外であることを意味し、また数字はそれぞ
れの単語が認識された時に遷移する１次の状７俵番号を
示している。(B) Prior Art Generally speaking, the recognition rate of speech recognition apparatuses decreases as the number of recognized words increases, as shown in Japanese Patent Application Laid-Open No. 59-219798. Conventionally, in order to prevent this, the recognition rate was improved by dividing the recognition words into several groups and limiting the words to be recognized the next time according to the result of speech recognition. For example, when controlling electrical appliances using voice recognition, first the voice ``light'' is recognized, then only the three words ``on'', ``off'', or ``cancel'' are recognized. I can do it. Figure 2 shows an example of such state transition rule information, and shows the state transition rule information for each state (state 1, state 2, state 6, etc.) and the state transition rule information for recognition &ik vs. unspoken and recognition results. Transition information is stored in gd. For example, let's say you want to control a light, a cooler, and a timer attached to the cooler using voice recognition. The 15 words used are the numbers ``ichi'' to ``kyu'', ``light'', ``cooler'', ``timer'', ``on'', ``off'', and ``trikeshi''. The symbol X in the figure means that the words are not recognized, and the numbers indicate the number of the seven bales in the first order that changes when each word is recognized.

現在、クーラーが停止しているものとし、これを状態１
とする。状、四１においては認識対象語は「ライト」と
「クーラー」のみである。まず音声「ライト」が入力さ
れ認識されたとすると、状態２に遷移する。状態２では
「オン」、「オフ」、「トリケシ」のいずれかが認識さ
れ、状態１に戻る。「オン」または「オフ」が認識され
たときには前記出力部においては、所定の信号が出力さ
れる。ここではライトが点灯しているときと消灯してい
るときを同じ状！Ｉｔとしたが、ライトが点灯している
ときにライト点灯を意味する信号が出力されては不都合
なときには、別の状態を定義すればよい。Assume that the cooler is currently stopped, and this is set to state 1.
shall be. In Case 41, the only words to be recognized are ``light'' and ``cooler.'' First, if the voice "light" is input and recognized, a transition is made to state 2. In state 2, any one of "on", "off", and "trigger" is recognized, and the state returns to state 1. When "on" or "off" is recognized, a predetermined signal is output from the output section. Here, the state is the same when the light is on and when it is off! However, if it is inconvenient for a signal indicating that the light is on to be output when the light is on, another state may be defined.

一方、状態１のときに「クーラー」と認識されたら、状
態６に遷移する。ここでは「オン」と「トリケシ」のみ
が認識対象であり、「オン」と認識されたとすると、さ
らに状７擦４に這移し、クーラーは始動する。クーラー
が始動するとタイマーの設定が可能になり、認識対象語
は「ライト」、「クーラー」、「タイマー」の３語とな
る。On the other hand, if it is recognized as a "cooler" in state 1, the state transitions to state 6. Here, only "ON" and "TRIKESH" are to be recognized, and if "ON" is recognized, the process moves further to step 7 and the cooler starts. Once the cooler starts, the timer can be set, and the three words to be recognized are "light,""cooler," and "timer."

状Ｈ４において「タイマー」と認識されたとすると状態
７に４移し、認識対象語は「イチ」から「キュウ」まで
の数字と「オフ」および「トリク」となる。数字が認識
されたときにはクーラー停止までの時間を設定するため
の信号が出力され、「オフ」が認識されたときにはタイ
マーの設定を解除する信号を出力Ｌ、また「トリケシ」
が認−されたときには例もしないで、１ｉｉ１７１．の
場合も状態４に戻る。If "timer" is recognized in state H4, the state moves to state 7 and the words to be recognized are numbers from "ichi" to "kyu", "off" and "trick". When the number is recognized, a signal is output to set the time until the cooler stops, and when "off" is recognized, a signal is output to cancel the timer setting.
1ii171. In this case, the state returns to state 4 as well.

従来はこのような状態遷移の規則を記憶し制御するため
に、音声認識装置をホストコンピュータによって制御し
、音声認識装置から出力される認識結果によって次の認
識対象語を音声認識装置に知らせる方法が用いられてき
た。しかしこの方法では、ホストコンピュータを必要と
するためシステムがな雑で大きなものになりやすい。Conventionally, in order to memorize and control such state transition rules, the speech recognition device is controlled by a host computer, and the next recognition target word is notified to the speech recognition device based on the recognition result output from the speech recognition device. has been used. However, since this method requires a host computer, the system tends to be complicated and large.

一方、音声認識装置に状態遷移規則を記憶させ、音声認
識装置自らが状態遷移を監視して認識対象語を判定する
ことによシ、システムを簡略化できるＯ第５図はこのようなシステムの構成例を示したブロック
図であり、ｉｌ＋はマイクロホン、（２）は入力音声の
特徴を抽出する特徴抽出部、（３）は前記特徴抽出部（
２１で得られる入力音声の特＠を用いて入カバターンを
作成するパターン作成部である。（４）は予め登録され
ている員数の標準パターンを格納する標準パターン記憶
部、（５）は前記入カバターンと標準パターン記憶部（
４）内の標準パターンとを比較するパターンマツチング
部、（６１はパターンマツチングの拮果に応じて所定の
信号を外部に出力する出力部である。（７１は第２図図
示の如き状態、Ｓ規則則を格納するための状Ｊ４移規則
記憶部、（８）は該状ｊｆＡ逼移規則記憶部！７１のｆ
Ｊ′Ｍｉを用いて前記標準パターン記憶部（４）内の標
準パターンを選択する標準パターン指定部、（９１は該
標蘭パターン指定部（８）からの出力に応じてパターン
マツチング部（５）へ送る標準パターンを選択するゲー
トである。On the other hand, the system can be simplified by storing the state transition rules in the speech recognition device and having the speech recognition device itself monitor the state transitions and determine the recognition target words. Figure 5 shows a diagram of such a system. It is a block diagram showing a configuration example, where il+ is a microphone, (2) is a feature extraction unit that extracts features of input audio, and (3) is the feature extraction unit (
This is a pattern creation section that creates an input pattern using the special @ of the input voice obtained in step 21. (4) is a standard pattern storage section that stores a pre-registered number of standard patterns; (5) is a standard pattern storage section (
4) is a pattern matching section that compares the pattern with the standard pattern (61 is an output section that outputs a predetermined signal to the outside according to the result of pattern matching. (71 is a state as shown in FIG. 2) , the state J4 transfer rule storage unit for storing the S rule, (8) is the state jfA transfer rule storage unit!71 f
A standard pattern specifying section (91) selects a standard pattern in the standard pattern storage section (4) using J'Mi; ) is a gate that selects the standard pattern to be sent to.

しかし従来は、状態遷移の規則が音声認識装置を制御す
るマイクロプロセッサのプロクラムノ一部として組み込
まれていたため音声認識装置が専用化されてしまい、用
途に応じて音声認識装置のプログラムを作り変えなけれ
ばならない不都合があった〇（ハ）発明が解決しようとする問題点本発明は、上ＩＱの不都合全解決するためのものであ７
て、内部に記憶されている状態遷移規則を簡単な手段で
変史することのできる音声認識装置を提供することを目
的とする。However, in the past, the state transition rules were incorporated as part of the program of the microprocessor that controlled the speech recognition device, which meant that the speech recognition device was specialized, and the program for the speech recognition device had to be rewritten depending on the purpose. (c) Problems to be solved by the invention The present invention is intended to solve all the disadvantages of upper IQ.
It is an object of the present invention to provide a speech recognition device that can change the state transition rules stored therein by simple means.

（ロ）問題点を解決するための手段入力された音声から特＠槌を抽出する特徴抽出部と、抽
出された特徴をもとに入力された音声を識別しその結果
に応じて所定の信号を出力する識別部と、前ｄピ識別部
の識別結果に応じて遷移する識別対象に関する段数の状
態の間の遷移規則を記憶する状、憧遷移規則記憶部と、
から構成され、かつ前記状態遷移規則記憶部は前記識別
部に対して着脱可能な構造全盲していることを特徴とす
る。(b) Means for solving the problem A feature extraction unit that extracts special@tsuchi from the input voice, identifies the input voice based on the extracted features, and generates a predetermined signal according to the result. an identification unit that outputs a dpi identification unit; and a desired transition rule storage unit that stores a transition rule between states of the number of stages related to an identification target that transitions according to the identification result of the previous dpi identification unit;
The state transition rule storage section is characterized in that the state transition rule storage section has a completely blind structure that is removable from the identification section.

さらに、曲呂己状態遷＃規則記憶部は、ＩＣソケット士
に装備されたＲＯＭ（リード・オンリーメモリー）によ
り構成されることを特徴とする〇（ホ）作　用本発明は、Ｆ記のように構成されているので、前記識別
部は前記状態遷移規則記１］は部の規則に従って識別対
象全判定することにより、状態遷規則μ（ｊ記憶部例え
ば’ＦＩＣＭ’ｉ交換するだけで、簡単に目的に応じた
状態遷移規則を音声認識装置に付加することができる。Furthermore, the present invention is characterized in that the state transition #rule storage unit is constituted by a ROM (read only memory) installed in the IC socket operator. Since the identification unit is configured as follows, the state transition rule 1] can be easily changed by simply exchanging the state transition rule μ(j memory unit, for example, 'FICM'i) by determining all the objects to be identified according to the rules of the It is possible to add state transition rules to the speech recognition device according to the purpose.

（へ）実施例第１図は本発明の実施例で第２図を具体化するためにパ
ターン作成部（３）、パターンマツチング１Ｍ１（５１
、積重パターン指定部（８）およびゲート（９）をマイ
クロプロセッサと演算回路によりて置き変えて実現した
ものであり、第２霞と同じ機能を持つものには同じ番号
を付１−である。マイクロホン＋１１で入力された音声
はアンプ８（１０１，１で増幅され、ＢＰＦ（バント・
パス・フィルタ）（ＩＩＪで周ｅｔ数帯ｍに分割され、
ＡＤＯ（アナログ−デジタル・コンバータ）（１２１に
よってデジタル化され、マイクログロセパターンを作成
する。音声登録時ｆｉ該大入カバターン標準バター・ン
として標準パターン記憶部（４）に特定数格納し、音声
認識時には該入カバターンと前記標準パターン記憶部（
４）に格納されている各標準パター・ンとをマイクログ
０セツサ（１３１が演算回路α４１を用いながら比較し
、その結果は出力部（６）を通して外部に出力される。(v) Embodiment FIG. 1 shows an example of the present invention, and in order to embody FIG.
, was realized by replacing the stacking pattern specifying section (8) and gate (9) with a microprocessor and arithmetic circuit, and those having the same function as the second haze are given the same number as 1-. . The audio input through microphone +11 is amplified by amplifier 8 (101,1), and then passed through BPF (Bant filter).
pass filter) (divided into et number bands m by IIJ,
It is digitized by an ADO (Analog-Digital Converter) (121) to create a microgrosse pattern. When registering a voice, a specific number of patterns are stored in the standard pattern storage unit (4) as the standard pattern for the large input cover, and the voice During recognition, the input cover pattern and the standard pattern storage section (
The microlog setter (131) compares each standard pattern stored in the microlog setter (4) using the arithmetic circuit α41, and the result is outputted to the outside through the output unit (6).

そのぬ、マイクログ０セツサ（１３は状態遷移規則記憶
部（７）に格納されている状Ｂ遷移規則を前回の音ＰＪ
ｆｉ結果に基づいて参照しながら認識対象語を判断・ノ
る。On the other hand, the microlog 0 setter (13 is the state B transition rule stored in the state transition rule storage section (7)) is stored in the previous sound PJ.
The target word to be recognized is judged while referring to the fi result.

本発明実施例装置が従来装置と異なる所は状態遷移規則
記憶部（７１にある。即ち、状態遷移規則記憶部（７）
は状態遷移規則が記憶されているＲＯＭ（７１１とそれ
を装着するためのＩＣソケットσ２から構成されており
、簡単にＲＯＭの交換ができる。The difference between the device according to the embodiment of the present invention and the conventional device is in the state transition rule storage section (71).
consists of a ROM (711) in which state transition rules are stored and an IC socket σ2 for mounting the ROM, and the ROM can be easily replaced.

このように、状態遷移規則記憶部（７）の’ＲＯＭケ１
］がプログラム記憶部σ９とはハード的に分離さｔ１斯
る装置の用途と対応し、こ状態４秘規則テーブルを格納
した他のＲＯＭに交換可能であるので、プログラムを修
正する串なく斯る装置の用途を大巾に増大せしめる事が
できる。In this way, the 'ROM key 1 of the state transition rule storage section (7)
] is hardware-separated from the program storage section σ9, and corresponds to the purpose of the device, and can be replaced with another ROM that stores the 4-secret rule table in this state, so there is no need to modify the program. The uses of the device can be greatly increased.

（ト１　発明の効果以上の説明で明らかなように、本発明によれば、ホスト
コンピュータを必要とせずかつ簡単な溝造÷状態這移規
則を用いることができ、しかも状態遷移規則を容易に変
更することができるので、認識率が高く多くの用途に対
応できる優れた音声認識装置を提供することができる０(G1) Effects of the Invention As is clear from the above explanation, according to the present invention, the simple Mizukuri/state transition rule can be used without the need for a host computer, and the state transition rule can be easily created. Since it can be changed, it is possible to provide an excellent speech recognition device that has a high recognition rate and can be used for many applications.

[Brief explanation of drawings]

第１印は本発明の音声認、？ａ装置の一実施例ブロック
図、第２図は状態遷移規則の４体例模式図、第５図は状
態遷移規則を用いた′ｔＩｒ声ｍ識装置のブロック図で
ある。＋１１−・・マイクロフォン、（４）−・標準パターン
記憶部、（７ｉ’・・・状態遷移規則記憶部、（１３１
・−マイクロプロセッサ、　■・・・演算部、　ヴト・
・ＲＯＭ。びり・・・ソケットThe first mark is the voice recognition of the present invention, ? FIG. 2 is a schematic diagram of a four-body example of state transition rules, and FIG. 5 is a block diagram of a voice recognition device using state transition rules. +11-...Microphone, (4)--Standard pattern storage unit, (7i'...State transition rule storage unit, (131
-Microprocessor, ■...Arithmetic section, Vt.
・ROM. Blink...Socket

Claims

[Claims]

(1) A feature extraction unit that extracts features from input audio, and identifies the input audio from among many recognition targets based on the extracted features, and outputs a predetermined signal according to the result. and a state transition rule storage unit that stores transition rules between a plurality of states regarding an identification target that changes according to the identification result of the identification unit, and the identification unit is detachable from this. A speech recognition device characterized in that a recognition target is determined according to a rule in the state transition rule storage unit that is connected in a series of states, and then a recognition process is performed.

(2) The voice recognition device according to claim 1, wherein the state transition rule storage unit is comprised of a ROM (read-only memory) that is removably installed on an IC socket.