JPH0562998B2

JPH0562998B2 -

Info

Publication number: JPH0562998B2
Application number: JP60209369A
Authority: JP
Inventors: Masanori Myatake
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 1985-09-20
Filing date: 1985-09-20
Publication date: 1993-09-09
Also published as: JPS6269298A

Description

【発明の詳細な説明】 (イ) 産業上の利用分野本発明は、音声確認の結果によつて次回の確認
処理に於ける識別対象の状態が遷移する音声認識
装置に関する。DETAILED DESCRIPTION OF THE INVENTION (a) Field of Industrial Application The present invention relates to a voice recognition device in which the state of an identification target in the next confirmation process changes depending on the result of voice confirmation.

(ロ) 従来の技術一般に音声認識装置は、特開昭59−219798号公
報に示されている如く、認識語数が増えるほど認
識率が低下する。従来よりこれをふせぐたけに、
認識語をいくつかのグループに分け、音声認識の
結果に応じてその次に認識する時の認識対象語を
限定することにより、認識率を向上させていた。
例えば、電化製品を音声認識によつて制御する場
合、まず「ライト」という音声が認識されれば、
次は「オン」、「オフ」または「取り消し」の３語
の音声のみを認識対象とすることができる。第２
図はこのような状態遷移規則情報の例を示したも
ので、各状態（状態１、状態２、状態３、……）
における認識対象語と認識結果に対する状態遷移
の情報が記憶されている。例えばいま、ライトと
クーラー、およびクーラーに付属したタイマーを
音声認識により制御するとする。使用する単語は
数字の「イチ」から「キユウ」、「ライト」、「クー
ラー」、「タイマー」、「オン」、「オフ」、「トリケ
シ」の15語である。図中の×は認識対象外である
ことを意味し、また数字はそれぞれ単語が認識さ
れた時に遷移する次の状態番号を示している。(B) Prior Art Generally speaking, the recognition rate of speech recognition devices decreases as the number of recognized words increases, as shown in Japanese Patent Application Laid-Open No. 59-219798. This is more important than before,
The recognition rate was improved by dividing the recognition words into several groups and limiting the next recognition target words according to the results of speech recognition.
For example, when controlling electrical appliances using voice recognition, if the voice ``light'' is first recognized, then
Next, only the voices of the three words "on", "off", or "cancel" can be recognized. Second
The figure shows an example of such state transition rule information, and each state (state 1, state 2, state 3,...)
Information on state transitions for recognition target words and recognition results is stored. For example, let's say you want to control a light, a cooler, and a timer attached to the cooler using voice recognition. The 15 words used are from the number ``ichi'' to ``kiyuu'', ``light'', ``cooler'', ``timer'', ``on'', ``off'', and ``trikeshi''. The x in the figure means that the word is not recognized, and each number indicates the next state number to which the word will transition when it is recognized.

現在、クーラーが停止しているものとし、これ
を状態１とする。状態１においては認識対象語は
「ライト」と「クーラー」のみである。まず音声
「ライト」が入力され認識されたとすると、状態
２に遷移する。状態２では「オン」、「オフ」、「ト
リケシ」のいずれかが認識され、状態１に戻る。
「オン」または「オフ」が認識されたときには前
記出力部においては、所定の信号が出力される。
ここではライトが点灯しているときと消灯してい
るときを同じ状態１としたが、ライトが点灯して
いるときにライト点灯を意味する信号が出力され
ては不都合なときには、別の状態を定義すればよ
い。 It is assumed that the cooler is currently stopped, and this is set as state 1. In state 1, the only words to be recognized are "light" and "cooler." First, if the voice "light" is input and recognized, a transition is made to state 2. In state 2, any one of "on", "off", and "trigger" is recognized, and the state returns to state 1.
When "on" or "off" is recognized, a predetermined signal is output from the output section.
In this case, the state 1 is the same when the light is on and when the light is off, but if it is inconvenient for a signal indicating that the light is on to be output when the light is on, a different state may be used. Just define it.

一方、状態１のときに「クーラー」と認識され
たら、状態３に遷移する。ここでは「オン」と
「トリケシ」のみが認識対象であり、「オン」と認
識されたとすると、さらに状態４に遷移し、クー
ラーは始動する。クーラーが始動するとタイマー
の設定が可能になり、認識対象語は「ライト」、
「クーラー」、「タイマー」の３語となる。 On the other hand, if it is recognized as a "cooler" in state 1, the state transitions to state 3. Here, only "on" and "trikeshi" are to be recognized, and if "on" is recognized, the state further transitions to state 4 and the cooler starts. Once the cooler starts, you can set the timer, and the words to be recognized are "Light",
The three words are "cooler" and "timer."

状態４において「タイマー」と認識されたとす
ると状態７に遷移し、認識対象語は「イチ」から
「キユウ」までの数字と「オフ」および「トリケ
シ」となる。数字が認識されたときにはクーラー
停止までの時間を設定するための信号が出力さ
れ、「オフ」が認識されたときにはタイマーの設
定を解除する信号を出力し、また「トリケシ」が
認識されたときには何もしないで、何れの場合も
状態４に戻る。 If "timer" is recognized in state 4, the state transitions to state 7, where the words to be recognized are numbers from "ichi" to "kiyu", "off", and "torikeshi". When a number is recognized, a signal is output to set the time until the cooler stops, when "off" is recognized, a signal is output to cancel the timer setting, and when "trikeshi" is recognized, a signal is output to set the time until the cooler stops. In either case, the process returns to state 4.

従来はこのような状態遷移の規則を記憶し制御
するために、音声認識装置をホストコンピユータ
によつて制御し、音声認識装置から出力される認
識結果によつて次の認識対象語を音声認識装置に
知らせる方法が用いられてきた。しかしこの方法
では、ホストコンピユータを必要とするためシス
テムが複雑で大きなものになりやすい。 Conventionally, in order to memorize and control such state transition rules, the speech recognition device is controlled by a host computer, and the next recognition target word is selected by the speech recognition device based on the recognition result output from the speech recognition device. A method of informing has been used. However, this method requires a host computer, which tends to make the system complex and large.

一方、音声認識装置に状態遷移規則を記憶さ
せ、音声認識装置自らが状態遷移を監視して認識
対象語を判定することにより、システムを簡略化
できる。 On the other hand, the system can be simplified by storing state transition rules in the speech recognition device and allowing the speech recognition device itself to monitor state transitions and determine recognition target words.

第３図はこのようなシステムの構成例を示した
ブロツク図であり、１はマイクロホン、２は入力
音声の特徴を抽出する特徴抽出部、３は前記特徴
抽出部２で得られる入力音声の特徴を用いて入力
パターンを作成するパターン作成部である。４は
予め登録されている複数の標準パターンを格納す
る標準パターン記憶部、５は前記入力パターンと
標準パターン記憶部４の標準パターンとを比較す
るパターンマツチング部、６はパターンマツチン
グの結果に応じて所定の信号を外部に出力する出
力部である。７は第２図図示の如き状態遷移規則
を格納するための状態遷移規則記憶部、８は該状
態遷移規則記憶部７の情報を用いて前記標準パタ
ーン記憶部４内の標準パターンを選択する標準パ
ターン指定部、９は該標準パターン指定部８から
の出力に応じてパターンマツチング部５へ送る標
準パターンを選択するゲートである。 FIG. 3 is a block diagram showing an example of the configuration of such a system, in which 1 is a microphone, 2 is a feature extractor that extracts the features of the input voice, and 3 is the feature of the input voice obtained by the feature extractor 2. This is a pattern creation section that creates an input pattern using . 4 is a standard pattern storage unit that stores a plurality of standard patterns registered in advance; 5 is a pattern matching unit that compares the input pattern with the standard pattern in the standard pattern storage unit 4; and 6 is a pattern matching unit that stores a plurality of standard patterns registered in advance. This is an output section that outputs a predetermined signal to the outside in response to the signal. 7 is a state transition rule storage unit for storing state transition rules as shown in FIG. 2; 8 is a standard for selecting a standard pattern in the standard pattern storage unit 4 using the information in the state transition rule storage unit 7; A pattern specifying section 9 is a gate for selecting a standard pattern to be sent to the pattern matching section 5 in accordance with the output from the standard pattern specifying section 8.

しかし従来は、状態遷移の規則が音声認識装置
を制御するマイクロプロセツサのプログラムの一
部として組み込まれていたため音声認識装置が専
用化されてしまい、用途に応じて音声認識装置の
プログラムを作り変えなければならない不都合が
あつた。 However, in the past, the state transition rules were incorporated as part of the microprocessor program that controlled the speech recognition device, which resulted in the speech recognition device being specialized, and the speech recognition device program was changed depending on the purpose. There was an inconvenience that I had to do.

(ハ) 発明が解決しようとする問題点本発明は、上記の不都合を解決するためのもの
であつて、内部に記憶されている状態遷移規則を
簡単な手段で変更することのできる音声認識装置
を提供することを目的とする。(c) Problems to be Solved by the Invention The present invention is intended to solve the above-mentioned inconveniences, and provides a speech recognition device that can change the internally stored state transition rules by simple means. The purpose is to provide

(ニ) 問題点を解決するための手段入力された音声から特徴量を抽出する特徴抽出
部と、抽出された特徴をもとに入力された音声を
識別しその結果に応じて所定の信号を出力する識
別部と、前記識別部の識別結果に応じて遷移する
識別対象に関する複数の状態の間の遷移規則を記
憶する状態遷移規則記憶部と、から構成され、か
つ前記状態遷移規則記憶部は前記識別部に対して
着脱可能な構造を有していることを特徴とする。(d) Means for solving the problem A feature extraction unit that extracts features from input speech, and a feature extraction unit that identifies the input speech based on the extracted features and outputs a predetermined signal according to the result. It is composed of an identification unit that outputs an output, and a state transition rule storage unit that stores transition rules between a plurality of states regarding an identification target that transitions according to the identification result of the identification unit, and the state transition rule storage unit is It is characterized in that it has a structure that can be attached to and detached from the identification section.

さらに、前記状態遷移規則記憶部は、ICソケ
ツト上に装備されたROM（リード・オンリー・
メモリー）により構成されることを特徴とする。 Furthermore, the state transition rule storage section is a ROM (read-only) installed on the IC socket.
memory).

(ホ) 作用本発明は、上記のように構成されているので、
前記識別部は前記状態遷移規則記憶部の規則に従
つて識別対象を判定することにより、状態遷移規
則記憶部例えばROMを交換するだけで、簡単に
目的に応じた状態遷移規則を音声認識装置に付加
することができる。(E) Effect Since the present invention is configured as described above,
By determining the identification target according to the rules in the state transition rule storage unit, the identification unit can easily apply state transition rules according to the purpose to the speech recognition device by simply replacing the state transition rule storage unit, for example, the ROM. can be added.

(ヘ) 実施例第１図は本発明の実施例で第２図の具体化する
ためにパターン作成部３、パターンマツチング部
５、標準パターン指定部８およびゲート９をマイ
クロプロセツサと演算回路によつて置き変えて実
現したものであり、第２図と同じ機能を持つもの
には同じ番号を付してある。マイクロホン１で入
力された音声はアンプ部１０で増幅され、BPF
（バンド・パス・フイルタ）１１で周波数帯域に
分割され、ADC（アナログ・デジタル・コンバー
タ）１２によつてデジタル化され、マイクロプロ
セツサ１３に取り込まれる。マイクロプロセツサ
１３は該音声情報に基づき演算回路１４を用いな
がら音声の入力パターンを作成する。音声登録時
は該入力パターンを標準パターンとして標準パタ
ーン記憶部４に特定数格納し、音声認識時には該
入力パターンと前記標準パターン記憶部４に格納
されている各標準パターンとをマイクロプロセツ
サ１３が演算回路１４を用いながら比較し、その
結果は出力部６を通して外部に出力される。その
際、マイクロプロセツサ１３は状態遷移規則記憶
部７に格納されている状態遷移規則を前回の音声
認識結果に基づいて参照しなら認識対象語を判断
する。(F) Embodiment FIG. 1 shows an embodiment of the present invention. In order to embody the embodiment shown in FIG. This figure was realized by replacing it with the previous figure, and parts having the same functions as those in FIG. 2 are given the same numbers. The audio input through the microphone 1 is amplified by the amplifier section 10, and the BPF
The signal is divided into frequency bands by a band pass filter (band pass filter) 11, digitized by an ADC (analog-to-digital converter) 12, and taken into a microprocessor 13. The microprocessor 13 uses the arithmetic circuit 14 to create a voice input pattern based on the voice information. At the time of voice registration, a specified number of input patterns are stored as standard patterns in the standard pattern storage section 4, and at the time of voice recognition, the microprocessor 13 stores the input patterns and each standard pattern stored in the standard pattern storage section 4. The comparison is made using the arithmetic circuit 14, and the result is outputted to the outside through the output section 6. At this time, the microprocessor 13 refers to the state transition rules stored in the state transition rule storage section 7 based on the previous speech recognition results and determines the recognition target word.

本発明実施例装置が従来装置と異なる所は状態
遷移規則記憶部７′にある。即ち、状態遷移規則
記憶部７′は状態遷移規則が記憶されている
ROM７１とそれを装着するためのICソケツト７
２から構成されており、簡単にROMの交換がで
きる。 The difference between the device according to the embodiment of the present invention and the conventional device lies in the state transition rule storage section 7'. That is, the state transition rule storage unit 7' stores state transition rules.
ROM71 and IC socket 7 for installing it
It consists of 2 parts, and the ROM can be easily replaced.

このように、状態遷移規則記憶部７のROM７
１がプログラム記憶部１５とはハード的に分離さ
れ斯る装置の用途と対応した状態遷移規則テーブ
ルを格納した他のROMに交換可能であるので、
プログラムを修正する事なく斯る装置の用途を大
巾に増大せしめる事ができる。 In this way, the ROM 7 of the state transition rule storage unit 7
1 is hardware-separated from the program storage unit 15 and can be replaced with another ROM that stores a state transition rule table corresponding to the purpose of the device.
The uses of such a device can be greatly increased without modifying the program.

(ト) 発明の効果以上の説明で明らかなように、本発明によれ
ば、ホストコンピユータを必要とせずか簡単な構
造で状態遷移規則を用いることができ、しかも状
態遷移規則を容易に変更することができるので、
認識率が高く多くの用途に対応できる優れた音声
認識装置を提供することができる。(G) Effects of the Invention As is clear from the above explanation, according to the present invention, state transition rules can be used with a simple structure without requiring a host computer, and the state transition rules can be easily changed. Because you can
It is possible to provide an excellent speech recognition device that has a high recognition rate and can be used in many applications.

[Brief explanation of the drawing]

第１図は本発明の音声認識装置の一実施例ブロ
ツク図、第２図は状態遷移規則の具体例模式図、
第３図は状態遷移規則を用いた音声認識装置のブ
ロツク図である。１……マイクロフオン、４……標準パターン記
憶部、７′……状態遷移規則記憶部、１３……マ
イクロプロセツサ、１４……演算部、７１……
ROM、７２……ソケツト。 FIG. 1 is a block diagram of an embodiment of the speech recognition device of the present invention, and FIG. 2 is a schematic diagram of a specific example of state transition rules.
FIG. 3 is a block diagram of a speech recognition device using state transition rules. 1... Microphone, 4... Standard pattern storage unit, 7'... State transition rule storage unit, 13... Microprocessor, 14... Arithmetic unit, 71...
ROM, 72...socket.

Claims

[Scope of Claims] 1. A feature extraction unit that extracts feature amounts from input speech, and a feature extraction unit that identifies the input speech from among a large number of recognition targets based on the extracted features and determines a predetermined amount according to the result. an identification unit that outputs a signal, and a state transition rule storage unit that stores transition rules between a plurality of states regarding an identification target that transitions according to the identification result of the identification unit, and the identification unit includes: A speech recognition device characterized in that it performs identification processing after determining an identification target according to the rules of the state transition rule storage unit connected in a detachable state. 2. The speech recognition device according to claim 1, wherein the state transition rule storage unit is comprised of a ROM (read only memory) detachably installed on an IC socket.