JPH06337697A - Voice recognition device - Google Patents

Voice recognition device

Info

Publication number
JPH06337697A
JPH06337697A JP5129454A JP12945493A JPH06337697A JP H06337697 A JPH06337697 A JP H06337697A JP 5129454 A JP5129454 A JP 5129454A JP 12945493 A JP12945493 A JP 12945493A JP H06337697 A JPH06337697 A JP H06337697A
Authority
JP
Japan
Prior art keywords
recognition
dsp
amplifier
voice recognition
variable gain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP5129454A
Other languages
Japanese (ja)
Other versions
JP2975808B2 (en
Inventor
Koichi Ide
廣一 井出
Toshiyuki Watanabe
俊幸 渡辺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sanyo Electric Co Ltd
Original Assignee
Sanyo Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sanyo Electric Co Ltd filed Critical Sanyo Electric Co Ltd
Priority to JP5129454A priority Critical patent/JP2975808B2/en
Publication of JPH06337697A publication Critical patent/JPH06337697A/en
Application granted granted Critical
Publication of JP2975808B2 publication Critical patent/JP2975808B2/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Abstract

PURPOSE:To improve the precision in recognition by using a voice recognition result and adjusting an input part. CONSTITUTION:After a sound inputted from a microphone 1 is amplified by a preamplifier 2A and a variable gain amplifier 2B, is converted to a digital signal by an A/D converter 3 to be supplied to a digital signal processor(DSP) 4. After the digital signal is voice recognition processed in the DSP 4, is outputted from an interface circuit 5 through an optimum format or port. At this time, when no recognition succeeds, a signal adjusting an amplifier gain is outputted from the DSP 4, and a gain switching circuit 2C of the variable gain amplifier 2B is switched by the signal. That is, when no recognition succeeds, the circuit 2C is switched so as to increase the gain of the variable gain amplifier 2B. Further, since a success/a failure in recognition is a probable problem, a method feeding back the result by a human and the method, etc., probably learning in the DSP side 4 may be used also.

Description

【発明の詳細な説明】Detailed Description of the Invention

【0001】[0001]

【産業上の利用分野】本発明は入力手段として音声を用
いる情報機器や民生機器一般に利用できる音声認識装置
に関する。
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice recognition device which can be used in general information equipment and consumer equipment which use voice as input means.

【0002】[0002]

【従来の技術】人間の発声音は発声者ごとに異なると共
に同一発声者でも発声のたびに変動する。初期の音声認
識技術ではこれらの時間領域での変動をいかに整合させ
パターンマッチングするかの検討がなされた。さらに、
数理統計手法やニューラルネットを導入する事で周波数
・スペクトル両領域でのモデル化が行われ不特定話者の
単語単位での認識率は格段に向上してきた。
2. Description of the Related Art Human utterance sounds are different for each speaker and also vary for each speaker even if they are the same speaker. In the early speech recognition technology, how to match these fluctuations in the time domain and pattern matching was examined. further,
By introducing mathematical statistical methods and neural nets, modeling in both frequency and spectrum regions has been performed, and the recognition rate in word units of unspecified speakers has improved dramatically.

【0003】このような技術開発に伴い現在では音声認
識が手軽に行えるパーソナルソコンピュータ内蔵ボード
が多数製品化されている。
With the development of such technology, a large number of personal computer built-in boards capable of easily recognizing voice have been commercialized.

【0004】一方、民生品分野でも複雑化したAV機器
の操作の単純化の1つのアプローチとして音声入力によ
る操作入力という試みがなされており、例えば、雑誌
「テレビ技術」(’91年5月号 第38頁〜第44
頁)には音声認識技術を利用したリモートコントロール
装置が紹介されているように一部商品化されている。
On the other hand, in the field of consumer products, an attempt has been made to input an operation by voice input as one approach for simplifying the operation of a complicated AV device. Pages 38-44
Page) has been partially commercialized as a remote control device using voice recognition technology is introduced.

【0005】一般的な音声認識に使用される回路構成例
を図2に示す。一般にはディジタル信号処理が行われる
ため、入力マイク1・アンプ系2とAD(アナログ・デ
ィジタル)変換処理部3とコンピュータやDSP(ディ
ジタル・シグナルプロセッサ)もしくは専用の音声認識
用ICで構成される認識処理部4と外部インターフェイ
ス部5に分けられる。
FIG. 2 shows a circuit configuration example used for general voice recognition. Generally, since digital signal processing is performed, the input microphone 1, the amplifier system 2, the AD (analog / digital) conversion processing unit 3, the computer, the DSP (digital signal processor), or the recognition composed of a dedicated voice recognition IC It is divided into a processing unit 4 and an external interface unit 5.

【0006】[0006]

【発明が解決しようとする課題】現在製品化されている
技術は上述の単語単位(ユニット)あるいはその逐次処
理の範疇のもので、この場合、予め決められた単語のみ
を認識できれば良いので切り出した音声と用意したデー
タベースをどれだけうまくマッチングさせるかで認識率
が決まる。
The technology currently commercialized is in the category of the above-mentioned word unit (unit) or its sequential processing. In this case, it is necessary to recognize only a predetermined word. The recognition rate is determined by how well the voice and the prepared database are matched.

【0007】これは、認識のアルゴリズム上の問題と共
に周辺回路でどれほどうまく必要な音声を取り込めるの
かも重要な要素となる。
[0007] This is an important factor as well as the problem of the recognition algorithm and how well the peripheral circuit can capture the necessary voice.

【0008】簡単のため入力音声信号が図3に示すよう
な単一周波数であるとして考える。入力が過大レベルの
場合、認識の演算部へ取り込む前のアンプ系でオーバレ
ンジとなって図4に示すとおり波形はピークがクリップ
され台形状となり高調波成分が重畳する。この過大入力
レベルでは波形自体が変わってしまっている。つまり、
認識の重要なパラメータである周波数のパターンが変わ
ってしまい認識できなくなってしまう。
For simplicity, assume that the input voice signal has a single frequency as shown in FIG. When the input is at an excessive level, it becomes overrange in the amplifier system before being taken into the recognition calculation unit, and the peak is trapped in the waveform as shown in FIG. At this excessive input level, the waveform itself has changed. That is,
The frequency pattern, which is an important parameter for recognition, changes and it becomes impossible to recognize.

【0009】上述の様な過大入力時の対応としてはレン
ジオーバとならない用に予めアンプゲインを落としてし
まうことも考えられるが、逆に過小レベルの認識率が大
幅に低下してしまう。また、入力レベルに応じてゲイン
を可変させる方法もあるが、予めある程度入力レベルが
決まっており同期信号等の基準がある映像信号と異なり
音声信号はもともと時間軸・レベルとも基準がなく、か
なりダイナミックに変動するため、この方法では危険で
ある。
As a countermeasure against the excessive input as described above, it is conceivable to drop the amplifier gain in advance in order to prevent the range from being exceeded, but on the contrary, the recognition rate of the under level is greatly lowered. There is also a method of varying the gain according to the input level, but unlike a video signal that has a reference such as a synchronization signal whose input level has been determined in advance to some extent, the audio signal originally has no time axis / level reference and is quite dynamic. This method is dangerous because it fluctuates.

【0010】そこで、本考案では音声認識結果を使って
入力部の調整を行うことで認識の精度を向上させようと
言うものである。
In view of this, the present invention intends to improve the recognition accuracy by adjusting the input section using the voice recognition result.

【0011】[0011]

【課題を解決するための手段】同一人物がマイク等に向
かって発生する場合、ある一定時間内に決められた単語
を発生する時を考えると、マイクを比較的同じ様な位置
に持ってゆき同じ様な音圧で発生することが多い。そこ
で、音声入力もレベルーオーバ等で歪む場合は連続して
発生すると考えられる。
[Means for Solving the Problems] When the same person occurs toward a microphone or the like, consider the time when a predetermined word is generated within a certain fixed time, and bring the microphone to a relatively similar position. It often occurs with similar sound pressure. Therefore, if the voice input is also distorted due to level over or the like, it is considered to occur continuously.

【0012】そこで、音声認識に不成功の場合はゲイン
調整をする。この時、学習効果の有るような手法例えば
過去数回のデータを平均するなどして記憶して置くよう
にする。
Therefore, when the voice recognition is unsuccessful, the gain is adjusted. At this time, a method having a learning effect, for example, data of the past several times is averaged and stored.

【0013】[0013]

【作用】このようにすることで上述のような発声の特性
から認識率が向上していく。
By doing so, the recognition rate is improved due to the above-mentioned utterance characteristics.

【0014】[0014]

【実施例】図1に本発明の一実施例を示す。FIG. 1 shows an embodiment of the present invention.

【0015】マイク1から入力された音声はプリアンプ
2A、可変利得アンプ2Bで増幅後、AD変換器3でデ
ィジタル信号に変換されDSP4に供給される。DSP
では音声認識処理を行った後インターフェイス回路5か
ら外部機器に任意のフォーマットもしくはポートを通じ
て出力される。
The voice input from the microphone 1 is amplified by the preamplifier 2A and the variable gain amplifier 2B, converted into a digital signal by the AD converter 3, and supplied to the DSP 4. DSP
Then, after the voice recognition processing is performed, it is output from the interface circuit 5 to an external device through an arbitrary format or port.

【0016】ここで、認識が不成功の場合、DSP4か
らアンプゲインを調整する信号が出力され、この信号が
可変利得アンプ2Bの利得切り換え回路2Cを切り換え
る。即ち、音声認識が不成功の場合は可変利得アンプ2
Bの利得を増加せしめるように切り替える。
When the recognition is unsuccessful, the DSP 4 outputs a signal for adjusting the amplifier gain, and this signal switches the gain switching circuit 2C of the variable gain amplifier 2B. That is, when the voice recognition is unsuccessful, the variable gain amplifier 2
Switch to increase B gain.

【0017】尚、認識の成功・不成功は確率的な問題な
ので、ゲイン調整については人間が結果をフィードバッ
クしてやる方法やDSP側で確率的に学習させる方法等
を使用するようにしても良い。
Since recognition success / failure is a probabilistic problem, a method in which a human feeds back the result, a method in which the DSP side learns stochastically, or the like may be used for gain adjustment.

【0018】音声認識では先述の様に周波数成分が重要
なパラメータとなっており、レンジオーバの様な特徴的
な歪であれば周波数分析で有る程度の予測がつけられる
場合もあり、アルゴリズムのプログラムROMに余裕が
あればこの情報も調整時の判別要素に加えればさらに精
度は向上する。
In the voice recognition, the frequency component is an important parameter as described above, and if there is a characteristic distortion such as a range over, it may be possible to make a prediction of the frequency analysis. If the ROM has a margin, the accuracy can be further improved by adding this information to the discrimination element at the time of adjustment.

【0019】[0019]

【発明の効果】本発明によれば音声認識の成功率の重要
な要素となるマイクアンプ部での波形歪を認識結果をフ
ィードバックすることで改善させたので、認識率を向上
させることができる。
According to the present invention, since the waveform distortion in the microphone amplifier section, which is an important factor in the success rate of voice recognition, is improved by feeding back the recognition result, the recognition rate can be improved.

【図面の簡単な説明】[Brief description of drawings]

【図1】本発明を実施した音声認識装置のブロック図で
ある。
FIG. 1 is a block diagram of a voice recognition device embodying the present invention.

【図2】従来の音声認識システムのブロック図である。FIG. 2 is a block diagram of a conventional voice recognition system.

【図3】動作説明のための波形図である。FIG. 3 is a waveform diagram for explaining the operation.

【図4】動作説明のための波形図である。FIG. 4 is a waveform diagram for explaining the operation.

【符号の説明】[Explanation of symbols]

1 マイクロフォン 2A プリアンプ 2B 可変利得増幅器 2C 利得切り換え回路 3 AD変換器 4 デジタル・シグナルプロセッサ 5 インターフェィス回路 1 Microphone 2A Preamplifier 2B Variable Gain Amplifier 2C Gain Switching Circuit 3 AD Converter 4 Digital Signal Processor 5 Interface Circuit

Claims (1)

【特許請求の範囲】[Claims] 【請求項1】 マイクロフォンと、このマイクロフォン
から得られる音声信号を増幅する可変利得アンプと、こ
のアンプから得られる出力を入力とするアナログ・ディ
ジタル変換器と、この変換器からの出力を処理する音声
認識処理回路とを備え、 前記音声認識処理回路における音声認識率に基づいて前
記可変利得アンプの利得を調整するようにした音声認識
装置。
1. A microphone, a variable gain amplifier for amplifying a voice signal obtained from the microphone, an analog-digital converter having an output obtained from the amplifier as an input, and a voice processing an output from the converter. A speech recognition apparatus, comprising: a recognition processing circuit, wherein the gain of the variable gain amplifier is adjusted based on the speech recognition rate in the speech recognition processing circuit.
JP5129454A 1993-05-31 1993-05-31 Voice recognition device Expired - Fee Related JP2975808B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP5129454A JP2975808B2 (en) 1993-05-31 1993-05-31 Voice recognition device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP5129454A JP2975808B2 (en) 1993-05-31 1993-05-31 Voice recognition device

Publications (2)

Publication Number Publication Date
JPH06337697A true JPH06337697A (en) 1994-12-06
JP2975808B2 JP2975808B2 (en) 1999-11-10

Family

ID=15009891

Family Applications (1)

Application Number Title Priority Date Filing Date
JP5129454A Expired - Fee Related JP2975808B2 (en) 1993-05-31 1993-05-31 Voice recognition device

Country Status (1)

Country Link
JP (1) JP2975808B2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002091487A (en) * 2000-07-10 2002-03-27 Matsushita Electric Ind Co Ltd Device, method and program for voice recognition

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002091487A (en) * 2000-07-10 2002-03-27 Matsushita Electric Ind Co Ltd Device, method and program for voice recognition
EP1300832A1 (en) * 2000-07-10 2003-04-09 Matsushita Electric Industrial Co., Ltd. Speech recognizer, method for recognizing speech and speech recognition program
EP1300832A4 (en) * 2000-07-10 2005-07-20 Matsushita Electric Ind Co Ltd Speech recognizer, method for recognizing speech and speech recognition program

Also Published As

Publication number Publication date
JP2975808B2 (en) 1999-11-10

Similar Documents

Publication Publication Date Title
US11107493B2 (en) Sound event detection
JPS6184694A (en) Dictionary learning system for voice recognition
JPH0361959B2 (en)
KR102191736B1 (en) Method and apparatus for speech enhancement with artificial neural network
US11367457B2 (en) Method for detecting ambient noise to change the playing voice frequency and sound playing device thereof
DE60223945D1 (en) LANGUAGE RECOGNITION AND DISCRIMINATION DEVICE AND METHOD
US10964307B2 (en) Method for adjusting voice frequency and sound playing device thereof
KR102217292B1 (en) Method, apparatus and computer-readable recording medium for improving a set of at least one semantic units by using phonetic sound
JPH06337697A (en) Voice recognition device
JP3555490B2 (en) Voice conversion system
JPS6332394B2 (en)
KR100587260B1 (en) speech recognizing system of sound apparatus
WO1994002936A1 (en) Voice recognition apparatus and method
JPH04324499A (en) Speech recognition device
JP2989231B2 (en) Voice recognition device
JPS5914769B2 (en) audio equipment
JP2017068153A (en) Semiconductor device, system, electronic apparatus, and voice recognition method
JPH0556519B2 (en)
KR20000047295A (en) Voice signal processing method and apparatus for processing voice signal
JPS6126678B2 (en)
JP2000155600A (en) Speech recognition system and input voice level alarming method
JPS6367400B2 (en)
JP2599974B2 (en) Voice detection method
JPH06324696A (en) Device and method for speech recognition
JPH01236000A (en) Voice recognizing device

Legal Events

Date Code Title Description
FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20080903

Year of fee payment: 9

LAPS Cancellation because of no payment of annual fees