JPH04181998A

JPH04181998A - Device and method for speech recognition

Info

Publication number: JPH04181998A
Application number: JP2311825A
Authority: JP
Inventors: Yoshihiro Akai; 赤井　善裕; Kazuo Ishii; 和夫石井
Original assignee: NEC Engineering Ltd
Current assignee: NEC Engineering Ltd
Priority date: 1990-11-16
Filing date: 1990-11-16
Publication date: 1992-06-29

Abstract

PURPOSE:To recognize the break of the head or tail of a voiced word and the mixture of a circumferential noise into the voiced word by comparing numeric data on an input speech signal with stored numeric data on standard patterns and recognizing the speech. CONSTITUTION:A speech input part 11 amplifies the input speech signal and generate numeric data by an A/D converter, etc., and a speech output part 12 converts the standard patterns stored in a storage part 14 into a speech signal by a D/A converter, etc., and outputs it. Then a control recognition part 13 controls respective functions and compares the numeric data inputted by the speech input part 11 with the standard patterns stored in the storage part 14 to perform a recognizing process. Further, when a standard pattern is registered, the control recognition part 13 stores the voice signal made into the numeric data by the voice input part 11 in the storage part 14 as the standard pattern with the command of an operation command part 15 according to a program in the storage part 14 and the voice output part 12 makes the standard pattern into a speech signal. Consequently, the break of the head or tail of the registered standard pattern, the mixture of the circumferential noise, etc., can be confirmed auditorily.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は音声認識装置および方法、とくに、特定話者方
式の音声認識装置および方法に関する。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to a speech recognition device and method, and particularly to a speaker-specific speech recognition device and method.

[Conventional technology]

従来の特定話者方式の音声認識装置における標準パター
ンの登録は、第２図に示す操作指令部２４の指令および
記憶部２３のプログラムにより制御指令部２２か、音声
入力部２１で数値データ化された音声信号を記憶部２３
に記憶するのみである。Registration of a standard pattern in a conventional speaker-specific speech recognition device is performed by converting the standard pattern into numerical data by the control command unit 22 or the voice input unit 21 according to the commands from the operation command unit 24 and the program in the storage unit 23 shown in FIG. The recorded audio signal is stored in the storage unit 23.
It is only stored in .

[Problem to be solved by the invention]

上述した従来の音声認識装置の標準パターンの登録方法
では装置内に入力された音声がどの様に記憶されたかが
わからず認識率に直接影響する標準パターンの良悪の判
定を発声を行った状況を判断し、個人の経験で行なうし
かないという問題がある。With the standard pattern registration method of the conventional speech recognition device described above, it is not known how the input voice was stored in the device, so it is difficult to judge whether the standard pattern is good or bad, which directly affects the recognition rate. The problem is that you have no choice but to make a judgment and use your own experience.

つまり、発声語の語頭４語尾が切れていたり、発声語に
周囲の雑音が混じっている等の確認ができないどう欠点
があった。In other words, it has the disadvantage that it is not possible to confirm whether the first four or last words of a spoken word are cut off, or whether surrounding noise is mixed in with the spoken word.

[Means to solve the problem]

本発明の音声認識装置は、（ａ）入力された音声信号を数値データ化する手段と、（ｂ）この数値データを標準パターンとして記憶する手
段と、（ｃ）記憶されている標準パターンの数値データを音声
信号化する手段と、（ｄ）入力された音声信号の数値データと記憶されてい
る標準パターンの数値データとを比較し、認識する手段
と、（ｅ）認識結果を装置外へ出力する手段、とを含んで構
成される。The speech recognition device of the present invention includes (a) means for converting an input speech signal into numerical data, (b) means for storing this numerical data as a standard pattern, and (c) numerical values of the stored standard pattern. (d) A means for comparing and recognizing the numerical data of the input audio signal with the numerical data of the stored standard pattern; (e) Outputting the recognition result to the outside of the device. and a means for doing so.

〔実施例二。[Example 2.

以下本発明につき図面を参照して説明する、第１図は本
発明の一実施例を示すフロック図である。The present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing one embodiment of the present invention.

第１区に示す音声認識装置は音声入力部１１゜音声出力
部１２．制御認識部１３．記憶部１４゜操作指令部１５
を具備する。The voice recognition device shown in the first section includes a voice input section 11, a voice output section 12. Control recognition unit 13. Storage section 14゜operation command section 15
Equipped with.

音声入力部１１は図示しないマイクロホンや無線入力装
置等から入力される音声信号を増幅し、Ａ／Ｄ変換器等
により数値データ化を行なうものである。The audio input section 11 amplifies an audio signal input from a microphone, wireless input device, etc. (not shown), and converts it into numerical data using an A/D converter or the like.

音声出力部１２は図示しないイヤホン、スピーカーおよ
び無線出力装置に対し記憶部１４に記憶されている標準
パターンをり、・Ａ変換器等により音声信号化し出力を
行なうものである。The audio output unit 12 converts the standard pattern stored in the storage unit 14 into an audio signal using an A converter or the like and outputs the standard pattern to earphones, speakers, and wireless output devices (not shown).

制御認識部１３は各機能の制御および音声入力部１１て
入力した数値データと記憶部１４に言己憶されている標
準パターンとを比較し、認識処理を行なうものである。The control recognition section 13 controls each function and compares the numerical data inputted through the voice input section 11 with a standard pattern stored in the storage section 14 to perform recognition processing.

記憶部１４はプログラム、データおよび標準パターンを
９己憶するものである。The storage unit 14 stores nine programs, data, and standard patterns.

操作指令部１５はキーボードデイスプレィあるいはホス
トコンピュータ等外部接続機器との通信を行なうもので
ある。The operation command unit 15 is for communicating with externally connected equipment such as a keyboard display or a host computer.

標準パターンの登録時には操作指令部１５の指令および
記憶部１４のプログラムにより制御認識部１３か、音声
入力部１１で数値データ化された音声信号を記憶部１４
に標準パターンとして記憶し標準パターンを音声出力部
１２で音声信号化する。When registering a standard pattern, the control recognition section 13 or the voice input section 11 sends an audio signal converted into numerical data to the storage section 14 according to a command from the operation command section 15 and a program stored in the storage section 14.
The standard pattern is stored as a standard pattern in the audio output section 12 and converted into an audio signal.

以上説明した実施例では標準パターンの登録時のｍ能で
あるが、認識動作中に入力された音声を本機能により音
声出力しても良い。In the embodiment described above, this function is used when registering a standard pattern, but the voice input during the recognition operation may be output as a voice using this function.

さらに標準パターンの音声出力機能は認識結果確認用音
声応答機能と兼ね合わせても良い。Furthermore, the standard pattern audio output function may be combined with the recognition result confirmation audio response function.

〔Effect of the invention〕

以上説明したように本発明は、登録した標準パターンに
語頭語尾の途切れ、周囲騒音の混己り等がないかを使用
者の聴覚で確認することができることにより、正しい標
準パターンの作成が容易となり、特定話者方式の音声認
識装置の認識率を向上させる効果がある。As explained above, the present invention facilitates the creation of correct standard patterns by allowing the user to visually confirm whether the registered standard patterns are free from interruptions at the beginning or end of words, or if there is any interference with ambient noise. This has the effect of improving the recognition rate of a speaker-specific speech recognition device.

[Brief explanation of the drawing]

第１図は本発明の一実施例を示すブロック図、第２図は
従来の一例を示すフロック図である。］１・・・音声入力部、１２・・・音声出力部、１３・
・・制御認識部、１４・・・記憶部、１５・・・操作指
令部、２１・・音声入力部、２２・・・制御認識部、２
３・・記憶部、２４・・・操作指令部。FIG. 1 is a block diagram showing an embodiment of the present invention, and FIG. 2 is a block diagram showing a conventional example. ]1... Audio input section, 12... Audio output section, 13.
...Control recognition unit, 14...Storage unit, 15...Operation command unit, 21...Voice input unit, 22...Control recognition unit, 2
3...Storage unit, 24...Operation command unit.

Claims

[Claims] 1. In a speaker-specific speech recognition device in which words to be recognized (hereinafter referred to as standard patterns) in the voice of the device user are registered in advance, an input speech signal is converted into numerical data. means for storing the numerical data as a standard pattern; means for converting the stored numerical data of the standard pattern into an audio signal; numerical data of the input audio signal and stored numerical data of the standard pattern. A speech recognition device comprising: a means for comparing and recognizing the results; and a means for outputting the recognition result to the outside of the device. 2. In a speaker-specific speech recognition method in which words to be recognized by the device user's voice (hereinafter referred to as standard patterns) are registered in advance, the procedure for converting input speech signals into numerical data and this numerical data A procedure for storing the numeric data of the memorized standard pattern as a standard pattern, a procedure for converting the numeric data of the memorized standard pattern into an audio signal, and comparing and recognizing the numeric data of the input audio signal with the numeric data of the memorized standard pattern. A speech recognition method comprising a procedure and a procedure for outputting a recognition result outside the device.