JPH06102894A

JPH06102894A - Speech recognizing device

Info

Publication number: JPH06102894A
Application number: JP4253194A
Authority: JP
Inventors: Hitoshi Anzai; 仁安斎; Jun Tsunoda; 潤角田
Original assignee: Toshiba Corp; Toshiba Information and Control Systems Corp
Current assignee: Toshiba Corp; Toshiba Information and Control Systems Corp
Priority date: 1992-09-22
Filing date: 1992-09-22
Publication date: 1994-04-15

Abstract

PURPOSE:To provide the speech recognizing device which can securely record only a speech pattern and securely confirm the speech pattern stored as a feature pattern by reproducing the recorded speech pattern and is improved in recognition rate. CONSTITUTION:A speech section detection part 13 detects the speech section of a speech pattern supplied from a speech analytic part 12 on the basis of a low-level threshold value L and a high-level threshold value H, and generates a sound-recording request signal and a sound-recording interruption signal. A feature pattern storage part 14 stores a speech pattern corresponding to the detected speech section and a sound recording control part 16 writes the speech pattern in the speech pattern storage part 17 in response to the sound recording request signal outputted from the speech section detection part 13 and quits the writing in response to the sound-recording interruption signal.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】この発明は、例えば入力した音声
信号を録音する音声録音再生装置を備え、この音声録音
再生装置によって録音された音声を再生することによ
り、特徴パターンとして登録した音声パターンを確認で
きる音声認識装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention comprises, for example, a voice recording / reproducing apparatus for recording an input voice signal. By reproducing the voice recorded by this voice recording / reproducing apparatus, a voice pattern registered as a characteristic pattern can be reproduced. A voice recognition device that can be confirmed.

【０００２】[0002]

【従来の技術】周知のように、音声認識装置は、入力し
た音声信号をディジタルの音声パターンに変換し、この
音声パターンから音声区間を検出し、この音声区間に対
応した音声パターンを特徴パターンとして特徴パターン
記憶部に記憶している。この音声認識装置は、音声区間
の検出精度が認識率に大きく影響する。したがって、特
徴パターン記憶部に記憶されている音声パターンの確認
ができれば、音声パターンの登録の失敗を減少すること
ができ、認識率の向上を図ることができる。2. Description of the Related Art As is well known, a voice recognition device converts an input voice signal into a digital voice pattern, detects a voice section from this voice pattern, and uses a voice pattern corresponding to this voice section as a characteristic pattern. It is stored in the characteristic pattern storage unit. In this voice recognition device, the detection accuracy of the voice section greatly affects the recognition rate. Therefore, if the voice pattern stored in the characteristic pattern storage unit can be confirmed, the failure of voice pattern registration can be reduced, and the recognition rate can be improved.

【０００３】そこで、音声認識装置に音声録音再生装置
を組合わせ、音声区間に対応する音声パターンを音声認
識装置の特徴パターン記憶部に記憶するとともに、この
記憶部に記憶した音声パターンに対応した音声パターン
を音声録音再生装置に録音し、この録音した音声パター
ンを再生することにより、特徴パターン記憶部に記憶し
た音声パターンを確認する装置が開発されている。図４
は、音声録音再生装置が組合わされた、従来の音声認識
装置を示すものである。Therefore, a voice recording / reproducing device is combined with a voice recognition device to store a voice pattern corresponding to a voice section in a characteristic pattern storage unit of the voice recognition device, and a voice corresponding to the voice pattern stored in this storage unit. An apparatus has been developed in which a pattern is recorded in a voice recording / reproducing apparatus and the recorded voice pattern is reproduced to confirm the voice pattern stored in the characteristic pattern storage unit. Figure 4
Shows a conventional voice recognition device in which a voice recording / reproducing device is combined.

【０００４】図４において、例えばマイクロフォンによ
って構成された音声入力部３１は、音声を電気信号に変
換して出力する。この音声入力部３１の出力端は音声分
析部３２に接続されている。この音声分析部３２は、図
示せぬＡ／Ｄ変換器を有し、音声入力部３１から供給さ
れた音声信号をディジタル信号に変換して音声パターン
を作成する。この音声分析部３２の出力端は音声区間検
出部３３に接続されている。この音声区間検出部３３
は、例えば２つの閾値を使用して、音声分析部３２から
供給される音声パターンからノイズを除いた音声区間を
検出する。この音声区間検出部３３には、特徴パターン
記憶部３４、認識計算部３５が接続されている。前記特
徴パターン記憶部３４は音声区間検出部３３において、
検出された音声区間に対応する音声パターンを記憶する
ものである。また、前記認識計算部３５は、前記音声区
間検出部３３から出力される音声パターンと前記特徴パ
ターン記憶部３４に記憶されている音声パターンとの類
似度を計算するものである。In FIG. 4, a voice input unit 31 formed of, for example, a microphone converts voice into an electric signal and outputs it. The output end of the voice input unit 31 is connected to the voice analysis unit 32. The voice analysis unit 32 has an A / D converter (not shown) and converts the voice signal supplied from the voice input unit 31 into a digital signal to create a voice pattern. The output end of the voice analysis unit 32 is connected to the voice section detection unit 33. This voice section detection unit 33
Uses two thresholds, for example, to detect a voice section in which noise is removed from the voice pattern supplied from the voice analysis unit 32. A characteristic pattern storage unit 34 and a recognition calculation unit 35 are connected to the voice section detection unit 33. The feature pattern storage unit 34 is
The voice pattern corresponding to the detected voice section is stored. Further, the recognition calculation unit 35 calculates the degree of similarity between the voice pattern output from the voice section detection unit 33 and the voice pattern stored in the characteristic pattern storage unit 34.

【０００５】一方、音声録音再生装置３６において、音
声録音制御部３７の入力端は前記音声入力部３１の出力
端に接続され、出力端は音声パターン録音部３８、およ
び音声再生部３９に接続されている。前記音声録音制御
部３７は、図示せぬＡ／Ｄ変換器および所謂ボイストリ
ガ機構を有し、前記音声入力部３１から供給される音声
信号をＡ／Ｄ変換器によってディジタル信号に変換して
音声パターンを生成するとともに、ボイストリガ機構に
よって音声信号の始まり（始端）を検出する。この結
果、音声信号の始端を検出した場合、これ以降の音声パ
ターンを前記音声パターン記憶部３８に出力する。前記
音声パターン記憶部３８は、例えばＲＡＭ（Random Acc
ess Memory）によって構成されており、前記音声録音制
御部３７から供給される音声パターンを録音する。前記
音声再生部３９は、図示せぬＤ／Ａ変換器を有し、前記
音声録音制御部３７の制御に応じて前記音声パターン録
音部３８から読出された音声パターンをアナログ信号に
変換し、音声出力する。On the other hand, in the voice recording / reproducing apparatus 36, the input end of the voice recording control unit 37 is connected to the output end of the voice input unit 31, and the output end is connected to the voice pattern recording unit 38 and the voice reproducing unit 39. ing. The voice recording control unit 37 has an A / D converter (not shown) and a so-called voice trigger mechanism, and converts the voice signal supplied from the voice input unit 31 into a digital signal by the A / D converter. At the same time as generating the pattern, the voice trigger mechanism detects the beginning (starting end) of the audio signal. As a result, when the beginning of the audio signal is detected, the subsequent audio patterns are output to the audio pattern storage unit 38. The voice pattern storage unit 38 is, for example, a RAM (Random Acc
ESS Memory) and records the voice pattern supplied from the voice recording controller 37. The voice reproduction unit 39 has a D / A converter (not shown), converts the voice pattern read from the voice pattern recording unit 38 into an analog signal under the control of the voice recording control unit 37, and outputs the voice signal. Output.

【０００６】[0006]

【発明が解決しようとする課題】前記音声記憶制御部３
７は、前述したようにボイストリガ機構によって音声パ
ターンの始端を検出し、自動的に音声パターンの録音を
開始する。このボイストリガ機構は、低レベルの閾値
（Ｌ）と高レベルの閾値（Ｈ）を有し、これら閾値を用
いて音声区間を定めているが、音声区間検出部３３に比
べると、音声区間を正確に検出することができないもの
であった。The voice memory control unit 3
7 detects the start of the voice pattern by the voice trigger mechanism as described above, and automatically starts recording the voice pattern. This voice trigger mechanism has a low-level threshold (L) and a high-level threshold (H), and uses these thresholds to determine the voice section. It could not be detected accurately.

【０００７】すなわち、図５（ａ）は、音声区間のうち
初めの部分が低レベルの閾値（Ｌ）を越えないため、こ
の部分に対応して録音要求信号が生成されていない場合
を示している。したがって、この場合は、音声の始端が
録音されていない。また、図５（ｂ）は、音声区間の前
にノイズがあり、このノイズが低レベルの閾値（Ｌ）を
越え、さらに、高レベルの閾値（Ｈ）を越えたため、ノ
イズに対して録音要求信号が生成され、ノイズのみを録
音した場合を示している。That is, FIG. 5 (a) shows a case in which the recording request signal is not generated corresponding to the low level threshold value (L) at the beginning of the voice section, so that the recording request signal is not generated. There is. Therefore, in this case, the beginning of the voice is not recorded. Further, in FIG. 5B, there is noise before the voice section, and this noise exceeds the low level threshold value (L) and further exceeds the high level threshold value (H). A signal is generated and only noise is recorded.

【０００８】このように、従来の音声録音制御部３７を
用いて録音された音声パターンは、始端が欠けたり、ノ
イズが含まれている場合があるため、音声区間を正確に
録音することができないものであった。As described above, the voice pattern recorded by using the conventional voice recording control unit 37 may lack the start end or may include noise, so that the voice section cannot be accurately recorded. It was a thing.

【０００９】しかも、従来は音声区間検出部３３と音声
録音制御部３７とで別々に音声パターンを検出し、特徴
パターン記憶部３４、音声パターン記憶部３８に別々に
記憶していた。したがって、特徴パターン記憶部３４に
記憶された音声パターンと、音声パターン記憶部３８に
記憶された音声パターンとの音声区間が一致しないた
め、特徴パターン記憶部に記憶された音声パターンを正
確に確認することができないものであった。In addition, conventionally, the voice section detection unit 33 and the voice recording control unit 37 separately detect the voice patterns and store them separately in the characteristic pattern storage unit 34 and the voice pattern storage unit 38. Therefore, the voice pattern stored in the characteristic pattern storage unit 34 and the voice pattern stored in the voice pattern storage unit 38 do not match in the voice section, so that the voice pattern stored in the feature pattern storage unit is accurately confirmed. It was impossible.

【００１０】この発明は、上記課題を解決するためにな
されたものであり、その目的とするところは、音声パタ
ーンの始端を欠いたり、ノイズのみを録音することな
く、音声パターンのみを確実に録音でき、この録音した
音声パターンと特徴パターンとの音声区間を一致でき、
録音した音声パターンを再生することにより、特徴パタ
ーンとして記憶した音声パターンを確実に確認すること
ができ、認識率の向上を図ることが可能な音声認識装置
を提供しようとするものである。The present invention has been made to solve the above problems, and an object of the present invention is to reliably record only a voice pattern without lacking the beginning of the voice pattern or recording only noise. Yes, you can match the voice section of this recorded voice pattern with the feature pattern,
An object of the present invention is to provide a voice recognition device capable of surely confirming a voice pattern stored as a characteristic pattern by reproducing a recorded voice pattern and improving a recognition rate.

【００１１】[0011]

【課題を解決するための手段】この発明は、上記課題を
解決するため、入力した音声に対応するディジタル信号
からなる音声パターンを生成する音声分析部と、低レベ
ルの閾値および高レベルの閾値が設定され、前記音声分
析部で生成された音声パターンが前記低レベルの閾値を
越えてから第１の時間以内に前記高レベルの閾値を越え
た場合、低レベルの閾値を越えた時点に対応して録音要
求信号を出力し、前記音声パターンが高レベルの閾値を
越えた後前記第１の時間より長い第２の時間以内に前記
低レベルの閾値以下となった場合、録音中断信号を出力
し、前記音声パターンが高レベルの閾値を越えた後、前
記第２の時間経過後、低レベルの閾値以下となるまでの
区間を音声区間として検出する音声区間検出部と、この
検出された音声区間に対応する音声パターンを記憶する
記憶手段と、この記憶手段に記憶された音声パターンと
前記音声区間検出部から出力される音声パターンとの類
似度を演算する演算部と、前記音声区間検出部から出力
される録音要求信号に応じて前記音声パターンを録音
し、前記録音中断信号に応じて前記音声パターンの録音
を中止する録音手段と、この録音手段によって録音され
た音声パターンを再生する再生手段とを具備している。In order to solve the above problems, the present invention provides a voice analysis unit for generating a voice pattern composed of a digital signal corresponding to an input voice, a low level threshold and a high level threshold. If the set voice pattern generated by the voice analysis unit exceeds the high level threshold within the first time period after the voice pattern exceeds the low level threshold, it corresponds to the time when the voice pattern exceeds the low level threshold. And outputs a recording request signal, and outputs a recording interruption signal when the voice pattern falls below the low level threshold within a second time longer than the first time after exceeding the high level threshold. A voice section detecting unit that detects a section as a voice section after the voice pattern exceeds a high level threshold and after the second time elapses until the voice pattern falls below the low level threshold, and the detected voice section. Storage means for storing a voice pattern corresponding to the voice pattern, a calculation section for calculating the similarity between the voice pattern stored in the storage means and the voice pattern output from the voice section detection section, and the voice section detection section. Recording means for recording the voice pattern in response to the output recording request signal and stopping recording of the voice pattern in response to the recording interruption signal; and reproducing means for reproducing the voice pattern recorded by the recording means. It is equipped with.

【００１２】また、前記録音手段は、音声パターンを記
憶する記憶部と、前記音声区間検出部から出力される録
音要求信号に応じて前記音声パターンを前記記憶部に書
込み、前記録音中断信号に応じて前記音声パターンの書
込みを中止する書込み回路と、前記記憶部に記憶された
音声パターンを読出す読出し回路と、この読出された音
声パターンを音声に変換して出力する再生手段とを具備
している。Further, the recording means writes the voice pattern in the storage portion for storing the voice pattern and the recording request signal output from the voice section detection portion, and in response to the recording interruption signal. A writing circuit for stopping the writing of the voice pattern, a reading circuit for reading the voice pattern stored in the storage section, and a reproducing means for converting the read voice pattern into voice and outputting the voice. There is.

【００１３】[0013]

【作用】すなわち、この発明において、音声区間検出部
は、音声分析部から供給される音声パターンが低レベル
の閾値を越えてから第１の時間以内に高レベルの閾値を
越えた場合、低レベルの閾値を越えた時点に対応して録
音要求信号を出力し、音声パターンが高レベルの閾値を
越えた後、第１の時間より長い第２の時間以内に低レベ
ルの閾値以下となった場合、録音中断信号を出力する。
さらに、音声区間検出部は、音声パターンが低レベルの
閾値を越えてから第１の時間以内に高レベルの閾値を越
え、さらに第２の時間経過後に低レベルの閾値以下とな
った場合、それまでの区間を音声区間として検出し、記
憶手段はこの検出された音声区間に対応する音声パター
ンを特徴パターンとして記憶する。また、録音制御手段
は、音声区間検出部から出力される録音要求信号に応じ
て音声パターンを録音し、この録音の途中で音声区間検
出部から録音要求信号が出力された場合、それまで録音
した音声パターンをノイズとみなして録音を中止する。
したがって、音声パターンの始端を欠いたり、ノイズの
みを録音することなく、音声パターンのみを確実に録音
できる。しかも、記憶手段に記憶された音声パターンの
音声区間と、録音制御手段によって録音された音声パタ
ーンの音声区間とが一致するため、録音した音声パター
ンを再生することにより、記憶手段に記憶された音声パ
ターンを確認できる。That is, in the present invention, when the voice pattern supplied from the voice analysis unit exceeds the high level threshold within the first time after the voice pattern supplied from the voice analysis unit exceeds the low level threshold, the low level is detected. When a recording request signal is output in response to the time when the sound level exceeds the threshold value of, and the sound pattern falls below the low level threshold value within the second time longer than the first time after exceeding the high level threshold value. , Output the recording interruption signal.
Furthermore, when the voice pattern exceeds the high level threshold within the first time after the voice pattern exceeds the low level threshold, and further falls below the low level threshold after the second time, Is detected as a voice section, and the storage means stores the voice pattern corresponding to the detected voice section as a characteristic pattern. Further, the recording control means records a voice pattern in accordance with the recording request signal output from the voice section detection unit, and if the recording request signal is output from the voice section detection unit during the recording, it records until that time. Stop the recording by treating the voice pattern as noise.
Therefore, it is possible to reliably record only the voice pattern without missing the beginning of the voice pattern or recording only the noise. Moreover, since the voice section of the voice pattern stored in the storage means and the voice section of the voice pattern recorded by the recording control means match each other, the voice pattern stored in the storage means is reproduced by reproducing the recorded voice pattern. You can check the pattern.

【００１４】[0014]

【実施例】以下、図面を参照してこの発明の一実施例に
ついて説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the present invention will be described below with reference to the drawings.

【００１５】例えばマイクロフォンによって構成された
音声入力部１１は、音声を電気信号に変換して出力す
る。この音声入力部１１の出力端は音声分析部１２に接
続されている。この音声分析部１２は、Ａ／Ｄ変換器１
２ａを有し、音声入力部１１から供給された音声信号を
ディジタル信号に変換して音声パターンを作成する。こ
の音声分析部１２の出力端は音声区間検出部１３に接続
されている。この音声区間検出部１３は、マイクロプロ
セッサ１３ａ、メモリ１３ｂが設けられている。このメ
モリ１３ｂには、マイクロプロセッサ１３ａの動作に必
要なプログラム、前記音声入力部１１によってピックア
ップされる周囲のノイズに基づいて設定された低レベル
の閾値Ｌ、および高レベルの閾値Ｈからなる２つの閾値
等が予め記憶されている。また、マイクロプロセッサ１
３ａは、前記音声分析部１２より供給される音声パター
ンから、後述する動作によって音声区間を検出し、録音
要求信号や録音中断信号を生成する。The voice input unit 11 composed of, for example, a microphone converts voice into an electric signal and outputs it. The output end of the voice input unit 11 is connected to the voice analysis unit 12. The voice analysis unit 12 is used in the A / D converter 1
2a and converts a voice signal supplied from the voice input unit 11 into a digital signal to create a voice pattern. The output end of the voice analysis unit 12 is connected to the voice section detection unit 13. The voice section detection unit 13 is provided with a microprocessor 13a and a memory 13b. The memory 13b has two programs, which are a program necessary for the operation of the microprocessor 13a, a low level threshold L set based on ambient noise picked up by the voice input unit 11, and a high level threshold H. The threshold value and the like are stored in advance. Also, the microprocessor 1
3a detects a voice section from the voice pattern supplied from the voice analysis unit 12 by an operation described later, and generates a recording request signal or a recording interruption signal.

【００１６】前記音声区間検出部１３には、特徴パター
ン記憶部１４、認識計算部１５、録音制御部１６が接続
されている。前記特徴パターン記憶部１４は音声区間検
出部１３において、検出された音声区間に対応する音声
パターンを記憶するものであり、例えばＲＡＭによって
構成されている。また、前記認識計算部１５は、前記音
声区間検出部１３から出力される音声パターンと前記特
徴パターン記憶部１４に記憶されている音声パターンと
の類似度を計算するものである。前記録音制御部１６の
入力端は前記音声入力部１１の出力端に接続され、出力
端は音声パターン記憶部１７、および音声再生部１８に
接続されている。A feature pattern storage unit 14, a recognition calculation unit 15, and a recording control unit 16 are connected to the voice section detection unit 13. The characteristic pattern storage unit 14 stores a voice pattern corresponding to the detected voice section in the voice section detection unit 13, and is configured by, for example, a RAM. Further, the recognition calculation unit 15 calculates the degree of similarity between the voice pattern output from the voice section detection unit 13 and the voice pattern stored in the characteristic pattern storage unit 14. The input end of the recording control unit 16 is connected to the output end of the voice input unit 11, and the output end is connected to the voice pattern storage unit 17 and the voice reproducing unit 18.

【００１７】録音制御部１６には、Ａ／Ｄ変換器１６
ａ、書込み回路１６ｂ、読出し回路１６ｃが設けられて
いる。前記Ａ／Ｄ変換器１６ａは、前記音声入力部１１
から供給された音声信号をディジタル信号に変換し音声
パターンを生成する。前記書込み回路１６ｂは、前記音
声区間検出部１３から供給される録音要求信号に応じ
て、Ａ／Ｄ変換器１６ａから出力される音声パターンを
音声パターン記憶部１７に書込んだり、前記録音中断信
号に応じて音声パターンの書込みを中止制御する。さら
に、読出し回路１６ｃは、外部から供給される読出し信
号に応じて前記音声パターン記憶部１７に記憶された音
声パターンを順次読出し、音声再生部１８に供給する。
前記音声パターン記憶部１７は例えばＲＡＭによって構
成されており、前記記憶制御部１６から供給される音声
パターンを記憶する。前記音声再生部１９は、Ｄ／Ａ変
換器１８ａ等を有し、前記記憶制御部１６の制御に応じ
て、前記音声パターン記憶部１７から供給される音声パ
ターンをアナログ信号に変換し、音声出力する。次に、
図２、図３を参照して前記音声区間検出部１３を構成す
るマイクロプロセッサ１３ａの動作についてさらに説明
する。The recording controller 16 includes an A / D converter 16
a, a writing circuit 16b, and a reading circuit 16c are provided. The A / D converter 16a includes the voice input unit 11
The voice signal supplied from the device is converted into a digital signal to generate a voice pattern. The writing circuit 16b writes the voice pattern output from the A / D converter 16a in the voice pattern storage unit 17 in response to the recording request signal supplied from the voice section detection unit 13, or the recording interruption signal. The writing of the voice pattern is controlled to be stopped according to. Further, the read circuit 16c sequentially reads the audio patterns stored in the audio pattern storage unit 17 according to a read signal supplied from the outside and supplies the audio patterns to the audio reproduction unit 18.
The voice pattern storage unit 17 is composed of, for example, a RAM, and stores the voice pattern supplied from the storage control unit 16. The voice reproduction unit 19 has a D / A converter 18a and the like, and converts the voice pattern supplied from the voice pattern storage unit 17 into an analog signal under the control of the storage control unit 16 and outputs the voice signal. To do. next,
The operation of the microprocessor 13a constituting the voice section detection unit 13 will be further described with reference to FIGS.

【００１８】マイクロプロセッサ１３ａは、先ず、音声
分析部１２から出力される音声パターンを特徴パターン
記憶部１４に順次取り込み、この音声パターンが低レベ
ルの閾値Ｌを越えたか否かを判別する（ＳＴ１、２）。
この結果、低レベルの閾値Ｌを越えた場合、所定の短時
間、例えば20ms以内に高レベルの閾値Ｈを越えたか否か
を判別する（ＳＴ３）。この結果、所定の短時間以内に
高レベルの閾値Ｈを越えた場合、低レベルの閾値Ｌを越
えた時点を音声パターンの始まり（始端）とみなし、低
レベルの閾値Ｌを越えた時点を基準として録音要求信号
を出力する（ＳＴ４）。First, the microprocessor 13a sequentially takes in the voice patterns output from the voice analysis unit 12 into the characteristic pattern storage unit 14, and determines whether or not the voice pattern exceeds a low level threshold L (ST1, 2).
As a result, when the threshold L of the low level is exceeded, it is determined whether the threshold H of the high level is exceeded within a predetermined short time, for example, 20 ms (ST3). As a result, when the threshold value H of the high level is exceeded within a predetermined short time, the time point of exceeding the threshold value L of the low level is regarded as the beginning (starting end) of the voice pattern, and the time point of exceeding the threshold value L of the low level is used as a reference. A recording request signal is output as (ST4).

【００１９】しかし、ノイズは短時間の間のみ高レベル
の閾値Ｈを越える特性を有している。したがって、ノイ
ズを除去するため、音声の始端を検出してから前記所定
の短時間より長い所定の時間、例えば１ｓ以内に低レベ
ルの閾値Ｌ以下となったか否かを判別し（ＳＴ５、
６）、所定の時間以内に低レベルの閾値Ｌ以下となった
場合、この間の音声パターンをノイズとみなし、録音中
断信号を出力する（ＳＴ７）。However, noise has a characteristic that it exceeds the high level threshold value H only for a short time. Therefore, in order to remove noise, it is determined whether or not the threshold value L has become equal to or lower than the low level threshold L within a predetermined time longer than the predetermined short time, for example, 1 s after the start of the voice is detected (ST5,
6) If the threshold value L becomes a low level or less within a predetermined time, the voice pattern during this period is regarded as noise and a recording interruption signal is output (ST7).

【００２０】一方、前記録音要求信号を出力した後、ス
テップ５において、所定の時間を経過したものと判別さ
れた場合、音声パターンが低レベルの閾値Ｌ以下である
か否かを判別し（ＳＴ８）、低レベルの閾値Ｌ以下であ
る場合、音声区間が終了したものとみなして録音要求信
号を停止する（ＳＴ９）。すなわち、録音要求信号は音
声区間に対応して出力され、前記特徴パターン記憶部１
４には、音声区間検出部１３によって音声区間とみなさ
れた範囲の音声パターンが記憶される。On the other hand, when it is determined in step 5 that the predetermined time has elapsed after outputting the recording request signal, it is determined whether or not the voice pattern is below the low level threshold L (ST8). ), If it is below the low level threshold L, it is considered that the voice section has ended, and the recording request signal is stopped (ST9). That is, the recording request signal is output corresponding to the voice section, and the characteristic pattern storage unit 1
In 4, the voice pattern of the range regarded as the voice section by the voice section detection unit 13 is stored.

【００２１】また、録音制御部１６を構成する書込み回
路１６ｂは、音声区間検出部１３から録音要求信号が供
給された場合、Ａ／Ｄ変換器１６ａから出力される音声
パターンを音声パターン記憶部１７に順次書込む、すな
わち、音声パターンを録音する。この書込み回路１６ｂ
は、書込みの途中において、音声区間検出部１３から録
音中断信号が出力された場合、書込みを中止するととも
に、書込みアドレスをクリアする。この後、録音中断信
号が停止されると、書込み回路１６ａは録音要求信号に
応じて再びＡ／Ｄ変換器１６ａから出力される音声パタ
ーンを音声パターン記憶部１７に順次書込む。この後、
音声区間検出部１３から出力される録音要求信号が停止
すると、書込み回路１６ａは音声パターンの書込みを停
止する。The writing circuit 16b, which constitutes the recording control unit 16, receives the recording request signal from the voice section detection unit 13 and outputs the voice pattern output from the A / D converter 16a to the voice pattern storage unit 17. Write in sequence, that is, record a voice pattern. This writing circuit 16b
When the recording stop signal is output from the voice section detector 13 during the writing, the writing is stopped and the writing address is cleared. After that, when the recording interruption signal is stopped, the writing circuit 16a sequentially writes the audio patterns output from the A / D converter 16a in the audio pattern storage unit 17 again in response to the recording request signal. After this,
When the recording request signal output from the voice section detection unit 13 stops, the writing circuit 16a stops writing the voice pattern.

【００２２】一方、録音制御部１６に外部より読出し信
号が供給されると、読出し回路１６ｃによって音声パタ
ーン記憶部１７に記憶された音声パターンが順次読出さ
れ、音声再生部１８に供給される。この音声再生部１８
では供給された音声パターンがＤ／Ａ変換器１８ａによ
ってアナログ信号に変換された音声として出力される。On the other hand, when a read signal is externally supplied to the sound recording control unit 16, the read circuit 16c sequentially reads the sound patterns stored in the sound pattern storage unit 17 and supplies the sound patterns to the sound reproducing unit 18. This voice reproduction unit 18
Then, the supplied voice pattern is output as voice converted into an analog signal by the D / A converter 18a.

【００２３】上記実施例によれば、音声区間検出部１３
は音声パターンのレベルに応じて録音要求信号や録音中
止信号を出力し、録音制御部１６は、音声区間検出部１
３から出力される録音要求信号や録音中断信号に基づい
て音声パターンを音声パターン記憶部１７に記憶した
り、記憶を中止している。したがって、音声パターン記
憶部１７にはノイズのみが記憶されたり、音声パターン
の始端が欠けて記憶されることがない。According to the above embodiment, the voice section detecting unit 13
Outputs a recording request signal or a recording stop signal according to the level of the voice pattern, and the recording control unit 16 causes the voice section detection unit 1
A voice pattern is stored in the voice pattern storage unit 17 based on a recording request signal or a recording interruption signal output from the recording unit 3, or the storage is stopped. Therefore, only the noise is not stored in the voice pattern storage unit 17, and the start end of the voice pattern is not stored.

【００２４】しかも、音声パターン記憶部１７に記憶さ
れた音声パターンと、特徴パターン記憶部１４に記憶さ
れている音声パターンは音声区間が一致している。した
がって、音声パターン記憶部１７に記憶された音声パタ
ーンを再生することにより、特徴パターン記憶部１４に
記憶されている音声パターンを確認することができる。In addition, the voice pattern stored in the voice pattern storage unit 17 and the voice pattern stored in the characteristic pattern storage unit 14 are in the same voice section. Therefore, by reproducing the voice pattern stored in the voice pattern storage unit 17, the voice pattern stored in the characteristic pattern storage unit 14 can be confirmed.

【００２５】尚、上記実施例において、記憶制御部１６
はＡ／Ｄ変換器１６ａを有し、音声入力部１１から供給
される音声信号をディジタル信号に変換して書込み回路
１６によって音声パターン記憶部１７に書込んだが、こ
れに限定されるものではなく、Ａ／Ｄ変換器を持たず、
音声分析部１２から供給される音声パターンを書込み回
路１６によって音声パターン記憶部１７に書込むように
してもよい。その他、この発明は、上記実施例に限定さ
れるものではなく、発明の要旨を変えない範囲におい
て、種々変形実施可能なことは勿論である。In the above embodiment, the storage controller 16
Has an A / D converter 16a, converts the audio signal supplied from the audio input unit 11 into a digital signal, and writes the digital signal in the audio pattern storage unit 17 by the writing circuit 16. However, the present invention is not limited to this. , Without A / D converter,
The voice pattern supplied from the voice analysis unit 12 may be written in the voice pattern storage unit 17 by the writing circuit 16. Besides, the present invention is not limited to the above-described embodiments, and it goes without saying that various modifications can be made without departing from the spirit of the invention.

【００２６】[0026]

【発明の効果】以上、詳述したようにこの発明によれ
ば、音声区間検出部は音声パターンと低レベルの閾値お
よび高レベルの閾値に応じて録音要求信号や録音中止信
号を出力するとともに、これら録音要求信号や録音中止
信号に対応して音声区間を検出し、録音制御部は録音要
求信号や録音中断信号に基づいて音声パターンを録音し
たり、録音を中止している。したがって、音声パターン
の始端を欠くことなく、音声パターンのみを確実に録音
できるとともに、この録音した音声パターンと特徴記憶
部に記憶した音声の特徴パターンとを一致させることが
できるため、この録音した音声を再生することにより、
特徴パターン記憶部に記憶された音声パターンを確認で
き、認識率の向上を図ることが可能な音声認識装置を提
供できる。As described above in detail, according to the present invention, the voice section detection unit outputs the recording request signal and the recording stop signal according to the voice pattern and the low level threshold and the high level threshold, and The voice section is detected in response to the recording request signal or the recording stop signal, and the recording control unit records a voice pattern or stops recording based on the recording request signal or the recording interruption signal. Therefore, it is possible to reliably record only the voice pattern without missing the beginning of the voice pattern and to match the recorded voice pattern with the feature pattern of the voice stored in the feature storage unit. By playing
A voice recognition device capable of confirming the voice pattern stored in the characteristic pattern storage unit and improving the recognition rate can be provided.

[Brief description of drawings]

【図１】この発明の装置の一実施例を示す構成図。FIG. 1 is a configuration diagram showing an embodiment of an apparatus of the present invention.

【図２】図１の動作を説明するために示すフローチャー
ト。2 is a flowchart shown to explain the operation of FIG. 1. FIG.

【図３】図１の動作を説明するために示す波形図。3 is a waveform diagram shown to explain the operation of FIG. 1. FIG.

【図４】従来の音声認識装置を示す構成図。FIG. 4 is a configuration diagram showing a conventional voice recognition device.

【図５】図４の動作を説明するために示す波形図。FIG. 5 is a waveform diagram shown to explain the operation of FIG.

[Explanation of symbols]

１１…音声入力部、１２…音声分析部、１３…音声区間
検出部、１４…特徴パターン記憶部、１５…認識計算
部、１６…記憶制御部、１７…音声パターン記憶部、１
８…音声再生部。11 ... Voice input section, 12 ... Voice analysis section, 13 ... Voice section detection section, 14 ... Feature pattern storage section, 15 ... Recognition calculation section, 16 ... Storage control section, 17 ... Voice pattern storage section, 1
8 ... Voice reproduction unit.

Claims

[Claims]

1. A voice analysis unit for generating a voice pattern composed of a digital signal corresponding to an input voice, a low level threshold and a high level threshold are set, and the voice pattern generated by the voice analysis unit is set to When the high level threshold is exceeded within a first time after the low level threshold is exceeded, a recording request signal is output corresponding to the time when the low level threshold is exceeded, and the voice pattern is high level. After the threshold value is exceeded, if the low level threshold value is reached within the second time period that is longer than the first time period, a recording interruption signal is output, and after the voice pattern exceeds the high level threshold value, A voice section detection unit that detects a section until the threshold value becomes a low level threshold or less after a second time elapses, and a voice pattern corresponding to the detected voice section is stored as a characteristic pattern. A storage unit, a calculation unit that calculates a similarity between the voice pattern stored in the storage unit and the voice pattern output from the voice section detection unit, and a recording request signal output from the voice section detection unit. A recording means for recording the voice pattern and stopping the recording of the voice pattern in response to the recording interruption signal; and a reproducing means for reproducing the voice pattern recorded by the recording means. Voice recognition device.

2. The recording means writes the voice pattern in the storage portion in response to a recording request signal output from the voice section detection portion and a storage portion for storing the voice pattern, and in response to the recording interruption signal. A writing circuit for stopping the writing of the voice pattern, a reading circuit for reading the voice pattern stored in the storage section, and a reproducing means for converting the read voice pattern into voice and outputting the voice. The voice recognition device according to claim 1, wherein