JPH02103599A - Voice recognizing device - Google Patents

Voice recognizing device

Info

Publication number
JPH02103599A
JPH02103599A JP63258266A JP25826688A JPH02103599A JP H02103599 A JPH02103599 A JP H02103599A JP 63258266 A JP63258266 A JP 63258266A JP 25826688 A JP25826688 A JP 25826688A JP H02103599 A JPH02103599 A JP H02103599A
Authority
JP
Japan
Prior art keywords
voice
section
response
unit
recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP63258266A
Other languages
Japanese (ja)
Inventor
Shoji Kuriki
章次 栗木
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ricoh Co Ltd
Original Assignee
Ricoh Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ricoh Co Ltd filed Critical Ricoh Co Ltd
Priority to JP63258266A priority Critical patent/JPH02103599A/en
Publication of JPH02103599A publication Critical patent/JPH02103599A/en
Pending legal-status Critical Current

Links

Abstract

PURPOSE:To stabilize utterance and to improve a recognition rate by stopping the output of a voice response section when the voice recognized during voice response is detected in a voice section detecting section. CONSTITUTION:The voice inputted from a microphone 1 is inputted to a characteristic extracting section 2 and the voice section detecting section 3. The characteristic quantity is extracted from the voice inputted in the characteristic extracting section 2 and the voice section is detected from the inputted voice in the voice section detecting section 3. The characteristic quantity in the voice section is compared with a voice recognition dictionary 5 in a recognition section 4 and the most analogous word of the dictionary is determined as a correct answer. On the other hand, the voice is outputted from the voice response data by the command of a voice control section 6 in the voice response section 8. This device is so constituted as to stop the output of the voice response section 8 when the voice recognized during the voice response is detected in the voice section detecting section 3. The high recognition rate is obtd. in this way even if the user begins to start utterance during the response voice output.

Description

【発明の詳細な説明】 挟権分互 本発明は、音声認識装置に関する。[Detailed description of the invention] division of powers The present invention relates to a speech recognition device.

炙米挟権 音声認識装置の認識結果を使用者が確認する手段として
、音声応答が一最に使用されている。更に、認識装置を
使用したシステムのガイダンスとしても音声応答が使用
される。これらの使用法は次の様になる。例えば、ガイ
ダンスであれば音声応答部の出力が終了してから使用者
はそのガイダンスにそって音声を発する事になる。音声
認識結果の出力としての音声応答であれば音声応答部の
出力を確認してから次の発声を行なう倶になる。
Voice response is most commonly used as a means for the user to confirm the recognition results of the voice recognition device. Additionally, voice responses are used as guidance for systems using recognition devices. Their usage is as follows. For example, in the case of guidance, the user will utter a voice in accordance with the guidance after the output of the voice response unit is finished. In the case of a voice response as an output of a voice recognition result, the next utterance is made after confirming the output of the voice response section.

しかし使用者が慣れてくると最後まで音声応答部の出力
を聞かずに発声を行なう様になる。なぜならば音声応答
部の出力を最後まで聞いていると応答音声の終了を待た
なくてはならず発声できる回数が減り、音声によるデー
タ入力が遅くなるからである。この場合、認識部の方で
は音声応答部と独自に動作できるので認識を始める事が
出来る。
However, as the user gets used to it, he or she will start speaking without listening to the output of the voice response section until the end. This is because if the output of the voice response unit is listened to until the end, the user must wait for the end of the response voice, which reduces the number of times the voice can be uttered and slows down data input by voice. In this case, the recognition section can operate independently of the voice response section, so recognition can begin.

しかし、使用者は音声応答部の出力を聞いたままで発声
を行なう事になる。音声応答の出力はヘラ1−セノ1〜
やハンドセット等で出力されるため、使用者の耳に近い
所で出力される。一般に、発声者が静かな所で発声する
場合とうるさい所で発声する場合では声の音量も発声の
仕方も変化する。音声応答部の出力を聞きながら発声す
る時は周りがうるさい場合と良く似た環境となり、言い
方が不安定になったり、話す音Ifが大きくなったりす
るため認識率が悪くなるという欠点があった。更に、使
用者として自分が発声している時に音声応答部の出力を
聞かなければならないというのは不快であった。
However, the user must speak while listening to the output of the voice response section. The output of the voice response is Hera 1-Seno 1~
Since it is output from a device such as a computer or a handset, it is output close to the user's ear. In general, the volume of a speaker's voice and the manner in which he or she speaks changes depending on whether the speaker speaks in a quiet place or in a noisy place. When speaking while listening to the output of the voice response unit, the environment is similar to when the surroundings are noisy, and the disadvantage is that the speech becomes unstable and the speaking sound If becomes louder, resulting in a poor recognition rate. . Furthermore, it was unpleasant for the user to have to listen to the output of the voice response unit while he or she was speaking.

止−一ゴケ 本発明は、上述のごとき実情に鑑みてなされたもので、
特に、使用者が応答音声出力中に発声を始めても高い認
識率を得ることのできる音声認識装置を提供することを
目的としてなされたものである。
The present invention was made in view of the above-mentioned circumstances.
In particular, the purpose of this invention is to provide a voice recognition device that can obtain a high recognition rate even if the user starts speaking while outputting a response voice.

市り一一戊 本発明は、上記目的を達成するために、音声をピックア
ップするマイクと、音声の特徴を抽出する特徴−獣抽出
部と、音声区間を検出する音声区間検出手段と、音声認
識辞書と、入力された音声を音声認3il辞書と比較し
最も類似している辞書を正答として出力する認識部と、
音声応答部と、音声応答データ部と、音声出力部と、音
声応答部の動作を制御する応答制御部と、タイマー部と
を有する音声認識装置において、音声応答中に認識され
るべき音声が音声区間検出部で検出された場合に、前記
音声応答部の出力を中止する事を特徴としたものである
。以下、本発明の実施例に基いて説明する。
In order to achieve the above object, the present invention provides a microphone that picks up voices, a feature-animal extractor that extracts voice features, a voice section detection means that detects voice sections, and a voice recognition system. a dictionary, a recognition unit that compares the input speech with a speech recognition 3il dictionary and outputs the dictionary that is most similar as the correct answer;
In a voice recognition device that includes a voice response section, a voice response data section, a voice output section, a response control section that controls the operation of the voice response section, and a timer section, the voice to be recognized during the voice response is a voice. It is characterized in that the output of the voice response section is stopped when the section detection section detects the detection. Hereinafter, the present invention will be explained based on examples.

第1図は、本発明の一実施例を説明するための構成図で
、図中、1はマイクロフォン、2は特徴抽出部、3は音
声区間検出部、4は認識部、5は音声辞書、6は応答制
御部、7は音声応答データ部、8は音声応答部、9は音
声出力部、10はスピーカで、マイクロフォン1より入
力された音声は特徴抽出部2と音声区171j検、′4
3部3に入力される。
FIG. 1 is a block diagram for explaining one embodiment of the present invention, in which 1 is a microphone, 2 is a feature extraction section, 3 is a speech section detection section, 4 is a recognition section, 5 is a speech dictionary, 6 is a response control section, 7 is a voice response data section, 8 is a voice response section, 9 is a voice output section, 10 is a speaker, and the voice input from the microphone 1 is passed through a feature extraction section 2 and a voice section 171j inspection, '4
3 Part 3 is input.

特徴抽出部2では入力された音声から特徴量を抽出する
。音声区間検出部3では入力された音声から音声区間の
検出を行なう。音声区間内の特徴Fatは認識部4にお
いて音声認識辞書5と比較され最も類似している辞書の
単語を正答とする。一方。
The feature extractor 2 extracts feature amounts from the input voice. The voice section detecting section 3 detects a voice section from the input voice. The feature Fat within the speech section is compared with the speech recognition dictionary 5 in the recognition unit 4, and the most similar word in the dictionary is determined as the correct answer. on the other hand.

音声応答部8では応答制御部6の指令により、音声応答
データ7から音声を出力する。
The voice response section 8 outputs voice from the voice response data 7 in response to a command from the response control section 6 .

次に、応答出力動作と認識動作の関係について説明する
が、ここではガイダンス付の認識動作という場合につい
て説明することにする。動作の始めに応答制御部6より
音声応答部8にガイダンスを出力する命令が与えられる
(音声応答部への指令信号b)。この命令により音声応
答部8はガイダンスを音声出力する。それと同時に認識
部4の方は音声区間の検出を開始する(音声区間検出4
8号)。ここで、使用者が音声応答出力後に発声したと
きは通常の認識動作を行なうだけで良い。しかし、使用
者が音声応答動作中に発声をした場合は次のような動作
を行なう。既に、音声区間検出部3では音声区間検出が
可能になっているので発声され音声の音声区間が検出さ
れる。ここで音声区間が検出された事を検知した応答制
御部6では音声応答部8の動作を中止する。こうするこ
とにより発声者は音声応答出力を聞かずに通常の発声が
行なえる。次に認識した場合、一般的に認識結果を使用
者に知らせるために音声応答部8から認識結果が出力さ
れる。ここで、使用者は応答出力を全部聞かなくても認
識結果が正しいのか間違っているのかを分かる場合があ
る。その場合には分かった時点で次の発声を行なう。そ
うすれば音声によるデータ入力が早くなるからである。
Next, the relationship between the response output operation and the recognition operation will be explained, but here, the case of recognition operation with guidance will be explained. At the beginning of the operation, the response control section 6 gives a command to output guidance to the voice response section 8 (command signal b to the voice response section). In response to this command, the voice response unit 8 outputs guidance as voice. At the same time, the recognition unit 4 starts detecting voice sections (voice section detection 4
No. 8). Here, when the user utters a voice after outputting a voice response, it is sufficient to perform a normal recognition operation. However, if the user speaks during the voice response operation, the following operation is performed. Since the speech section detection unit 3 is already capable of detecting speech sections, the speech section of the uttered voice is detected. The response control section 6 detects that the voice section has been detected and stops the operation of the voice response section 8. This allows the speaker to speak normally without hearing the voice response output. When the next recognition is performed, the recognition result is generally output from the voice response unit 8 in order to notify the user of the recognition result. Here, the user may be able to tell whether the recognition result is correct or incorrect without listening to the entire response output. In that case, make the next utterance as soon as you know. This is because data input by voice becomes faster.

この状態でも認識結果出力後直ちに認識可能な状態にな
っていれば音声区間を検出する事が可能なため、音声区
間が検出されたならば応答制御部6が応答出力を中止す
ることにより使用者が正しい発声を行なうことができる
Even in this state, it is possible to detect a voice section if the recognition result is immediately recognized, so if a voice section is detected, the response control unit 6 stops outputting the response, and the user can produce correct vocalizations.

ところで上記の動作をする場合、認識部は常に発声待ち
の状態にいなければならない。しかし、マイクから雑音
を拾った場合は間違って応答出力を止めることになり不
便である。そのため、第2図(a)に示す様に、音声区
間が検出された後(図中t、の時点)、ある一定の時間
(図中のT)継続した場合のみ音声応答出力を止める様
にする。この値Tはおよそ150 m s程度が適当で
ある。なぜなら、一般にそれより短い単語は存在しない
し雑音はそれより短い場合が多いからである。
By the way, when performing the above operation, the recognition unit must always be in a state of waiting for utterance. However, if noise is picked up from the microphone, the response output may be stopped by mistake, which is inconvenient. Therefore, as shown in Figure 2 (a), after a voice section is detected (time point t in the figure), the voice response output is stopped only if it continues for a certain period of time (T in the figure). do. Appropriately, this value T is approximately 150 ms. This is because there are generally no words shorter than that, and noise is often shorter than that.

第3図は、上述のごとき動作を行う本発明による音声認
識装置の一実施例を説明するための構成図で、図中、1
1はタイマーで、その他力1図に示した実施例と同様の
作用をする部分には第1図の場合と同一の参照番号が付
しである。而して、第1図に示した実施例の動作と同様
に音声応答出力中に音声区間が検出されると、応答制御
部ではタイマーからの信号により音声区間がある一定時
間連続して検出されたかどうかを検知する。音声区間信
号が第2図(b)に示すようにある一定時間Tに満たな
い場合には、音声応答部の出力を持続させる。一方、第
2図(、)のようにTより長くなれば応答部の出力を中
止(図中L2の時点)する。こうすることにより雑音に
よって応答出力が止まる事が無くなる。
FIG. 3 is a block diagram for explaining one embodiment of the speech recognition device according to the present invention that performs the above-mentioned operation.
Reference numeral 1 designates a timer, and other parts having the same functions as those in the embodiment shown in FIG. 1 are given the same reference numerals as in FIG. Similarly to the operation of the embodiment shown in FIG. 1, when a voice section is detected while outputting a voice response, the response control section detects the voice section continuously for a certain period of time based on a signal from the timer. detect whether or not the If the voice section signal is less than a certain time T as shown in FIG. 2(b), the output of the voice response section is maintained. On the other hand, if the length becomes longer than T as shown in FIG. 2 (,), the output of the response section is stopped (at the point L2 in the figure). By doing this, the response output will not stop due to noise.

紘−一末 以上の説明から明らかなように、請求項第1項の音声認
識装置においては、使用者が音声応答中に発声を始めた
場合、応答出力が停止するので発声が安定し、認識率が
上がる。また、請求項第2項の音声認識装置においては
、周りの雑音による間違った音声応答の停止が無くなり
、使用者が安定した音声応答出力を聞く事ができる。
As is clear from the above explanation, in the voice recognition device of claim 1, if the user starts speaking during a voice response, the response output is stopped, so the voice becomes stable and recognition is improved. rate increases. Furthermore, in the speech recognition device according to the second aspect of the present invention, there is no erroneous stoppage of the speech response due to surrounding noise, and the user can hear stable speech response output.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図は、本発明による音声認識装置の一実施例を説明
するための構成図、第2図は、本発明の動作説明をする
ためのタイムチャート、第3図は、本発明の他の実施例
を説明するための構成図である。 1・・・マイクロフォン、2・・・特徴抽出部、3・・
・音声区間検出部、4・・・認識部、5・・・音声辞湯
、6・・・応答制御部、7・・・音声応答データ部、8
・・音声応答部、9・・音声出力部、10・・スピーカ
、11・・・タイマー部。 第 図 第 図 第 図
FIG. 1 is a block diagram for explaining one embodiment of a speech recognition device according to the present invention, FIG. 2 is a time chart for explaining the operation of the present invention, and FIG. FIG. 2 is a configuration diagram for explaining an example. 1...Microphone, 2...Feature extraction unit, 3...
- Voice section detection unit, 4... Recognition unit, 5... Voice response unit, 6... Response control unit, 7... Voice response data unit, 8
...Voice response section, 9..Speech output section, 10..Speaker, 11..Timer section. Figure Figure Figure Figure

Claims (1)

【特許請求の範囲】 1、音声をピックアップするマイクと、音声の特徴を抽
出する特徴量抽出部と、音声区間を検出する音声区間検
出手段と、音声認識辞書と、入力された音声を音声認識
辞書と比較し最も類似している辞書を正答として出力す
る認識部と、音声応答部と、音声応答データ部と、音声
出力部と、音声応答部の動作を制御する応答制御部と、
タイマー部とを有する音声認識装置において、音声応答
中に認識されるべき音声が音声区間検出部で検出された
場合に、前記音声応答部の出力を中止する事を特徴とす
る音声認識装置。 2、音声をピックアップするマイクと、音声の特徴を抽
出する特徴量抽出部と、音声区間を検出する音声区間検
出部と、音声認識辞書と、入力された音声を音声認識辞
書と比較し最も類似している辞書を正答として出力する
認識部と、音声応答部と、音声応答データ部と、音声出
力部と、音声応答部の動作を制御する応答制御部とタイ
マー部を有する音声認識装置において、音声応答中に認
識されるべき音声がある一定時間連続して音声区間検出
部で検出された場合に、前記音声応答部の出力を中止す
る事を特徴とする音声認識装置。
[Scope of Claims] 1. A microphone that picks up voices, a feature amount extraction unit that extracts voice features, a voice section detection means that detects voice sections, a voice recognition dictionary, and a voice recognition system for input voice. a recognition unit that compares with the dictionary and outputs the most similar dictionary as the correct answer; a voice response unit; a voice response data unit; a voice output unit; a response control unit that controls the operation of the voice response unit;
What is claimed is: 1. A voice recognition device having a timer unit, characterized in that when a voice to be recognized during a voice response is detected by a voice section detection unit, the output of the voice response unit is stopped. 2. A microphone that picks up the voice, a feature extractor that extracts the features of the voice, a voice section detector that detects the voice section, a voice recognition dictionary, and compares the input voice with the voice recognition dictionary to find the most similar one. A voice recognition device comprising: a recognition unit that outputs a dictionary that is correct as a correct answer, a voice response unit, a voice response data unit, a voice output unit, a response control unit that controls the operation of the voice response unit, and a timer unit, A voice recognition device characterized in that when a voice to be recognized during a voice response is continuously detected by a voice section detection unit for a certain period of time, the output of the voice response unit is stopped.
JP63258266A 1988-10-13 1988-10-13 Voice recognizing device Pending JPH02103599A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP63258266A JPH02103599A (en) 1988-10-13 1988-10-13 Voice recognizing device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP63258266A JPH02103599A (en) 1988-10-13 1988-10-13 Voice recognizing device

Publications (1)

Publication Number Publication Date
JPH02103599A true JPH02103599A (en) 1990-04-16

Family

ID=17317851

Family Applications (1)

Application Number Title Priority Date Filing Date
JP63258266A Pending JPH02103599A (en) 1988-10-13 1988-10-13 Voice recognizing device

Country Status (1)

Country Link
JP (1) JPH02103599A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07175498A (en) * 1993-12-20 1995-07-14 Nec Corp Device for recognizing and responding voice
JPH08146989A (en) * 1994-11-17 1996-06-07 Canon Inc Information processor and its control method
JPH08146991A (en) * 1994-11-17 1996-06-07 Canon Inc Information processor and its control method
US7412382B2 (en) 2002-10-21 2008-08-12 Fujitsu Limited Voice interactive system and method

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07175498A (en) * 1993-12-20 1995-07-14 Nec Corp Device for recognizing and responding voice
JPH08146989A (en) * 1994-11-17 1996-06-07 Canon Inc Information processor and its control method
JPH08146991A (en) * 1994-11-17 1996-06-07 Canon Inc Information processor and its control method
US7412382B2 (en) 2002-10-21 2008-08-12 Fujitsu Limited Voice interactive system and method

Similar Documents

Publication Publication Date Title
JP3674990B2 (en) Speech recognition dialogue apparatus and speech recognition dialogue processing method
US7885818B2 (en) Controlling an apparatus based on speech
US9293134B1 (en) Source-specific speech interactions
EP0077194B1 (en) Speech recognition system
JPH09106296A (en) Apparatus and method for speech recognition
JP2009003040A (en) Speech interaction device, speech interaction method and robot device
CN110867197A (en) Method and equipment for interrupting voice robot in real time in voice interaction process
WO2018216180A1 (en) Speech recognition device and speech recognition method
JP2016061888A (en) Speech recognition device, speech recognition subject section setting method, and speech recognition section setting program
JP5375423B2 (en) Speech recognition system, speech recognition method, and speech recognition program
US7177806B2 (en) Sound signal recognition system and sound signal recognition method, and dialog control system and dialog control method using sound signal recognition system
JP3553828B2 (en) Voice storage and playback method and voice storage and playback device
JPH02103599A (en) Voice recognizing device
JPH0635497A (en) Speech input device
KR20120111510A (en) A system of robot controlling of using voice recognition
JPH08263092A (en) Response voice generating method and voice interactive system
JP2019132997A (en) Voice processing device, method and program
JPH02131300A (en) Voice recognizing device
JPS62150295A (en) Voice recognition
JP6748565B2 (en) Voice dialogue system and voice dialogue method
KR20080061901A (en) System and method of effcient speech recognition by input/output device of robot
JP3360978B2 (en) Voice recognition device
JP2005122194A (en) Voice recognition and dialog device and voice recognition and dialog processing method
KR100281582B1 (en) Speech Recognition Method Using the Recognizer Resource Efficiently
JP2870421B2 (en) Hearing aid with speech speed conversion function