JP2003044092A

JP2003044092A - Voice recognizing device

Info

Publication number: JP2003044092A
Application number: JP2001236312A
Authority: JP
Inventors: Shingo Kiuchi; 真吾木内
Original assignee: Alpine Electronics Inc
Current assignee: Alpine Electronics Inc
Priority date: 2001-08-03
Filing date: 2001-08-03
Publication date: 2003-02-14

Abstract

PROBLEM TO BE SOLVED: To provide a voice recognizing device with which the voices of a speaker can be collected with a satisfactory S/N (signal-noise ratio) at all times even when the position of a microphone changes. SOLUTION: The angle of a sun visor 19, to which microphones M1 and M2 are attached, is detected by an angle sensor 18. On the basis of the output of the angle sensor 18, the direction of the mouth of the speaker to the microphones M1 and M2 is found by a microphone directivity calculating part 16, and suitable delay time is calculated. Signal delayers 12a and 12b are controlled by a delay time control part 15 so that the delay time calculated by the microphone directivity calculating part 16 can be provided. Thus, even when the sun visor 19 is moved, voices from the direction of the speaker are extracted with the satisfactory S/N at all times and the recognition rate of voices by a voice recognizing processing part 14 is prevented from being lowered.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、マイクと話者との
相対的な位置関係が変化する環境で使用される音声認識
装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice recognition device used in an environment where the relative positional relationship between a microphone and a speaker changes.

【０００２】[0002]

【従来の技術】近年、音声認識技術が著しく発達し、種
々の分野で使用されるようになった。例えば車載用ナビ
ゲーション装置では、目的地の設定や画面の切換えなど
の操作を音声でできるようにしたものも多い。この場
合、音声入力用マイクを車室内に設置するが、マイクが
運転者の視界を妨げず、運転者の音声をよく収音し、且
つ車室内のノイズの影響を極力小さくする工夫が必要で
ある。2. Description of the Related Art In recent years, voice recognition technology has been remarkably developed and has been used in various fields. For example, in many on-vehicle navigation devices, operations such as destination setting and screen switching can be performed by voice. In this case, a voice input microphone is installed in the vehicle interior, but it is necessary to devise a microphone that does not obstruct the driver's view, collects the driver's voice well, and minimizes the effect of noise in the vehicle interior. is there.

【０００３】音声認識機能を備えた車載用ナビゲーショ
ン装置の音声入力用マイクとして、ヘッドセットマイク
を使用することも考えられる。しかし、ヘッドセットマ
イクを頭に装着したまま車両を運転することは煩わし
く、一般的ではない。そこで、一般的な車載用ナビゲー
ション装置では、運転席の前方のサンバイザーに指向性
マイクを取り付けるようになっている。この方法は、車
室内へのマイクの取り付けが容易であるという利点もあ
る。It is also conceivable to use a headset microphone as a voice input microphone of a vehicle-mounted navigation device having a voice recognition function. However, driving a vehicle with the headset microphone mounted on the head is cumbersome and not common. Therefore, in a general vehicle-mounted navigation device, a directional microphone is attached to the sun visor in front of the driver's seat. This method also has an advantage that the microphone can be easily attached to the vehicle interior.

【０００４】[0004]

【発明が解決しようとする課題】しかしながら、サンバ
イザーに指向性マイクを取り付ける方法では、例えば車
両前方からの日差しがまぶしいときに運転者がサンバイ
ザーを下げて日差しを避けようとすると、マイクの向き
が変わってしまう。このため、運転者の音声が収音しに
くくなり、ＳＮ比（信号とノイズとの比）が劣化して音
声認識の精度が著しく低下する。However, in the method of attaching the directional microphone to the sun visor, for example, when the driver lowers the sun visor to avoid the sunshine when the sunshine from the front of the vehicle is dazzling, the direction of the mic Will change. For this reason, it becomes difficult for the driver's voice to be picked up, the SN ratio (ratio of signal to noise) deteriorates, and the accuracy of voice recognition significantly decreases.

【０００５】以上から、本発明の目的は、マイクの位置
が変化しても常に良好なＳＮ比で話者の音声を収音する
ことができる音声認識装置を提供することである。In view of the above, an object of the present invention is to provide a voice recognition device which can always pick up the voice of the speaker with a good SN ratio even if the position of the microphone changes.

【０００６】[0006]

【課題を解決するための手段】本発明の音声認識装置
は、相互に離隔して配置される複数個のマイクと、前記
複数個のマイクの位置変化量を検出するマイク位置変化
量検出手段と、前記マイク位置変化量検出手段で検出し
たマイク位置変化量を入力し、各マイクの位置の差に応
じた信号の遅れを利用して前記複数個のマイクの出力か
ら話者の方向からの音声信号を選択的に抽出する音声信
号抽出手段と、前記音声信号抽出手段により抽出された
音声信号に対し音声認識処理を行う音声認識処理部とを
有することを特徴とする。A voice recognition device of the present invention comprises a plurality of microphones arranged apart from each other, and a microphone position change amount detecting means for detecting a position change amount of the plurality of microphones. The microphone position change amount detected by the microphone position change amount detecting means is input, and a signal from the direction of the speaker is output from the outputs of the plurality of microphones by using a signal delay according to a difference in position of each microphone. It is characterized by comprising a voice signal extraction means for selectively extracting a signal and a voice recognition processing section for performing voice recognition processing on the voice signal extracted by the voice signal extraction means.

【０００７】本発明においては、相互に離隔して配置さ
れた複数個のマイクの出力を音声信号抽出手段により信
号処理し、話者の方向からの音声による信号を強調する
か、又は所定の方向からのノイズによる信号を減衰させ
ることによって、話者の方向からの音声信号を選択的に
抽出する。相互に離隔して配置された複数個のマイクを
使用すると、各マイクと話者との距離の差により、各マ
イクからの信号に時間差が生じる。どの程度の時間差が
生じるのかは、マイクに対する話者の口元の方向に関係
する。例えば、話者の音声による各マイクの出力信号が
同時に信号合成器に入力されるように各マイクと信号合
成器との間に信号遅延器を接続すると、話者の方向から
の音声は同位相であるため加算により強調される。一
方、別方向から到来するノイズは遅延により同位相とな
らないため、加算しても音声ほどは強調されない。その
結果、ＳＮ比が向上する。In the present invention, the output of a plurality of microphones arranged apart from each other is subjected to signal processing by the voice signal extraction means to emphasize the signal by the voice from the direction of the speaker, or a predetermined direction. The sound signal from the direction of the speaker is selectively extracted by attenuating the signal due to the noise from. When a plurality of microphones arranged apart from each other are used, a time difference occurs in signals from each microphone due to a difference in distance between each microphone and a speaker. How much time difference occurs depends on the direction of the speaker's mouth with respect to the microphone. For example, if a signal delay device is connected between each microphone and the signal synthesizer so that the output signals of each microphone from the speaker's voice are input to the signal synthesizer at the same time, the voice from the speaker's direction will have the same phase. Therefore, it is emphasized by addition. On the other hand, noises coming from different directions do not have the same phase due to the delay, so even if they are added, they are not emphasized as much as speech. As a result, the SN ratio is improved.

【０００８】また、例えば所定の方向にノイズの発生源
がある場合は、その方向からのノイズによる信号が逆位
相になるように信号遅延器の遅延時間を設定すると、ノ
イズ成分が減衰される。一方、話者の方向からの音声信
号は、完全に逆位相とはならないため、ノイズほどは減
衰されず、その結果ＳＮ比が向上する。いずれの方法に
よっても、マイクの位置が変化すると、適切な遅延時間
が変化する。従って、マイク位置の変化量を検出するマ
イク位置変化量検出手段と、マイク位置の変化量に応じ
て信号遅延器の遅延時間を決定する遅延時間決定部と、
遅延時間決定部で決定した遅延時間となるように信号遅
延器を制御する遅延時間制御部とが必要である。Further, for example, when there is a noise source in a predetermined direction, the noise component is attenuated by setting the delay time of the signal delay device so that the signal due to the noise from that direction has the opposite phase. On the other hand, the voice signal from the direction of the speaker is not completely out of phase and is not attenuated as much as noise, resulting in an improvement in the SN ratio. With either method, when the position of the microphone changes, the appropriate delay time changes. Therefore, a microphone position change amount detecting means for detecting the change amount of the microphone position, a delay time determining unit for determining the delay time of the signal delay unit according to the change amount of the microphone position,
A delay time control unit that controls the signal delay device so that the delay time determined by the delay time determination unit is required.

【０００９】[0009]

【発明の実施の形態】以下、本発明の実施の形態につい
て、添付の図面を参照して説明する。（第１の実施の形態）図１は本発明の第１の実施の形態
の音声認識装置の原理を示す図である。本実施の形態の
音声認識装置においては、２個の無指向性マイクＭ₁，
Ｍ₂を使用し、話者の音声がこれらのマイクＭ₁，Ｍ₂
にそれぞれ到達するまでの時間の差を利用して、話者
（の口元）の方向からの音声を選択的に収音する。BEST MODE FOR CARRYING OUT THE INVENTION Embodiments of the present invention will be described below with reference to the accompanying drawings. (First Embodiment) FIG. 1 is a diagram showing the principle of a voice recognition apparatus according to a first embodiment of the present invention. In the voice recognition device of the present embodiment, two omnidirectional microphones M ₁ ,
M ₂ is used, and the voice of the speaker is transmitted to these microphones M ₁ and M _2.
The sound from the direction of the speaker (the mouth of the speaker) is selectively picked up by using the difference in the time until reaching each of the.

【００１０】例えば、図１に示すように、第１のマイク
Ｍ₁と第２のマイクＭ₂との間隔がＬ（ｍ）、音速がＣ
（ｍ／秒）であるとすると、第１のマイクＭ₁から出力
される信号は、第２のマイクＭ₂から出力される信号に
対してＬ／Ｃ秒だけ時間差が生じる。従って、第２のマ
イクＭ₂の出力をＬ／Ｃ秒だけ遅延して第１のマイクＭ
₁の出力との和をとれば、マイクＭ₁とマイクＭ₂の音
声出力が同相化されているため、話者の方向の音声が強
調され、ＳＮ比が向上して、話者の方向からの音声だけ
を選択的に収音することができる。For example, as shown in FIG. 1, the distance between the first microphone M ₁ and the second microphone M ₂ is L (m), and the speed of sound is C.
(M / sec), the signal output from the first microphone M ₁ has a time difference of L / C seconds with respect to the signal output from the second microphone M ₂ . Therefore, the output of the second microphone M ₂ is delayed by L / C seconds and the first microphone M ₂ is delayed.
_If the output of ₁ is summed, the voice outputs of the microphones M ₁ and M ₂ are in-phase, so that the voice in the direction of the speaker is emphasized, the SN ratio is improved, and the direction of the speaker is increased. Only the voice of can be selectively picked up.

【００１１】図２に示すように、第１のマイクＭ₁と第
２のマイクＭ₂との間隔をｄとし、話者の方向をθとす
ると、話者からマイクＭ₁，Ｍ₂までの距離の差Ｌは、
下記（１）式に示すように、ｄsin θとなる。As shown in FIG. 2, when the distance between the first microphone M ₁ and the second microphone M ₂ is d and the direction of the speaker is θ, the distance from the speaker to the microphones M ₁ and M ₂ is The distance difference L is
As shown in the following equation (1), dsin θ.

【００１２】[0012]

【数１】 [Equation 1]

【００１３】従って、第２のマイクＭ₂の出力を（ｄsi
n θ）／Ｃ秒だけ遅延して第１のマイクＭ₁の出力との
和をとれば、話者の方向からの音声だけを選択的に収音
することができる。図３は第１の実施の形態の音声認識
装置の構成を示すブロック図である。図３は、本発明を
車載用ナビゲーション装置の音声認識装置に適用した例
を示している。Therefore, the output of the second microphone M ₂ is (dsi
By delaying by n θ) / C seconds and summing with the output of the first microphone M ₁ , only the voice from the direction of the speaker can be selectively picked up. FIG. 3 is a block diagram showing the configuration of the voice recognition device according to the first embodiment. FIG. 3 shows an example in which the present invention is applied to a voice recognition device of a vehicle-mounted navigation device.

【００１４】本実施の形態の音声認識装置は、２個の無
指向性マイクＭ₁，Ｍ₂と、信号遅延器１２ａ，１２ｂ
と、加算器（信号合成器）１３と、音声認識処理部１４
と、遅延時間制御部１５と、マイク指向性算出部（遅延
時間決定部）１６と、座席位置センサ１７と、角度セン
サ１８とにより構成されている。無指向性マイクＭ₁，
Ｍ₂は、運転席の前方のサンバイザー１９に相互に離隔
して取り付けられている。マイクＭ₁は信号遅延器１２
ａに接続され、マイクＭ ₂は信号遅延器１２ｂに接続さ
れている。これらの信号遅延器１２ａ，１２ｂは、遅延
時間制御部１５により遅延時間が個別に制御される。な
お、マイクＭ₁，Ｍ₂は必ずしも無指向性である必要は
ないが、位置が変化しても話者の方向に対する感度の変
化が少ないものであることが好ましい。The speech recognition apparatus of this embodiment has two
Directional microphone M₁, M₂And the signal delay devices 12a and 12b
, Adder (signal synthesizer) 13, and voice recognition processing unit 14
, Delay time control unit 15, microphone directivity calculation unit (delay
(Time determination unit) 16, seat position sensor 17, angle sensor
It is composed of the service 18 and the service. Omnidirectional microphone M₁，
M₂Are separated from each other by the sun visor 19 in front of the driver's seat.
And is installed. Mike M₁Is the signal delay unit 12
a, connected to a microphone M ₂Is connected to the signal delay device 12b
Has been. These signal delay devices 12a and 12b are
The delay time is individually controlled by the time control unit 15. Na
Oh, Mike M₁, M₂Need not be omnidirectional
However, even if the position changes, the sensitivity to the direction of the speaker changes.
It is preferable that there is little change.

【００１５】加算器１３は、信号遅延器１２ａ，１２ｂ
から出力された信号を加算する。この加算器１３から出
力された信号は音声認識処理部１４に送られる。音声認
識処理部１４は、信号合成部１３から入力した信号を音
声処理して文字データに変換する。音声認識処理部１４
から出力された文字データはナビゲーション装置本体
（図示せず）に送られ、ナビゲーション装置本体ではこ
れらの文字データに応じた処理が実行される。The adder 13 is a signal delay device 12a, 12b.
Add the signals output from. The signal output from the adder 13 is sent to the voice recognition processing unit 14. The voice recognition processing unit 14 performs voice processing on the signal input from the signal synthesis unit 13 and converts the signal into character data. Speech recognition processing unit 14
The character data output from the device is sent to the navigation device body (not shown), and the navigation device body executes processing according to the character data.

【００１６】座席位置センサ１７は、運転席を車両の前
後方向にスライドさせることによる話者（の口元）の位
置変化を検出する。座席位置センサ１７は必須ではない
が、音声認識の精度を向上させるためには、座席位置セ
ンサ１７で座席の位置を検出したり、キーボード等の入
力装置から話者の身長を入力するようにして、それらの
データから話者の口元の位置をより正確に決めることが
好ましい。The seat position sensor 17 detects a change in the position of the speaker (or his / her mouth) caused by sliding the driver's seat in the longitudinal direction of the vehicle. Although the seat position sensor 17 is not essential, in order to improve the accuracy of voice recognition, the seat position sensor 17 is used to detect the seat position, or the height of the speaker is input from an input device such as a keyboard. , It is preferable to more accurately determine the position of the speaker's mouth from those data.

【００１７】角度センサ１８は、例えば図４に示すよう
に、マイクＭ₁，Ｍ₂が取り付けられたサンバイザー１
９の軸受け部１９ａに配置され、回転軸１９ｂに対する
サンバイザー１９の回転角度を検出する。角度センサ１
８としては、例えば回転角度に応じて抵抗値が変化する
ポテンショメータや、センサ内の液面の位置変化に応じ
てインピーダンスが変化することを利用して回転角度を
検出する液体封入型センサなどを使用することができ
る。The angle sensor 18 is, for example, as shown in FIG. 4, a sun visor 1 to which microphones M ₁ and M ₂ are attached.
It is arranged in the bearing portion 19a of 9 and detects the rotation angle of the sun visor 19 with respect to the rotation shaft 19b. Angle sensor 1
As 8, for example, a potentiometer whose resistance value changes according to the rotation angle, or a liquid-filled type sensor that detects the rotation angle by utilizing the impedance change according to the position change of the liquid level in the sensor is used. can do.

【００１８】マイク指向性算出部１６は、座席位置セン
サ１７の出力と、角度センサ１８の出力とからマイクＭ
₁，Ｍ₂に対する話者の口元方向を算出し、更にマイク
Ｍ₁，Ｍ₂から出力される話者の音声信号が同相となる
ために必要な遅延時間を算出する。遅延時間制御部１５
は、マイク指向性算出部１６で算出した遅延時間となる
ように、信号遅延器１２ａ，１２ｂを制御する。The microphone directivity calculation unit 16 determines the microphone M from the output of the seat position sensor 17 and the output of the angle sensor 18.
₁ calculates a mouth direction of the speaker with respect to M _2, further calculates the delay time required for the speaker of the audio signal outputted from the microphone M _1, M ₂ are the same phase. Delay time control unit 15
Controls the signal delay units 12a and 12b so that the delay time calculated by the microphone directivity calculation unit 16 is reached.

【００１９】以下、本実施の形態の音声認識装置の動作
について説明する。なお、以下の説明では、話者の口元
の位置は一定であるとする。図５は、遅延時間の算出方
法を説明するための模式図である。サンバイザーの回転
軸の位置を原点Ｏ（０，０）とし、マイクＭ₁，Ｍ
₂（マイクアレイ）の取り付け位置をｍ₀（ａ₀，
ｂ₀）とし、話者の位置をＶ（ａ₂，ｂ₂）とし、マイ
クアレイに対する話者の口元の方向をθ₀とする。ここ
では、サンバイザーが水平に配置されているものとす
る。従って、ａ₀＝０であり、サンバイザーの回転軸か
らマイクアレイまでの距離はｂ₀である。The operation of the speech recognition apparatus of this embodiment will be described below. In the following description, it is assumed that the position of the speaker's mouth is constant. FIG. 5 is a schematic diagram for explaining the method of calculating the delay time. The position of the rotation axis of the sun visor is the origin O (0,0), and the microphones M ₁ and M
The mounting position of ₂ (microphone array) is m ₀ (a ₀ ,
b ₀ ), the speaker position is V (a ₂ , b ₂ ), and the direction of the speaker's mouth with respect to the microphone array is θ ₀ . Here, it is assumed that the sun visor is arranged horizontally. Therefore, a ₀ = 0, and the distance from the rotation axis of the sun visor to the microphone array is b ₀ .

【００２０】この場合、ベクトルの内積の定義より、下
記（２）式が成り立つ。In this case, the following equation (2) is established from the definition of the inner product of the vectors.

【００２１】[0021]

【数２】 [Equation 2]

【００２２】話者の口元の位置Ｖ（ａ₂，ｂ₂）及びマ
イクアレイの位置ｍ₀（ａ₀，ｂ₀）は既知であるの
で、（２）式より、マイクアレイに対する話者の方向θ
₀を算出することができる。その後、（１）式に示す関
係から、話者からマイクＭ₁，Ｍ₂までの距離の差Ｌを
計算し、話者の方向からの音声を選択的に収音するのに
必要な遅延時間（Ｌ／Ｃ）を算出する。この遅延時間を
信号遅延部１２ｂに設定することにより、話者の方向か
らの音声を選択的に収音することができる。Since the position V (a ₂ , b ₂ ) of the speaker's mouth and the position m ₀ (a ₀ , b ₀ ) of the microphone array are known, the direction of the speaker with respect to the microphone array is expressed by the equation (2). θ
₀ can be calculated. After that, the delay time required to calculate the difference L in the distance from the speaker to the microphones M ₁ and M ₂ from the relationship shown in equation (1) and selectively pick up the voice from the direction of the speaker. Calculate (L / C). By setting this delay time in the signal delay unit 12b, the voice from the direction of the speaker can be selectively picked up.

【００２３】図６は、サンバイザーを図５に示す位置か
ら角度θだけ回転した状態を示す。ここで、マイクアレ
イの位置をｍ₁（ａ₁，ｂ₁）とし、マイクアレイに対
する話者の口元の方向をθ₁とすると、ベクトルの内積
の定義より、下記（３）式が成り立つ。FIG. 6 shows the sun visor rotated from the position shown in FIG. 5 by an angle θ. Here, assuming that the position of the microphone array is m ₁ (a ₁ , b ₁ ) and the direction of the speaker's mouth with respect to the microphone array is θ ₁ , the following expression (3) is established from the definition of the inner product of the vector.

【００２４】[0024]

【数３】 [Equation 3]

【００２５】マイクアレイのＸ座標ａ₁はａ₁＝ｂ₀co
s θ、Ｙ座標ｂ₁はｂ₁＝ｂ₀sinθであり、話者の口
元の位置Ｖ（ａ₂，ｂ₂）は既知であるので、（３）式
より、マイクアレイに対する話者の方向θ₁を算出する
ことができる。その後、（１）式に示す関係から、話者
からマイクＭ₁，Ｍ₂までの距離の差Ｌを計算し、話者
の方向からの音声を選択的に収音するのに必要な遅延時
間を算出する。この遅延時間を信号遅延部１２ｂに設定
することにより、話者の方向からの音声を選択的に収音
することができる。The X coordinate a ₁ of the microphone array is a ₁ = b ₀ co
Since s θ and the Y coordinate b ₁ are b ₁ = b ₀ sin θ, and the position V (a ₂ , b ₂ ) of the speaker's mouth is known, the direction of the speaker with respect to the microphone array is calculated from the equation (3). θ ₁ can be calculated. After that, the delay time required to calculate the difference L in the distance from the speaker to the microphones M ₁ and M ₂ from the relationship shown in equation (1) and selectively pick up the voice from the direction of the speaker. To calculate. By setting this delay time in the signal delay unit 12b, the voice from the direction of the speaker can be selectively picked up.

【００２６】このように、マイクアレイ（マイクＭ₁，
Ｍ₂）が取り付けられたサンバイザー１９の角度に応じ
て信号遅延器１２ａ，１２ｂの遅延時間を調整すれば、
常に話者の方向からの音声を選択的に収音することがで
きる。音声認識処理部１４では、加算器１３から出力さ
れた音声信号を音声認識処理して文字データに変換し、
ナビゲーション装置本体（図示せず）に出力する。ナビ
ゲーション装置本体では、音声認識処理部１４から出力
された文字データを入力し、文字データに応じた処理を
実行する。In this way, the microphone array (microphone M ₁ ,
If the delay times of the signal delay units 12a and 12b are adjusted according to the angle of the sun visor 19 to which M ₂ ) is attached,
The voice from the direction of the speaker can always be selectively picked up. In the voice recognition processing unit 14, the voice signal output from the adder 13 is subjected to voice recognition processing and converted into character data,
Output to the navigation device body (not shown). In the main body of the navigation device, the character data output from the voice recognition processing unit 14 is input and a process according to the character data is executed.

【００２７】上述したように、本実施の形態の音声認識
装置は、サンバイザー１９を動かしても話者の口元方向
からの音声信号を選択的に収音できるので、音声認識処
理部１４にはノイズ成分が相対的に少ない信号が入力さ
れる。これにより、音声認識率の低下が防止される。な
お、上記の実施の形態では、マイク指向性算出部１６で
サンバイザー１９の回転角度θからマイクに対する話者
口元の方向を算出するものとした。しかし、マイク指向
性算出部１６に代えて、サンバイザーの回転角度θとマ
イクに対する話者口元の方向との関係を示すテーブル
（表）を記憶させたデータ記憶部を設けてもよい。この
場合、データ記憶部は角度センサ１８から角度θが入力
されると、テーブルを用いて角度θに対応したマイクに
対する話者口元の方向を出力する。このマイクに対する
と話者口元位置の方向情報とに基づいて、遅延時間制御
部１５は信号遅延器１２ａ，１２ｂの遅延時間を制御す
る。As described above, the voice recognition device of the present embodiment can selectively pick up the voice signal from the mouth direction of the speaker even if the sun visor 19 is moved. A signal having a relatively small noise component is input. This prevents the voice recognition rate from decreasing. In the above embodiment, the microphone directivity calculation unit 16 calculates the direction of the speaker's mouth with respect to the microphone from the rotation angle θ of the sun visor 19. However, instead of the microphone directivity calculation unit 16, a data storage unit that stores a table showing the relationship between the rotation angle θ of the sun visor and the direction of the speaker's mouth with respect to the microphone may be provided. In this case, when the angle θ is input from the angle sensor 18, the data storage unit outputs the direction of the speaker's mouth to the microphone corresponding to the angle θ using the table. The delay time control unit 15 controls the delay time of the signal delay units 12a and 12b based on the direction information of the speaker mouth position with respect to the microphone.

【００２８】上記実施の形態では２個のマイクの出力の
遅延和をとる遅延和アレイを用いた場合について説明し
たが、３個又はそれ以上の遅延和アレイを用いることに
より、話者の方向からの音声をより一層高い精度で強調
することができる。（第２の実施の形態）図７は本発明の第２の実施の形態
の音声認識装置の構成を示すブロック図である。本実施
の形態では、ノイズ発生源の方向に死角（感度の低い方
向）を形成する減算型アレイを用いることにより、ノイ
ズ発生源の方向が特定できる場合に、話者の方向からの
音声を選択的に収音することができる。本実施の形態
も、本発明を車載用ナビゲーション装置の音声認識装置
に適用した例を示している。In the above embodiment, the case where the delay sum array for taking the delay sum of the outputs of the two microphones is used has been described, but by using the delay sum array of three or more, the direction of the speaker can be improved. Can be emphasized with even higher accuracy. (Second Embodiment) FIG. 7 is a block diagram showing the configuration of a speech recognition apparatus according to the second embodiment of the present invention. In the present embodiment, by using a subtraction type array that forms a blind spot (direction with low sensitivity) in the direction of the noise source, when the direction of the noise source can be specified, the voice from the direction of the speaker is selected. The sound can be picked up. This embodiment also shows an example in which the present invention is applied to a voice recognition device of a vehicle-mounted navigation device.

【００２９】本実施の形態の音声認識装置は、２個の無
指向性マイクＭ₁，Ｍ₂と、信号遅延器２２と、減算器
（信号合成器）２３と、音声認識処理部２４と、遅延時
間制御部２５と、マイク指向性算出部２６と、座席位置
センサ２７と、角度センサ２８とにより構成されてい
る。マイクＭ₁，Ｍ₂は、第１の実施の形態と同様に、
運転席前方のサンバイザー（図示せず）に相互に離隔し
て取り付けられる。この例では、マイクＭ₁がマイクＭ
₂に比べて話者に近い位置に配置される。マイクＭ₁の
出力は減算器２３に直接入力され、マイクＭ₂の出力は
信号遅延器２２を介して減算器２３に入力される。減算
器２３は、マイクＭ₁の出力から信号遅延器２２の出力
を減算した信号を出力する。音声認識処理部２４は、減
算器２３から出力された信号に対し音声認識処理を実施
して、文字データを出力する。この文字データは、ナビ
ゲーション装置本体（図示せず）に送られる。The speech recognition apparatus of this embodiment has two omnidirectional microphones M ₁ and M ₂ , a signal delay unit 22, a subtractor (signal synthesizer) 23, a speech recognition processing unit 24, The delay time control unit 25, the microphone directivity calculation unit 26, the seat position sensor 27, and the angle sensor 28 are included. The microphones M ₁ and M ₂ are similar to those in the first embodiment.
The sun visors (not shown) in front of the driver's seat are mounted separately from each other. In this example, microphone M ₁ is microphone M _1.
It is placed closer to the speaker compared to ₂ . The output of the microphone M ₁ is directly input to the subtractor 23, and the output of the microphone M ₂ is input to the subtractor 23 via the signal delay unit 22. The subtractor 23 outputs a signal obtained by subtracting the output of the signal delay unit 22 from the output of the microphone M ₁ . The voice recognition processing unit 24 performs voice recognition processing on the signal output from the subtractor 23 and outputs character data. This character data is sent to the navigation device body (not shown).

【００３０】座席位置センサ２７は、運転席を車両の前
後方向にスライドさせることによる話者（の口元）の位
置変化量を検出する。また、角度センサ２８は、マイク
Ｍ₁，Ｍ₂が取り付けられたサンバイザーの角度に応じ
た信号を出力する。マイク指向性算出部２６は、第１の
実施の形態と同様に、角度センサ２８の出力からマイク
Ｍ₁，Ｍ₂に対するノイズ発生源のある方向を求め、ノ
イズ成分を減衰するために必要な遅延時間を算出する。
遅延時間制御部２５は、マイク指向性算出部２６で算出
した遅延時間となるように、信号遅延器２２を制御す
る。The seat position sensor 27 detects the amount of change in the position of the speaker (or his / her mouth) when the driver's seat is slid in the longitudinal direction of the vehicle. Further, the angle sensor 28 outputs a signal according to the angle of the sun visor to which the microphones M ₁ and M ₂ are attached. Similar to the first embodiment, the microphone directivity calculation unit 26 obtains the direction in which the noise source is generated with respect to the microphones M ₁ and M ₂ from the output of the angle sensor 28, and delays necessary to attenuate the noise component. Calculate time.
The delay time control unit 25 controls the signal delay unit 22 so that the delay time calculated by the microphone directivity calculation unit 26 is reached.

【００３１】以下、上述した構成の音声認識装置の動作
について説明する。本実施の形態では、マイクＭ₁を基
準マイクとし、マイクＭ₂に接続した信号遅延器２２の
遅延時間をノイズ源の方向に応じて設定する。例えば、
エンジンルームの方向からのノイズを除去する場合、マ
イクＭ₁で検出したエンジンルームの方向からのノイズ
と、マイクＭ₂で検出したエンジンルームの方向からの
ノイズが同じタイミングで減算器２３に入力されるよう
に信号遅延器２２の遅延時間を設定する。The operation of the speech recognition apparatus having the above-mentioned structure will be described below. In the present embodiment, the microphone M ₁ is used as the reference microphone, and the delay time of the signal delay device 22 connected to the microphone M ₂ is set according to the direction of the noise source. For example,
When removing noise from the direction of the engine room, noise from the direction of the engine room detected by the microphone M ₁ and noise from the direction of the engine room detected by the microphone M ₂ are input to the subtractor 23 at the same timing. Thus, the delay time of the signal delay unit 22 is set.

【００３２】減算器２３では、マイクＭ₁の出力から信
号遅延器２２の出力を減算することにより、ノイズ源の
方向からのノイズ成分を抑圧（低減）した音声信号が出
力される。この出力信号は音声認識処理部２４に入力さ
れ、音声認識処理される。音声認識処理部２４から出力
された文字データはナビゲーション装置本体に入力さ
れ、ナビゲーション装置本体では音声認識処理部２４か
ら入力した文字データに応じた処理が実行される。The subtractor 23 subtracts the output of the signal delay unit 22 from the output of the microphone M ₁ to output an audio signal in which the noise component from the direction of the noise source is suppressed (reduced). This output signal is input to the voice recognition processing unit 24 and subjected to voice recognition processing. The character data output from the voice recognition processing unit 24 is input to the navigation device body, and the navigation device body executes processing according to the character data input from the voice recognition processing unit 24.

【００３３】一方、角度センサ２８は、第１の実施の形
態と同様に、サンバイザーの角度に応じた信号を出力す
る。マイク指向性算出部２６は、角度センサ２８の出力
を基にマイクＭ₁，Ｍ₂の変位を求め、マイクＭ₁，Ｍ
₂に対するノイズ発生源の方向を算出する。遅延時間制
御部２５は、マイク指向性算出部２６で算出された遅延
時間となるように信号遅延器２２を制御する。On the other hand, the angle sensor 28 outputs a signal corresponding to the angle of the sun visor, as in the first embodiment. The microphone directivity calculation unit 26 obtains the displacements of the microphones M ₁ and M ₂ based on the output of the angle sensor 28, and determines the microphones M ₁ and M.
Calculate the direction of the noise source with respect to ₂ . The delay time control unit 25 controls the signal delay unit 22 so that the delay time calculated by the microphone directivity calculation unit 26 is reached.

【００３４】本実施の形態では、角度センサ２８により
マイクＭ₁，Ｍ₂が取り付けられたサンバイザーの角度
を検出し、その検出結果に応じて信号遅延器２２の遅延
時間を制御するので、サンバイザーを動かしてもノイズ
発声源の方向に常に死角が形成される。これにより、本
実施の形態においても、第１の実施の形態と同様に、音
声認識処理部２４における音声認識率の低下が回避され
る。In this embodiment, the angle sensor 28 detects the angle of the sun visor to which the microphones M ₁ and M ₂ are attached, and the delay time of the signal delay device 22 is controlled according to the detection result. Even if the visor is moved, a blind spot is always formed in the direction of the noise vocal source. As a result, in the present embodiment as well, a decrease in the voice recognition rate in the voice recognition processing unit 24 is avoided, as in the first embodiment.

【００３５】（第３の実施の形態）以下、本発明の第３
の実施の形態の音声認識装置について説明する。本実施
の形態では、Griffith-Jim型アレイを用いて話者の方向
からの音声を選択的に抽出する。本実施の形態も、本発
明を車載用ナビゲーション装置の音声認識装置に適用し
た例を示している。(Third Embodiment) The third embodiment of the present invention will be described below.
The voice recognition device according to the embodiment will be described. In the present embodiment, a Griffith-Jim type array is used to selectively extract voice from the direction of the speaker. This embodiment also shows an example in which the present invention is applied to a voice recognition device of a vehicle-mounted navigation device.

【００３６】図８は本発明の第３の実施の形態の音声認
識装置の構成を示すブロック図である。本実施の形態の
音声認識装置は、４個の無指向性マイクＭ₁〜Ｍ₄と、
４個の信号遅延器Ｄ１〜Ｄ４と、４個の減算器（信号合
成器）Ｓ０〜Ｓ３と、３個の適応フィルタＷ１〜Ｗ３
と、２個の加算器（信号合成器）Ａ１，Ａ２と、音声認
識処理部３４と、遅延時間制御部３５と、マイク指向性
算出部３６と、座席位置センサ３７と、角度センサ３８
とにより構成されている。FIG. 8 is a block diagram showing the structure of a speech recognition apparatus according to the third embodiment of the present invention. The voice recognition device according to the present embodiment includes four omnidirectional microphones M _{1 to} M ₄ ,
Four signal delay units D1 to D4, four subtractors (signal combiners) S0 to S3, and three adaptive filters W1 to W3
, Two adders (signal synthesizers) A1 and A2, a voice recognition processing unit 34, a delay time control unit 35, a microphone directivity calculation unit 36, a seat position sensor 37, and an angle sensor 38.
It is composed of and.

【００３７】マイクＭ₁〜Ｍ₄はいずれも運転席前方の
サンバイザー（図示せず）に、相互に離隔して取り付け
られる。信号遅延器Ｄ１〜Ｄ４はそれぞれ対応するマイ
クＭ ₁〜Ｍ₄に接続されており、マイクＭ₁〜Ｍ₄から
出力された信号を遅延して出力する。これらの信号遅延
器Ｄ１〜Ｄ４は、遅延時間制御部３５により遅延時間が
個別に制御される。マイクＭ₁は基準マイクであり、こ
のマイクＭ₁に接続された信号遅延器Ｄ１の出力は、減
算器Ｓ１に伝達されるとともに、減算器Ｓ０にも伝達さ
れる。Mike M₁~ M_FourBoth are in front of the driver's seat
Mounted separately on a sun visor (not shown)
To be The signal delay devices D1 to D4 respectively correspond to
Ku M ₁~ M_FourIs connected to the microphone M₁~ M_FourFrom
The output signal is delayed and output. These signal delays
The delay time of the delay time of the units D1 to D4 is controlled by the delay time control unit 35.
Individually controlled. Mike M₁Is the reference microphone,
Mike M₁The output of the signal delay device D1 connected to
It is transmitted to the calculator S1 and also to the subtractor S0.
Be done.

【００３８】減算器Ｓ１は、信号遅延器Ｄ１の出力から
信号遅延器Ｄ２の出力を減算した信号を適応フィルタＷ
１に出力する。減算器Ｓ２は信号遅延器Ｄ２の出力から
信号遅延器Ｄ３の出力を減算した信号を適応フィルタＷ
２に出力する。減算器Ｓ３は、信号遅延器Ｄ３の出力か
ら信号遅延器Ｄ４の出力を減算した信号を適応フィルタ
Ｗ２に出力する。The subtractor S1 subtracts the output of the signal delay device D2 from the output of the signal delay device D1 to obtain an adaptive filter W.
Output to 1. The subtracter S2 subtracts the output of the signal delay device D3 from the output of the signal delay device D2, and outputs the signal obtained by the adaptive filter W.
Output to 2. The subtractor S3 outputs a signal obtained by subtracting the output of the signal delay device D4 from the output of the signal delay device D3 to the adaptive filter W2.

【００３９】加算器Ａ２は、適応フィルタＷ２の出力と
適応フィルタＷ３の出力とを加算した信号を加算器Ａ１
に伝達する。また、加算器Ａ１は適応フィルタＷ１の出
力と加算器Ａ２の出力とを加算した信号を減算器Ｓ０に
伝達する。減算器Ｓ０は、信号遅延器Ｄ１の出力から加
算器Ａ１の出力を減算した信号を音声認識処理部２４に
伝達する。The adder A2 adds a signal obtained by adding the output of the adaptive filter W2 and the output of the adaptive filter W3 to the adder A1.
Communicate to. Further, the adder A1 transmits the signal obtained by adding the output of the adaptive filter W1 and the output of the adder A2 to the subtractor S0. The subtractor S0 transfers the signal obtained by subtracting the output of the adder A1 from the output of the signal delay unit D1 to the voice recognition processing unit 24.

【００４０】適応フィルタＷ１〜Ｗ３は、減算器Ｓ０の
出力を誤差とし、誤差のパワーが最小になるように、適
応アルゴリズムによってフィルタ係数を逐次変化させ
る。一方、角度センサ３８は、第１の実施の形態と同様
に、マイクＭ₁〜Ｍ₄が取り付けられたサンバイザーの
角度を検出する。マイク指向性算出部３６は，角度セン
サ３８の出力を基にマイクＭ₁〜Ｍ₄の変位を求め、マ
イクＭ₁〜Ｍ₄に対する話者口元位置の方向を算出す
る。遅延時間制御部３５は、マイク指向性算出部３６で
算出された遅延時間となるように信号遅延部Ｄ１〜Ｄ４
を制御する。The adaptive filters W1 to W3 use the output of the subtractor S0 as an error and sequentially change the filter coefficient by an adaptive algorithm so that the error power is minimized. On the other hand, the angle sensor 38 detects the angle of the sun visor to which the microphones M _{1 to} M ₄ are attached, as in the _first embodiment. The microphone directivity calculation unit 36 obtains the displacements of the microphones M _{1 to} M ₄ based on the output of the angle sensor 38, and calculates the direction of the speaker mouth position with respect to the microphones M _{1 to} M ₄ . The delay time control unit 35 sets the signal delay units D1 to D4 so that the delay time calculated by the microphone directivity calculation unit 36 is obtained.
To control.

【００４１】以下、本実施の形態の音声認識装置の動作
について説明する。マイクＭ₁は、話者の音声と周囲の
ノイズとを収音し、音声信号にノイズ成分が重畳した信
号を出力する。また、マイクＭ₂は、話者の音声と周囲
のノイズとを収音し、音声信号にノイズ成分が重畳した
信号を出力する。従って、信号遅延器Ｄ１，Ｄ２の遅延
時間を適切に調整すると、減算器Ｓ１からは、音声信号
が除去されたノイズ成分のみの信号が出力される。この
ノイズ成分のみの信号が適応フィルタＷ１に入力され
る。The operation of the speech recognition apparatus of this embodiment will be described below. The microphone M ₁ picks up the speaker's voice and ambient noise, and outputs a signal in which a noise component is superimposed on the voice signal. Further, the microphone M ₂ picks up the voice of the speaker and ambient noise and outputs a signal in which a noise component is superimposed on the voice signal. Therefore, when the delay times of the signal delay units D1 and D2 are appropriately adjusted, the subtractor S1 outputs a signal of only the noise component from which the audio signal has been removed. The signal of only this noise component is input to the adaptive filter W1.

【００４２】これと同様に、信号遅延器Ｄ３，Ｄ４の遅
延時間を適切に調整すると、減算器Ｓ２，Ｓ３からも音
声信号が除去されたノイズ成分のみ信号が出力される。
これらのノイズ成分のみ信号はそれぞれ適応フィルタＷ
２、Ｗ３に入力される。減算器Ｓ０は、信号遅延器Ｄ１
の出力信号から、適応フィルタＷ１〜Ｗ３の出力を加算
した信号を減算する。初めは減算器Ｓ０の出力には音声
信号にノイズ成分が重畳しているが、適応フィルタＷ１
〜Ｗ３は減算器Ｓ０の出力が最小となるように働くの
で、十分時間が経過した後には、減算器Ｓ０からはノイ
ズ成分が除去された音声信号が出力されるようになる。Similarly, when the delay times of the signal delay units D3 and D4 are properly adjusted, only the noise component from which the audio signal has been removed is output from the subtracters S2 and S3.
Signals containing only these noise components are adaptive filters W respectively.
2, W3 is input. The subtractor S0 is a signal delay device D1.
The signal obtained by adding the outputs of the adaptive filters W1 to W3 is subtracted from the output signal of. At first, the noise component is superimposed on the audio signal at the output of the subtractor S0, but the adaptive filter W1
Since ~ W3 works so as to minimize the output of the subtractor S0, the audio signal from which the noise component is removed is output from the subtractor S0 after a sufficient time has elapsed.

【００４３】音声認識処理部３４では、減算器Ｓ０から
ノイズ成分が除去された音声信号が入力されるので、音
声認識の認識性能が、ノイズのない環境のそれと比較し
て劣化しない。ところで、減算器Ｓ１〜Ｓ３の出力に音
声信号を含まないようにするためには、上述の如く、信
号遅延器Ｄ１〜Ｄ４の遅延時間を適切に制御する必要が
ある。マイクＭ₁〜Ｍ₄が取り付けられたサンバイザー
の角度が変化すると、適切な遅延時間も変化する。従っ
て、マイク指向性算出部３６は、角度センサ３８で検出
したサンバイザーの角度から各信号遅延器Ｄ１〜Ｄ４毎
の適切な遅延時間を計算し、その計算結果に応じて遅延
時間制御部３５は信号遅延器Ｄ１〜Ｄ４を制御する。In the voice recognition processing unit 34, since the voice signal from which the noise component has been removed is input from the subtractor S0, the recognition performance of voice recognition does not deteriorate as compared with that in a noise-free environment. By the way, in order to prevent the outputs of the subtracters S1 to S3 from including the audio signal, it is necessary to appropriately control the delay times of the signal delay units D1 to D4 as described above. When the angle of the sun visor of the microphone M ₁ ~M ₄ is attached changes, changes also suitable delay. Therefore, the microphone directivity calculation unit 36 calculates an appropriate delay time for each of the signal delay units D1 to D4 from the angle of the sun visor detected by the angle sensor 38, and the delay time control unit 35 calculates the appropriate delay time according to the calculation result. The signal delay devices D1 to D4 are controlled.

【００４４】これにより、サンバイザーを動かしても常
に話者の方向からの音声を選択的に収音することがで
き、音声認識処理部３４における音声認識率の低下が回
避される。なお、上記第３の実施の形態では、４個のマ
イクＭ₁〜Ｍ₄を用いた場合について説明しているが、
マイクの数を増やすことにより、話者の方向からの音声
をより一層高い精度で抽出することができる。As a result, even if the sun visor is moved, the voice from the direction of the speaker can always be selectively picked up, and the reduction in the voice recognition rate in the voice recognition processing section 34 can be avoided. In addition, in the third embodiment, the case where the _four microphones M _{1 to} M ₄ are used has been described.
By increasing the number of microphones, the voice from the direction of the speaker can be extracted with higher accuracy.

【００４５】また、第１〜第３の実施の形態では、いず
れも音声信号抽出手段が信号遅延器及び遅延時間制御部
により構成されている場合について説明したが、これに
より本発明の音声認識装置の音声信号抽出手段が上記構
成に限定されるものではない。例えば音声信号抽出手段
として、各マイクからの出力信号を周波数毎に分割し、
各周波数毎の時間差及びレベル差を算出して話者の方向
からの音声を抽出するようにしてもよい。In each of the first to third embodiments, the case where the voice signal extraction means is composed of the signal delay unit and the delay time control unit has been described. The audio signal extracting means is not limited to the above configuration. For example, as an audio signal extraction means, the output signal from each microphone is divided for each frequency,
The sound from the direction of the speaker may be extracted by calculating the time difference and the level difference for each frequency.

【００４６】更に、第１〜第３の実施の形態では、いず
れも本発明を車載用ナビゲーション装置の音声認識装置
に適用した場合について説明したが、本発明はこれによ
り車載用の音声認識装置に限定されるものではない。本
発明は、マイクと話者との相対位置が変化するような環
境で使用される種々の音声認識装置に適用することがで
きる。Further, in each of the first to third embodiments, the case where the present invention is applied to the voice recognition device of the vehicle-mounted navigation device has been described, but the present invention is applied to the voice recognition device for the vehicle. It is not limited. INDUSTRIAL APPLICABILITY The present invention can be applied to various voice recognition devices used in an environment where the relative position between a microphone and a speaker changes.

【００４７】[0047]

【発明の効果】以上説明したように、本発明によれば、
相互に離隔して配置された複数個のマイクの出力を音声
信号抽出手段で信号処理して、話者の方向からの音声に
よる信号を強調し、又はノイズ成分を減衰させることに
よって，話者の方向からの音声信号を選択的に抽出する
ので、ノイズ環境下での音声認識処理部での音声の認識
性能が、ノイズのない環境のそれと比較して劣化しな
い。また、音声信号抽出手段では、マイク位置変化量検
出手段により検出したマイクの位置の変化に応じた信号
処理を実行するので、マイクの位置が変化しても、音声
認識処理部での音声認識率の低下が回避される。As described above, according to the present invention,
The output of a plurality of microphones arranged apart from each other is subjected to signal processing by the voice signal extraction means to enhance the signal by the voice from the direction of the speaker or to attenuate the noise component, Since the voice signal from the direction is selectively extracted, the voice recognition performance of the voice recognition processing unit in a noise environment does not deteriorate as compared with that in a noise-free environment. Further, since the voice signal extraction means executes signal processing according to the change in the position of the microphone detected by the microphone position change amount detection means, even if the position of the microphone changes, the voice recognition rate in the voice recognition processing section is increased. Is reduced.

[Brief description of drawings]

【図１】図１は本発明の第１の実施の形態の音声認識装
置の原理を示す図（その１）である。FIG. 1 is a diagram (No. 1) showing the principle of a speech recognition apparatus according to a first embodiment of the present invention.

【図２】図２は本発明の第１の実施の形態の音声認識装
置の原理を示す図（その２）である。FIG. 2 is a diagram (part 2) showing the principle of the voice recognition device in the first embodiment of the present invention.

【図３】図３は第１の実施の形態の音声認識装置の構成
を示すブロック図である。FIG. 3 is a block diagram showing a configuration of a voice recognition device according to the first embodiment.

【図４】図４は角度センサの取り付け部の例を示す模式
図である。FIG. 4 is a schematic diagram showing an example of a mounting portion of an angle sensor.

【図５】図５は遅延時間の算出方法を説明するための模
式図（その１）である。FIG. 5 is a schematic diagram (No. 1) for explaining a delay time calculation method.

【図６】図６は遅延時間の算出方法を説明するための模
式図（その２）である。FIG. 6 is a schematic view (No. 2) for explaining a delay time calculation method.

【図７】図７は本発明の第２の実施の形態の音声認識装
置の構成を示すブロック図である。FIG. 7 is a block diagram showing a configuration of a voice recognition device according to a second embodiment of the present invention.

【図８】図８は本発明の第３の実施の形態の音声認識装
置の構成を示すブロック図である。FIG. 8 is a block diagram showing a configuration of a voice recognition device according to a third embodiment of the present invention.

[Explanation of symbols]

１２ａ，１２ｂ，２２，Ｄ１〜Ｄ４…信号遅延器、１３，Ａ１，Ａ２…加算器（信号合成器）、１４，２４，３４…音声認識処理部、１５，２５．３５…遅延時間制御部、１６，２６，３６…マイク指向性算出部、１７，２７，３７…座席位置センサ、１８，２８，３８…角度センサ、２３，Ｓ０〜Ｓ３…減算器（信号合成部）、Ｍ₁〜Ｍ₄…マイク。12a, 12b, 22, D1 to D4 ... Signal delay device, 13, A1, A2 ... Adder (signal synthesizer), 14, 24, 34 ... Speech recognition processing unit, 15, 25.35 ... Delay time control unit, 16, 26, 36 ... microphone directivity calculation unit, 17, 27, 37 ... seat position sensor, 18,28,38 ... angle sensor, 23, S0 to S3 ... subtractor (signal combining unit), M ₁ ~M ₄ …Microphone.

フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｇ１０Ｌ 15/20 Ｇ１０Ｌ 3/02 ３０１Ｄ 21/02 ３０１Ｅ Front page continuation (51) Int.Cl. ⁷ Identification code FI theme code (reference) G10L 15/20 G10L 3/02 301D 21/02 301E

Claims

[Claims]

1. A plurality of microphones arranged apart from each other, a microphone position change amount detecting means for detecting a position change amount of the plurality of microphones, and a microphone detected by the microphone position change amount detecting means. A voice signal extraction means for inputting a position change amount and selectively extracting a voice signal from the direction of the speaker from the outputs of the plurality of microphones by utilizing the delay of the signal according to the difference in the position of each microphone. A voice recognition processing unit that performs voice recognition processing on the voice signal extracted by the voice signal extraction means.

2. The voice recognition device according to claim 1, wherein each of the plurality of microphones is an omnidirectional microphone.

3. The voice signal extracting means selectively extracts a voice signal from the direction of the speaker by emphasizing a voice from the direction of the speaker by using outputs of the plurality of microphones. The voice recognition device according to claim 1, wherein

4. The voice signal extracting means selectively extracts voice signals from the direction of the speaker by using outputs from the plurality of microphones to attenuate noise components from a predetermined direction. The voice recognition device according to claim 1, wherein the voice recognition device is a voice recognition device.

5. The audio signal extraction means is connected between a signal synthesizer for synthesizing outputs of the plurality of microphones, and between the signal synthesizer and at least one of the plurality of microphones. A signal delay unit, a delay time determination unit that determines a delay time of the signal delay unit based on a detection result of the microphone position change amount by the microphone position change amount detection unit, and a delay time determined by the delay time determination unit. The speech recognition apparatus according to claim 1, further comprising a delay time control unit that controls the signal delay device.

6. The delay time determination unit calculates the direction in which the speaker is located with respect to the plurality of microphones by calculation based on the detection result of the microphone position change amount detection means, and determines the delay time. The voice recognition device according to claim 5, wherein

7. The voice recognition device according to claim 1, wherein at least one of the plurality of microphones is attached to a sun visor in front of a driver's seat of an automobile.

8. The voice recognition device according to claim 7, wherein the microphone position change amount detection means is a sensor that detects an angle of the sun visor.