JP4286637B2

JP4286637B2 - Microphone device and playback device

Info

Publication number: JP4286637B2
Application number: JP2003385375A
Authority: JP
Inventors: 丈郎金森; 岳河村; 智美松岡
Original assignee: Panasonic Corp; Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Corp; Panasonic Holdings Corp
Priority date: 2002-11-18
Filing date: 2003-11-14
Publication date: 2009-07-01
Anticipated expiration: 2023-11-14
Also published as: JP2004187283A

Abstract

PROBLEM TO BE SOLVED: To provide a microphone unit capable of stably operating even under a plurality of noises in the practical environment and of realizing a high S/N. SOLUTION: A signal generating part generates a main signal and a noise reference signal. A judgement part judges whether or not a level ratio is greater than a predetermined value. An adaptive filter part generates a signal indicating a signal component of a target sound included in the noise reference signal generated by the signal generating part and learns a filter coefficient only when it is judged by the judgement part that the level ratio is greater than the predetermined value. A subtraction part subtracts the signal generated by the adaptive filter part. A noise suppressing part uses the main signal and the noise reference signal after subtraction by the subtraction part to suppress a signal component of noise contained in the main signal. COPYRIGHT: (C)2004,JPO&NCIPI

Description

本発明は、マイクロホン装置および音声再生装置に関し、より特定的には、所定の方向から到来する音を雑音を抑圧して検出するマイクロホン装置および音声再生装置である。 The present invention relates to a microphone device and a sound reproduction device, and more specifically, a microphone device and a sound reproduction device that detect noise coming from a predetermined direction while suppressing noise.

従来のマイクロホン装置の構成について図２４から図２６を用いて説明する。
図２４は、従来例１のマイクロホン装置の構成を示す図である。図２４において、マイクロホン装置は、第１のマイクロホンユニット１０１０と、第２のマイクロホンユニット１０２０と、信号加算部１０３０と、第１の信号減算部１０３１と、信号増幅部１０５０と、適応フィルタ部１０６０と、第２の信号減算部１０６２とを備えている。各マイクロホンユニット１０１０および１０２０は、ともに正面方向（図２４では左方向）を向くように配置される。信号加算部１０３０は、第１のマイクロホンユニット１０１０から出力される信号と第２のマイクロホンユニット１０２０から出力される信号とを加算する。第１の信号減算部１０３１は、第１のマイクロホンユニット１０１０から出力される信号から、第２のマイクロホンユニット１０２０から出力される信号を減算する。信号増幅部１０５０は、信号加算部１０３０から出力される信号を１／２倍する。適応フィルタ部１０６０は、第１の信号減算部１０３１から出力される信号を入力とし、適応フィルタによってフィルタリングを行った信号を出力する。第２の信号減算部１０６２は、信号増幅部１０５０から出力される信号から、適応フィルタ部１０６０から出力される信号を減算する。第２の信号減算部１０６２からの出力が、マイクロホン装置の出力となる。適応フィルタ部１０６０は、第２の信号減算部１０６２から出力される信号と第１の信号減算部１０３１から出力される信号とに基づいて、フィルタ係数の学習を行う。 A configuration of a conventional microphone device will be described with reference to FIGS.
FIG. 24 is a diagram showing a configuration of the microphone device of the first conventional example. In FIG. 24, the microphone device includes a first microphone unit 1010, a second microphone unit 1020, a signal adding unit 1030, a first signal subtracting unit 1031, a signal amplifying unit 1050, and an adaptive filter unit 1060. , And a second signal subtracting unit 1062. Both microphone units 1010 and 1020 are arranged to face the front direction (left direction in FIG. 24). The signal adding unit 1030 adds the signal output from the first microphone unit 1010 and the signal output from the second microphone unit 1020. The first signal subtracting unit 1031 subtracts the signal output from the second microphone unit 1020 from the signal output from the first microphone unit 1010. The signal amplification unit 1050 multiplies the signal output from the signal addition unit 1030 by a factor of two. The adaptive filter unit 1060 receives the signal output from the first signal subtracting unit 1031 and outputs a signal filtered by the adaptive filter. The second signal subtraction unit 1062 subtracts the signal output from the adaptive filter unit 1060 from the signal output from the signal amplification unit 1050. The output from the second signal subtracting unit 1062 becomes the output of the microphone device. The adaptive filter unit 1060 learns filter coefficients based on the signal output from the second signal subtracting unit 1062 and the signal output from the first signal subtracting unit 1031.

次に、従来例１のマイクロホン装置の動作について説明する。正面方向から到来する音を検出する場合、各マイクロホンユニット１０１０および１０２０はほぼ等しい信号を出力する。また、正面方向以外の方向から到来する音を検出する場合、各マイクロホンユニット１０１０および１０２０は位相の異なる信号を出力する。各マイクロホンユニット１０１０および１０２０からの出力信号は信号加算部１０３０によって加算される。加算された信号は、信号増幅部１０５０によってレベルが正規化される、すなわち、振幅が１／２倍される。以上によって、正面方向から到来する音の成分を有する主信号を得ることができる。一方、第１の信号減算部１０３１からの出力によって、正面方向に対して指向性主軸が９０度方向に向き、かつ、正面方向が指向性の死角となる（すなわち、正面方向が指向性の最小感度方向となる）ような指向性特性を得ることができる。つまり、第１の信号減算部１０３１から出力される信号は、正面方向から到来する音の成分を含まない雑音参照信号となる。適応フィルタ部１０６０は、信号増幅部１０５０から出力される主信号と第１の信号減算部１０３１から出力される雑音参照信号とを用いることによって、適応指向性を実現する。すなわち、正面方向以外から到来するある一方向の雑音源に対して自動的に指向性の死角を形成する。 Next, the operation of the microphone device of Conventional Example 1 will be described. When detecting sound coming from the front direction, the microphone units 1010 and 1020 output substantially equal signals. Further, when detecting sound coming from a direction other than the front direction, the microphone units 1010 and 1020 output signals having different phases. Output signals from the microphone units 1010 and 1020 are added by a signal adding unit 1030. The level of the added signal is normalized by the signal amplification unit 1050, that is, the amplitude is halved. As described above, a main signal having a sound component coming from the front direction can be obtained. On the other hand, by the output from the first signal subtracting unit 1031, the main directivity axis is oriented in the direction of 90 degrees with respect to the front direction, and the front direction becomes a blind spot of directivity (that is, the front direction is the minimum directivity). Directivity characteristics such as a sensitivity direction can be obtained. That is, the signal output from the first signal subtraction unit 1031 is a noise reference signal that does not include a sound component coming from the front direction. The adaptive filter unit 1060 realizes adaptive directivity by using the main signal output from the signal amplification unit 1050 and the noise reference signal output from the first signal subtraction unit 1031. That is, a directivity blind spot is automatically formed with respect to a noise source in one direction coming from other than the front direction.

図２５は、従来例２のマイクロホン装置の構成を示す図である。図２５において、マイクロホン装置は、第１のマイクロホンユニット１０１０と、第２のマイクロホンユニット１０２０と、第１の適応フィルタ部１０４０と、第１の信号遅延部１０４１と、第１の信号減算部１０４２と、第２の適応フィルタ部１０６０と、第２の信号遅延部１０６１と、第２の信号減算部１０６２とを備えている。 FIG. 25 is a diagram showing a configuration of a microphone device of Conventional Example 2. In FIG. 25, the microphone device includes a first microphone unit 1010, a second microphone unit 1020, a first adaptive filter unit 1040, a first signal delay unit 1041, and a first signal subtracting unit 1042. , A second adaptive filter unit 1060, a second signal delay unit 1061, and a second signal subtraction unit 1062.

第１の適応フィルタ部１０４０は、第２のマイクロホンユニット１０２０からの出力信号を入力として適応フィルタによるフィルタリング結果を出力する。第１の信号遅延部１０４１は、第１のマイクロホンユニット１０１０から出力される信号を遅延させる。第１の信号減算部１０４２は、第１の信号遅延部１０４１から出力される信号から、第１の適応フィルタ部１０４０から出力される信号を減算する。第１の適応フィルタ部１０４０は、第１の信号減算部１０４２から出力される信号と、第２のマイクロホンユニット１０２０から出力される信号とに基づいて、フィルタ係数の学習を行う。第２の信号遅延部１０６１は、第１の信号遅延部１０４１から出力される信号に対して遅延を与える。第２の適応フィルタ部１０６０は、第１の信号減算部１０４２から出力される信号を入力として適応フィルタによるフィルタリング結果を出力する。第２の信号減算部１０６２は、第２の信号遅延部１０６１から出力される信号から、第２の適応フィルタ部から出力される信号を減算してマイクロホン装置の出力とする。第２の適応フィルタ部１０６０は、第２の信号減算部１０６２から出力される信号と、第１の信号減算部から出力される信号とに基づいて、フィルタ係数の学習を行う。 The first adaptive filter unit 1040 receives the output signal from the second microphone unit 1020 as an input and outputs the filtering result by the adaptive filter. The first signal delay unit 1041 delays the signal output from the first microphone unit 1010. The first signal subtraction unit 1042 subtracts the signal output from the first adaptive filter unit 1040 from the signal output from the first signal delay unit 1041. The first adaptive filter unit 1040 learns filter coefficients based on the signal output from the first signal subtracting unit 1042 and the signal output from the second microphone unit 1020. The second signal delay unit 1061 gives a delay to the signal output from the first signal delay unit 1041. The second adaptive filter unit 1060 receives the signal output from the first signal subtracting unit 1042 as an input and outputs a filtering result by the adaptive filter. The second signal subtracting unit 1062 subtracts the signal output from the second adaptive filter unit from the signal output from the second signal delay unit 1061 to obtain the output of the microphone device. The second adaptive filter unit 1060 learns filter coefficients based on the signal output from the second signal subtracting unit 1062 and the signal output from the first signal subtracting unit.

以下、従来例２のマイクロホン装置の動作を説明する。従来例２における第１の適応フィルタ部１０４０、第１の信号遅延部１０４１、および第１の信号減算部１０４２は、各マイクロホンユニット１０１０および１０２０に到来した音波に対してキャンセル動作を行うものである。すなわち、第１の信号減算部１０４２から出力される信号は、第２の適応フィルタ部１０６０に対する雑音参照信号となる。また、第１の信号減算部１０４２から出力される信号は、図２４に示す第１の信号減算部１０３１から出力される信号と同様の目的の信号である。ただし、従来例１が固定指向性であるのに対して、従来例２では適応フィルタを用いることによって指向性を変化させることができる点で異なっている。 The operation of the microphone device of Conventional Example 2 will be described below. The first adaptive filter unit 1040, the first signal delay unit 1041, and the first signal subtraction unit 1042 in the conventional example 2 perform a cancel operation on the sound waves that have arrived at the microphone units 1010 and 1020. . That is, the signal output from the first signal subtraction unit 1042 is a noise reference signal for the second adaptive filter unit 1060. The signal output from the first signal subtracting unit 1042 is a target signal similar to the signal output from the first signal subtracting unit 1031 illustrated in FIG. However, the conventional example 1 has a fixed directivity, whereas the conventional example 2 is different in that the directivity can be changed by using an adaptive filter.

図２６は、従来例３のマイクロホン装置の構成を示す図である。図２６に示すマイクロホン装置は、第１の単一指向性マイクロホンユニット１０１１と、第２の単一指向性マイクロホンユニット１０１２と、第１のＦＦＴ部１０７０と、第２のＦＦＴ部１０８０と、２入力型スペクトルサブトラクション部１０９０と、音声認識部２０００とを備えている。 FIG. 26 is a diagram showing a configuration of a microphone device of Conventional Example 3. 26 includes a first unidirectional microphone unit 1011, a second unidirectional microphone unit 1012, a first FFT unit 1070, a second FFT unit 1080, and two inputs. A type spectrum subtraction unit 1090 and a speech recognition unit 2000 are provided.

図２６において、第１の単一指向性マイクロホンユニット１０１１は、指向性主軸が正面方向を向くように配置される。第２の単一指向性マイクロホンユニット１０１２は、指向性主軸が背面方向を向くように配置される。第１のＦＦＴ部１０７０は、第１の単一指向性マイクロホンユニット１０１１から出力される信号を入力として周波数スペクトルを求める。第２のＦＦＴ部１０８０は、第２の単一指向性マイクロホンユニット１０１２から出力される信号を入力として周波数スペクトルを求める。２入力型スペクトルサブトラクション部１０９０は、各ＦＦＴ部１０７０および１０８０から出力される信号を入力として、第１のＦＦＴ部１０７０によって導出される信号スペクトルから、第２のＦＦＴ部１０８０によって導出される信号スペクトルをパワスペクトル領域で減算することによって、目的信号のスペクトルを出力する。音声認識部２０００は、２入力型スペクトルサブトラクション部１０９０から出力される目的信号のスペクトルを入力として音声認識を行う。 In FIG. 26, the first unidirectional microphone unit 1011 is arranged such that the directional main axis faces the front direction. The second unidirectional microphone unit 1012 is arranged such that the directional main axis faces the back direction. The first FFT unit 1070 obtains a frequency spectrum using the signal output from the first unidirectional microphone unit 1011 as an input. The second FFT unit 1080 obtains a frequency spectrum using the signal output from the second unidirectional microphone unit 1012 as an input. The 2-input type spectrum subtraction unit 1090 receives the signals output from the FFT units 1070 and 1080 as inputs, and the signal spectrum derived by the second FFT unit 1080 from the signal spectrum derived by the first FFT unit 1070. Is subtracted in the power spectrum region to output the spectrum of the target signal. The voice recognition unit 2000 performs voice recognition using the spectrum of the target signal output from the two-input spectrum subtraction unit 1090 as an input.

以下、従来例３のマイクロホン装置について動作を説明する。従来例３において、第１の単一指向性マイクロホンユニット１０１１は、正面方向の目的音を収音する指向特性を有する。また、第２の単一指向性マイクロホンユニット１０１２は、主として雑音を収音する指向特性を有する。この結果、第１の単一指向性マイクロホンユニット１０１１から主信号ｍ１が得られ、第２の単一指向性マイクロホンユニット１０１２から雑音参照信号ｍ２が得られる。各ＦＦＴ部１０７０および１０８０においては、主信号ｍ１および雑音参照信号ｍ２のスペクトルが求められる。２入力型スペクトルサブトラクション部１０９０においては、主信号のパワスペクトルから雑音参照信号のパワスペクトルが減算されることによって、信号成分のパワスペクトルが推定される。なお、１入力型のスペクトルサブトラクション法では、目的音が到来していない時間区間に雑音が定常であることを仮定して雑音スペクトルを推定している。従って、１入力型のスペクトルサブトラクション法では、定常雑音の抑圧のみを行うことができる。これに対して、２入力型のスペクトルサブトラクション法を採用する従来例３の構成によれば、第２の単一指向性マイクロホンユニット１０１２によって雑音参照信号のスペクトルを常に得ることができるので、非定常な雑音の抑圧を行うことが可能になる。以上のように、従来例３のマイクロホン装置によれば、定常な雑音だけでなく非定常な雑音をも抑圧することによって、後段の音声認識部２０００の音声認識率を改善することができる。なお、図２６に示す装置は音声認識を用途としている。ここで、最終段でＩＦＦＴを行うことによってスペクトルを時間信号に戻し、フレームオーバーラップをさせながら波形信号にすることによって、マイクロホン装置とすることも可能である。
特許第３０８４８３３号明細書ＢｅｒｎａｒｄＷｉｄｒｏｗ，「ＡＤＡＰＴＩＶＥＳＩＧＮＡＬＰＲＯＣＥＳＳＩＮ」，ＰｒｅｎｔｉｃｅＨａｌｌ，１９８５年，ｐ．４１４，４１９，４２３中台、管村、中津，「２入力による雑音除去手法を用いた自動車内の音声認識」，電子情報通信学会技術研究報告，１９８９年，ＳＰ８９−８１，ｐｐ．４１−４８ The operation of the microphone device of Conventional Example 3 will be described below. In Conventional Example 3, the first unidirectional microphone unit 1011 has directivity characteristics for collecting the target sound in the front direction. The second unidirectional microphone unit 1012 has a directivity characteristic that mainly collects noise. As a result, the main signal m1 is obtained from the first unidirectional microphone unit 1011 and the noise reference signal m2 is obtained from the second unidirectional microphone unit 1012. In each of the FFT units 1070 and 1080, spectra of the main signal m1 and the noise reference signal m2 are obtained. In the 2-input spectrum subtraction unit 1090, the power spectrum of the signal component is estimated by subtracting the power spectrum of the noise reference signal from the power spectrum of the main signal. In the one-input type spectral subtraction method, the noise spectrum is estimated on the assumption that the noise is stationary in the time interval in which the target sound does not arrive. Therefore, in the one-input type spectral subtraction method, only stationary noise can be suppressed. On the other hand, according to the configuration of the conventional example 3 employing the two-input type spectral subtraction method, the spectrum of the noise reference signal can always be obtained by the second unidirectional microphone unit 1012. Noise can be suppressed. As described above, according to the microphone device of Conventional Example 3, it is possible to improve the speech recognition rate of the subsequent speech recognition unit 2000 by suppressing not only stationary noise but also non-stationary noise. Note that the apparatus shown in FIG. 26 uses voice recognition. Here, by performing IFFT in the final stage, the spectrum can be returned to a time signal, and a waveform signal can be obtained while performing frame overlap.
Japanese Patent No. 3084433 Bernard Widrow, “ADAPTION SIGNAL PROCESSIN”, Prentice Hall, 1985, p. 414, 419, 423 Nakadai, Tsunemura, Nakatsu, “Voice recognition in a car using a noise reduction technique with two inputs”, IEICE Technical Report, 1989, SP89-81, pp. 41-48

上記の従来例１の構成では、ある一方向から騒音が到来する環境下においては大きな雑音抑圧効果を得ることができる。しかし、従来例１の装置は、複数の方向から到来する騒音には対応することができない。従って、様々な方向に騒音源が同時に存在する実際の騒音環境下においては、従来例１の構成では、従来から用いられている単一指向性のマイクロホン装置の性能と同等の雑音抑圧効果しか得ることができない。 In the configuration of the conventional example 1 described above, a large noise suppression effect can be obtained in an environment where noise comes from one direction. However, the device of Conventional Example 1 cannot cope with noise coming from a plurality of directions. Therefore, in an actual noise environment in which noise sources are simultaneously present in various directions, the configuration of the conventional example 1 obtains only a noise suppression effect equivalent to the performance of a conventionally used unidirectional microphone device. I can't.

また、従来例２の構成では、第１の適応フィルタを用いることによって雑音参照信号を得ている。ここで、実環境において第１の適応フィルタを安定に動作させるためには、話者からの音声が周囲の騒音より十分に大きなときにのみ第１の適応フィルタを学習させる必要がある。従って、従来例２の構成では、フィルタの収束が完了するまで騒音抑圧効果を得ることができない。また、騒音環境下ではフィルタの収束が困難となる。さらに、従来例１と同様、従来例２の構成では、複数の騒音源に対応することができない。また、従来例２の装置は、ユニット信号の間に相関性のない風雑音を抑圧する目的で発明されているので、目的音の方向を限定することができない。すなわち、到来する音の内、最も大きな音が目的音となってしまい、特定の方向の音を強調して収音することができない。 In the configuration of Conventional Example 2, the noise reference signal is obtained by using the first adaptive filter. Here, in order to stably operate the first adaptive filter in the actual environment, it is necessary to learn the first adaptive filter only when the voice from the speaker is sufficiently louder than the surrounding noise. Therefore, with the configuration of Conventional Example 2, the noise suppression effect cannot be obtained until the convergence of the filter is completed. In addition, it is difficult to converge the filter in a noisy environment. Further, like the conventional example 1, the configuration of the conventional example 2 cannot cope with a plurality of noise sources. Further, since the device of the conventional example 2 is invented for the purpose of suppressing wind noise having no correlation between unit signals, the direction of the target sound cannot be limited. That is, the loudest sound among the arriving sounds becomes the target sound, and the sound in a specific direction cannot be emphasized and collected.

また、従来例３の構成は、主信号と雑音参照信号とをスペクトルに変換し、パワスペクトルにおいてスペクトル減算法を用いて雑音を抑圧する方式である。この方法は、複数方向の雑音源が存在する場合でも同時に雑音を抑圧することが可能な方法である。しかし、この方法は、雑音参照信号の方に目的音が微小に混入しただけでも、処理後の音声に音質面で大きな問題が発生したり、目的音自体が打ち消されたりするという課題がある。また、実際の音場では、単一指向性マイクロホンユニットの指向性死角を目的音方向に向けても、反射波が回り込んで混入することが考えられる。さらに、通常のマイクロホンユニットは、指向性の死角が無限大の減衰量ではなく、１０〜１５ｄｂ程度の減衰量であるので、目的音の直接波が除去しきれずに雑音参照信号に混入するおそれがある。また、スペクトル減算法の場合、フレーム処理による処理遅延が発生するので、同時通話や拡声等の用途には利用できないという課題があった。 The configuration of Conventional Example 3 is a system that converts a main signal and a noise reference signal into a spectrum and suppresses noise using a spectrum subtraction method in a power spectrum. This method is a method capable of simultaneously suppressing noise even when there are noise sources in a plurality of directions. However, this method has a problem that even if the target sound is mixed into the noise reference signal, a large problem occurs in the sound quality of the processed sound, or the target sound itself is canceled. Further, in an actual sound field, it is conceivable that reflected waves wrap around and enter even when the directional blind spot of the unidirectional microphone unit is directed to the target sound direction. Furthermore, since a normal microphone unit is not an infinite attenuation of directivity but an attenuation of about 10 to 15 db, the direct wave of the target sound may not be completely removed and may be mixed into the noise reference signal. is there. In addition, in the case of the spectrum subtraction method, a processing delay due to frame processing occurs, so that there is a problem that it cannot be used for applications such as simultaneous calls and loudspeakers.

また、上記の従来例は、目的音とは別の騒音である加法性雑音の抑圧に主眼を置いている。上記の従来例では、目的音が壁、机や床などの反射面に反射してから到達する乗法性雑音を除去することができない。従って、マイクロホン装置を実際に使用する音場における反射等の影響によって、目的音の周波数特性が歪んでしまうおそれがあった。そのため、特に音声認識などの用途では、認識時のマッチングに不整合を発生するという誤認識の問題を解決することができなかった。 Further, the above-described conventional example focuses on suppressing additive noise, which is noise different from the target sound. In the conventional example described above, multiplicative noise that arrives after the target sound is reflected on a reflecting surface such as a wall, desk, or floor cannot be removed. Therefore, the frequency characteristics of the target sound may be distorted due to the influence of reflection or the like in the sound field where the microphone device is actually used. For this reason, particularly in applications such as speech recognition, it has not been possible to solve the problem of misrecognition that causes inconsistency in matching during recognition.

それ故、本発明の目的は、実使用環境の複数の騒音下でも安定に動作するとともに、高Ｓ／Ｎを実現することができるマイクロホン装置を提供することである。 SUMMARY OF THE INVENTION Therefore, an object of the present invention is to provide a microphone device that can stably operate even under a plurality of noises in an actual use environment and can realize a high S / N.

また、本発明の他の目的は、目的音の反射波等に起因する乗法性雑音と、騒音に起因する加法性雑音をともに抑圧するマイクロホン装置を提供することである。 Another object of the present invention is to provide a microphone device that suppresses both multiplicative noise caused by reflected waves of the target sound and additive noise caused by noise.

また、本発明の他の目的は、雑音を抑圧する処理において用いられる主信号および雑音参照信号を簡易な方法で生成することである。 Another object of the present invention is to generate a main signal and a noise reference signal used in a process for suppressing noise by a simple method.

上記の目的を達成するために、本発明は以下の構成を採用した。すなわち、第１の発明は、目的音方向から到来する目的音を検出するマイクロホン装置である。マイクロホン装置は、信号生成部と、判定部と、適応フィルタ部と、減算部と、雑音抑圧部とを備えている。信号生成部は、目的音方向に対して感度を有して検出した結果を示す主信号と、目的音方向に対して感度死角を向けて検出した結果を示す雑音参照信号とを生成する。判定部は、信号生成部によって生成された雑音参照信号の信号レベルに対する主信号の信号レベルの割合を示すレベル比が所定の値よりも大きいか否かを判定する。適応フィルタ部は、信号生成部によって生成された主信号を適応フィルタでフィルタリングすることによって、信号生成部によって生成された雑音参照信号に含まれる目的音の信号成分を示す信号を生成するとともに、判定部によってレベル比が所定の値よりも大きいと判定された場合のみ、フィルタ係数の学習を行う。減算部は、雑音参照信号から、適応フィルタ部によって生成された、雑音参照信号に含まれる目的音の信号成分を示す信号を減算する。雑音抑圧部は、主信号と、減算部による減算後の雑音参照信号とを用いて、主信号に含まれる雑音の信号成分を抑圧する。この雑音抑圧部は、雑音抑圧フィルタ係数算出部と、時変係数フィルタ部とを含んでいる。雑音抑圧フィルタ係数算出部は、主信号と減算部による減算後の雑音参照信号とに基づいて、主信号から目的音の信号以外の信号成分を抑圧するための雑音抑圧フィルタのフィルタ係数を算出する。時変係数フィルタ部は、雑音抑圧フィルタ係数算出部によって算出されたフィルタ係数を反映して、主信号に対してフィルタリングを行う。 In order to achieve the above object, the present invention employs the following configuration. That is, the first invention is a microphone device that detects a target sound coming from a target sound direction. The microphone device includes a signal generation unit, a determination unit, an adaptive filter unit, a subtraction unit, and a noise suppression unit. The signal generation unit generates a main signal indicating a result detected with sensitivity to the target sound direction , and a noise reference signal indicating a result detected with the sensitivity blind spot directed toward the target sound direction. The determination unit determines whether or not a level ratio indicating a ratio of the signal level of the main signal to the signal level of the noise reference signal generated by the signal generation unit is greater than a predetermined value. The adaptive filter unit generates a signal indicating the signal component of the target sound included in the noise reference signal generated by the signal generation unit by filtering the main signal generated by the signal generation unit with an adaptive filter, and determines Only when it is determined by the unit that the level ratio is larger than a predetermined value, the filter coefficient is learned. Subtraction unit, the noise reference signal, generated by the adaptive filter subtracts the signal indicating the signal components of the target sound included in the noise reference signal. The noise suppression unit suppresses a signal component of noise included in the main signal using the main signal and the noise reference signal after subtraction by the subtraction unit. The noise suppression unit includes a noise suppression filter coefficient calculation unit and a time-varying coefficient filter unit. The noise suppression filter coefficient calculation unit calculates a filter coefficient of a noise suppression filter for suppressing signal components other than the target sound signal from the main signal based on the main signal and the noise reference signal after subtraction by the subtraction unit. . The time-varying coefficient filter unit performs filtering on the main signal, reflecting the filter coefficient calculated by the noise suppression filter coefficient calculation unit.

なお、「目的音方向に対して感度を有して検出した結果を示す主信号」とは、マイクロホンユニットから出力された信号そのもののみならず、マイクロホンユニットによって検出された信号に所定の加工を加えた結果得られる信号をも含む意味である。つまり、上記主信号は、目的音方向に指向性主軸が向けられたマイクロホンユニットから出力された信号そのものであってもよいし、マイクロホンユニット（無指向性であってもよいし、所定の方向に指向性主軸が向けられていてもよい）から出力された信号を加工することによって得られた信号であってもよい。これと同様に、「他の方向から到来する音を目的音よりも高い感度で検出した結果を示す雑音参照信号」とは、マイクロホンユニットから出力された信号そのものであってもよいし、マイクロホンユニットから出力された信号を加工することによって得られた信号であってもよい。 The “main signal indicating the result of detection with sensitivity to the target sound direction” is not only the signal itself output from the microphone unit, but also a predetermined process applied to the signal detected by the microphone unit. This also includes the signal obtained as a result. That is, the main signal may be a signal itself output from a microphone unit having a directional main axis directed to the target sound direction, or may be a microphone unit (which may be omnidirectional or in a predetermined direction). It may be a signal obtained by processing a signal output from a directivity main axis may be directed. Similarly, the “noise reference signal indicating the result of detecting the sound coming from other directions with higher sensitivity than the target sound” may be the signal itself output from the microphone unit or the microphone unit. It may be a signal obtained by processing the signal output from.

第２の発明は、目的音方向から到来する目的音を検出するマイクロホン装置である。マイクロホン装置は、信号生成部と、判定部と、適応フィルタ部と、減算部と、反射情報算出部と、反射補正部とを備えている。信号生成部は、目的音方向に対して感度を有して検出した結果を示す主信号と、目的音方向に対して感度死角を向けて検出した結果を示す雑音参照信号とを生成する。判定部は、信号生成部によって生成された雑音参照信号の信号レベルに対する主信号の信号レベルの割合を示すレベル比が所定の値よりも大きいか否かを判定する。適応フィルタ部は、信号生成部によって生成された主信号を適応フィルタでフィルタリングすることによって、信号生成部によって生成された雑音参照信号に含まれる目的音の信号成分を示す信号を生成するとともに、判定部によってレベル比が所定の値よりも大きいと判定された場合のみ、フィルタ係数の学習を行う。減算部は、雑音参照信号から、適応フィルタ部によって生成された、雑音参照信号に含まれる目的音の信号成分を示す信号を減算する。反射情報算出部は、適応フィルタ部のフィルタ係数に基づいて、目的音の直接波と反射波との到達時間差に関する情報を算出する。反射補正部は、反射情報算出部によって算出された情報に基づいて、目的音の反射波によって主信号に生じる周波数特性の歪を補正する。 The second invention is a microphone device for detecting a target sound coming from a target sound direction. The microphone device includes a signal generation unit, a determination unit, an adaptive filter unit, a subtraction unit, a reflection information calculation unit, and a reflection correction unit. The signal generation unit generates a main signal indicating a result detected with sensitivity to the target sound direction , and a noise reference signal indicating a result detected with the sensitivity blind spot directed toward the target sound direction. The determination unit determines whether or not a level ratio indicating a ratio of the signal level of the main signal to the signal level of the noise reference signal generated by the signal generation unit is greater than a predetermined value. The adaptive filter unit generates a signal indicating the signal component of the target sound included in the noise reference signal generated by the signal generation unit by filtering the main signal generated by the signal generation unit with an adaptive filter, and determines Only when it is determined by the unit that the level ratio is larger than a predetermined value, the filter coefficient is learned. Subtraction unit, the noise reference signal, generated by the adaptive filter subtracts the signal indicating the signal components of the target sound included in the noise reference signal. The reflection information calculation unit calculates information related to the arrival time difference between the direct wave and the reflected wave of the target sound based on the filter coefficient of the adaptive filter unit. The reflection correction unit corrects the distortion of the frequency characteristics generated in the main signal due to the reflected wave of the target sound based on the information calculated by the reflection information calculation unit.

また、第３の発明では、信号生成部は、第１のマイクロホンユニットと、第２のマイクロホンユニットとを含んでいる。第１のマイクロホンユニットは、指向性主軸が目的音方向に向けられて配置される。第２のマイクロホンユニットは、指向性の死角方向が目的音方向に向けられて配置される。第１のマイクロホンユニットからの出力信号を主信号、第２のマイクロホンユニットからの出力信号を雑音参照信号とする。 In the third invention, the signal generation unit includes a first microphone unit and a second microphone unit. The first microphone unit is arranged with the directional main axis directed in the target sound direction. The second microphone unit is arranged with the directional blind spot direction directed to the target sound direction. The output signal from the first microphone unit is the main signal, and the output signal from the second microphone unit is the noise reference signal.

また、第４の発明では、マイクロホン装置は、信号遅延部をさらに備える。信号遅延部は、信号生成部における雑音参照信号の出力端と減算部との間に設けられ、適応フィルタ部の適応フィルタの収束条件を満たすように当該雑音参照信号を遅延させる。 In the fourth invention, the microphone device further includes a signal delay unit. The signal delay unit is provided between the output end of the noise reference signal in the signal generation unit and the subtraction unit, and delays the noise reference signal so as to satisfy the convergence condition of the adaptive filter of the adaptive filter unit.

また、第５の発明では、所定の値は変更可能である。 In the fifth invention, the predetermined value can be changed.

また、第６の発明では、信号生成部は、第１のマイクロホンユニットと、第２のマイクロホンユニットと、遅延部と、増幅部と、第１の減算部と、第２の減算部とを含む。第２のマイクロホンユニットは、第１のマイクロホンユニットと同一の特性を有する。遅延部は、第１のマイクロホンユニットから出力される信号を所定の遅延量だけ遅延させて出力する。増幅部は、遅延部から出力された信号を増幅する。第１の減算部は、第２のマイクロホンユニットから出力される信号から、増幅部によって増幅された信号を減算することによって、主信号を生成する。第２の減算部は、第２のマイクロホンユニットから出力される信号から、遅延部から出力された信号を減算することによって、雑音参照信号を生成する。また、所定の遅延量は、第２の減算部から出力される雑音参照信号が持つ指向特性の死角方向が目的音方向に向くように設定される。増幅部における増幅率は、雑音参照信号より主信号のほうが目的音方向の感度が高くなるように設定される。 In the sixth invention, the signal generation unit includes a first microphone unit, a second microphone unit, a delay unit, an amplification unit, a first subtraction unit, and a second subtraction unit. . The second microphone unit has the same characteristics as the first microphone unit. The delay unit delays and outputs the signal output from the first microphone unit by a predetermined delay amount. The amplification unit amplifies the signal output from the delay unit. The first subtracting unit generates a main signal by subtracting the signal amplified by the amplifying unit from the signal output from the second microphone unit. The second subtracting unit generates a noise reference signal by subtracting the signal output from the delay unit from the signal output from the second microphone unit. The delay amount of Jo Tokoro is dead angle directivity characteristic noise reference signal output from the second subtracting unit has is set to face the target sound direction. The amplification factor in the amplification unit is set so that the main signal has higher sensitivity in the target sound direction than the noise reference signal .

また、第７の発明では、マイクロホン装置は、遅延部において設定される所定の遅延量を変化させる設定部をさらに備えている。 In the seventh invention, the microphone device further includes a setting unit that changes a predetermined delay amount set in the delay unit.

また、第８の発明では、信号生成部は、第１のマイクロホンユニットと、第２のマイクロホンユニットと、合成部とを含んでいる。第２のマイクロホンユニットは、第１のマイクロホンユニットと同一の特性を有する。合成部は、第１および第２のマイクロホンユニットから出力される各信号に基づいて、目的音方向に対して感度を有するように主信号を生成するとともに、目的音方向の感度が最小となるように雑音の信号成分を生成する。 In the eighth invention, the signal generation unit includes a first microphone unit, a second microphone unit, and a synthesis unit. The second microphone unit has the same characteristics as the first microphone unit. The synthesizer generates a main signal based on the signals output from the first and second microphone units so as to be sensitive to the target sound direction, and minimizes the sensitivity in the target sound direction. A noise signal component is generated.

また、第９の発明では、信号生成部は、第１のマイクロホンユニットと、第２のマイクロホンユニットと、信号加算部と、信号減算部とを含んでいる。
第２のマイクロホンユニットは、第１のマイクロホンユニットとは異なる方向に指向性主軸が向けられて配置される。信号加算部は、第１のマイクロホンユニットから出力される信号と、第２のマイクロホンユニットから出力される信号とを加算することによって主信号を生成する。信号減算部は、第１のマイクロホンユニットから出力される信号、および第２のマイクロホンユニットから出力される信号のいずれか一方から他方を減算することによって雑音参照信号を生成する。 In the ninth invention, the signal generation unit includes a first microphone unit, a second microphone unit, a signal addition unit, and a signal subtraction unit.
The second microphone unit is arranged with the directional main axis directed in a direction different from that of the first microphone unit. The signal adding unit generates a main signal by adding the signal output from the first microphone unit and the signal output from the second microphone unit. The signal subtracting unit generates a noise reference signal by subtracting the other from one of the signal output from the first microphone unit and the signal output from the second microphone unit.

また、第１０の発明では、信号生成部は、第１のマイクロホンユニットと、第２のマイクロホンユニットと、ステレオ信号生成部と、逆合成部と、合成部とを含んでいる。第２のマイクロホンユニットは、第１のマイクロホンユニットと同一の特性を有する。ステレオ信号生成部は、第１および第２のマイクロホンユニットに基づいて、右チャンネル信号と左チャンネル信号とからなるステレオ信号を生成する。逆合成部は、ステレオ信号に基づいて、各マイクロホンユニットから出力される各信号を生成する。合成部は、逆合成部によって生成された各信号に基づいて、目的音方向に対して感度を有して検出した結果を示す主信号と、目的音方向以外の他の方向から到来する音を目的音よりも高い感度で検出した結果を示す雑音参照信号とを生成する。 In the tenth invention, the signal generation unit includes a first microphone unit, a second microphone unit, a stereo signal generation unit, an inverse synthesis unit, and a synthesis unit. The second microphone unit has the same characteristics as the first microphone unit. The stereo signal generation unit generates a stereo signal including a right channel signal and a left channel signal based on the first and second microphone units. The inverse synthesizer generates each signal output from each microphone unit based on the stereo signal. Based on each signal generated by the inverse synthesizer, the synthesizer outputs a main signal indicating a result detected with sensitivity to the target sound direction and a sound arriving from a direction other than the target sound direction. A noise reference signal indicating the detection result with higher sensitivity than the target sound is generated.

また、第１１の発明では、信号生成部は、第１のマイクロホンユニットと、第２のマイクロホンユニットと、ステレオ信号生成部と、信号加算部と、信号減算部とを含んでいる。第２のマイクロホンユニットは、第１のマイクロホンユニットと同一の特性を有する。ステレオ信号生成部は、第１および第２のマイクロホンユニットに基づいて、右チャンネル信号と左チャンネル信号とからなるステレオ信号を生成する。信号加算部は、ステレオ信号の右チャンネル信号と左チャンネル信号とを加算することによって主信号を生成する。信号減算部は、ステレオ信号の右チャンネル信号および左チャンネル信号のいずれか一方から他方を減算することによって雑音参照信号を生成する。 In the eleventh aspect of the invention, the signal generation unit includes a first microphone unit, a second microphone unit, a stereo signal generation unit, a signal addition unit, and a signal subtraction unit. The second microphone unit has the same characteristics as the first microphone unit. The stereo signal generation unit generates a stereo signal including a right channel signal and a left channel signal based on the first and second microphone units. The signal adder generates a main signal by adding the right channel signal and the left channel signal of the stereo signal. The signal subtracting unit generates a noise reference signal by subtracting the other from one of the right channel signal and the left channel signal of the stereo signal.

また、第１２の発明では、マイクロホン装置は、反射情報算出部と、反射補正部とをさらに備えている。反射情報算出部は、適応フィルタ部のフィルタ係数に基づいて、目的音の直接波と反射波との到達時間差に関する情報を算出する。反射補正部は、反射情報算出部によって算出された情報に基づいて、目的音の反射波によって主信号に生じる周波数特性の歪を補正する。また、雑音抑圧部は、反射補正部による補正後の主信号と、減算部による減算後の雑音参照信号とを用いて、主信号に含まれる雑音の信号成分を抑圧する。 In the twelfth aspect, the microphone device further includes a reflection information calculation unit and a reflection correction unit. The reflection information calculation unit calculates information related to the arrival time difference between the direct wave and the reflected wave of the target sound based on the filter coefficient of the adaptive filter unit. The reflection correction unit corrects the distortion of the frequency characteristics generated in the main signal due to the reflected wave of the target sound based on the information calculated by the reflection information calculation unit. The noise suppression unit suppresses the signal component of the noise included in the main signal using the main signal after correction by the reflection correction unit and the noise reference signal after subtraction by the subtraction unit.

また、第１３の発明では、雑音抑圧フィルタ係数算出部は、第１の周波数分析部と、第２の周波数分析部と、パワスペクトル比演算部と、乗算部と、する係数算出部とを含んでいる。第１の周波数分析部は、主信号のパワスペクトルを算出する。第２の周波数分析部は、減算部による減算後の雑音参照信号のパワスペクトルを算出する。パワスペクトル比演算部は、判定部によってレベル比が所定の値よりも小さいと判定された場合にのみ、第１の周波数分析部によって算出されたパワスペクトルと、第２の周波数分析部によって算出されたパワスペクトルとのパワスペクトル比の時間平均を算出する。乗算部は、パワスペクトル比演算部によって算出されたパワスペクトル比の時間平均と、第２の周波数分析部によって算出されたパワスペクトルとを乗算する。係数算出部は、第１の周波数分析部によって算出されたパワスペクトルと、乗算部による乗算結果とに基づいて、雑音抑圧フィルタのフィルタ係数を算出する。 In the thirteenth invention, the noise suppression filter coefficient calculation unit includes a first frequency analysis unit, a second frequency analysis unit, a power spectrum ratio calculation unit, a multiplication unit, and a coefficient calculation unit. It is out. The first frequency analysis unit calculates a power spectrum of the main signal. The second frequency analysis unit calculates a power spectrum of the noise reference signal after subtraction by the subtraction unit. The power spectrum ratio calculation unit is calculated by the power spectrum calculated by the first frequency analysis unit and the second frequency analysis unit only when the determination unit determines that the level ratio is smaller than a predetermined value. The time average of the power spectrum ratio to the power spectrum is calculated. The multiplication unit multiplies the time average of the power spectrum ratio calculated by the power spectrum ratio calculation unit and the power spectrum calculated by the second frequency analysis unit. The coefficient calculation unit calculates a filter coefficient of the noise suppression filter based on the power spectrum calculated by the first frequency analysis unit and the multiplication result by the multiplication unit.

また、第１４の発明の音声再生装置は、音声記録部と、信号生成部と、判定部と、適応フィルタ部と、減算部と、雑音抑圧部と、再生部とを備えている。音声記録部は、少なくとも２種類のチャンネルの音声信号を記録する。信号生成部は、記録部に記録されている音声信号に基づいて、目的音方向に対して感度を有して検出した結果を示す主信号と、目的音方向に対して感度死角を向けて検出した結果を示す雑音参照信号とを生成する。判定部は、信号生成部によって生成された雑音参照信号の信号レベルに対する主信号の信号レベルの割合を示すレベル比が所定の値よりも大きいか否かを判定する。適応フィルタ部は、信号生成部によって生成された主信号を適応フィルタでフィルタリングすることによって、信号生成部によって生成された雑音参照信号に含まれる目的音の信号成分を示す信号を生成するとともに、判定部によってレベル比が所定の値よりも大きいと判定された場合のみ、フィルタ係数の学習を行う。減算部は、雑音参照信号から、適応フィルタ部によって生成された、雑音参照信号に含まれる目的音の信号成分を示す信号を減算する。雑音抑圧部は、主信号と、減算部による減算後の雑音参照信号とを用いて、主信号に含まれる雑音の信号成分を抑圧する。再生部は、雑音抑圧部によって雑音信号成分が抑圧された主信号を再生する。雑音抑圧部は、雑音抑圧フィルタ係数算出部と、時変係数フィルタ部とを含んでいる。雑音抑圧フィルタ係数算出部は、主信号と減算部による減算後の雑音参照信号とに基づいて、主信号から目的音の信号以外の信号成分を抑圧するための雑音抑圧フィルタのフィルタ係数を算出する。時変係数フィルタ部は、雑音抑圧フィルタ係数算出部によって算出されたフィルタ係数を反映して、主信号に対してフィルタリングを行う。 In addition, an audio reproduction device according to a fourteenth aspect includes an audio recording unit, a signal generation unit, a determination unit, an adaptive filter unit, a subtraction unit, a noise suppression unit, and a reproduction unit. The audio recording unit records audio signals of at least two types of channels. Based on the audio signal recorded in the recording unit, the signal generator detects the main signal indicating the detection result with sensitivity to the target sound direction and the sensitivity blind angle toward the target sound direction . A noise reference signal indicating the result is generated. The determination unit determines whether or not a level ratio indicating a ratio of the signal level of the main signal to the signal level of the noise reference signal generated by the signal generation unit is greater than a predetermined value. The adaptive filter unit generates a signal indicating the signal component of the target sound included in the noise reference signal generated by the signal generation unit by filtering the main signal generated by the signal generation unit with an adaptive filter, and determines Only when it is determined by the unit that the level ratio is larger than a predetermined value, the filter coefficient is learned. Subtraction unit, the noise reference signal, generated by the adaptive filter subtracts the signal indicating the signal components of the target sound included in the noise reference signal. The noise suppression unit suppresses a signal component of noise included in the main signal using the main signal and the noise reference signal after subtraction by the subtraction unit. The reproduction unit reproduces the main signal in which the noise signal component is suppressed by the noise suppression unit. The noise suppression unit includes a noise suppression filter coefficient calculation unit and a time-varying coefficient filter unit. The noise suppression filter coefficient calculation unit calculates a filter coefficient of a noise suppression filter for suppressing signal components other than the target sound signal from the main signal based on the main signal and the noise reference signal after subtraction by the subtraction unit. . The time-varying coefficient filter unit performs filtering on the main signal, reflecting the filter coefficient calculated by the noise suppression filter coefficient calculation unit.

また、第１５の発明では、音声再生装置は、音声記録部に記録されている音声信号に関連する映像信号を記録する映像記録部と、映像記録部に記録されている映像信号を再生する映像再生部と、音を強調すべき方向の入力をユーザから受け付ける方向受付部とをさらに備えている。このとき、信号生成部は、方向受付部によって受け付けられた方向を
目的音方向として主信号および雑音参照信号を生成する。 In the fifteenth aspect of the invention, the audio reproduction device includes a video recording unit that records a video signal related to the audio signal recorded in the audio recording unit, and a video that reproduces the video signal recorded in the video recording unit. A playback unit and a direction receiving unit that receives an input of a direction in which the sound should be emphasized from a user are further provided. At this time, the signal generation unit generates the main signal and the noise reference signal with the direction received by the direction reception unit as the target sound direction.

第１の発明によれば、雑音参照信号に含まれる目的音の信号成分が当該雑音参照信号から除去され、その後、主信号と雑音参照信号とに基づいて雑音の抑圧処理が行われる。従って、理想的な雑音参照信号を用いて雑音の抑圧処理を行うことができるので、高Ｓ／Ｎを実現することができる。また、第１の発明によれば、目的音以外の音はすべて雑音として抑圧することができる。従って、ある一方向の騒音だけでなく、全方向の雑音に対応することができる。 According to the first aspect, the signal component of the target sound included in the noise reference signal is removed from the noise reference signal, and then noise suppression processing is performed based on the main signal and the noise reference signal. Therefore, since noise suppression processing can be performed using an ideal noise reference signal, high S / N can be realized. According to the first invention, all sounds other than the target sound can be suppressed as noise. Therefore, not only noise in one direction but also noise in all directions can be handled.

また、第２の発明によれば、主信号に与える反射波の影響を補正することができるので、マイクロホン装置の周囲の音場に左右されずに安定した感度対周波数特性を有するマイクロホン装置を実現することができる。また、反射物による音質の変化がないので、特に音声認識用途では、認識率の改善効果が大きい。 In addition, according to the second invention, since the influence of the reflected wave on the main signal can be corrected, a microphone device having stable sensitivity vs. frequency characteristics can be realized without being influenced by the sound field around the microphone device. can do. In addition, since there is no change in sound quality due to the reflector, the effect of improving the recognition rate is great particularly in speech recognition applications.

また、第３の発明によれば、主信号および雑音参照信号を容易に生成することができる。さらに、２つのマイクロホンユニットは互いに接触させるまで近接して配置できるので、マイクロホン装置を小型化することができる。 Further, according to the third invention, the main signal and the noise reference signal can be easily generated. Furthermore, since the two microphone units can be arranged close to each other until they come into contact with each other, the microphone device can be miniaturized.

また、第５の発明によれば、マイクロホン装置の収音範囲を目的音方向を中心として左右何度まで収音可能にするかを制御することができるようになる。従って、目的に応じた収音角度幅の設定を行ったり、ズームマイクの様に収音角度幅を可変にしたりすることができるようになる。 Further, according to the fifth aspect, it is possible to control how many times the sound collection range of the microphone device can be collected left and right with the target sound direction as the center. Therefore, it is possible to set the sound collection angle width according to the purpose, or to make the sound collection angle width variable like a zoom microphone.

また、第６の発明によれば、主信号と雑音参照信号との感度特性が、目的音方向以外の方向でほぼ一致する指向性パターンが得られる。従って、後段の雑音抑圧処理における整合性が高まり、処理後の音声品質が改善される。 Further, according to the sixth aspect, a directivity pattern in which the sensitivity characteristics of the main signal and the noise reference signal substantially match in directions other than the target sound direction can be obtained. Therefore, consistency in the subsequent noise suppression process is improved, and the voice quality after the process is improved.

また、第７の発明によれば、遅延時間を変化させることによって、収音方向を制御することができる。 According to the seventh aspect, the sound collection direction can be controlled by changing the delay time.

また、第９の発明によれば、例えばワンポイントステレオマイクロホンから出力される信号を利用して、主信号および雑音参照信号を得ることができる。 According to the ninth aspect, the main signal and the noise reference signal can be obtained using, for example, a signal output from a one-point stereo microphone.

また、第１０および第１１の発明によれば、ステレオ信号を用いて主信号および雑音参照信号を得ることができる。 According to the tenth and eleventh aspects, the main signal and the noise reference signal can be obtained using the stereo signal.

また、第１２の発明によれば、加法性雑音である騒音と、乗法性雑音である反射波との双方を同時に抑圧することができる。従って、音場の影響を受けず、高Ｓ／Ｎでかつ常に平坦なマイクロホン周波数特性を実現することができる。 Further, according to the twelfth aspect, it is possible to simultaneously suppress both noise that is additive noise and reflected wave that is multiplicative noise. Therefore, it is possible to realize a microphone frequency characteristic that is always free from the influence of the sound field and has a high S / N ratio.

（実施の形態１）
まず、本発明の実施の形態１に係るマイクロホン装置について、図１〜図７を用いて説明する。図１は、実施の形態１に係るマイクロホン装置の構成を示すブロック図である。図１において、マイクロホン装置は、第１のマイクロホンユニット１と、第２のマイクロホンユニット２と、判定部１０と、適応フィルタ部２０と、信号減算部３０と、雑音抑圧フィルタ係数算出部４０と、時変係数フィルタ部５０とを備えている。 (Embodiment 1)
First, the microphone device according to Embodiment 1 of the present invention will be described with reference to FIGS. FIG. 1 is a block diagram showing the configuration of the microphone device according to Embodiment 1. In FIG. In FIG. 1, the microphone device includes a first microphone unit 1, a second microphone unit 2, a determination unit 10, an adaptive filter unit 20, a signal subtraction unit 30, a noise suppression filter coefficient calculation unit 40, And a time-varying coefficient filter unit 50.

図１において、第１のマイクロホンユニット１は、単一指向性マイクロホンユニットである。第１のマイクロホンユニット１の指向性主軸は、正面方向に向けられている。第２のマイクロホンユニット２は、双指向性マイクロホンユニットである。第２のマイクロホンユニット２の指向性主軸は、正面方向に直角な方向に向けられている。なお、マイクロホン装置は、所望の方向から到来する音を検出するものであり、以下においては、検出すべき音を目的音と呼び、当該所望の方向を目的音方向と呼ぶ。実施の形態１では、正面方向が目的音方向である。 In FIG. 1, the first microphone unit 1 is a unidirectional microphone unit. The main directivity axis of the first microphone unit 1 is directed in the front direction. The second microphone unit 2 is a bidirectional microphone unit. The main directivity axis of the second microphone unit 2 is oriented in a direction perpendicular to the front direction. The microphone device detects sound coming from a desired direction. In the following, the sound to be detected is called a target sound, and the desired direction is called a target sound direction. In the first embodiment, the front direction is the target sound direction.

判定部１０は、第１のマイクロホンユニット１から出力される信号ｍ１と、第２のマイクロホンユニット２から出力される信号ｍ２とを入力信号として、入力信号間のレベル比に従って目的音の到来の有無を判定する。適応フィルタ部２０は、フィルタ係数によって信号ｍ１をフィルタリングした信号を出力する。信号減算部３０は、適応フィルタ部２０から出力される信号を信号ｍ２から減算する。 The determination unit 10 uses the signal m1 output from the first microphone unit 1 and the signal m2 output from the second microphone unit 2 as input signals, and whether or not the target sound has arrived according to the level ratio between the input signals Determine. The adaptive filter unit 20 outputs a signal obtained by filtering the signal m1 with the filter coefficient. The signal subtracting unit 30 subtracts the signal output from the adaptive filter unit 20 from the signal m2.

雑音抑圧フィルタ係数算出部４０は、信号ｍ１を主信号として入力し、第１の信号減算部３０から出力される信号ｍ３を雑音参照信号として入力する。雑音抑圧フィルタ係数算出部４０は、当該主信号および当該雑音参照信号を用いて雑音抑圧のためのフィルタ特性を示すフィルタ係数を計算する。計算されたフィルタ係数は、時変係数フィルタ部５０へ出力される。時変係数フィルタ部５０は、信号ｍ１を入力する。そして、入力した信号を、雑音抑圧フィルタ係数算出部４０によって計算されたフィルタ係数に従ってフィルタリングして出力する。 The noise suppression filter coefficient calculation unit 40 receives the signal m1 as a main signal, and inputs the signal m3 output from the first signal subtraction unit 30 as a noise reference signal. The noise suppression filter coefficient calculation unit 40 calculates a filter coefficient indicating filter characteristics for noise suppression using the main signal and the noise reference signal. The calculated filter coefficient is output to the time-varying coefficient filter unit 50. The time varying coefficient filter unit 50 receives the signal m1. The input signal is filtered according to the filter coefficient calculated by the noise suppression filter coefficient calculation unit 40 and output.

以上のように構成されたマイクロホン装置の動作について説明する。なお、以下の説明においては、特に説明がない場合は、目的音が到来する方向は正面方向であるとする。 The operation of the microphone device configured as described above will be described. In the following description, the direction in which the target sound arrives is the front direction unless otherwise specified.

図１において、第１のマイクロホンユニット１は、第２のマイクロホンユニット２に近接して配置される。各マイクロホンユニット１および２を近接して配置することによって、第２のマイクロホンユニット２は、第１のマイクロホンユニット１とほぼ同一位置で、目的音以外の音（すなわち騒音）を収音することができる。実施の形態１に係るマイクロホン装置は、第１のマイクロホンユニット１に混入する騒音を時変係数フィルタ部５０によって抑圧することによって、高Ｓ／Ｎの収音を実現するものである。その際に、信号ｍ２は雑音参照信号として用いられる。従って、各マイクロホンユニット１および２は、同一場所の音場の収音を行うことが理想的である。つまり、各マイクロホンユニット１および２の配置は、各マイクロホンユニット１および２の指向性の形成に互いに影響を与えないことを条件として、各マイクロホンユニット１および２を接触させて配置することが望ましい。そのため、実施の形態１では、各マイクロホンユニット１および２を互いに近接して配置しているのである。 In FIG. 1, the first microphone unit 1 is disposed close to the second microphone unit 2. By arranging the microphone units 1 and 2 close to each other, the second microphone unit 2 can collect sounds other than the target sound (that is, noise) at substantially the same position as the first microphone unit 1. it can. The microphone device according to Embodiment 1 realizes high S / N sound collection by suppressing noise mixed in the first microphone unit 1 by the time-varying coefficient filter unit 50. At that time, the signal m2 is used as a noise reference signal. Therefore, it is ideal that the microphone units 1 and 2 collect sound in the same place. That is, it is desirable that the microphone units 1 and 2 are placed in contact with each other on the condition that the directivity of the microphone units 1 and 2 is not affected by each other. Therefore, in the first embodiment, the microphone units 1 and 2 are arranged close to each other.

また、実施の形態１では、マイクロホン装置の後段（雑音抑圧フィルタ係数算出部４０および時変係数フィルタ部５０）において、時変係数フィルタを用いた雑音抑圧処理方式を採用している。第２のマイクロホンユニット２に目的音が混入すると、当該方式の性質上、処理後の音声に歪みやレベル低下等の悪影響が発生する。従って、当該方式を用いる場合には、雑音参照信号への目的音の混入を如何に除去するかが課題となる。そこで、実施の形態１では、雑音参照信号への目的音の混入をできるだけ少なくすることを目的として、第２のマイクロホンユニット２の指向性の死角が正面方向を向くように構成している。なお、第２のマイクロホンユニット２として双指向性マイクロホンユニットを用いている理由は、双指向性マイクロホンユニットは、死角の方向や感度減衰量等の特性に関する製造のばらつきが単一指向性ユニット等他のマイクロホンユニットと比較して少ないという特徴があるからである。 In the first embodiment, a noise suppression processing method using a time-varying coefficient filter is employed in the subsequent stage (noise suppression filter coefficient calculating unit 40 and time-varying coefficient filter unit 50) of the microphone device. When the target sound is mixed into the second microphone unit 2, due to the nature of the method, adverse effects such as distortion and level reduction occur in the processed sound. Therefore, when using this method, how to remove the mixing of the target sound into the noise reference signal becomes a problem. In view of this, the first embodiment is configured such that the blind spot of the directivity of the second microphone unit 2 faces the front direction in order to reduce the mixing of the target sound into the noise reference signal as much as possible. The reason why the bidirectional microphone unit 2 is used as the second microphone unit 2 is that the bidirectional microphone unit has a manufacturing variation related to characteristics such as the direction of blind spot and sensitivity attenuation, etc. This is because there is a feature that it is less than that of the microphone unit.

なお、第２のマイクロホンユニット２を上記のように構成することによって、雑音参照信号への目的音の混入を抑えることができるが、雑音参照信号への目的音の混入を完全になくすことはできない。なぜなら、実際の使用環境では、マイクロホンユニットが取り付けられる筐体や、マイクロホン装置の周囲にある反射物等の音響的な影響によって、目的音の反射波が第２のマイクロホンユニット２によって検出されてしまうからである。また、目的音の反射波による影響の他、第２のマイクロホンユニット２の指向性の死角を正面方向に向けても、目的音の直接波がわずかながら検出されてしまう（目的音の消し残りがある）からである。以上のような理由で、信号ｍ２には目的音の成分が混入してしまう。そこで、実施の形態１では、判定部１０、適応フィルタ部２０、および第１の信号減算部３０によってキャンセラを構成する。このキャンセラによって、雑音参照信号へ混入する目的音の成分を除去する。これによって、理想的な雑音参照信号、すなわち、目的音が混入していない雑音参照信号を得ることができる。 In addition, although the 2nd microphone unit 2 is comprised as mentioned above, mixing of the target sound into a noise reference signal can be suppressed, but mixing of the target sound into a noise reference signal cannot be eliminated completely. . This is because, in an actual usage environment, the reflected wave of the target sound is detected by the second microphone unit 2 due to acoustic effects such as a casing to which the microphone unit is attached and a reflector around the microphone device. Because. In addition to the influence of the reflected sound of the target sound, the direct wave of the target sound is detected slightly even if the directivity blind spot of the second microphone unit 2 is directed in the front direction (the target sound remains unerased). Because there is. For the reasons described above, the target sound component is mixed in the signal m2. Therefore, in the first embodiment, the canceller is configured by the determination unit 10, the adaptive filter unit 20, and the first signal subtraction unit 30. By this canceller, the target sound component mixed in the noise reference signal is removed. As a result, an ideal noise reference signal, that is, a noise reference signal in which the target sound is not mixed can be obtained.

また、図１において、第１のマイクロホンユニット１から出力される信号は、雑音の成分よりも目的音の成分の割合が高い。実施の形態１に係るマイクロホン装置は、上記キャンセラにおける波形の等化処理において、信号ｍ２に混入する目的音成分の信号に信号ｍ１を適応等化させる。つまり、雑音参照信号に含まれる目的音成分の信号に主信号を等化させる。これによって、キャンセラを精度よく動作させることができる。 In FIG. 1, the signal output from the first microphone unit 1 has a higher proportion of the target sound component than the noise component. The microphone device according to Embodiment 1 adaptively equalizes the signal m1 to the signal of the target sound component mixed in the signal m2 in the waveform equalization processing in the canceller. That is, the main signal is equalized to the signal of the target sound component included in the noise reference signal. Thereby, the canceller can be operated with high accuracy.

さらに、実施の形態１では、上記キャンセラの適応フィルタ部２０は、目的音が十分大きく発生している場合にのみ、適応フィルタの学習動作を行う。具体的には、目的音が騒音よりも大きいか否かが判定部１０によって検出される。適応フィルタ部２０は、判定部１０の検出結果に応じて、適応フィルタの学習動作を行う。これによって、適応フィルタ部２０のフィルタ係数を安定に収束させることができる。なお、判定部１０は、音の到来方向およびレベルの双方を検出する必要がある。判定部１０の詳細な構成については後述する（図２参照）。 Further, in the first embodiment, the adaptive filter unit 20 of the canceller performs the adaptive filter learning operation only when the target sound is sufficiently large. Specifically, the determination unit 10 detects whether the target sound is larger than the noise. The adaptive filter unit 20 performs an adaptive filter learning operation according to the detection result of the determination unit 10. Thereby, the filter coefficient of the adaptive filter unit 20 can be converged stably. In addition, the determination part 10 needs to detect both the arrival direction and level of a sound. The detailed configuration of the determination unit 10 will be described later (see FIG. 2).

次に、マイクロホン装置の各構成要素の詳細な構成とともに動作の詳細について説明する。図２は、図１に示す判定部の構成を示す図である。図２において、判定部１０は、第１の信号レベル算出部１１と、第２の信号レベル算出部１２と、信号除算部１３と、目的音到来判定部１４とを備えている。 Next, the details of the operation together with the detailed configuration of each component of the microphone device will be described. FIG. 2 is a diagram illustrating a configuration of the determination unit illustrated in FIG. 1. In FIG. 2, the determination unit 10 includes a first signal level calculation unit 11, a second signal level calculation unit 12, a signal division unit 13, and a target sound arrival determination unit 14.

図２において、第１の信号レベル算出部１１は、信号ｍ１を入力として、信号ｍ１の信号レベルの短時間平均を算出し、第１の信号レベルｘ１ａを出力する。第２の信号レベル算出部１２は、信号ｍ２を入力として、信号ｍ２の信号レベルの短時間平均を算出し、第２の信号レベルｘ２ａを出力する。信号除算部１３は、第１の信号レベルｘ１ａと第２の信号レベルｘ２ａとの信号比率（レベル比）を求める。具体的には、信号除算部１３は、Ｖａ＝ｘ１ａ／ｘ２ａの除算をすることによって信号比率Ｖａを出力する。目的音到来判定部１４は、信号除算部１３からの出力に基づいて、目的音が十分大きく発生しているか否か、すなわち、目的音が騒音よりも大きいか否かを判定する。具体的には、目的音到来判定部１４は、信号比率Ｖａと所定のしきい値ｔｈ１との大小関係を比べ、当該大小関係を示す判定結果Ｖｘを出力する。より具体的には、Ｖｘは、信号比率Ｖａが所定のしきい値ｔｈ１よりも大きいことを示す値（ここでは、“１”とする）と、信号比率Ｖａが所定のしきい値ｔｈ１以下であることを示す値（ここでは、“０”とする）という２値の値をとる。 In FIG. 2, the first signal level calculation unit 11 receives the signal m1 as an input, calculates the short-time average of the signal level of the signal m1, and outputs the first signal level x1a. The second signal level calculation unit 12 receives the signal m2 as an input, calculates a short-time average of the signal level of the signal m2, and outputs a second signal level x2a. The signal divider 13 obtains a signal ratio (level ratio) between the first signal level x1a and the second signal level x2a. Specifically, the signal dividing unit 13 outputs the signal ratio Va by dividing Va = x1a / x2a. Based on the output from the signal divider 13, the target sound arrival determination unit 14 determines whether the target sound is sufficiently large, that is, whether the target sound is larger than the noise. Specifically, the target sound arrival determination unit 14 compares the magnitude relationship between the signal ratio Va and the predetermined threshold value th1, and outputs a determination result Vx indicating the magnitude relationship. More specifically, Vx is a value indicating that the signal ratio Va is larger than the predetermined threshold th1 (here, “1”), and the signal ratio Va is equal to or less than the predetermined threshold th1. It takes a binary value, which is a value indicating that it is present (here, “0”).

図２において、まず、θ０方向（正面方向）から到来する音が支配的である場合を考える。ここで、「θ０方向からの音が支配的である」とは、θ０方向から到来する音が他の方向から到来する音に比べて非常に大きく、他の方向から到来する音が無視できるほど小さいことを意味する。この場合、正面方向に一致するθ０方向は、第１のマイクロホンユニット１の最大感度方向であり、第２のマイクロホンユニットの最小感度の方向である。従って、第１の信号レベルｘ１ａの値は（後述する場合と比べて相対的に）大きく、第２の信号レベルｘ２ａの値は（後述する場合と比べて相対的に）小さくなる。従って、この場合、信号比率Ｖａ（＝ｘ１ａ／ｘ２ａ）は（後述する場合と比べて相対的に）大きな値となる。 In FIG. 2, first, consider a case where the sound coming from the θ0 direction (front direction) is dominant. Here, “the sound from the θ0 direction is dominant” means that the sound coming from the θ0 direction is much louder than the sound coming from the other direction, and the sound coming from the other direction can be ignored. Mean small. In this case, the θ0 direction coinciding with the front direction is the maximum sensitivity direction of the first microphone unit 1 and the minimum sensitivity direction of the second microphone unit. Accordingly, the value of the first signal level x1a is large (relative to the case described later), and the value of the second signal level x2a is small (relative to the case described later). Therefore, in this case, the signal ratio Va (= x1a / x2a) is a large value (relative to the case described later).

次に、θ１方向から到来する音が支配的である場合を考える。ここで、第１のマイクロホン１の指向特性は、指向性主軸がθ０方向に向けられた単一指向性である。また、第２のマイクロホン２の指向特性は、指向性主軸がθ２方向に向けられた双指向性である。従って、θ１方向から到来する音が支配的である場合、θ０方向から到来する音が支配的である場合に比べて、第１の信号レベルｘ１ａの値は減少し、第２の信号レベルｘ２ａの値は増加する。その結果、信号比率Ｖａは、θ０方向から到来する音が支配的である場合に比べて小さくなる。また、支配的である音の方向がθ１方向からθ２方向へと移った場合、第１の信号レベルｘ１ａの値はさらに減少し、第２の信号レベルｘ２ａの値はさらに増加する。その結果、信号比率Ｖａは、θ０方向から到来する音が支配的である場合に比べて小さくなる。 Next, consider the case where the sound coming from the θ1 direction is dominant. Here, the directivity characteristic of the first microphone 1 is unidirectional with the main directivity axis oriented in the θ0 direction. The directivity characteristic of the second microphone 2 is bi-directional with the directivity main axis directed in the θ2 direction. Therefore, when the sound arriving from the θ1 direction is dominant, the value of the first signal level x1a is decreased compared to the case where the sound arriving from the θ0 direction is dominant, and the second signal level x2a The value increases. As a result, the signal ratio Va is smaller than when the sound arriving from the θ0 direction is dominant. Further, when the direction of the dominant sound moves from the θ1 direction to the θ2 direction, the value of the first signal level x1a further decreases and the value of the second signal level x2a further increases. As a result, the signal ratio Va is smaller than when the sound arriving from the θ0 direction is dominant.

次に、θ３方向から到来する音が支配的である場合を考える。ここで、双方のマイクロホンユニット１および２についてθ３方向は指向性の死角となる方向である。第１の信号レベルｘ１ａおよび第２の信号レベルｘ２ａともに小さくなり、その結果、信号比率Ｖａは大きな値にはならない。 Next, consider the case where the sound coming from the θ3 direction is dominant. Here, with respect to both microphone units 1 and 2, the θ3 direction is a direction that becomes a directivity blind spot. Both the first signal level x1a and the second signal level x2a become small, and as a result, the signal ratio Va does not become a large value.

図３は、支配的である音の方向がθ１〜θ３方向である場合における音声検出の状態の例を示す図である。第１の信号レベルｘ１ａ、第２の信号レベルｘ２ａ、および信号比率Ｖａの波形は、図３に示す信号波形となる。ここで、しきい値ｔｈ１を図３に示すレベルに設定することによって、θ０方向の音が支配的であることを判定結果Ｖｘとして検出することができる。すなわち、しきい値ｔｈ１を図３に示すレベルに設定すると、θ０方向の音が支配的である場合のみ、判定結果Ｖｘの値が“１”となる。実施の形態１では、正面方向（θ０方向）から到来する音を目的音とするので、目的音が支配的であることをＶｘの値によって検出することができる。なお、θ０方向から到来する音のみならず、θ１方向から到来する音も目的音とする場合には、しきい値を図３に示すｔｈ２とすればよい。しきい値をｔｈ２とすれば、θ０方向の音だけでなく、θ１方向の音が支配的である場合にも、Ｖｘの値が“１”となる。 FIG. 3 is a diagram illustrating an example of a sound detection state in a case where the dominant sound direction is the θ1 to θ3 direction. The waveforms of the first signal level x1a, the second signal level x2a, and the signal ratio Va are the signal waveforms shown in FIG. Here, by setting the threshold th1 to the level shown in FIG. 3, it is possible to detect that the sound in the θ0 direction is dominant as the determination result Vx. That is, when the threshold value th1 is set to the level shown in FIG. 3, the value of the determination result Vx is “1” only when the sound in the θ0 direction is dominant. In Embodiment 1, since the sound coming from the front direction (θ0 direction) is the target sound, it can be detected from the value of Vx that the target sound is dominant. If the target sound is not only the sound arriving from the θ0 direction but also the sound arriving from the θ1 direction, the threshold value may be set to th2 shown in FIG. When the threshold is set to th2, the value of Vx is “1” not only in the θ0 direction sound but also in the θ1 direction sound.

次に、適応フィルタ部２０および信号減算部３０において、雑音参照信号（信号ｍ２）に混入する目的音を除去する動作について説明する。適応フィルタ部２０は、適応フィルタによって、信号ｍ２に含まれる目的音成分の信号に信号ｍ１を等化させる。つまり、適応フィルタ部２０は、信号ｍ２に含まれる目的音成分の信号を信号ｍ１から生成する。なお、適応フィルタの方式としては、例えばＬＭＳ法（学習同定法）等を用いることができる。信号減算部３０は、適応フィルタ部２０によって生成された信号を信号ｍ２から減算する。その結果、信号ｍ３は、目的音成分が除去された雑音参照信号となる。 Next, an operation of removing the target sound mixed in the noise reference signal (signal m2) in the adaptive filter unit 20 and the signal subtracting unit 30 will be described. The adaptive filter unit 20 equalizes the signal m1 to the signal of the target sound component included in the signal m2 by the adaptive filter. That is, the adaptive filter unit 20 generates a signal of the target sound component included in the signal m2 from the signal m1. As an adaptive filter method, for example, an LMS method (learning identification method) or the like can be used. The signal subtracting unit 30 subtracts the signal generated by the adaptive filter unit 20 from the signal m2. As a result, the signal m3 becomes a noise reference signal from which the target sound component is removed.

ここで、適応フィルタ部２０は、判定部１０による判定結果Ｖｘに応じて、フィルタ係数の学習を行うか否かを決定する。具体的には、判定部１０によって目的音が支配的であると判定された場合、すなわち、判定結果Ｖｘが“１”を示す場合、適応フィルタ部２０は学習を行う。一方、判定部１０によって目的音が支配的でないと判定された場合、すなわち、判定結果Ｖｘが“０”を示す場合、適応フィルタ部２０は学習を行わない。 Here, the adaptive filter unit 20 determines whether or not to learn the filter coefficient according to the determination result Vx by the determination unit 10. Specifically, when the determination unit 10 determines that the target sound is dominant, that is, when the determination result Vx indicates “1”, the adaptive filter unit 20 performs learning. On the other hand, when the determination unit 10 determines that the target sound is not dominant, that is, when the determination result Vx indicates “0”, the adaptive filter unit 20 does not perform learning.

まず、目的音が支配的である場合を考える。この場合、適応フィルタ部２０は学習を行う。ここで、目的音が支配的である場合、雑音は無視することができるので、第２のマイクロホンユニット２は雑音を検出せず、目的音の成分（目的音の反射波や、目的音の直接波の消し残り等の成分）のみを検出するとみなすことができる。つまり、信号ｍ２は、騒音の成分を含まず、目的音の成分のみを含むとみなすことができる。この場合においては、適応フィルタ部２０は、信号ｍ１をフィルタリングした結果として信号ｍ２を出力すればよい。つまり、信号ｍ３が０となるようにフィルタ係数の学習を行えばよい。この学習の結果、適応フィルタ部２０は、信号ｍ１に基づいて信号ｍ２に含まれる目的音成分の信号を生成するためのフィルタ係数を高い精度で得ることができる。 First, consider the case where the target sound is dominant. In this case, the adaptive filter unit 20 performs learning. Here, when the target sound is dominant, the noise can be ignored, so the second microphone unit 2 does not detect the noise, but the target sound component (the reflected wave of the target sound or the direct sound of the target sound). It can be considered that only components (such as unerased waves) are detected. That is, the signal m2 can be regarded as including only the target sound component without including the noise component. In this case, the adaptive filter unit 20 may output the signal m2 as a result of filtering the signal m1. That is, the filter coefficient may be learned so that the signal m3 becomes zero. As a result of this learning, the adaptive filter unit 20 can obtain a filter coefficient for generating a signal of the target sound component included in the signal m2 with high accuracy based on the signal m1.

一方、目的音が支配的でない場合を考える。この場合、信号ｍ２は、目的音の成分に加えて、無視できない大きさの雑音成分を含むことになる。従って、この場合、適応フィルタ部２０は、信号ｍ３が０となるようにフィルタ係数の学習を行っても、適切なフィルタ係数を得ることができない。すなわち、信号ｍ１に基づいて信号ｍ２に含まれる目的音成分の信号を生成するためのフィルタ係数を得ることができない。さらに、このような場合に学習を行うと、フィルタ係数が発散してしまうおそれもある。以上の理由から、適応フィルタ部２０は、フィルタ係数の学習を行うべきでない。そこで、適応フィルタ部２０は、目的音が支配的でない場合には学習を行わないようにするのである。 On the other hand, consider the case where the target sound is not dominant. In this case, the signal m2 includes a noise component having a magnitude that cannot be ignored in addition to the target sound component. Therefore, in this case, the adaptive filter unit 20 cannot obtain an appropriate filter coefficient even if the filter coefficient is learned so that the signal m3 becomes zero. That is, the filter coefficient for generating the signal of the target sound component included in the signal m2 cannot be obtained based on the signal m1. Furthermore, if learning is performed in such a case, the filter coefficients may diverge. For the above reasons, the adaptive filter unit 20 should not learn filter coefficients. Therefore, the adaptive filter unit 20 does not perform learning when the target sound is not dominant.

以上のように、適応フィルタの学習は、判定部１０の判定結果を用いることによって、目的音の大きさが周囲の騒音に比較して大きな場合にのみ行われる。これによって、適応フィルタ部２０は、フィルタ係数を安定に収束させることができる。 As described above, adaptive filter learning is performed only when the target sound is louder than surrounding noise by using the determination result of the determination unit 10. Thereby, the adaptive filter unit 20 can converge the filter coefficients stably.

以上のように、実施の形態１に係るマイクロホン装置は、まず、各マイクロホンユニット１および２の指向特性を利用した前処理として、目的音と騒音とをある程度分離する。その上で、上記キャンセラを利用することによって、各マイクロホンユニット１および２を利用した構成では抑圧しきれない、騒音参照信号へ混入した目的音成分を除去する。以上によって、実施の形態１に係るマイクロホン装置は、理想的な雑音参照信号を得ることができる。 As described above, the microphone device according to Embodiment 1 first separates the target sound and the noise to some extent as preprocessing using the directivity characteristics of the microphone units 1 and 2. In addition, by using the canceller, the target sound component mixed in the noise reference signal, which cannot be suppressed by the configuration using the microphone units 1 and 2, is removed. As described above, the microphone device according to Embodiment 1 can obtain an ideal noise reference signal.

なお、仮に、各マイクロホンユニット１および２の指向特性を利用した前処理を行わずに、キャンセラの構成のみによって雑音参照信号を得ようとする場合には、次のような短所がある。騒音が発生している環境下では目的音の検出が困難となることから、学習制御の精度が悪くなるという短所がある。また、マイクロホンユニットの指向性を用いた目的音の強調が行われないことから、学習信号（目的音）の相関が低下し、フィルタ係数の収束が困難になるといった短所がある。 If a noise reference signal is to be obtained only by the configuration of the canceller without performing preprocessing using the directivity characteristics of the microphone units 1 and 2, there are the following disadvantages. Since it is difficult to detect the target sound in an environment where noise is generated, there is a disadvantage that the accuracy of learning control is deteriorated. Further, since the target sound is not emphasized using the directivity of the microphone unit, there is a disadvantage that the correlation of the learning signal (target sound) is lowered and it is difficult to converge the filter coefficients.

次に、雑音抑圧フィルタ係数算出部４０および時変係数フィルタ部５０によって、主信号（信号ｍ１）から騒音成分を抑圧する動作について説明する。なお、２入力型のスペクトル減算法を行う構成によっても、雑音抑圧フィルタ係数算出部４０および時変係数フィルタ部５０と同様の騒音抑圧効果が得られる。しかし、スペクトル減算法を行う場合には、スペクトルを最終的に波形信号に戻すためのフレーム処理が必要となるので、処理遅延が発生してしまう。なお、フレーム処理における信号遅延を小さくするための方法として、フレーム長を短くすることや、フレームオーバーラップを多くすること等が考えられる。しかし、前者は周波数分解能が低下する点で、また、後者は処理量が増大する点で現実的でない。そこで、実施の形態１では、処理遅延の少ない方法である、時変係数フィルタを用いた構成を採用している。 Next, the operation of suppressing the noise component from the main signal (signal m1) by the noise suppression filter coefficient calculation unit 40 and the time varying coefficient filter unit 50 will be described. Note that the noise suppression effect similar to that of the noise suppression filter coefficient calculation unit 40 and the time-varying coefficient filter unit 50 can be obtained also by the configuration that performs the two-input type spectral subtraction method. However, when the spectral subtraction method is performed, a frame process for finally returning the spectrum to the waveform signal is required, which causes a processing delay. Note that, as a method for reducing the signal delay in the frame processing, it is conceivable to shorten the frame length, increase the frame overlap, or the like. However, the former is not realistic in that the frequency resolution is lowered, and the latter is not realistic in that the processing amount is increased. Therefore, in the first embodiment, a configuration using a time-varying coefficient filter, which is a method with less processing delay, is employed.

図４は、雑音抑圧フィルタ係数算出部４０の構成例を示す図である。図４において、雑音抑圧フィルタ係数算出部４０は、第１の周波数分析部４１と、第２の周波数分析部４２と、スペクトル比演算部４３と、信号平均部４４と、信号乗算部４５と、フィルタ伝達特性推定部４６と、インパルス応答設計部４７とを備えている。 FIG. 4 is a diagram illustrating a configuration example of the noise suppression filter coefficient calculation unit 40. In FIG. 4, a noise suppression filter coefficient calculation unit 40 includes a first frequency analysis unit 41, a second frequency analysis unit 42, a spectrum ratio calculation unit 43, a signal averaging unit 44, a signal multiplication unit 45, A filter transfer characteristic estimation unit 46 and an impulse response design unit 47 are provided.

図４において、第１の周波数分析部４１は、主信号である信号ｍ１のパワスペクトルＸ（ω）を算出する。第２の周波数分析部４２は、雑音参照信号である信号ｍ３のパワスペクトルＮ１（ω）を算出する。ここで、各周波数分析部４１および４２は、ＦＦＴ、フィルタバンク、ウェーブレット変換やＤＣＴ等、周波数成分のパワーを導出することができる既知の手法を用いることで実現できる。 In FIG. 4, the first frequency analysis unit 41 calculates the power spectrum X (ω) of the signal m1, which is the main signal. The second frequency analysis unit 42 calculates a power spectrum N1 (ω) of the signal m3 that is a noise reference signal. Here, each of the frequency analysis units 41 and 42 can be realized by using a known method capable of deriving the power of the frequency component, such as FFT, filter bank, wavelet transform, and DCT.

スペクトル比演算部４３は、第１の周波数分析部４１によって算出されるパワスペクトルＸ（ω）と、第２の周波数分析部４２によって算出されるパワスペクトルＮ１（ω）とを入力とし、スペクトル比Ｈ（ω）＝Ｘ（ω）／Ｎ１（ω）を導出する。信号平均部４４は、スペクトル比演算部４３によって導出されるスペクトル比Ｈ（ω）と、判定部１０による判定結果Ｖｘとを入力とする。そして、目的音より周囲騒音が支配的である場合（すなわち、Ｖｘの値が“０”である場合）における周波数成分毎の時間平均Ｈａ（ω）を算出する。信号乗算部４５は、第２の周波数分析部４２によって算出されるパワスペクトルＮ１（ω）と、信号平均部４４によって算出される時間平均Ｈａ（ω）とを周波数成分毎に乗算する。そして、乗算結果をＮｘ（ω）として出力する。なお、指向性パターンが異なることやマイクロホンユニットの特性等の原因で、主信号のスペクトルＸ（ω）に含まれる目的音成分以外の騒音成分のスペクトルの形状やレベルは、雑音参照信号のスペクトルＮ１（ω）の形状やレベルと必ずしも等しくならない。以上に述べたスペクトル比演算部４３、信号平均部４４、および信号乗算部４５は、主信号のスペクトルＸ（ω）に含まれる目的音成分以外の騒音成分のスペクトルと、雑音参照信号のスペクトルＮ１（ω）とを一致させるための構成である。従って、信号乗算部４５の乗算結果として得られるＮｘ（ω）は、主信号のスペクトルＸ（ω）の中に含まれる雑音成分となる。従って、このＮｘ（ω）を、推定雑音スペクトルＮｘ（ω）と呼ぶ。 The spectrum ratio calculation unit 43 receives the power spectrum X (ω) calculated by the first frequency analysis unit 41 and the power spectrum N1 (ω) calculated by the second frequency analysis unit 42 as input. H (ω) = X (ω) / N1 (ω) is derived. The signal average unit 44 receives the spectrum ratio H (ω) derived by the spectrum ratio calculation unit 43 and the determination result Vx by the determination unit 10 as inputs. Then, a time average Ha (ω) for each frequency component when the ambient noise is more dominant than the target sound (that is, when the value of Vx is “0”) is calculated. The signal multiplication unit 45 multiplies the power spectrum N1 (ω) calculated by the second frequency analysis unit 42 and the time average Ha (ω) calculated by the signal averaging unit 44 for each frequency component. Then, the multiplication result is output as Nx (ω). Note that the shape and level of the noise component spectrum other than the target sound component included in the main signal spectrum X (ω) due to the different directivity patterns, the characteristics of the microphone unit, and the like are the spectrum N1 of the noise reference signal. It is not necessarily equal to the shape and level of (ω). The spectrum ratio calculation unit 43, the signal averaging unit 44, and the signal multiplication unit 45 described above include the spectrum of the noise component other than the target sound component included in the spectrum X (ω) of the main signal and the spectrum N1 of the noise reference signal. This is a configuration for matching (ω). Accordingly, Nx (ω) obtained as a multiplication result of the signal multiplier 45 becomes a noise component included in the spectrum X (ω) of the main signal. Therefore, this Nx (ω) is referred to as an estimated noise spectrum Nx (ω).

フィルタ伝達特性推定部４６は、第１の周波数分析部４１によって算出されるパワスペクトルＸ（ω）と、信号乗算部４５によって算出される推定雑音スペクトルＮｘ（ω）とを入力として、雑音抑圧フィルタの伝達特性Ｈｗ（ω）を算出する。雑音抑圧フィルタの伝達特性Ｈｗ（ω）は、例えばウィナーフィルタ法に基づき、Ｈｗ（ω）＝（Ｘ（ω）−Ｎｘ（ω））／Ｘ（ω）等によって求めることができる。 The filter transfer characteristic estimator 46 receives the power spectrum X (ω) calculated by the first frequency analyzer 41 and the estimated noise spectrum Nx (ω) calculated by the signal multiplier 45 as inputs, and receives a noise suppression filter. Transfer characteristic Hw (ω) is calculated. The transfer characteristic Hw (ω) of the noise suppression filter can be obtained by, for example, Hw (ω) = (X (ω) −Nx (ω)) / X (ω) based on the Wiener filter method.

インパルス応答設計部４７は、フィルタ伝達特性推定部４６によって算出される伝達特性Ｈｗ（ω）を目標特性とし、目標特性に対して毎サンプル漸近していくようにフィルタ係数ｈｗ（ｎ）を出力する。 The impulse response design unit 47 uses the transfer characteristic Hw (ω) calculated by the filter transfer characteristic estimation unit 46 as a target characteristic, and outputs a filter coefficient hw (n) so as to gradually approach the target characteristic every sample. .

時変係数フィルタ部５０は、インパルス応答設計部４７から出力されるフィルタ係数ｈｗ（ｎ）に従って、信号ｍ１に対してフィルタリングを行い、マイクロホン装置の出力信号ｙを生成する。以下、図５および図６を用いて、時変係数フィルタ部５０の具体的な構成例を説明する。 The time-varying coefficient filter unit 50 filters the signal m1 according to the filter coefficient hw (n) output from the impulse response design unit 47, and generates an output signal y of the microphone device. Hereinafter, a specific configuration example of the time-varying coefficient filter unit 50 will be described with reference to FIGS. 5 and 6.

図５は、時変係数フィルタ部５０の構成例を示す図である。図５において、時変係数フィルタ部５０は、ｎ個の信号遅延部と、ｎ＋１個の信号増幅部と、ｎ個の信号加算部とを備えている。なお、図５においては、第１の信号遅延部５０１、第２の信号遅延部５０２、第ｎの信号遅延部５０３、第１の信号増幅部５０４、第２の信号増幅部５０５、第ｎの信号増幅部５０６、第１の信号加算部５０８、および第ｎの信号加算部５０９のみを示す。 FIG. 5 is a diagram illustrating a configuration example of the time varying coefficient filter unit 50. In FIG. 5, the time-varying coefficient filter unit 50 includes n signal delay units, n + 1 signal amplification units, and n signal addition units. In FIG. 5, the first signal delay unit 501, the second signal delay unit 502, the nth signal delay unit 503, the first signal amplification unit 504, the second signal amplification unit 505, the nth Only the signal amplification unit 506, the first signal addition unit 508, and the nth signal addition unit 509 are shown.

図５において、各信号遅延部は、従属に接続され、入力した信号を１サンプル遅延させる。各信号増幅部は、入力した信号を増幅して出力する。第１の信号増幅部５０４は、時変係数フィルタ部５０に入力される信号ｍ１を増幅する。第２の信号増幅部５０５は、第１の信号遅延部５０１から出力される信号を増幅する。以降、第２の信号増幅部５０５よりも後段の信号増幅部も第２の信号増幅部５０５と同様の動作を行う。すなわち、第ｉ＋１の信号増幅部は、第ｉの信号遅延部から出力される信号を増幅する（ｉは、１からｎまでの整数）。第１の信号加算部５０８は、第１の信号増幅部５０４から出力される信号と第２の信号増幅部５０５から出力される信号とを加算する。第２の信号加算部（図示していない）は、第１の信号加算部５０８から出力される信号と、第３の信号増幅部（図示していない）から出力される信号とを加算する。以降、第２の信号加算部（図示していない）よりも後段の信号加算部も第２の信号加算部と同様の動作を行う。すなわち、第ｊの信号加算部は、第ｊ−１の信号加算部から出力される信号と、第ｉ＋１の信号増幅部から出力される信号とを加算する（ｊは、２からｎまでの整数）。そして、第ｎの信号加算部５０９から出力される信号が、出力信号ｙとなる。なお、図５に示す構成は、一般的なＦＩＲ型フィルタの構成であり、第１から第ｎ＋１の信号増幅部の各係数は、インパルス応答設計部４７からのフィルタ係数ｈｗ（ｎ）に従って変化する。 In FIG. 5, each signal delay unit is connected to each other and delays an input signal by one sample. Each signal amplifying unit amplifies and outputs the input signal. The first signal amplification unit 504 amplifies the signal m 1 input to the time varying coefficient filter unit 50. The second signal amplification unit 505 amplifies the signal output from the first signal delay unit 501. Thereafter, the signal amplifying unit subsequent to the second signal amplifying unit 505 performs the same operation as that of the second signal amplifying unit 505. In other words, the (i + 1) -th signal amplification unit amplifies the signal output from the i-th signal delay unit (i is an integer from 1 to n). The first signal adder 508 adds the signal output from the first signal amplifier 504 and the signal output from the second signal amplifier 505. The second signal adder (not shown) adds the signal output from the first signal adder 508 and the signal output from the third signal amplifier (not shown). Thereafter, the signal adding unit at the subsequent stage of the second signal adding unit (not shown) performs the same operation as the second signal adding unit. That is, the jth signal adding unit adds the signal output from the j−1th signal adding unit and the signal output from the i + 1th signal amplifying unit (j is an integer from 2 to n). ). The signal output from the nth signal adding unit 509 is the output signal y. The configuration shown in FIG. 5 is a configuration of a general FIR filter, and the coefficients of the first to (n + 1) th signal amplification units change according to the filter coefficient hw (n) from the impulse response design unit 47. .

図６は、時変係数フィルタ部５０の他の構成例を示す図である。図６において、時変係数フィルタ部５０は、ｎ個のバンドパスフィルタと、ｎ個の信号増幅部と、信号加算部５１７とを備えている。なお、図６においては、第１のバンドパスフィルタ５１１、第２のバンドパスフィルタ５１２、第ｎのバンドパスフィルタ５１３、第１の信号増幅部５１４、第２の信号増幅部５１５、第ｎの信号増幅部５１６、および信号加算部５１７のみを示している。 FIG. 6 is a diagram illustrating another configuration example of the time-varying coefficient filter unit 50. In FIG. 6, the time-varying coefficient filter unit 50 includes n band-pass filters, n signal amplification units, and a signal addition unit 517. In FIG. 6, the first bandpass filter 511, the second bandpass filter 512, the nth bandpass filter 513, the first signal amplification unit 514, the second signal amplification unit 515, the nth Only the signal amplifier 516 and the signal adder 517 are shown.

図６において、各バンドパスフィルタは、入力信号後段に並列に設けられ、時変係数フィルタ部５０に入力される信号ｍ１の帯域をｎ個に分割して出力する。各信号増幅部は、各バンドパスフィルタから出力される信号に対してそれぞれ増幅を行う。信号加算部５１７は、各信号増幅部から出力される信号を加算し、加算した結果を出力信号ｙとして出力する。なお、各信号増幅部の増幅率は、フィルタ伝達特性推定部４６から出力される伝達関数Ｈｗ（ω）をもとに決定することができる。以上の構成によっても、図５と同様の効果を得ることができる。 In FIG. 6, each bandpass filter is provided in parallel in the subsequent stage of the input signal, and divides the band of the signal m 1 input to the time-varying coefficient filter unit 50 into n and outputs it. Each signal amplifying unit amplifies the signal output from each bandpass filter. The signal adder 517 adds the signals output from the signal amplifiers and outputs the addition result as an output signal y. The amplification factor of each signal amplifying unit can be determined based on the transfer function Hw (ω) output from the filter transfer characteristic estimating unit 46. Also with the above configuration, the same effect as in FIG. 5 can be obtained.

図７は、図１に示す各信号の具体例を示す図である。具体的には、第１のマイクロホンユニット１から出力される信号ｍ１、第２のマイクロホンユニット２から出力される信号ｍ２、第１の信号減算部３０から出力される信号ｍ３、および時変係数フィルタ部５０から出力される出力信号ｙの具体例を示す。図７に示すように、信号ｍ３は、信号ｍ２から反射音等の影響が除去され、目的音以外の成分のみ、すなわち、騒音の成分のみを含む信号となっている。さらに、時変係数フィルタ部５０において主信号ｍ１と雑音参照信号ｍ３とを用いてフィルタ処理を行うことによって、出力信号ｙとして目的音のみを取り出すことができる。従来の指向性マイクロホンユニットの出力であるｍ１と、実施の形態１のマイクロホン装置の出力信号ｙとを比較すると明らかなように、実施の形態１のマイクロホン装置によれば、目的音が発生している状態であるか目的音が発生していない状態であるかを問わず、周囲の騒音を大幅に抑圧することができる。 FIG. 7 is a diagram showing a specific example of each signal shown in FIG. Specifically, the signal m1 output from the first microphone unit 1, the signal m2 output from the second microphone unit 2, the signal m3 output from the first signal subtraction unit 30, and the time-varying coefficient filter The specific example of the output signal y output from the part 50 is shown. As shown in FIG. 7, the signal m 3 is a signal that includes only components other than the target sound, i.e., only noise components, by removing the influence of reflected sound and the like from the signal m 2. Further, by performing filtering using the main signal m1 and the noise reference signal m3 in the time-varying coefficient filter unit 50, only the target sound can be extracted as the output signal y. As is apparent from comparing the output m1 of the conventional directional microphone unit with the output signal y of the microphone device of the first embodiment, the microphone device of the first embodiment generates a target sound. The ambient noise can be greatly suppressed regardless of whether the target sound is generated or the target sound is not generated.

なお、第１のマイクロホンユニット１と第２のマイクロホンユニット２との位置関係や、各マイクロホンユニット１および２の後段に設けられる各構成の回路によっては、適応フィルタ収束のための因果律を満たすことを目的として、信号減算部３０と第２のマイクロホンユニット２との間に信号遅延部を設ける構成としてもよい。この信号遅延部における遅延量は、各マイクロホンユニット１および２の間の距離を音速で割った量以上とすることを目安として決定される。 Note that, depending on the positional relationship between the first microphone unit 1 and the second microphone unit 2 and the circuit of each component provided in the subsequent stage of each microphone unit 1 and 2, the causality for convergence of the adaptive filter may be satisfied. For the purpose, a signal delay unit may be provided between the signal subtracting unit 30 and the second microphone unit 2. The amount of delay in the signal delay unit is determined based on a criterion that the distance between the microphone units 1 and 2 is equal to or greater than the amount obtained by dividing the distance by the sound speed.

また、実施の形態１では、第１のマイクロホンユニット１として単一指向性マイクロホンユニットを用いることとしたが、無指向性マイクロホンや超指向性マイクロホンを用いてもよい。 In the first embodiment, a unidirectional microphone unit is used as the first microphone unit 1, but an omnidirectional microphone or a superdirectional microphone may be used.

なお、上記においては、判定部１０は判定結果Ｖｘとして２値で表現される数値を出力した。ここで、判定部１０は、多値で表現される信号比率Ｖａを出力としてもよい。さらに、この場合、適応フィルタ部２０は、判定結果（信号比率Ｖａ）に応じて学習のスピードを変化させる。具体的には、信号比率Ｖａがしきい値よりも大きい場合、適応フィルタ部２０は、信号比率Ｖａが大きくなるほど学習のスピードを上げる。より具体的には、適応フィルタ部２０は、信号比率Ｖａが大きくなるほど、ステップゲインパラメータの値を０．５に近づける。一方、信号比率Ｖａがしきい値以下である場合、適応フィルタ部２０は学習を行わない。より具体的には、ステップゲインパラメータの値を０にする。 In the above description, the determination unit 10 outputs a numerical value expressed in binary as the determination result Vx. Here, the determination unit 10 may output the signal ratio Va expressed in multiple values. Further, in this case, the adaptive filter unit 20 changes the learning speed according to the determination result (signal ratio Va). Specifically, when the signal ratio Va is larger than the threshold value, the adaptive filter unit 20 increases the learning speed as the signal ratio Va increases. More specifically, the adaptive filter unit 20 brings the value of the step gain parameter closer to 0.5 as the signal ratio Va increases. On the other hand, when the signal ratio Va is equal to or less than the threshold value, the adaptive filter unit 20 does not perform learning. More specifically, the value of the step gain parameter is set to zero.

以上のように、実施の形態１に係るマイクロホン装置は、騒音環境下および反射音場においても理想的な雑音参照信号を得ることができる。従って、主信号と雑音参照信号とを利用した雑音抑圧部によって、従来の指向性マイクロホンに比較して大幅に収音Ｓ／Ｎを改善することができる。さらに、実施の形態１に係るマイクロホン装置は、雑音抑圧方式として時変係数フィルタを用いた方法を採用することによって、スペクトル減算法を用いる場合に比べて処理遅延を低減することができる。従って、実施の形態１に係るマイクロホン装置は、拡声用途や通話用途等、遅延の少ない処理が要求される用途にも適用することができる。 As described above, the microphone device according to Embodiment 1 can obtain an ideal noise reference signal even in a noise environment and in a reflected sound field. Therefore, the noise suppression unit using the main signal and the noise reference signal can greatly improve the sound collection S / N compared to the conventional directional microphone. Furthermore, the microphone device according to Embodiment 1 can reduce the processing delay as compared with the case of using the spectral subtraction method by adopting a method using a time-varying coefficient filter as a noise suppression method. Therefore, the microphone device according to Embodiment 1 can also be applied to uses that require processing with little delay, such as a loudspeaker use and a call use.

（実施の形態２）
次に、実施の形態２に係るマイクロホン装置について、図８と図９を用いて説明する。なお、実施の形態１に係るマイクロホン装置は、目的音を検出する際に混入する騒音を抑制することを目的とするものであった。実施の形態２に係るマイクロホン装置は、目的音の反射波が検出されることによる目的音の周波数特性歪みを補正することを目的とするものである。 (Embodiment 2)
Next, a microphone device according to Embodiment 2 will be described with reference to FIGS. The microphone device according to Embodiment 1 is intended to suppress noise that is mixed when detecting the target sound. The microphone device according to the second embodiment is intended to correct the frequency characteristic distortion of the target sound caused by detecting the reflected wave of the target sound.

図８において、マイクロホン装置は、第１のマイクロホンユニット１と、第２のマイクロホンユニット２と、判定部１０と、適応フィルタ部２０と、信号減算部３０と、反射情報算出部６０と、反射補正部７０とを備えている。なお、図８において、実施の形態１と同様の構成要素については、図１と同じ参照符号を付し、詳細な説明を省略する。 In FIG. 8, the microphone device includes a first microphone unit 1, a second microphone unit 2, a determination unit 10, an adaptive filter unit 20, a signal subtraction unit 30, a reflection information calculation unit 60, and reflection correction. Part 70. In FIG. 8, the same components as those in the first embodiment are denoted by the same reference numerals as those in FIG. 1, and detailed description thereof is omitted.

図８において、反射情報算出部６０には、適応フィルタ部２０のフィルタ係数が入力される。反射情報算出部６０は、入力されたフィルタ係数を用いて、反射物の有無や、距離、影響度を推定する。反射補正部７０は、信号ｍ１を入力として、反射情報算出部６０の推定結果に基づいて、目的音の反射の影響によって信号ｍ１に生じている周波数特性歪みを補正する。 In FIG. 8, the filter information of the adaptive filter unit 20 is input to the reflection information calculation unit 60. The reflection information calculation unit 60 uses the input filter coefficients to estimate the presence / absence, distance, and influence of the reflecting object. The reflection correction unit 70 receives the signal m1 and corrects the frequency characteristic distortion generated in the signal m1 due to the influence of the reflection of the target sound based on the estimation result of the reflection information calculation unit 60.

以下、実施の形態２に係るマイクロホン装置について動作を説明する。 Hereinafter, the operation of the microphone device according to Embodiment 2 will be described.

図８に示すマイクロホン装置においては、信号ｍ１が主信号となる。ここで、第１のマイクロホンユニット１の指向性が単一指向性である場合、第１のマイクロホンユニット１の指向性は、目的音の反射波を除去できるほど鋭くない。従って、反射物がマイクロホン装置の近傍に存在した場合、目的音の直接波以外に反射波が同時に収音されるので、検出される音の周波数特性が目的音の直接波と反射波との干渉によって乱れることになる。実施の形態２に係るマイクロホン装置は、反射波の情報が適応フィルタ部２０のフィルタ係数に表れることを利用して、目的音の反射の影響によって歪んだ周波数特性を補正する。これによって、検出される音の周波数特性の自動補正が可能となる。 In the microphone device shown in FIG. 8, the signal m1 is the main signal. Here, when the directivity of the first microphone unit 1 is unidirectional, the directivity of the first microphone unit 1 is not so sharp that the reflected wave of the target sound can be removed. Therefore, when a reflective object is present in the vicinity of the microphone device, the reflected wave is simultaneously collected in addition to the direct wave of the target sound. Will be disturbed by. The microphone device according to the second embodiment corrects the frequency characteristic distorted by the influence of the reflection of the target sound by using the fact that the information of the reflected wave appears in the filter coefficient of the adaptive filter unit 20. This enables automatic correction of the frequency characteristics of the detected sound.

前述のように、適応フィルタ部２０は、信号ｍ２に混入する目的音成分、すなわち、不完全な指向性による目的音の消し残り成分、および、目的音の反射波成分の信号を生成する。つまり、目的音の直接波の成分を多く含む信号ｍ１から、目的音の反射波の成分を多く含む信号ｍ２への伝達特性（インパルス応答）は、適応フィルタ部２０のフィルタ係数に表現されていることになる。従って、このフィルタ係数からその係数のピークを検出することによって、マイクロホンユニットの位置における目的音の直接波が到来する時刻と反射波が到来する時刻との時間差ｄｔ（ｓｅｃ）や、反射波を表すピークレベルＬｒや、反射の強さがわかる。さらに、時間差ｄｔから、目的音の反射波が到来する経路と、直接波が到来する経路との距離差ｄｔ×ｃ（ただし、ｃは音速）がわかる。 As described above, the adaptive filter unit 20 generates a target sound component mixed in the signal m2, that is, a signal of a target sound unerased component due to imperfect directivity and a reflected wave component of the target sound. That is, the transfer characteristic (impulse response) from the signal m1 containing a large amount of the direct wave component of the target sound to the signal m2 containing the large amount of the reflected wave component of the target sound is expressed in the filter coefficient of the adaptive filter unit 20. It will be. Therefore, by detecting the peak of the coefficient from the filter coefficient, it represents the time difference dt (sec) between the time when the direct wave of the target sound arrives at the position of the microphone unit and the time when the reflected wave arrives, and the reflected wave. The peak level Lr and the intensity of reflection are known. Further, from the time difference dt, the distance difference dt × c (where c is the speed of sound) between the path from which the reflected wave of the target sound arrives and the path from which the direct wave arrives is known.

ここで、波長が当該距離差と等しくなる（波長λがλ＝ｄｔ×ｃの関係を満たす）周波数の音については、直接波と反射波とが同位相で加算されるので、マイクロホンユニットで検出される音圧レベルが上がる。逆に、波長が当該距離差の１／２と等しくなる（波長λがλ／２＝ｄｔ×ｃの関係を満たす）周波数の音については、直接波と反射波とが逆位相となるので、マイクロホンユニットで検出される音圧レベルが下がり、主信号の周波数特性においてディップが発生する。また、反射面で完全反射が起こっているとすれば、第１のマイクロホンユニット１から出力される信号には、ｆａ（＝ｃ／λ＝１／ｄｔ）を基本周波数とする高調波部分が強調される、くし型フィルタ状の周波数特性が現れる。 Here, for the sound of the frequency whose wavelength is equal to the distance difference (wavelength λ satisfies the relationship of λ = dt × c), the direct wave and the reflected wave are added in the same phase, and therefore detected by the microphone unit. Increased sound pressure level. Conversely, for a sound having a frequency whose wavelength is equal to 1/2 of the distance difference (wavelength λ satisfies the relationship of λ / 2 = dt × c), the direct wave and the reflected wave are in opposite phases. The sound pressure level detected by the microphone unit decreases, and a dip occurs in the frequency characteristics of the main signal. If complete reflection occurs on the reflecting surface, the signal output from the first microphone unit 1 is emphasized by a harmonic part having a fundamental frequency of fa (= c / λ = 1 / dt). A comb-like filter-like frequency characteristic appears.

図９は、反射物がある場合と反射物がない場合とにおけるマイクロホン装置の内部状態の相違を説明する図である。図９においては、反射物がある場合と反射物がない場合について、マイクロホンユニット、目的音源（話者）、および反射物の位置関係と、適応フィルタ部２０における適応フィルタ係数ｈａｄｆ（ｎ）の値と、信号ｍ１の周波数特性とが示されている。 FIG. 9 is a diagram for explaining the difference in the internal state of the microphone device when there is a reflector and when there is no reflector. In FIG. 9, the positional relationship between the microphone unit, the target sound source (speaker), and the reflector, and the value of the adaptive filter coefficient hadf (n) in the adaptive filter unit 20 when there is a reflector and when there is no reflector. And frequency characteristics of the signal m1 are shown.

図９において、（ａ１）に示すような話者およびマイクロホンユニットの近傍に反射物がない状態においては、（ａ２）に示すように、適応フィルタ部２０のフィルタ係数には、反射波の影響は現れない。さらに、（ａ３）に示すように、主信号の周波数特性の形状は、比較的平坦になる。一方、（ｂ１）に示すような話者およびマイクロホンの近傍に反射物がある状態においては、（ｂ２）に示すように、適応フィルタ部２０のフィルタ係数は、上記時間差ｄｔの部分の値が大きくなる。さらに、（ｂ３）に示すように、主信号の周波数特性に関しても、マイクロホン、目的音源、および反射物の位置関係に応じた周波数特性の歪みが生じている。 In FIG. 9, in the state where there are no reflectors in the vicinity of the speaker and the microphone unit as shown in (a1), as shown in (a2), the influence of the reflected wave on the filter coefficient of the adaptive filter unit 20 is as follows. It does not appear. Furthermore, as shown in (a3), the shape of the frequency characteristic of the main signal is relatively flat. On the other hand, in the state where there is a reflector near the speaker and microphone as shown in (b1), the filter coefficient of the adaptive filter unit 20 has a large value in the portion of the time difference dt as shown in (b2). Become. Further, as shown in (b3), the frequency characteristics of the main signal are also distorted according to the positional relationship between the microphone, the target sound source, and the reflector.

以上より、適応フィルタの係数ピークから、上記時間差ｄｔや影響度Ｌｒを算出することができる。さらに、これらを用いて、反射波の影響で歪んだ周波数特性の補正量を推定することができる。なお、実際には、特に高音域では、反射面で完全反射が起こっているとみなすことはできない。反射面で完全反射が起こっているとみなすことができない場合には、反射面の反射特性を仮定して、デコンボリューションのフィルタ設計を行うことが考えられる。また、簡易的に低域特性のみに着目して、１波長が距離差に等しい周波数（ｆａ＝１／ｄｔ）や、１／２波長が距離差に等しい周波数（ｆｂ＝１／２ｄｔ）等の周波数に対して、例えば、以下の式で補正ゲインを算出する。
中心周波数ｆａ：補正ゲイン＝−β１・２０ｌｏｇ（１＋α１・Ｌｒ）（ｄＢ）
中心周波数ｆｂ：補正ゲイン＝＋β２・２０ｌｏｇ（１−α２・Ｌｒ）（ｄＢ）
この場合、反射情報算出部６０からの情報に基づいて中心周波数とバンド幅とゲインとを調整することが可能なイコライザによって、反射補正部７０の補正特性Ｈｒ（ω）を実現できる。 As described above, the time difference dt and the influence level Lr can be calculated from the coefficient peak of the adaptive filter. Furthermore, the correction amount of the frequency characteristic distorted by the influence of the reflected wave can be estimated using these. Actually, it cannot be considered that complete reflection occurs on the reflecting surface particularly in a high sound range. In the case where it is not possible to assume that complete reflection occurs on the reflection surface, it is conceivable to perform deconvolution filter design assuming the reflection characteristics of the reflection surface. Further, focusing attention only on the low frequency characteristics, a frequency where one wavelength is equal to the distance difference (fa = 1 / dt), a frequency where the half wavelength is equal to the distance difference (fb = 1/2 dt), etc. For the frequency, for example, the correction gain is calculated by the following equation.
Center frequency fa: correction gain = −β1 · 20 log (1 + α1 · Lr) (dB)
Center frequency fb: correction gain = + β2 · 20 log (1−α2 · Lr) (dB)
In this case, the correction characteristic Hr (ω) of the reflection correction unit 70 can be realized by an equalizer that can adjust the center frequency, the bandwidth, and the gain based on the information from the reflection information calculation unit 60.

なお、例えばカーナビゲーションの音声認識用途でマイクロホン装置を使用する場合等、使用環境が限定できる場合、適応フィルタ部２０のフィルタ係数の検出精度を高めることができる。具体的には、初期反射成分のみを対象とし、反射面位置から算出した反射波遅延量に基づいてフィルタ係数の最大値の探索範囲を限定する。 Note that, when the use environment can be limited, for example, when a microphone device is used for voice recognition for car navigation, the detection accuracy of the filter coefficient of the adaptive filter unit 20 can be increased. Specifically, only the initial reflection component is targeted, and the search range of the maximum value of the filter coefficient is limited based on the reflected wave delay amount calculated from the reflection surface position.

また、フィルタ係数の最大値は、マイクロホンユニットの指向性タイプによっては、指向性ローブの極性によって、反射波によるピークが正負のどちらに発生するかが反射波の到来方向に依存する場合がある。その様な構成のときには係数の絶対値に対して最大値を探索する必要がある。 In addition, depending on the directivity type of the microphone unit, the maximum value of the filter coefficient may depend on whether the peak due to the reflected wave is positive or negative depending on the direction of the reflected wave depending on the polarity of the directivity lobe. In such a configuration, it is necessary to search for the maximum value with respect to the absolute value of the coefficient.

以上のように、実施の形態２によれば、目的音の反射波の影響で歪む周波数特性を補正することができる。それ故、どの様な使用環境（音場）においても安定して平坦な音圧感度対周波数特性が得られるマイクロホン装置を実現することができる。従って、実施の形態２によれば、通話や拡声においては音質改善を図ることができる。また、特に音声認識用途では反射波が及ぼす周波数特性歪みが誤認識の要因の一つであったが、実施の形態２の構成によって、近傍の反射物の在り無しにかかわらず安定して高い音声認識率を実現することができるようになる。 As described above, according to the second embodiment, it is possible to correct the frequency characteristics that are distorted by the influence of the reflected wave of the target sound. Therefore, it is possible to realize a microphone device that can obtain a stable and flat sound pressure sensitivity versus frequency characteristic in any use environment (sound field). Therefore, according to the second embodiment, it is possible to improve the sound quality in a telephone call or voice expansion. In particular, in frequency recognition applications, frequency characteristic distortion caused by reflected waves was one of the causes of misrecognition. However, the configuration of the second embodiment enables stable and high voice regardless of the presence or absence of nearby reflectors. The recognition rate can be realized.

（実施の形態３）
次に、実施の形態３に係るマイクロホン装置について、図１０および図１１を用いて説明する。実施の形態３に係るマイクロホン装置は、実施の形態１の構成と実施の形態２の構成とを結合した構成である。 (Embodiment 3)
Next, a microphone device according to Embodiment 3 will be described with reference to FIGS. The microphone device according to the third embodiment has a configuration obtained by combining the configuration of the first embodiment and the configuration of the second embodiment.

図１０は、実施の形態３に係るマイクロホン装置の構成を示すブロック図である。図１０において、マイクロホン装置は、第１のマイクロホンユニット１と、第２のマイクロホンユニット２と、判定部１０と、適応フィルタ部２０と、信号減算部３０と、雑音抑圧フィルタ係数算出部４０と、時変係数フィルタ部５０と、反射情報算出部６０と、反射補正部７０とを備えている。なお、図１０において、実施の形態１または２と同様の構成要素については、図１または図８と同じ参照符号を付し、詳細な説明を省略する。 FIG. 10 is a block diagram showing a configuration of the microphone device according to Embodiment 3. In FIG. 10, the microphone device includes a first microphone unit 1, a second microphone unit 2, a determination unit 10, an adaptive filter unit 20, a signal subtraction unit 30, a noise suppression filter coefficient calculation unit 40, A time-varying coefficient filter unit 50, a reflection information calculation unit 60, and a reflection correction unit 70 are provided. In FIG. 10, the same components as those in the first or second embodiment are denoted by the same reference numerals as those in FIG. 1 or 8, and detailed description thereof is omitted.

図１０に示す構成と図８に示す構成との相違点は、図８に示す構成の後段に、図１に示す雑音抑圧フィルタ係数算出部４０および時変係数フィルタ部５０を設けた点である。これによって、図１０に示すマイクロホン装置は、反射波による周波数特性の歪みを補正するとともに、雑音抑圧を行うことが可能である。 The difference between the configuration shown in FIG. 10 and the configuration shown in FIG. 8 is that the noise suppression filter coefficient calculation unit 40 and the time-varying coefficient filter unit 50 shown in FIG. 1 are provided after the configuration shown in FIG. . Accordingly, the microphone device shown in FIG. 10 can correct the distortion of the frequency characteristic due to the reflected wave and perform noise suppression.

図１１は、実施の形態３に係るマイクロホン装置の他の構成を示すブロック図である。図１１において、マイクロホン装置は、第１のマイクロホンユニット１と、第２のマイクロホンユニット２と、判定部１０と、適応フィルタ部２０と、信号減算部３０と、時変係数フィルタ部５０と、反射情報算出部６０と、反射補正部７０と、雑音抑圧かつ反射逆特性フィルタ係数推定部８０とを備えている。図１１に示す構成は、反射補正部７０の特性を時変係数フィルタ部５０の特性に重畳させることによって処理量の削減を行う構成である。 FIG. 11 is a block diagram showing another configuration of the microphone device according to Embodiment 3. In FIG. In FIG. 11, the microphone device includes a first microphone unit 1, a second microphone unit 2, a determination unit 10, an adaptive filter unit 20, a signal subtraction unit 30, a time-varying coefficient filter unit 50, a reflection An information calculation unit 60, a reflection correction unit 70, and a noise suppression and reflection inverse characteristic filter coefficient estimation unit 80 are provided. The configuration shown in FIG. 11 is a configuration in which the processing amount is reduced by superimposing the characteristics of the reflection correction unit 70 on the characteristics of the time-varying coefficient filter unit 50.

図１１に示す構成の動作が図１０に示す構成の動作と異なる点は、雑音抑圧かつ反射逆特性フィルタ係数推定部８０の動作である。雑音抑圧かつ反射逆特性フィルタ係数推定部８０は、信号ｍ１（主信号）と、信号ｍ３（雑音参照信号）と、反射情報算出部６０から出力される信号を入力とする。そして、これらの信号に基づいて、雑音抑圧フィルタ特性Ｈｗ（ω）＝（Ｘ（ω）−Ｎｘ（ω））／Ｘ（ω）と、反射逆特性Ｈｒ（ω）を算出する。さらに、｛Ｈｗ（ω）・Ｈｒ（ω）｝を目標特性とするフィルタ係数を時変係数フィルタ部５０に出力する。これによって、反射波による周波数特性の歪みの補正処理と、雑音抑圧処理とを同時に処理することが可能となる。 The operation of the configuration shown in FIG. 11 is different from the operation of the configuration shown in FIG. 10 in the operation of the noise suppression and reflection inverse characteristic filter coefficient estimation unit 80. The noise suppression and reflection inverse characteristic filter coefficient estimation unit 80 receives the signal m1 (main signal), the signal m3 (noise reference signal), and the signal output from the reflection information calculation unit 60 as inputs. Based on these signals, the noise suppression filter characteristic Hw (ω) = (X (ω) −Nx (ω)) / X (ω) and the reflection inverse characteristic Hr (ω) are calculated. Further, the filter coefficient having the target characteristic {Hw (ω) · Hr (ω)} is output to the time-varying coefficient filter unit 50. As a result, it is possible to simultaneously perform the correction process of the distortion of the frequency characteristic due to the reflected wave and the noise suppression process.

以上のように、実施の形態３によれば、実施の形態１と同様、目的音を除去した理想的な雑音参照信号を得ることができる。また、実施の形態２と同様、主信号と雑音参照信号とを用いた２入力型の雑音抑圧処理と、反射波の影響による周波数特性歪みの補正処理とを同時に行うことができる。その結果、周囲の環境が騒音環境下であっても反射音場であっても、高Ｓ／Ｎでかつ平坦な周波数特性を得ることでき、通話や拡声の音声品質が改善するという効果や、音声認識の認識率が改善するという効果が得られる。 As described above, according to the third embodiment, as in the first embodiment, an ideal noise reference signal from which the target sound is removed can be obtained. Similarly to the second embodiment, the two-input type noise suppression process using the main signal and the noise reference signal and the correction process of the frequency characteristic distortion due to the influence of the reflected wave can be performed simultaneously. As a result, even if the surrounding environment is a noisy environment or a reflected sound field, it is possible to obtain a flat frequency characteristic with a high S / N, and the effect of improving the voice quality of calls and loudspeakers, The effect of improving the recognition rate of voice recognition can be obtained.

（実施の形態４）
次に、実施の形態４に係るマイクロホン装置について、図１２および図１３を用いて説明する。実施の形態４では、マイクロホン装置に到来する全方向の音の内、目的音とみなす方向を変化させる。 (Embodiment 4)
Next, a microphone device according to Embodiment 4 will be described with reference to FIGS. In the fourth embodiment, the direction regarded as the target sound is changed from the sounds in all directions arriving at the microphone device.

図１２は、実施の形態４に係るマイクロホン装置の構成を示すブロック図である。図１２において、マイクロホン装置は、図１１に示す構成に加え、検出閾値設定部９０をさらに備えている。なお、図１２において、実施の形態３と同様の構成要素については、図１１と同じ参照符号を付し、詳細な説明を省略する。 FIG. 12 is a block diagram showing a configuration of the microphone device according to Embodiment 4. In FIG. In FIG. 12, the microphone device further includes a detection threshold setting unit 90 in addition to the configuration shown in FIG. In FIG. 12, the same components as those in the third embodiment are denoted by the same reference numerals as those in FIG. 11, and detailed description thereof is omitted.

検出閾値設定部９０は、判定部１０において用いられる閾値の値を設定する。つまり、実施の形態４の構成が実施の形態３の構成と異なる点は、判定部１０において設定される閾値を制御可能にした点である。 The detection threshold setting unit 90 sets a threshold value used in the determination unit 10. That is, the configuration of the fourth embodiment is different from the configuration of the third embodiment in that the threshold set in the determination unit 10 can be controlled.

図１２においては、判定部１０において設定される閾値を変化させることができる。この閾値を変化させることで、目的音とみなす音が到来する方向を、正面方向から左右両側にどの角度まで含めるかを変化させることができる。つまり、この閾値を変化させることで、目的音として収音することが可能な角度の範囲を制御することできる。 In FIG. 12, the threshold set in the determination unit 10 can be changed. By changing this threshold value, it is possible to change to what angle the direction in which the sound regarded as the target sound arrives is included on the left and right sides from the front direction. That is, by changing this threshold value, the range of angles that can be collected as the target sound can be controlled.

例えば、検出閾値設定部９０によって上記閾値をｔｈ１と設定した場合（図３参照）を考える。この場合、θ１方向（図２および図３参照）から到来する音は目的音とみなされない。すなわち、θ１方向から到来する音は雑音とみなされ、雑音参照信号である信号ｍ３にはθ１方向から到来する音の成分が含まれることとなる。その結果、最終的な出力においては、θ１方向から到来する音は抑圧されることになる。 For example, consider the case where the detection threshold value setting unit 90 sets the threshold value as th1 (see FIG. 3). In this case, the sound coming from the θ1 direction (see FIGS. 2 and 3) is not regarded as the target sound. That is, the sound coming from the θ1 direction is regarded as noise, and the signal m3 that is the noise reference signal includes a sound component coming from the θ1 direction. As a result, in the final output, the sound coming from the θ1 direction is suppressed.

一方、閾値をｔｈ２と設定した場合（図３参照）、θ１方向から到来する音は目的音とみなされる。この場合、雑音参照信号である信号ｍ３にはθ１方向から到来する音の成分が含まれない。その結果、最終的な出力においては、θ１方向から到来する音は目的音として出力されることになる。 On the other hand, when the threshold is set to th2 (see FIG. 3), the sound coming from the θ1 direction is regarded as the target sound. In this case, the signal m3, which is a noise reference signal, does not include a sound component coming from the θ1 direction. As a result, in the final output, the sound arriving from the θ1 direction is output as the target sound.

以上のように、判定部１０の閾値を制御することによって、マイクロホン装置が収音可能な角度範囲を制御することが可能となる。ただし、当該角度範囲は、第２のマイクロホンユニット２の指向性死角方向、すなわち正面方向に対してある程度の角度範囲に限られる。 As described above, by controlling the threshold value of the determination unit 10, it is possible to control the angle range in which the microphone device can collect sound. However, the angle range is limited to a certain angle range with respect to the directional blind spot direction of the second microphone unit 2, that is, the front direction.

図１３は、マイクロホン装置の指向性パターンを示す図である。図１３（ａ）は、信号ｍ１の指向性パターンを示す図である。図１３（ｂ）は、閾値をｔｈ２に設定した場合のマイクロホン装置の出力信号ｙの指向性パターンを示す図である。図１３（ｃ）は、閾値をｔｈ１に設定した場合のマイクロホン装置の出力信号ｙの指向性パターンを示す図である。図１３（ｂ）においては、マイクロホン装置の収音可能な角度範囲が、図１３（ｃ）に比べて広くなる。例えば、しきい値をｔｈ２とする場合、角度θ１から到来する音は目的音と判定される。また、当該範囲を外れた部分では大きく感度が減衰している。一方、図１３（ｃ）においては、マイクロホン装置の収音可能な角度範囲が狭く、非常に鋭い指向特性が実現されている。この場合、角度θ１から到来する音は目的音と判定されない。 FIG. 13 is a diagram illustrating a directivity pattern of the microphone device. FIG. 13A shows a directivity pattern of the signal m1. FIG. 13B is a diagram showing a directivity pattern of the output signal y of the microphone device when the threshold is set to th2. FIG. 13C is a diagram showing a directivity pattern of the output signal y of the microphone device when the threshold is set to th1. In FIG. 13B, the angle range in which the microphone device can collect sound is wider than that in FIG. For example, when the threshold is set to th2, the sound coming from the angle θ1 is determined as the target sound. In addition, the sensitivity is greatly attenuated at portions outside the range. On the other hand, in FIG. 13C, the angle range in which the microphone device can collect sound is narrow, and a very sharp directional characteristic is realized. In this case, the sound coming from the angle θ1 is not determined as the target sound.

以上のように、実施の形態４によれば、判定部１０のしきい値を変化させることによって、マイクロホン装置の指向性の鋭さを変化させることができる。一般的に、マイクロホンの指向性は、鋭い死角を形成するよりも鋭い主ビームを形成するほうが困難であるが、実施の形態４によれば、従来にはない鋭い指向性を有するマイクロホン装置を実現することができる。 As described above, according to the fourth embodiment, the sharpness of the directivity of the microphone device can be changed by changing the threshold value of the determination unit 10. In general, the directivity of a microphone is more difficult to form a sharp main beam than to form a sharp blind spot. However, according to the fourth embodiment, a microphone device having a sharp directivity that has not existed in the past is realized. can do.

ここで、実使用上では、指向性の鋭さとマイクロホン装置を使用の使いやすさとは相反するものである。ユーザは、鋭い指向性のマイクロホン装置を使用する場合、正面方向を強く意識して用いなければならない。従って、使いやすさと雑音抑圧性能とを両立するためには、マイクロホン装置は、正面からある角度範囲までは一定の感度特性を持ち、それ以外の方向に対する感度減衰が大きくなるような指向特性を有することが望ましい。また、収音可能な角度範囲は、マイクロホン装置の用途や収音状況に応じて自由に設定できることが望ましい。実施の形態４によれば、マイクロホン装置の指向性は図１３に示すように変化する。図１３から明らかなように、実施の形態４に係るマイクロホン装置は、マイクロホン装置としての使いやすさと、雑音除去能力の高さとを両立することができることがわかる。 Here, in actual use, the sharpness of directivity is contrary to the ease of use of the microphone device. When using a microphone device with a sharp directivity, the user must use it with a strong awareness of the front direction. Therefore, in order to achieve both ease of use and noise suppression performance, the microphone device has a certain sensitivity characteristic from the front to a certain angle range, and a directivity characteristic that increases the sensitivity attenuation in other directions. It is desirable. In addition, it is desirable that the angle range in which sound can be collected can be freely set according to the use of the microphone device and the sound collecting situation. According to the fourth embodiment, the directivity of the microphone device changes as shown in FIG. As can be seen from FIG. 13, the microphone device according to Embodiment 4 can achieve both ease of use as a microphone device and high noise removal capability.

（実施の形態５）
次に、実施の形態５に係るマイクロホン装置について図１４を用いて説明する。なお、実施の形態１〜４に係るマイクロホン装置は、単一指向性マイクロホンユニットと双指向性マイクロホンユニットとを近接して配置し、各マイクロホンユニットから出力される信号を主信号および雑音参照信号とする構成であった。この構成のメリットは、小型化が可能である点、および指向性合成等の処理が不要であるので安価に実現可能である点である。 (Embodiment 5)
Next, a microphone device according to Embodiment 5 will be described with reference to FIG. In the microphone devices according to the first to fourth embodiments, the unidirectional microphone unit and the bidirectional microphone unit are arranged close to each other, and the signals output from the microphone units are the main signal and the noise reference signal. It was the composition to do. The merit of this configuration is that it can be miniaturized and can be realized at low cost because processing such as directivity synthesis is unnecessary.

一方、ビデオムービーやその他の収音機能を有する機器は、実装面の問題や性能面の問題で、しばしば無指向性または同一特性の指向性を持つ複数のマイクロホンユニットが用いられ、これらのマイクロホンユニットから出力される信号の合成によって指向性を形成する場合がある。複数のマイクロホンユニットからの信号に対して指向性合成を行う処理においては、回路雑音等の問題から、マイクロホンユニット間の間隔としてはある程度（通常１ｃｍ〜５ｃｍ）の間隔が必要とされる。そのため、当該処理を行う方法は、上述した実施の形態１〜４より小型化の面では不利である。しかし、当該処理を行う方法は、指向性の設計自由度が高い点や、デジタル処理を用いた可変特性を利用可能である点等、実装面でメリットがある。 On the other hand, video movies and other devices with sound collection functions often have multiple microphone units with omnidirectionality or directivity with the same characteristics due to mounting problems and performance problems. In some cases, directivity is formed by synthesizing signals output from the. In the process of performing directivity synthesis on signals from a plurality of microphone units, a certain distance (usually 1 cm to 5 cm) is required as a distance between the microphone units due to problems such as circuit noise. Therefore, the method of performing the process is disadvantageous in terms of downsizing compared to the above-described first to fourth embodiments. However, the method of performing the processing has advantages in terms of mounting, such as a high degree of freedom in design of directivity and the ability to use variable characteristics using digital processing.

そこで、実施の形態５においては、同一の指向特性を持つ複数のマイクロホンユニット（実施の形態５では２個）と、指向性合成部１００とを用いて、上記信号ｍ１に相当する主信号と、上記信号ｍ２に相当する雑音参照信号とを得る構成を採用する。 Therefore, in the fifth embodiment, using a plurality of microphone units having the same directivity characteristics (two in the fifth embodiment) and the directivity synthesis unit 100, a main signal corresponding to the signal m1, A configuration for obtaining a noise reference signal corresponding to the signal m2 is adopted.

図１４は、実施の形態５に係るマイクロホン装置の構成の一部を示す図である。図１４において、マイクロホン装置は、第３のマイクロホンユニット３と、第４のマイクロホンユニット４と、指向性合成部１００とを備えている。なお、信号ｍ１および信号ｍ２を得た後の構成は、実施の形態１〜４のいずれかの構成が用いられる。 FIG. 14 is a diagram showing a part of the configuration of the microphone device according to Embodiment 5. In FIG. In FIG. 14, the microphone device includes a third microphone unit 3, a fourth microphone unit 4, and a directivity synthesis unit 100. In addition, the structure after obtaining the signal m1 and the signal m2 uses the structure in any one of Embodiment 1-4.

図１４において、各マイクロホンユニット３および４は、正面方向を向く軸（図１４に示す一点鎖線）上に配置される。各マイクロホンユニット３および４の間の距離はｄである。各マイクロホンユニット３および４は、その指向性主軸が正面方向を向くように配置される。 In FIG. 14, each of the microphone units 3 and 4 is arranged on an axis (a chain line shown in FIG. 14) facing the front direction. The distance between each microphone unit 3 and 4 is d. Each microphone unit 3 and 4 is arranged such that its directivity main axis faces the front direction.

また、指向性合成部１００は、第１の信号遅延部１０１と、第１の信号減算部１０３と、第２の信号遅延部１０２と、第２の信号減算部１０４とを備えている。第１の信号遅延部１０１は、第４のマイクロホンユニット４から出力される信号を遅延させる。第２の信号遅延部１０２は、第３のマイクロホンユニット３から出力される信号を遅延させる。第１の信号減算部１０３は、第３のマイクロホンユニット３から出力される信号から、第１の信号遅延部１０１から出力される信号を減算する。これによって信号ｍ１が得られる。第２の信号減算部１０４は、第４のマイクロホンユニット４から出力される信号から、第２の信号遅延部１０２から出力される信号を減算する。これによって、信号ｍ２が得られる。 The directivity synthesis unit 100 includes a first signal delay unit 101, a first signal subtraction unit 103, a second signal delay unit 102, and a second signal subtraction unit 104. The first signal delay unit 101 delays the signal output from the fourth microphone unit 4. The second signal delay unit 102 delays the signal output from the third microphone unit 3. The first signal subtraction unit 103 subtracts the signal output from the first signal delay unit 101 from the signal output from the third microphone unit 3. As a result, a signal m1 is obtained. The second signal subtracting unit 104 subtracts the signal output from the second signal delay unit 102 from the signal output from the fourth microphone unit 4. Thereby, the signal m2 is obtained.

また、第１の信号遅延部１０１の遅延量τ１を０≦τ１≦ｄ／ｃ（ただし、ｃは音速）とすることによって、指向性主軸が正面方向となる２次音圧傾度型の超指向性特性を信号ｍ１として得ることができる。また、第２の信号遅延部１０２の信号遅延量τ２をτ２＝ｄ／ｃとすることによって、正面方向に指向性の死角が形成される信号（正面方向に指向性の死角が形成されるマイクロホンユニットからの結果として得られる信号）ｍ２を得ることができる。 Further, by setting the delay amount τ1 of the first signal delay unit 101 to 0 ≦ τ1 ≦ d / c (where c is the speed of sound), the superdirectivity of the secondary sound pressure gradient type in which the directivity main axis is the front direction. The characteristic can be obtained as the signal m1. Further, by setting the signal delay amount τ2 of the second signal delay unit 102 to τ2 = d / c, a signal in which a directional blind spot is formed in the front direction (a microphone in which a directional blind spot is formed in the front direction). (Resulting signal from the unit) m2.

以上の構成により、信号ｍ１の特性に予め超指向性を実現することで、後段の雑音抑圧処理と組み合わせ、従来の超指向性マイクロホンを大幅に上回る鋭い指向性と雑音抑圧性能を実現することができる。 With the above configuration, by realizing superdirectivity in advance in the characteristics of the signal m1, it is possible to realize sharp directivity and noise suppression performance that greatly exceed conventional superdirective microphones in combination with subsequent noise suppression processing. it can.

（実施の形態６）
次に、実施の形態６に係るマイクロホン装置について図１５を用いて説明する。実施の形態６は、実施の形態５と同様、同一の指向特性を持つ複数のマイクロホンユニットを用いて、主信号と雑音参照信号とを得る構成を採用するものである。 (Embodiment 6)
Next, a microphone device according to Embodiment 6 will be described with reference to FIG. As in the fifth embodiment, the sixth embodiment employs a configuration in which a main signal and a noise reference signal are obtained using a plurality of microphone units having the same directivity characteristics.

図１５は、実施の形態６に係るマイクロホン装置の構成の一部を示す図である。マイクロホン装置は、第３のマイクロホンユニット３と、第４のマイクロホンユニット４と、指向性合成部１００とを備えている。各マイクロホンユニット３および４は、正面方向を向く直線（図１５に示す点線）に垂直な軸（図１５に示す一点鎖線）上に配置される。各マイクロホンユニット３および４は、その指向性主軸が正面方向を向くように配置される。なお、信号ｍ１および信号ｍ２を得た後の構成は、実施の形態１〜４のいずれかの構成が用いられる。 FIG. 15 is a diagram showing a part of the configuration of the microphone device according to Embodiment 6. In FIG. The microphone device includes a third microphone unit 3, a fourth microphone unit 4, and a directivity synthesis unit 100. Each of the microphone units 3 and 4 is arranged on an axis (a chain line shown in FIG. 15) perpendicular to a straight line (a dotted line shown in FIG. 15) facing the front direction. Each microphone unit 3 and 4 is arranged such that its directivity main axis faces the front direction. In addition, the structure after obtaining the signal m1 and the signal m2 uses the structure in any one of Embodiment 1-4.

図１５において、指向性合成部１００は、第１の信号加算部１０５と、第２の信号減算部１０４とを備えている。第１の信号加算部１０５は、各マイクロホンユニット３および４から出力される信号を加算する。これによって主信号である信号ｍ１を得ることができる。第２の信号減算部１０４は、第４のマイクロホンユニット４から出力される信号から第３のマイクロホンユニット３から出力される信号を減算する。これによって、雑音参照信号である信号ｍ２を得ることができる。 In FIG. 15, the directivity synthesis unit 100 includes a first signal addition unit 105 and a second signal subtraction unit 104. The first signal adding unit 105 adds signals output from the microphone units 3 and 4. As a result, the signal m1, which is the main signal, can be obtained. The second signal subtracting unit 104 subtracts the signal output from the third microphone unit 3 from the signal output from the fourth microphone unit 4. As a result, a signal m2 that is a noise reference signal can be obtained.

図１５において、各マイクロホンユニット３および４の間隔がある程度狭い場合、信号ｍ１の指向特性は、マイクロホンユニット単体の場合（実施の形態１〜４）と高域特性を除いてあまり変わらない。従って、図１５に示す構成では、図１４に示す構成と比較して鋭い指向性を得ることはできないが、その半面、振動雑音や回路雑音の低減効果が得られる。また、正面方向から到来した音は各マイクロホンユニット３および４で同位相で検出されるので、正面方向に指向性の死角が形成された信号ｍ２を得ることができる。 In FIG. 15, when the distance between the microphone units 3 and 4 is narrow to some extent, the directivity characteristic of the signal m1 is not much different from that of the single microphone unit (Embodiments 1 to 4) except for the high frequency characteristics. Therefore, the configuration shown in FIG. 15 cannot obtain sharp directivity as compared with the configuration shown in FIG. 14, but on the other hand, an effect of reducing vibration noise and circuit noise can be obtained. Also, since the sound coming from the front direction is detected in the same phase by the microphone units 3 and 4, a signal m2 in which a directional blind spot is formed in the front direction can be obtained.

（実施の形態７）
次に、実施の形態７に係るマイクロホン装置について図１６を用いて説明する。実施の形態７は、実施の形態５と同様、同一の指向特性を持つ複数のマイクロホンユニットを用いて、主信号と雑音参照信号とを得る構成を採用するものである。 (Embodiment 7)
Next, a microphone device according to Embodiment 7 will be described with reference to FIG. As in the fifth embodiment, the seventh embodiment employs a configuration in which a main signal and a noise reference signal are obtained using a plurality of microphone units having the same directivity characteristics.

図１６（ａ）は、実施の形態７に係るマイクロホン装置の構成の一部を示す図である。マイクロホン装置は、第３のマイクロホンユニット３と、第４のマイクロホンユニット４と、指向性合成部１００とを備えている。各マイクロホンユニット３および４の配置は、図１５に示す配置と同様である。なお、信号ｍ１および信号ｍ２を得た後の構成は、実施の形態１〜４のいずれかの構成が用いられる。 FIG. 16A shows a part of the configuration of the microphone device according to the seventh embodiment. The microphone device includes a third microphone unit 3, a fourth microphone unit 4, and a directivity synthesis unit 100. The arrangement of the microphone units 3 and 4 is the same as the arrangement shown in FIG. In addition, the structure after obtaining the signal m1 and the signal m2 uses the structure in any one of Embodiment 1-4.

図１６（ａ）において、指向性合成部１００は、信号遅延部１１１と、第２の信号減算部１０４と、信号増幅部１５０と、第１の信号減算部１０３とを備えている。信号遅延部１１１は、第３のマイクロホンユニット３から出力される信号を入力として信号を遅延させる。第２の信号減算部１０４は、第４のマイクロホンユニット４から出力される信号から、信号遅延部１１１から出力される信号を減算する。これによって雑音参照信号である信号ｍ２を得ることができる。信号増幅部１５０は、信号遅延部１１１から出力される信号を定数倍する。第１の信号減算部１０３は、第４のマイクロホンユニット４から出力される信号から、信号増幅部１５０から出力される信号を減算する。これによって、主信号である信号ｍ１を得ることができる。 In FIG. 16A, the directivity synthesis unit 100 includes a signal delay unit 111, a second signal subtraction unit 104, a signal amplification unit 150, and a first signal subtraction unit 103. The signal delay unit 111 receives the signal output from the third microphone unit 3 and delays the signal. The second signal subtracting unit 104 subtracts the signal output from the signal delay unit 111 from the signal output from the fourth microphone unit 4. As a result, a signal m2 that is a noise reference signal can be obtained. The signal amplification unit 150 multiplies the signal output from the signal delay unit 111 by a constant. The first signal subtracting unit 103 subtracts the signal output from the signal amplifying unit 150 from the signal output from the fourth microphone unit 4. Thereby, the signal m1 which is the main signal can be obtained.

図１６（ａ）において、信号ｍ１を得る過程と信号ｍ２を得る過程との違いは、信号ｍ１を得る過程には信号増幅部１５０が存在することである。信号ｍ１および信号ｍ２における指向性の死角方向は、信号遅延部１１１の遅延量τ１によって決まる。例えば、τ１＝０である場合、指向性の死角は正面方向となり、τ１＝ｄ／ｃである場合、指向性の死角は正面方向に垂直な方向となる。ここでは、目的音の方向に死角ができるように遅延量τ１を設定する。これによって、信号ｍ１および信号ｍ２には、目的音方向以外の他の方向から到来する音の成分が目的音の成分よりも多く含まれることになる。 In FIG. 16A, the difference between the process of obtaining the signal m1 and the process of obtaining the signal m2 is that the signal amplification unit 150 exists in the process of obtaining the signal m1. Directional blind spot directions in the signal m1 and the signal m2 are determined by the delay amount τ1 of the signal delay unit 111. For example, when τ1 = 0, the directivity blind spot is the front direction, and when τ1 = d / c, the directivity blind spot is the direction perpendicular to the front direction. Here, the delay amount τ1 is set so that a blind spot is formed in the direction of the target sound. As a result, the signal m1 and the signal m2 contain more sound components coming from directions other than the target sound direction than the target sound components.

ここで、指向性合成部１００で形成される指向性パターンは、目的音の方向については、信号ｍ１と信号ｍ２との間で感度差が大きいことが好ましい。一方、目的音の方向以外の方向については、信号ｍ１と信号ｍ２との間で感度特性に差がないことが好ましい。これは、複数方向から同時に騒音が到来している状況で、雑音参照信号をもとに主信号に混入する雑音成分を抑圧するためには、図４に示すスペクトル比演算部４３の出力が、雑音が到来する方向にかかわらず一定となる必要があるからである。すなわち、スペクトル比演算部４３の出力が雑音の到来方向によって変化すると、ある特定の方向の推定雑音スペクトルＮｘ（ω）しか正確に求まらないことになるからである。従って、信号ｍ１と信号ｍ２との指向性パターンは、指向性の死角部分のみにおいて形状が異なり、他の部分では形状が同じになることが好ましい。 Here, the directivity pattern formed by the directivity synthesis unit 100 preferably has a large sensitivity difference between the signal m1 and the signal m2 with respect to the direction of the target sound. On the other hand, it is preferable that there is no difference in sensitivity characteristics between the signal m1 and the signal m2 in directions other than the direction of the target sound. In order to suppress the noise component mixed in the main signal based on the noise reference signal in a situation where noise is simultaneously received from a plurality of directions, the output of the spectrum ratio calculation unit 43 shown in FIG. This is because it needs to be constant regardless of the direction of noise arrival. That is, if the output of the spectrum ratio calculation unit 43 changes depending on the noise arrival direction, only the estimated noise spectrum Nx (ω) in a specific direction can be accurately obtained. Therefore, it is preferable that the directivity patterns of the signal m1 and the signal m2 are different in shape only in the blind spot portion of directivity and the shape is the same in other portions.

ここで、各マイクロホンユニット３および４からの信号を減算するときに第３のマイクロホンユニット３と第４のマイクロホンユニット４との感度のバランスを崩すと、最も精度が必要な零点、すなわち指向性の死角の部分の感度が上昇する。この性質を利用して、信号ｍ１側に信号増幅部１５０を設け、信号増幅率を０．８５程度に設定することによって、図１６（ｂ）に示すような指向性パターンを得ることができる。図１６（ｂ）は、図１６（ａ）における信号ｍ１および信号ｍ２における指向性パターンを示す図である。図１６（ｂ）に示すように、実施の形態７においては、指向性の死角部分のみにおいて形状が異なり、他の部分では形状がほぼ同じになる指向性パターンを得ることができる。 Here, if the balance of sensitivity between the third microphone unit 3 and the fourth microphone unit 4 is lost when the signals from the microphone units 3 and 4 are subtracted, the zero point that requires the most accuracy, that is, the directivity. The sensitivity of the blind spot increases. Utilizing this property, by providing the signal amplification unit 150 on the signal m1 side and setting the signal amplification factor to about 0.85, a directivity pattern as shown in FIG. 16B can be obtained. FIG. 16B is a diagram showing directivity patterns in the signal m1 and the signal m2 in FIG. As shown in FIG. 16B, in the seventh embodiment, it is possible to obtain a directivity pattern whose shape is different only in the blind spot portion of directivity and whose shape is substantially the same in other portions.

以上のように、実施の形態７によれば、目的音方向に関してのみ感度特性が異なる信号ｍ１および信号ｍ２を得ることができる。そのため、後段の雑音抑圧処理において良好な抑圧効果を得ることができるようになる。 As described above, according to the seventh embodiment, it is possible to obtain the signal m1 and the signal m2 having different sensitivity characteristics only with respect to the target sound direction. Therefore, a good suppression effect can be obtained in the subsequent noise suppression processing.

（実施の形態８）
次に、実施の形態８に係るマイクロホン装置について図１７を用いて説明する。実施の形態８は、実施の形態５と同様、同一の指向特性を持つ複数のマイクロホンユニットを用いて、主信号と雑音参照信号とを得る構成を採用するものである。 (Embodiment 8)
Next, a microphone device according to Embodiment 8 will be described with reference to FIG. As in the fifth embodiment, the eighth embodiment employs a configuration in which a main signal and a noise reference signal are obtained using a plurality of microphone units having the same directivity characteristics.

図１７（ａ）は、実施の形態８に係るマイクロホン装置の構成の一部を示す図である。図１７（ａ）において、指向性合成部１００は、図１６（ａ）に示す構成に加え、角度設定部１６０と、第２の信号遅延部１１２をさらに備えている。なお、信号ｍ１および信号ｍ２を得た後の構成は、実施の形態１〜４のいずれかの構成が用いられる。 FIG. 17A shows a part of the configuration of the microphone device according to Embodiment 8. In FIG. 17A, the directivity synthesis unit 100 further includes an angle setting unit 160 and a second signal delay unit 112 in addition to the configuration shown in FIG. In addition, the structure after obtaining the signal m1 and the signal m2 uses the structure in any one of Embodiment 1-4.

図１７（ａ）に示す構成は、角度設定部１６０をさらに設けるとともに、第４のマイクロホンユニット４の後段に第２の信号遅延部１１２を設けた点で図１６（ａ）に示す構成と異なる。なお、図１７（ａ）における基本的な動作は図１６（ａ）と同様なので省略する。図１７（ａ）における動作のうち図１６（ａ）における動作と異なる点は、角度設定部１６０によって、目的音方向を変化させることができるようにした点である。 The configuration shown in FIG. 17A is different from the configuration shown in FIG. 16A in that an angle setting unit 160 is further provided and a second signal delay unit 112 is provided after the fourth microphone unit 4. . The basic operation in FIG. 17A is the same as that in FIG. 17A is different from the operation in FIG. 16A in that the target sound direction can be changed by the angle setting unit 160.

角度設定部１６０は、第１の信号遅延部１１１の信号遅延量τ１を、０≦τ１≦２ｄ／ｃ（ただし、ｄはマイクロホンユニットの間隔、ｃは音速）の範囲で変化させることができるものとする。ここで、第２の信号遅延部１１２がない場合、第１の信号遅延部１１１の信号遅延量τ１を上記の範囲で変化させても、正面方向に対して０°から＋９０°までの範囲でしか目的音方向を変化させることができない。そこで、第２の信号遅延部１１２を設け、その信号遅延量τ２をτ２＝ｄ／ｃとすることによって、正面方向に対して±９０°の範囲で目的音方向を変化させることとしている。 The angle setting unit 160 can change the signal delay amount τ1 of the first signal delay unit 111 within a range of 0 ≦ τ1 ≦ 2d / c (where d is the interval between the microphone units and c is the speed of sound). And Here, when the second signal delay unit 112 is not provided, even if the signal delay amount τ1 of the first signal delay unit 111 is changed in the above range, the range from 0 ° to + 90 ° with respect to the front direction. However, the target sound direction can only be changed. Therefore, by providing the second signal delay unit 112 and setting the signal delay amount τ2 to τ2 = d / c, the target sound direction is changed within a range of ± 90 ° with respect to the front direction.

以上のように、実施の形態８においては、マイクロホン装置の収音方向（目的音方向）を可変にすることが可能となる。例えば、図１７（ｂ）に示す指向性パターンを実現することも可能であるし、信号遅延部の信号遅延量を変化させることによって、図１７（ｃ）に示す指向性パターンを実現することも可能である。なお、可変遅延特性は、信号遅延部をオールパスフィルタＨ（ω）＝（Ａ＋ｚ−１）／（１＋Ａ・ｚ−１）で構成し、係数Ａを０≦Ａ＜１とすることによって簡単に実現することができる。信号遅延量を変化させる際には、角度設定部１６０によってこの係数Ａを変化させる。なお、大きな遅延量や、遅延周波数特性の直線性が必要なときには、２次オールパスフィルタおよび／またはオールパスフィルタを従属接続すればよい。 As described above, in the eighth embodiment, the sound collection direction (target sound direction) of the microphone device can be made variable. For example, the directivity pattern shown in FIG. 17B can be realized, or the directivity pattern shown in FIG. 17C can be realized by changing the signal delay amount of the signal delay unit. Is possible. The variable delay characteristic is easily realized by configuring the signal delay unit with an all-pass filter H (ω) = (A + z−1) / (1 + A · z−1) and setting the coefficient A to 0 ≦ A <1. can do. When changing the signal delay amount, the coefficient A is changed by the angle setting unit 160. If a large delay amount or linearity of delay frequency characteristics is required, a secondary all-pass filter and / or an all-pass filter may be connected in cascade.

（実施の形態９）
次に、実施の形態９に係るマイクロホン装置について図１８を用いて説明する。実施の形態９は、実施の形態５と同様、同一の指向特性を持つ複数のマイクロホンユニットを用いて、主信号と雑音参照信号とを得る構成を採用するものである。 (Embodiment 9)
Next, a microphone device according to Embodiment 9 will be described with reference to FIG. As in the fifth embodiment, the ninth embodiment employs a configuration in which a main signal and a noise reference signal are obtained using a plurality of microphone units having the same directivity characteristics.

図１８（ａ）は、実施の形態９に係るマイクロホン装置の構成の一部を示す図である。マイクロホン装置は、第３のマイクロホンユニット３と、第４のマイクロホンユニット４と、指向性合成部１００と、角度設定部１６０とを備えている。各マイクロホンユニット３および４の配置は、図１５に示す配置と同様である。なお、信号ｍ１および信号ｍ２を得た後の構成は、実施の形態１〜４のいずれかの構成が用いられる。 FIG. 18A illustrates a part of the configuration of the microphone device according to the ninth embodiment. The microphone device includes a third microphone unit 3, a fourth microphone unit 4, a directivity synthesis unit 100, and an angle setting unit 160. The arrangement of the microphone units 3 and 4 is the same as the arrangement shown in FIG. In addition, the structure after obtaining the signal m1 and the signal m2 uses the structure in any one of Embodiment 1-4.

図１８（ａ）において、指向性合成部１００は、第３の信号遅延部１２１と、第１の信号遅延部１０１と、第４の信号遅延部１２２と、第２の信号遅延部１０２と、第１の信号減算部１０３と、第２の信号減算部１０４とを備えている。第３の信号遅延部１２１は、第３のマイクロホンユニット３から出力される信号を遅延させる。第１の信号遅延部１０１は、第４のマイクロホンユニット４から出力される信号を遅延させる。第１の信号減算部１０３は、第３の信号遅延部１２１から出力される信号から、第１の信号遅延部１０１から出力される信号を減算する。これによって、主信号である信号ｍ１を得ることができる。第４の信号遅延部１２２は、第４のマイクロホンユニット４から出力される信号を遅延させる。第２の信号遅延部１０２は、第３のマイクロホンユニット３から出力される信号を遅延させる。第２の信号減算部１０４は、第４の信号遅延部１２２から出力される信号から、第２の信号遅延部１０２から出力される信号を減算する。これによって、雑音参照信号である信号ｍ２を得ることができる。角度設定部１６０は、第１の信号遅延部１０１の信号遅延量と、第２の信号遅延部１０２の信号遅延量とを独立して制御する。 18A, the directivity synthesis unit 100 includes a third signal delay unit 121, a first signal delay unit 101, a fourth signal delay unit 122, a second signal delay unit 102, A first signal subtracting unit 103 and a second signal subtracting unit 104 are provided. The third signal delay unit 121 delays the signal output from the third microphone unit 3. The first signal delay unit 101 delays the signal output from the fourth microphone unit 4. The first signal subtracting unit 103 subtracts the signal output from the first signal delay unit 101 from the signal output from the third signal delay unit 121. Thereby, the signal m1 which is the main signal can be obtained. The fourth signal delay unit 122 delays the signal output from the fourth microphone unit 4. The second signal delay unit 102 delays the signal output from the third microphone unit 3. The second signal subtracting unit 104 subtracts the signal output from the second signal delay unit 102 from the signal output from the fourth signal delay unit 122. As a result, a signal m2 that is a noise reference signal can be obtained. The angle setting unit 160 independently controls the signal delay amount of the first signal delay unit 101 and the signal delay amount of the second signal delay unit 102.

図１８（ａ）において、信号ｍ１側の構成は、信号ｍ２側の構成に対して対称的に構成される。これによって、信号ｍ１の指向性パタンと信号ｍ２の指向性パターンとは独立に制御されるので、信号ｍ１および信号ｍ２の指向性パターンを、目的音方向の感度に重点を置く設計とすることができる。具体的には、信号ｍ１の指向性パターンを図１８（ｂ）に示すように、目的音方向でできるだけ感度が高く、かつ雑音抑圧効果が得られる指向性とする。さらに、信号ｍ２の指向性パターンを図１８（ｃ）に示すように、指向性の死角方向を目的音方向に一致させるように形成する。 In FIG. 18A, the configuration on the signal m1 side is configured symmetrically with respect to the configuration on the signal m2 side. As a result, the directivity pattern of the signal m1 and the directivity pattern of the signal m2 are controlled independently. Therefore, the directivity patterns of the signal m1 and the signal m2 may be designed to emphasize the sensitivity in the target sound direction. it can. Specifically, as shown in FIG. 18B, the directivity pattern of the signal m1 has a directivity that is as sensitive as possible in the target sound direction and that provides a noise suppression effect. Further, as shown in FIG. 18C, the directivity pattern of the signal m2 is formed so that the blind spot direction of directivity coincides with the target sound direction.

以上のように、実施の形態９では、後段の雑音抑圧処理を補助的に用い、前段の指向性合成によって雑音を積極的に抑圧する。そのため、実施の形態９では信号ｍ１の指向性パターンを優先して形成する。ここで、指向性合成は線形処理であるので、音声波形歪などを起こしにくいという特徴がある。一方、雑音抑圧処理は、フィルタ係数が時間的に変化する非線形処理であるので、雑音スペクトル等様々な推定部の誤差によって、音声波形歪を生じる場合がある。このように観点から、図１７（ｂ）および（ｃ）に示す指向性パターンを採用するか、図１８（ｂ）および（ｃ）に示す指向性パターンを採用するかについては、使用環境（目的音の大きさ、周囲騒音レベル、反射、残響等）や、用途（通話、音声認識、録音等）や、必要とする雑音抑圧量等によって、適宜選択することが好ましい。 As described above, in the ninth embodiment, the noise suppression process at the subsequent stage is supplementarily used, and noise is positively suppressed by the directivity synthesis at the previous stage. Therefore, in the ninth embodiment, the directivity pattern of the signal m1 is formed with priority. Here, since the directivity synthesis is a linear process, there is a feature that speech waveform distortion or the like hardly occurs. On the other hand, the noise suppression process is a non-linear process in which the filter coefficient changes with time, so that there are cases where speech waveform distortion occurs due to errors of various estimation units such as a noise spectrum. From this point of view, whether to use the directivity pattern shown in FIGS. 17B and 17C or the directivity pattern shown in FIGS. 18B and 18C depends on the usage environment (purpose It is preferable to select appropriately according to the volume of sound, ambient noise level, reflection, reverberation, etc.), application (calling, voice recognition, recording, etc.), required noise suppression amount, and the like.

（実施の形態１０）
次に、実施の形態１０に係るマイクロホン装置について図１９を用いて説明する。実施の形態１０では、２つのマイクロホンユニットの指向性主軸が異なる方向を向いて設けられている機器において、本発明の雑音抑制処理に必要な主信号および雑音参照信号を得ることを目的とする。 (Embodiment 10)
Next, a microphone device according to Embodiment 10 will be described with reference to FIG. The object of the tenth embodiment is to obtain a main signal and a noise reference signal necessary for noise suppression processing of the present invention in a device in which the directivity main axes of two microphone units are oriented in different directions.

図１９は、実施の形態１０に係るマイクロホン装置の構成の一部を示す図である。マイクロホン装置は、第３のマイクロホンユニット３と、第４のマイクロホンユニット４と、指向性再合成部２００とを備えている。なお、信号ｍ１および信号ｍ２を得た後の構成は、実施の形態１〜４のいずれかの構成が用いられる。 FIG. 19 is a diagram illustrating a part of the configuration of the microphone device according to the tenth embodiment. The microphone device includes a third microphone unit 3, a fourth microphone unit 4, and a directivity re-synthesis unit 200. In addition, the structure after obtaining the signal m1 and the signal m2 uses the structure in any one of Embodiment 1-4.

図１９において、各マイクロホンユニット３および４が配置される位置は、図１５に示す位置と同様である。ただし、図１９においては、第３のマイクロホンユニット３は、正面方向に対して所定の角度だけ回転した方向に指向性主軸が向けられている。第４のマイクロホンユニット４は、正面方向に対して所定の角度だけ回転（第３のマイクロホンユニット３とは逆の回転方向）した方向に指向性主軸が向けられている。ここで、第３のマイクロホンユニット３から出力される信号を右チャンネル信号と呼び、第４のマイクロホンユニット４から出力される信号を左チャンネル信号と呼ぶ。 In FIG. 19, the positions where the microphone units 3 and 4 are arranged are the same as the positions shown in FIG. However, in FIG. 19, the third microphone unit 3 has a directional main axis directed in a direction rotated by a predetermined angle with respect to the front direction. The fourth microphone unit 4 has a directional main axis directed in a direction rotated by a predetermined angle with respect to the front direction (the direction of rotation opposite to that of the third microphone unit 3). Here, a signal output from the third microphone unit 3 is referred to as a right channel signal, and a signal output from the fourth microphone unit 4 is referred to as a left channel signal.

また、図１９において、指向性再合成部２００は、信号加算部２０５と、信号減算部２０４とを備えている。信号加算部２０５は、右チャンネル信号と左チャンネル信号とを加算する。これによって主信号である信号ｍ１を得ることができる。信号減算部２０４は、左チャンネル信号から右チャンネル信号を減算する。これによって雑音参照信号である信号ｍ２を得ることができる。 In FIG. 19, the directivity re-synthesis unit 200 includes a signal addition unit 205 and a signal subtraction unit 204. The signal adding unit 205 adds the right channel signal and the left channel signal. As a result, the signal m1, which is the main signal, can be obtained. The signal subtracting unit 204 subtracts the right channel signal from the left channel signal. As a result, a signal m2 that is a noise reference signal can be obtained.

なお、図１９の構成は、例えばビデオムービーの様にワンポイントステレオマイクロホンが用いられている機器に本発明を適用することを想定している。例えば、この機器では、通常はステレオ収音を行い、目的音として正面方向のみを強調する場合には以下に説明するような指向性の再合成を行うようにしてもよい。 Note that the configuration of FIG. 19 assumes that the present invention is applied to a device in which a one-point stereo microphone is used, such as a video movie. For example, in this device, stereo sound collection is usually performed, and when only the front direction is emphasized as the target sound, re-synthesis of directivity as described below may be performed.

通常のワンポイントステレオマイクロホンでは、再生時の音像定位を考慮に入れて、中央（図１９に示す正面方向）から到来する音の位相が左右のマイクロホンユニットにおいて同位相となるように、左右のマイクロホンユニットの振幅および位相特性は同一のものが用いられる。また、上述したように、各マイクロホンユニット３および４の指向性の角度は、左右のマイクロホンユニットで等しい角度に設定される。従って、信号加算部２０５において右チャンネル信号と左チャンネル信号とを加算することによって、正面方向に指向性がある信号ｍ１が得られる。また、信号減算部２０４において、左チャンネル信号から右チャンネル信号を減算することによって、正面方向に指向性死角を持つ信号ｍ２が得られる。以上のように、指向性再合成部２００によって生成される信号ｍ１および信号ｍ２は、それぞれ、実施の形態１の信号ｍ１および信号ｍ２と同様の信号となる。従って、信号ｍ１および信号ｍ２を用いて雑音抑圧処理や反射特性歪みの補正処理を行うことが可能である。 In a normal one-point stereo microphone, the left and right microphones are arranged so that the sound coming from the center (front direction shown in FIG. 19) has the same phase in the left and right microphone units, taking into account the sound image localization during playback. The unit has the same amplitude and phase characteristics. Further, as described above, the directivity angles of the microphone units 3 and 4 are set to be equal in the left and right microphone units. Therefore, by adding the right channel signal and the left channel signal in the signal adding unit 205, a signal m1 having directivity in the front direction can be obtained. The signal subtracting unit 204 subtracts the right channel signal from the left channel signal, thereby obtaining a signal m2 having a directional blind spot in the front direction. As described above, the signal m1 and the signal m2 generated by the directivity re-synthesis unit 200 are the same as the signal m1 and the signal m2 of Embodiment 1, respectively. Accordingly, it is possible to perform noise suppression processing and reflection characteristic distortion correction processing using the signal m1 and the signal m2.

以上のようにして、実施の形態１０によれば、ワンポイントステレオマイクロホンから出力される信号を利用して、目的音方向の音を強調することができる。従って、ワンポイントステレオマイクロホンを有する機器を例えばズームマイクロホンとして機能させることが可能となる。また、実施の形態１０では、ステレオ信号をもとにして指向性の再合成を行うので、ステレオ信号と正面方向の信号とが同時に得られるマルチチャンネル収音にも応用が可能となる。なお、ステレオマイクはアナログ回路であっても、上記と同様の効果を得ることができる。 As described above, according to the tenth embodiment, the sound in the target sound direction can be enhanced using the signal output from the one-point stereo microphone. Therefore, a device having a one-point stereo microphone can function as a zoom microphone, for example. In Embodiment 10, since directivity recombination is performed based on a stereo signal, it can be applied to multi-channel sound collection in which a stereo signal and a signal in the front direction can be obtained simultaneously. Even if the stereo microphone is an analog circuit, the same effect as described above can be obtained.

（実施の形態１１）
次に、実施の形態１１に係るマイクロホン装置について図２０を用いて説明する。実施の形態１１では、ステレオ信号が生成される機器において、本発明の雑音抑制処理に必要な主信号および雑音参照信号を得ることを目的とする。 (Embodiment 11)
Next, a microphone device according to Embodiment 11 will be described with reference to FIG. The object of the eleventh embodiment is to obtain a main signal and a noise reference signal necessary for noise suppression processing of the present invention in a device that generates a stereo signal.

図２０は、実施の形態１１に係るマイクロホン装置の構成を示す図である。図２０において、マイクロホン装置は、第５のマイクロホンユニット５と、第６のマイクロホンユニット６と、指向性合成部５００と、指向性再合成部２００とを備えている。各マイクロホンユニット５および６は、同一特性の無指向性マイクロホンユニットである。各マイクロホンユニット５および６の配置位置は、図１５に示す配置と同様である。指向性合成部５００は、各マイクロホンユニット５および６から出力される信号を入力として、右チャンネル信号Ｒｃｈおよび左チャンネル信号Ｌｃｈを出力する。指向性再合成部２００は、右チャンネル信号Ｒｃｈおよび左チャンネル信号Ｌｃｈを入力として、目的音方向に感度を持つ主信号である信号ｍ１と、目的音方向に指向性死角を持つ雑音参照信号である信号ｍ２とを出力する。なお、目的音方向は、正面方向以外の方向も設定可能とする。 FIG. 20 is a diagram showing a configuration of the microphone device according to Embodiment 11. In FIG. 20, the microphone device includes a fifth microphone unit 5, a sixth microphone unit 6, a directivity synthesis unit 500, and a directivity resynthesis unit 200. Each microphone unit 5 and 6 is an omnidirectional microphone unit having the same characteristics. The arrangement positions of the microphone units 5 and 6 are the same as the arrangement shown in FIG. Directivity synthesis section 500 receives signals output from microphone units 5 and 6 as inputs, and outputs right channel signal Rch and left channel signal Lch. The directivity re-synthesis unit 200 receives the right channel signal Rch and the left channel signal Lch as input, and is a signal m1 that is a main signal having sensitivity in the target sound direction and a noise reference signal having a directivity blind angle in the target sound direction. The signal m2 is output. The target sound direction can be set to a direction other than the front direction.

また、図２０において、指向性再合成部２００は、逆指向性合成部２５０と、指向性合成部１００とを備えている。逆指向性合成部２５０は、指向性合成部５００から出力される信号（右チャンネル信号Ｒｃｈおよび左チャンネル信号Ｌｃｈ）を入力とする。逆指向性合成部２５０は、右チャンネル信号Ｒｃｈおよび左チャンネル信号Ｌｃｈから、無指向性の信号を生成する。指向性合成部１００は、実施の形態５において示したものと同様である。ただし、ここでは、角度設定部１６０は設けられていない構成とする。また、図２０では、指向性合成部１００を図１８（ａ）に示す構成としたが、指向性合成部１００は、図１５、図１６（ａ）、および図１７（ａ）に示す構成であってもよい。 In FIG. 20, the directivity re-synthesis unit 200 includes an inverse directivity synthesis unit 250 and a directivity synthesis unit 100. The reverse directivity synthesis unit 250 receives the signals (right channel signal Rch and left channel signal Lch) output from the directivity synthesis unit 500. The reverse directivity synthesis unit 250 generates an omnidirectional signal from the right channel signal Rch and the left channel signal Lch. The directivity synthesis unit 100 is the same as that shown in the fifth embodiment. However, here, the angle setting unit 160 is not provided. In FIG. 20, the directivity synthesis unit 100 is configured as shown in FIG. 18A, but the directivity synthesis unit 100 is configured as shown in FIGS. 15, 16A, and 17A. There may be.

実施の形態１１では、指向性合成部５００によって得られたステレオ信号（右チャンネル信号Ｒｃｈおよび左チャンネル信号Ｌｃｈ）が、逆指向性合成部２５０によって各マイクロホンユニット５および６から出力された信号に再変換される。つまり、ステレオ信号は、２つの無指向性の信号に再変換される。さらに、再変換によって得られた無指向性の信号は、指向性合成部１００によって所定の方向から到来する目的音を検出するための主信号および雑音参照信号に変換される。 In the eleventh embodiment, the stereo signals (right channel signal Rch and left channel signal Lch) obtained by directivity synthesis section 500 are re-converted into signals output from microphone units 5 and 6 by reverse directivity synthesis section 250. Converted. That is, the stereo signal is reconverted into two omnidirectional signals. Further, the omnidirectional signal obtained by the reconversion is converted into a main signal and a noise reference signal for detecting a target sound coming from a predetermined direction by the directivity synthesis unit 100.

ここで、ステレオ信号を出力するための指向性合成部５００は、第１の信号遅延部５０１と、第１の信号減算部５２１と、第２の信号遅延部５０２と、第２の信号減算部５２２とから構成される。第１の信号遅延部５０１は、第６のマイクロホンユニット６から出力される信号を遅延して出力する。第１の信号減算部５２１は、第５のマイクロホンユニット５から出力される信号から、第１の信号遅延部５０１から出力される信号を減算し、減算の結果得られた信号Ｒｃｈを出力する。第２の信号遅延部５０２は、第５のマイクロホンユニット５から出力される信号を遅延して出力する。第２の信号減算部５２２は、第６のマイクロホンユニット６から出力される信号から、第２の信号遅延部５０２から出力される信号を減算し、減算の結果得られた信号Ｌｃｈを出力する。以上に述べた指向性合成部５００の動作を数式で表現すると、次のようになる。

ここで、左辺のｘ１およびｘ２は第５および第６のマイクロホンユニット５および６からそれぞれ出力される信号であり、右辺のＲｃｈおよびＬｃｈが指向性合成部５００から出力されるステレオ信号である。なお、指向性合成部５００については、一般に用いられている指向性合成のための構成であるので、詳細説明は省略する。なお、式（１）において、１／（１−Ｈτ４（ω））の部分は、６ｄｂ／ｏｃｔの周波数特性補正項になる。実際のマイクロホン装置では補正が行われるが、指向特性とは別に考えられるのでここでは無視している。指向性合成部５００によって得られたステレオ信号（信号Ｒｃｈおよび信号Ｌｃｈ）をマイクロホンユニットから出力された信号（信号ｘ１および信号ｘ２）に戻すには、式（１）の左辺第２項の行列の逆行列を両辺の左側から掛ければよく、いわゆる逆フィルタによって実現することができる。このことを数式で表現すると（２）、（３）のようになる。

従って、信号Ｒｃｈおよび信号Ｌｃｈに式（３）の処理を行うことによって、逆指向性合成を実現することができる。図２０に示す逆指向性合成部２５０は、式（３）を図示したものである。指向性合成部１００は、このようにして得られた信号ｘ１および信号ｘ２から、目的音方向に感度を持つ主信号ｍ１と、目的音方向に指向性死角を持つ雑音参照信号ｍ２とを生成する。 Here, the directivity synthesis unit 500 for outputting a stereo signal includes a first signal delay unit 501, a first signal subtraction unit 521, a second signal delay unit 502, and a second signal subtraction unit. 522. The first signal delay unit 501 delays and outputs the signal output from the sixth microphone unit 6. The first signal subtracting unit 521 subtracts the signal output from the first signal delay unit 501 from the signal output from the fifth microphone unit 5, and outputs the signal Rch obtained as a result of the subtraction. The second signal delay unit 502 delays and outputs the signal output from the fifth microphone unit 5. The second signal subtracting unit 522 subtracts the signal output from the second signal delay unit 502 from the signal output from the sixth microphone unit 6, and outputs the signal Lch obtained as a result of the subtraction. The operation of the directivity synthesis unit 500 described above is expressed as follows.

Here, x1 and x2 on the left side are signals output from the fifth and sixth microphone units 5 and 6, respectively, and Rch and Lch on the right side are stereo signals output from the directivity synthesis unit 500. Note that the directivity synthesis unit 500 is a commonly used configuration for directivity synthesis, and thus detailed description thereof is omitted. In Equation (1), the 1 / (1-Hτ4 (ω)) portion is a frequency characteristic correction term of 6 db / oct. Although correction is performed in an actual microphone device, it is ignored here because it can be considered separately from directivity characteristics. In order to return the stereo signals (signal Rch and signal Lch) obtained by the directivity synthesis unit 500 to the signals (signal x1 and signal x2) output from the microphone unit, the matrix of the second term on the left side of the equation (1) What is necessary is just to multiply an inverse matrix from the left side of both sides, and it can implement | achieve by what is called an inverse filter. This can be expressed by equations (2) and (3).

Therefore, reverse directivity synthesis can be realized by performing the processing of Expression (3) on the signal Rch and the signal Lch. The reverse directivity synthesis unit 250 illustrated in FIG. 20 illustrates Equation (3). The directivity synthesis unit 100 generates a main signal m1 having sensitivity in the target sound direction and a noise reference signal m2 having directivity blind angle in the target sound direction from the signals x1 and x2 obtained in this way. .

以上のように、実施の形態１１では、ワンポイントステレオマイクロホンから出力される信号を利用する。この場合でも、実施の形態１０と同様の効果を得ることができる。すなわち、正面方向から到来する目的音の強調、および反射による周波数歪みの補正を行うことができる。また、実施の形態１１では、任意の方向から到来する目的音に対応することができる。 As described above, in the eleventh embodiment, a signal output from the one-point stereo microphone is used. Even in this case, an effect similar to that of the tenth embodiment can be obtained. That is, the target sound coming from the front direction can be emphasized and frequency distortion due to reflection can be corrected. In the eleventh embodiment, it is possible to deal with a target sound coming from an arbitrary direction.

実施の形態１１は、特に、マイクロホンユニットから出力される信号が得られず、ステレオ化された信号のみが入手可能な状態で有効である。換言すれば、実施の形態１１によれば、ステレオ化された信号が生成される機器においても、目的音の主信号および理想的な雑音参照信号を得るための構成を実現することができる。 The eleventh embodiment is particularly effective when a signal output from the microphone unit cannot be obtained and only a stereo signal is available. In other words, according to the eleventh embodiment, a configuration for obtaining a main signal of an objective sound and an ideal noise reference signal can be realized even in a device that generates a stereo signal.

図２１は、実施の形態１１の応用例を示す図である。図２１は、音声記録装置８０１と、音声再生装置８０２とからなるシステムを示す図である。音声記録装置８０１は、第５および第６のマイクロホンユニット５および６と、指向性合成部５００とを備えている。また、記録部８０３は、音声記録装置８０１および音声再生装置８０２に着脱可能な記録媒体である。音声再生装置８０２は、指向性再合成部２００とを備えている。また、図示していないが、音声再生装置８０２は、実施の形態１〜４のいずれかのマイクロホン装置の構成を備えている。 FIG. 21 is a diagram illustrating an application example of the eleventh embodiment. FIG. 21 is a diagram showing a system including an audio recording device 801 and an audio reproduction device 802. The audio recording device 801 includes fifth and sixth microphone units 5 and 6 and a directivity synthesis unit 500. The recording unit 803 is a recording medium that can be attached to and detached from the audio recording device 801 and the audio reproduction device 802. The audio reproduction device 802 includes a directivity resynthesis unit 200. Although not shown, the audio playback device 802 has the configuration of the microphone device according to any one of the first to fourth embodiments.

図２１において、音声記録装置８０１の記録部８０３には、信号Ｒｃｈおよび信号Ｌｃｈが記録される。これによって、記録部８０３に音声情報が記録されたことになる。音声情報が記録された記録部８０３が音声再生装置８０２に装着されると、音声再生装置８０２は、記録部８０３に記録されている情報を読み出す。具体的には、信号Ｒｃｈおよび信号Ｌｃｈが指向性再合成部２００に読みとられる。指向性再合成部２００は、読みとった信号Ｒｃｈおよび信号Ｌｃｈから、主信号および雑音参照信号を生成する。主信号および雑音参照信号を用いることによって、目的音に対する雑音抑圧処理を行うことができる。 In FIG. 21, a signal Rch and a signal Lch are recorded in the recording unit 803 of the audio recording device 801. As a result, audio information is recorded in the recording unit 803. When the recording unit 803 in which the audio information is recorded is attached to the audio reproduction device 802, the audio reproduction device 802 reads information recorded in the recording unit 803. Specifically, the signal Rch and the signal Lch are read by the directivity resynthesis unit 200. The directivity re-synthesis unit 200 generates a main signal and a noise reference signal from the read signal Rch and signal Lch. By using the main signal and the noise reference signal, it is possible to perform noise suppression processing on the target sound.

以上のように、音声記録装置８０１と音声再生装置８０２とが別体である場合でも、実施の形態１１の構成を実現することができる。すなわち、ビデオムービー等の記録部８０３に一度記録された信号に対して、再生時に雑音抑圧処理を行うようにすることも可能である。 As described above, even when the audio recording device 801 and the audio reproduction device 802 are separate bodies, the configuration of the eleventh embodiment can be realized. That is, it is possible to perform noise suppression processing during reproduction on a signal once recorded in the recording unit 803 such as a video movie.

図２２は、図２１に示す音声再生装置の応用例を示す図である。図２２においては、音声再生装置８０２は、図２１において説明した構成に加え、画像表示部９００および角度設定部１６０を備えている。すなわち、図２２に示す音声再生装置８０２は画像表示機能を備えており、例えば、デジタルビデオカメラ等によって実現される。 FIG. 22 is a diagram illustrating an application example of the audio reproduction device illustrated in FIG. 21. In FIG. 22, the audio playback device 802 includes an image display unit 900 and an angle setting unit 160 in addition to the configuration described in FIG. 21. That is, the audio reproducing device 802 shown in FIG. 22 has an image display function, and is realized by, for example, a digital video camera.

図２２においては、記録部８０３には、図２１において説明した音声情報の他、画像表示部に表示すべき画像情報を記録している。この音声情報および画像情報は、例えばデジタルビデオカメラで同時に記録された画像（映像）および音声の情報のように、互いに関連する情報である。この音声情報および画像情報は、音声再生装置８０２において同時に再生される。ここで、音声情報および画像情報の再生中において、ユーザは角度設定部１６０を用いて角度を指示する。このとき、ユーザは、画像表示部に表示された画像を見ながら角度を決定する。例えば、画像表示部の画面中央に被写体が表示されているならば、ユーザは、画面中央に対応する方向（すなわち、正面方向）を示す角度を指示する。これによって、ユーザは、正面方向から到来する音を目的音として抽出して聞くことができる。 In FIG. 22, the recording unit 803 records image information to be displayed on the image display unit in addition to the audio information described in FIG. The audio information and image information are information related to each other such as image (video) and audio information recorded simultaneously by a digital video camera. The audio information and the image information are simultaneously reproduced by the audio reproduction device 802. Here, during the reproduction of the audio information and the image information, the user instructs the angle using the angle setting unit 160. At this time, the user determines the angle while viewing the image displayed on the image display unit. For example, if the subject is displayed at the center of the screen of the image display unit, the user instructs an angle indicating a direction corresponding to the center of the screen (that is, the front direction). Thereby, the user can extract and listen to the sound coming from the front direction as the target sound.

なお、他の実施の形態においては、次のような構成も考えられる。図２３は、他の実施の形態におけるマイクロホン装置の構成の一部を示す図である。図２３において、第５のマイクロホンユニット５、第６のマイクロホンユニット６、および指向性合成部５００については、図２０に示す構成と同様である。また、指向性再合成部２００については、図１９に示す構成と同様である。図２３に示す構成によっても、上記と同様の効果を得ることができる。なお、信号ｍ１および信号ｍ２を得た後の構成は実施の形態１〜４のいずれかの構成が用いられる。 In other embodiments, the following configuration is also conceivable. FIG. 23 is a diagram illustrating a part of the configuration of a microphone device according to another embodiment. In FIG. 23, the fifth microphone unit 5, the sixth microphone unit 6, and the directivity synthesis unit 500 are the same as those shown in FIG. The directivity re-synthesis unit 200 has the same configuration as that shown in FIG. Also with the configuration shown in FIG. 23, the same effect as described above can be obtained. In addition, the structure after obtaining the signal m1 and the signal m2 uses the structure in any one of Embodiment 1-4.

以上のように、本発明によれば、目的音方向に向けた指向性マイクロホン出力に対して、目的音方向以外の方向について定常および非定常雑音を抑圧することで、小型でありながら超指向性を持つマイクロホンを得ることができる。また、同時にマイクロホン装置が受ける反射波の周波数特性への影響を除去することができる。このように効果から、加法性雑音である騒音と、乗法性雑音である反射波との両方を同時に抑圧可能となり、音場の影響を受けず高Ｓ／Ｎでかつ常に平坦なマイクロホン周波数特性を実現することができる。また雑音抑圧処理部においては、処理遅延を少なくする構成を実現することで、大きな遅延が許されない拡声や通話への応用を可能とする。また、前処理となる指向性合成、逆指向性合成、指向性再合成などの組み合わせにより様々な方向の音を抽出したり再生装置側での同様の効果も得られる。 As described above, according to the present invention, with respect to the directional microphone output directed toward the target sound direction, the stationary and non-stationary noises are suppressed in directions other than the target sound direction. A microphone with can be obtained. At the same time, the influence on the frequency characteristic of the reflected wave received by the microphone device can be removed. In this way, both additive noise and multiplicative noise reflected waves can be suppressed simultaneously, and high S / N and always flat microphone frequency characteristics are not affected by the sound field. Can be realized. In addition, the noise suppression processing unit realizes a configuration that reduces a processing delay, thereby enabling application to a loud voice or a call that does not allow a large delay. In addition, sound in various directions can be extracted by combining combinations of directivity synthesis, reverse directivity synthesis, and directivity resynthesis, which are preprocessing, and similar effects can be obtained on the playback device side.

以上のように、本発明のマイクロホン装置および再生装置は、実使用環境の複数の騒音下でも安定に動作するとともに、高Ｓ／Ｎを実現すること等を目的として利用することが可能である。 As described above, the microphone device and the playback device of the present invention can be used for the purpose of stably operating even under a plurality of noises in an actual use environment and realizing a high S / N.

実施の形態１に係るマイクロホン装置の構成を示すブロック図1 is a block diagram showing a configuration of a microphone device according to Embodiment 1. FIG. 図１に示す判定部の構成を示す図The figure which shows the structure of the determination part shown in FIG. 支配的である音の方向がθ１〜θ３方向である場合における音声検出の状態の例を示す図The figure which shows the example of the state of an audio | voice detection in case the direction of the dominant sound is (theta) 1-theta (3) direction. 雑音抑圧フィルタ係数算出部４０の構成例を示す図The figure which shows the structural example of the noise suppression filter coefficient calculation part 40. 時変係数フィルタ部５０の構成例を示す図The figure which shows the structural example of the time-varying coefficient filter part 50. 時変係数フィルタ部５０の他の構成例を示す図The figure which shows the other structural example of the time-varying coefficient filter part 50. 図１に示す各信号の具体例を示す図The figure which shows the specific example of each signal shown in FIG. 実施の形態２に係るマイクロホン装置の構成を示すブロック図FIG. 3 is a block diagram showing a configuration of a microphone device according to Embodiment 2. 反射物がある場合と反射物がない場合とにおけるマイクロホン装置の内部状態の相違を説明する図The figure explaining the difference in the internal state of a microphone apparatus with the case where there is a reflector, and the case where there is no reflector 実施の形態３に係るマイクロホン装置の構成を示すブロック図FIG. 4 is a block diagram showing a configuration of a microphone device according to Embodiment 3. 実施の形態３に係るマイクロホン装置の他の構成を示すブロック図FIG. 9 is a block diagram showing another configuration of the microphone device according to Embodiment 3. 実施の形態４に係るマイクロホン装置の構成を示すブロック図Block diagram showing a configuration of a microphone device according to Embodiment 4 マイクロホン装置の指向性パターンを示す図The figure which shows the directivity pattern of the microphone device 実施の形態５に係るマイクロホン装置の構成の一部を示す図The figure which shows a part of structure of the microphone apparatus which concerns on Embodiment 5. FIG. 実施の形態６に係るマイクロホン装置の構成の一部を示す図The figure which shows a part of structure of the microphone apparatus which concerns on Embodiment 6. FIG. 実施の形態７に係るマイクロホン装置の構成の一部を示す図The figure which shows a part of structure of the microphone apparatus which concerns on Embodiment 7. FIG. 実施の形態８に係るマイクロホン装置の構成の一部を示す図The figure which shows a part of structure of the microphone apparatus which concerns on Embodiment 8. FIG. 実施の形態９に係るマイクロホン装置の構成の一部を示す図The figure which shows a part of structure of the microphone apparatus which concerns on Embodiment 9. FIG. 実施の形態１０に係るマイクロホン装置の構成の一部を示す図FIG. 10 shows a part of a configuration of a microphone device according to Embodiment 10; 実施の形態１１に係るマイクロホン装置の構成を示す図The figure which shows the structure of the microphone apparatus based on Embodiment 11. FIG. 実施の形態１１の応用例を示す図FIG. 15 shows an application example of the eleventh embodiment. 図２１に示す音声再生装置の応用例を示す図The figure which shows the application example of the audio | voice reproduction apparatus shown in FIG. 他の実施の形態におけるマイクロホン装置の構成の一部を示す図The figure which shows a part of structure of the microphone apparatus in other embodiment. 従来例１のマイクロホン装置の構成を示す図The figure which shows the structure of the microphone apparatus of the prior art example 1. 従来例２のマイクロホン装置の構成を示す図The figure which shows the structure of the microphone apparatus of the prior art example 2. 従来例３のマイクロホン装置の構成を示す図The figure which shows the structure of the microphone apparatus of the prior art example 3.

Explanation of symbols

１第１のマイクロホンユニット
２第２のマイクロホンユニット
３第３のマイクロホンユニット
４第４のマイクロホンユニット
１０判定部
２０適応フィルタ部
３０信号減算部
４０雑音抑圧フィルタ係数算出部
５０時変係数フィルタ部
６０反射情報算出部
７０反射補正部
９０検出閾値設定部

DESCRIPTION OF SYMBOLS 1 1st microphone unit 2 2nd microphone unit 3 3rd microphone unit 4 4th microphone unit 10 Determination part 20 Adaptive filter part 30 Signal subtraction part 40 Noise suppression filter coefficient calculation part 50 Time-varying coefficient filter part 60 Reflection Information calculation unit 70 Reflection correction unit 90 Detection threshold setting unit

Claims

A microphone device for detecting a target sound coming from a target sound direction,
A signal generation unit that generates a main signal indicating a result detected with sensitivity to the target sound direction , and a noise reference signal indicating a result detected by directing the sensitivity blind angle with respect to the target sound direction;
A determination unit that determines whether a level ratio indicating a ratio of the signal level of the main signal to the signal level of the noise reference signal generated by the signal generation unit is greater than a predetermined value;
The main signal generated by the signal generation unit is filtered by an adaptive filter to generate a signal indicating the signal component of the target sound included in the noise reference signal generated by the signal generation unit, and by the determination unit An adaptive filter unit that learns filter coefficients only when the level ratio is determined to be greater than a predetermined value;
Before Kizatsu sound reference signal, the generated by the adaptive filter section, and a subtraction unit for subtracting a signal indicating the signal components of the target sound included in the noise reference signal,
Using the main signal and a noise reference signal after subtraction by the subtraction unit, a noise suppression unit that suppresses a signal component of noise included in the main signal ,
The noise suppression unit calculates a filter coefficient of a noise suppression filter for suppressing signal components other than the target sound signal from the main signal based on the main signal and the noise reference signal after subtraction by the subtraction unit. A noise suppression filter coefficient calculation unit for
A microphone device , comprising: a time-varying coefficient filter unit that performs filtering on the main signal, reflecting the filter coefficient calculated by the noise suppression filter coefficient calculation unit .

A microphone device for detecting a target sound coming from a target sound direction,
A signal generation unit that generates a main signal indicating a result detected with sensitivity to the target sound direction , and a noise reference signal indicating a result detected by directing the sensitivity blind angle with respect to the target sound direction;
A determination unit that determines whether a level ratio indicating a ratio of the signal level of the main signal to the signal level of the noise reference signal generated by the signal generation unit is greater than a predetermined value;
The main signal generated by the signal generation unit is filtered by an adaptive filter to generate a signal indicating the signal component of the target sound included in the noise reference signal generated by the signal generation unit, and by the determination unit An adaptive filter unit that learns filter coefficients only when the level ratio is determined to be greater than a predetermined value;
Before Kizatsu sound reference signal, the generated by the adaptive filter section, and a subtraction unit for subtracting a signal indicating the signal components of the target sound included in the noise reference signal,
Based on the filter coefficient of the adaptive filter unit, a reflection information calculation unit that calculates information on the arrival time difference between the direct wave and the reflected wave of the target sound;
A microphone device comprising: a reflection correction unit that corrects a distortion of a frequency characteristic generated in a main signal due to a reflected wave of a target sound based on information calculated by the reflection information calculation unit.

The signal generator is
A first microphone unit arranged such that a directional main axis is directed to a target sound direction;
Directivity dead angle direction viewed contains a second microphone unit is arranged directed to the target sound direction,
The microphone device according to claim 1 or 2, wherein an output signal from the first microphone unit is the main signal, and an output signal from the second microphone unit is the noise reference signal .

Provided between the output terminal and the subtraction of the noise reference signal in the signal generating unit further includes a signal delay unit for delaying the noise reference signal so as to satisfy the convergence condition of the adaptive filter of the adaptive filter section The microphone device according to claim 1 or 2.

The microphone device according to claim 1, wherein the predetermined value is changeable.

The signal generator is
A first microphone unit;
A second microphone unit having the same characteristics as the first microphone unit;
A delay unit that delays a signal output from the first microphone unit by a predetermined delay amount;
An amplifier for amplifying the signal output from the delay unit;
A first subtraction unit that generates a main signal by subtracting the signal amplified by the amplification unit from the signal output from the second microphone unit;
A second subtracting unit that generates a noise reference signal by subtracting the signal output from the delay unit from the signal output from the second microphone unit ;
Before SL predetermined delay amount, the blind spot direction of the directional characteristic noise reference signal output from the second subtracting unit has is set to face the target sound direction,
The microphone device according to claim 1 or 2, wherein an amplification factor in the amplification unit is set such that the main signal has higher sensitivity in the target sound direction than the noise reference signal .

The microphone device according to claim 6, further comprising a setting unit that changes a predetermined delay amount set in the delay unit.

The signal generator is
A first microphone unit;
A second microphone unit having the same characteristics as the first microphone unit;
Based on the signals output from the first and second microphone units, a main signal is generated so as to be sensitive to the target sound direction, and a noise signal is used so that the sensitivity in the target sound direction is minimized. The microphone device according to claim 1, further comprising: a synthesis unit that generates

The signal generator is
A first microphone unit;
A second microphone unit disposed with a directional main axis directed in a direction different from the first microphone unit;
A signal adder that generates a main signal by adding a signal output from the first microphone unit and a signal output from the second microphone unit;
And a signal subtracting unit that generates a noise reference signal by subtracting the other from one of the signal output from the first microphone unit and the signal output from the second microphone unit. The microphone device according to claim 1 or 2.

The signal generator is
A first microphone unit;
A second microphone unit having the same characteristics as the first microphone unit;
A stereo signal generator for generating a stereo signal composed of a right channel signal and a left channel signal based on the first and second microphone units;
Based on the stereo signal, an inverse synthesis unit that generates each signal output from each microphone unit;
Based on each signal generated by the inverse synthesizing unit, a main signal indicating a result detected with sensitivity to the target sound direction and a sound arriving from a direction other than the target sound direction from the target sound. The microphone device according to claim 1, further comprising: a synthesizing unit that generates a noise reference signal indicating a detection result with high sensitivity.

The signal generator is
A first microphone unit;
A second microphone unit having the same characteristics as the first microphone unit;
A stereo signal generator for generating a stereo signal composed of a right channel signal and a left channel signal based on the first and second microphone units;
A signal adding unit that generates a main signal by adding the right channel signal and the left channel signal of the stereo signal;
The microphone device according to claim 1, further comprising: a signal subtracting unit that generates a noise reference signal by subtracting the other from one of the right channel signal and the left channel signal of the stereo signal.

Based on the filter coefficient of the adaptive filter unit, a reflection information calculation unit that calculates information on the arrival time difference between the direct wave and the reflected wave of the target sound;
Based on the information calculated by the reflection information calculation unit, further comprising a reflection correction unit that corrects distortion of the frequency characteristics generated in the main signal by the reflected wave of the target sound,
The noise suppression unit suppresses a signal component of noise included in the main signal by using the main signal after correction by the reflection correction unit and the noise reference signal after subtraction by the subtraction unit. The microphone device described.

The noise suppression filter coefficient calculation unit is
A first frequency analysis unit for calculating a power spectrum of the main signal;
A second frequency analysis unit that calculates a power spectrum of the noise reference signal after subtraction by the subtraction unit;
Wherein the determination unit only when the level ratio is determined the Most smaller than a predetermined value, said the power spectrum calculated by the first frequency analysis unit, the power spectrum calculated by the second frequency analysis unit A power spectrum ratio calculation unit for calculating a time average of the power spectrum ratio with
A multiplier that multiplies the time average of the power spectrum ratio calculated by the power spectrum ratio calculator by the power spectrum calculated by the second frequency analyzer;
The microphone according to claim 1 , further comprising: a coefficient calculation unit that calculates a filter coefficient of the noise suppression filter based on a power spectrum calculated by the first frequency analysis unit and a multiplication result by the multiplication unit. apparatus.

An audio recording unit for recording audio signals of at least two types of channels;
Based on the audio signal recorded in the recording unit, the main signal indicating the detection result with sensitivity to the target sound direction, and the detection result with the sensitivity dead angle directed to the target sound direction are shown. A signal generator for generating a noise reference signal;
A determination unit that determines whether a level ratio indicating a ratio of the signal level of the main signal to the signal level of the noise reference signal generated by the signal generation unit is greater than a predetermined value;
The main signal generated by the signal generation unit is filtered by an adaptive filter to generate a signal indicating the signal component of the target sound included in the noise reference signal generated by the signal generation unit, and by the determination unit An adaptive filter unit that learns filter coefficients only when the level ratio is determined to be greater than a predetermined value;
Before Kizatsu sound reference signal, the generated by the adaptive filter section, and a subtraction unit for subtracting a signal indicating the signal components of the target sound included in the noise reference signal,
Using the main signal and the noise reference signal after subtraction by the subtraction unit, a noise suppression unit that suppresses a signal component of noise included in the main signal;
A reproducing unit that reproduces a main signal in which a noise signal component is suppressed by the noise suppressing unit ;
The noise suppression unit calculates a filter coefficient of a noise suppression filter for suppressing signal components other than the target sound signal from the main signal based on the main signal and the noise reference signal after subtraction by the subtraction unit. A noise suppression filter coefficient calculation unit for
An audio reproduction device comprising: a time-varying coefficient filter unit that performs filtering on the main signal, reflecting the filter coefficient calculated by the noise suppression filter coefficient calculation unit .

A video recording unit for recording a video signal related to the audio signal recorded in the audio recording unit;
A video reproduction unit for reproducing the video signal recorded in the video recording unit;
A direction receiving unit that receives an input of a direction in which the sound should be emphasized from a user,
The audio reproduction device according to claim 14 , wherein the signal generation unit generates a main signal and a noise reference signal with a direction received by the direction reception unit as a target sound direction.