JP2005260743A

JP2005260743A - Microphone array

Info

Publication number: JP2005260743A
Application number: JP2004071550A
Authority: JP
Inventors: Shigeki Sagayama; 茂樹嵯峨山; Masaru Kamamoto; 優鎌本; Takuya Nishimoto; 卓也西本; Toshiharu Horiuchi; 俊治堀内; Mitsunori Mizumachi; 光徳水町; Satoru Nakamura; 哲中村
Original assignee: ATR Advanced Telecommunications Research Institute International
Current assignee: ATR Advanced Telecommunications Research Institute International
Priority date: 2004-03-12
Filing date: 2004-03-12
Publication date: 2005-09-22
Anticipated expiration: 2024-03-12
Also published as: JP4156545B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a microphone array which can improve an SNR (Signal to Noise Ratio). <P>SOLUTION: In the microphone array with 3 or more microphones, distances of respective microphones are set so that the distances are proportional to a scale distance of the minimum Golomb ruler. The 3 or more microphones may be arranged in a straight line shape, or in a circular shape. If the number of microphones is for example 4, the distances of respective microphones are set to be spacing of 1 to 3 to 2. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

この発明は、マイクロホンアレーに関する。 The present invention relates to a microphone array.

近年、自動音声認識(Automatic Speech Recognition; ASR) が、擬人化エージェントやカーナビゲーションシステムなどへ応用されてきている。実環境では雑音や残響の影響で認識率が大幅に低下することから、雑音や残響に頑健なＡＳＲシステムを目指す研究がなされてきている（参考文献〔１〕参照）。マイクロホンアレーを用いることで、対象音源と雑音源の空間的位相差を利用し、雑音や残響を抑圧することにより、遠隔発話音声の認識性能を向上させることができる。 In recent years, automatic speech recognition (ASR) has been applied to anthropomorphic agents and car navigation systems. In real environments, the recognition rate is greatly reduced due to the effects of noise and reverberation, so research aimed at ASR systems that are robust against noise and reverberation has been made (see reference [1]). By using the microphone array, it is possible to improve the recognition performance of the remote speech by using the spatial phase difference between the target sound source and the noise source and suppressing noise and reverberation.

参考文献〔１〕：中村哲，”実音響環境に頑健な音声認識を目指して，” 信学技報，SP 2002-12, pp. 31-36, 2002. Reference [1]: Satoshi Nakamura, “Toward robust speech recognition in a real acoustic environment,” IEICE Technical Report, SP 2002-12, pp. 31-36, 2002.

マイクロホンアレーには様々な技術があるが、Griffith-JimやＡＭＮＯＲなどの適応型マイクロホンアレーでは、無音声区間を予め入力して学習させることが必要である（参考文献〔２〕参照）。 There are various techniques for microphone arrays. However, adaptive microphone arrays such as Griffith-Jim and AMNOR require learning by inputting a silent period in advance (see reference [2]).

参考文献〔２〕：大賀寿郎ら：音響システムとディジタル処理，電子情報通信学会，1995. Reference [2]: Toshiro Oga et al .: Acoustic systems and digital processing, IEICE, 1995.

しかしながら、実際に音声認識を行う場合に、雑音環境下で学習のための無音声区間を検出することは必ずしも容易ではない。また、定常雑音に対して頑健な雑音除去を行うことはできるが、非定常雑音に対しては性能が低下する。このような雑音や残響が時々刻々変化する環境では、認識性能が低下してしまう。 However, when actually performing speech recognition, it is not always easy to detect a speechless section for learning in a noisy environment. In addition, robust noise removal can be performed for stationary noise, but performance is degraded for non-stationary noise. In an environment where such noise and reverberation change from moment to moment, recognition performance is degraded.

そこで、本出願の発明者らは、ＡＳＲの性能と利便性の両立を目指し、その一例として、学習を必要とせず、雑音および残響の抑制に効果のある遅延和（DS:Delay and Sum) 型マイクロホンアレーに着目し、マイクロホン間隔と配置に関して改良を試みた。 Accordingly, the inventors of the present application aim to achieve both ASR performance and convenience, and as an example thereof, a delay and sum (DS) type that does not require learning and is effective in suppressing noise and reverberation. Focusing on the microphone array, we tried to improve the microphone spacing and layout.

以後の説明は、遅延和（DS:Delay and Sum) 型マイクロホンアレーを用いる場合について説明する。遅延和型マイクロホンアレーは、よく知られているように、各マイクロホンで受音した信号それぞれに遅延を付加した後、それらの総和をとるといった処理（以下、遅延和処理という）を行うマイクロホンアレーである。図１に、遅延和型マイクロホンアレーの構成例を示す。図１において、Ｍ_i（ｉ＝１，２，…ｍ）は、一直線状に配されたマイクロホンである。Ｄ_iは、各マイクロホンＭ_iで受音した信号ｘ_si（ｔ）に遅延量ｄ_iを付加する遅延器である。Ｓは、遅延器Ｄ_iの出力信号ｘ_si（ｔ−ｄ_i）を加算して、出力信号ｙ（ｔ）を出力する加算器である。
中村哲，”実音響環境に頑健な音声認識を目指して，” 信学技報，SP 2002-12, pp. 31-36, 2002. 大賀寿郎ら：音響システムとディジタル処理，電子情報通信学会，1995. 鹿野清宏ら：音声認識システム，オーム社，2001. In the following description, a case where a delay and sum (DS) type microphone array is used will be described. As is well known, a delay-and-sum type microphone array is a microphone array that performs processing such as adding a delay to each signal received by each microphone and then summing them (hereinafter referred to as delay-sum processing). is there. FIG. 1 shows a configuration example of a delay sum type microphone array. In FIG. 1, M _i (i = 1, 2,..., M) are microphones arranged in a straight line. D _i is a delay device that adds a delay amount d _i to the signal x _si (t) received by each microphone M _i . S is an adder that adds the output signal x _si (t−d _i ) of the delay device D _i and outputs the output signal y (t).
Satoshi Nakamura, “Toward robust speech recognition in a real acoustic environment,” IEICE Technical Report, SP 2002-12, pp. 31-36, 2002. Toshiro Oga et al .: Acoustic systems and digital processing, IEICE, 1995. Kiyohiro Shikano et al .: Speech recognition system, Ohmsha, 2001.

この発明は、ＳＮＲを向上させることができるマイクロホンアレーを提供することを目的とする。 An object of this invention is to provide the microphone array which can improve SNR.

請求項１に記載の発明は、３以上のマイクロホンを備えたマイクロホンアレーにおいて、各マイクロホンの間隔が、最短ゴロム定規の目盛間隔に比例した間隔に設定されていることを特徴とする。 The invention described in claim 1 is characterized in that, in a microphone array having three or more microphones, the interval between the microphones is set to an interval proportional to the scale interval of the shortest Golomb ruler.

請求項２に記載の発明は、請求項１に記載の発明において、上記３以上のマイクロホンは、直線状に配置されていることを特徴とする。 The invention described in claim 2 is the invention described in claim 1, characterized in that the three or more microphones are arranged in a straight line.

請求項３に記載の発明は、請求項１に記載の発明において、上記３以上のマイクロホンは、円弧状に配置されていることを特徴とする。 According to a third aspect of the invention, in the first aspect of the invention, the three or more microphones are arranged in an arc shape.

請求項４に記載の発明は、請求項１乃至３に記載の発明において、４個のマイクロホンを備えており、各マイクロホンの間隔が、１対３対２の間隔に設定されていることを特徴とする。 According to a fourth aspect of the present invention, in the first to third aspects of the present invention, four microphones are provided, and the interval between the microphones is set to a one-to-three-to-two interval. And

この発明によれば、ＳＮＲを向上させることができるマイクロホンアレーが実現する。 According to the present invention, a microphone array capable of improving SNR is realized.

以下、図面を参照して、この発明を遅延和型マイクロホンアレーに適用した場合の実施例について説明する。 Embodiments of the present invention applied to a delay and sum microphone array will be described below with reference to the drawings.

〔１〕予備検討 [1] Preliminary examination

予備検討として、様々な条件における遅延和型マイクロホンアレーの性能を比較するため、シミュレーションにより音声認識実験を行った。特に、マイクロホンの数、マイクロホン間隔、雑音のマイクロホンアレーに対する角度、ＳＮＲに注目した。なお、予備検討で用いた遅延和型マイクロホンアレーの各マイクロホンは一直線上に等間隔で配置されているものとする。 As a preliminary study, a speech recognition experiment was performed by simulation to compare the performance of delay-and-sum type microphone arrays under various conditions. In particular, we focused on the number of microphones, the microphone spacing, the angle of the noise to the microphone array, and the SNR. It is assumed that the microphones of the delay sum type microphone array used in the preliminary study are arranged at equal intervals on a straight line.

評価用の音声データには、ＡＴＲのＢＴＥＣテストセット０１を用いた。この評価用データは旅行の際に用いられる会話を朗読したもので、全部で５１０文あり、１６ｋＨｚサンプリングで収録されたものである。 ATR's BTEC test set 01 was used as audio data for evaluation. This evaluation data is a reading of conversations used during travel, with a total of 510 sentences, recorded at 16 kHz sampling.

雑音はマイクロホンアレーの正面から到来すると仮定し、マイクロホンの受音信号として、適切な時間差を伴う音声に同一の雑音を加えた。雑音は音声の周波数帯域に合わせて、１２５Ｈｚから６ｋＨｚまでのランダム帯域雑音を用いた。 Assuming that the noise comes from the front of the microphone array, the same noise was added to the speech with an appropriate time difference as the microphone reception signal. As the noise, random band noise from 125 Hz to 6 kHz was used according to the frequency band of the voice.

ＳＮＲは、音声データの無音声区間を除いた区間の平均振幅から信号のエネルギーを求め、マイクロホンの受音信号のＳＮＲ（入力信号のＳＮＲ）が目的のＳＮＲとなるように雑音の振幅を変化させた。その後、遅延和処理により、雑音抑圧した音声を認識した。 The SNR is obtained by calculating the energy of the signal from the average amplitude of the audio data excluding the non-voice interval, and changing the noise amplitude so that the SNR of the microphone reception signal (the SNR of the input signal) becomes the target SNR. It was. After that, the speech with noise suppression was recognized by delay sum processing.

結果として、マイクロホンの数が多いほど認識率（単語正解精度）が向上した。さらに、マイクロホン間隔に応じて音声のマイクロホンアレーに対する角度と関係して遅延和処理後のＳＮＲが変化し、入力信号のＳＮＲが高いほど認識率も向上することが分かった。 As a result, the recognition rate (word accuracy) improved as the number of microphones increased. Furthermore, it was found that the SNR after the delay sum process changes in relation to the angle of the voice to the microphone array according to the microphone interval, and the recognition rate improves as the SNR of the input signal increases.

遅延和型マイクロホンアレーのマイクロホンの数を２個とし、それらの間隔（マイクロホン間隔）を５ｃｍ，１０ｃｍ，１５ｃｍに変化させた。各マイクロホン間隔（５ｃｍ，１０ｃｍ，１５ｃｍ）での、音源と雑音源の角度の変化による遅延和処理後のＳＮＲの変化を図２に示す。図２は、入力信号のＳＮＲを２０ｄＢとして、音源方向を−９０度から＋９０度まで５度毎に変化させたとき、遅延和処理後のＳＮＲの変化を示している。 The number of microphones in the delay sum type microphone array was set to two, and the interval (microphone interval) was changed to 5 cm, 10 cm, and 15 cm. FIG. 2 shows the change in SNR after delay sum processing due to the change in the angle between the sound source and the noise source at each microphone interval (5 cm, 10 cm, 15 cm). FIG. 2 shows the change in SNR after delay sum processing when the SNR of the input signal is 20 dB and the sound source direction is changed every 5 degrees from −90 degrees to +90 degrees.

予備検討から得られた結果より、音源方向と雑音方向が既知ならば、図１にしたがって、マイクロホン間隔を調節することにより、遅延和処理後のＳＮＲを向上させることができる。したがって、予め多数のマイクロホンを用意しておけば、適切な間隔のマイクロホンの対を選択することにより、同様な効果が得られる。 If the sound source direction and the noise direction are known from the results obtained from the preliminary study, the SNR after the delay sum process can be improved by adjusting the microphone interval according to FIG. Therefore, if a large number of microphones are prepared in advance, a similar effect can be obtained by selecting a pair of microphones at appropriate intervals.

できるだけマイクロホン数を増やさずに、様々な間隔が得られるような配置があれば、音源方向や雑音方向に合わせて、最適な距離を選択することができる。 If there is an arrangement in which various intervals can be obtained without increasing the number of microphones as much as possible, an optimum distance can be selected in accordance with the sound source direction and noise direction.

〔２〕最短ゴロム定規の導入
前述の要求を満たすために、この実施例では、遅延和型マイクロホンアレーのマイクロホンの間隔に、最短ゴロム定規(Optimal Golomb Ruler;OGR)の目盛間隔を導入した。 [2] Introducing the Shortest Golomb Ruler In order to satisfy the above-described requirements, in this embodiment, the scale interval of the shortest Golomb Ruler (OGR) is introduced as the interval between the microphones of the delay sum type microphone array.

最短ゴロム定規（OGR)は、Ｘ線センサの配置や電波望遠鏡の配置に使われている。この間隔は、センサの数が少なくても、計測できる距離の種類が増えるというものである。例えば、配置対象が４個の場合には、それらの配置位置は｛０−１−４−６｝となり、配置対象が１０個の場合には、それらの配置位置は｛０−１−６−１０−２３−２６−３４−４１−５３−５５｝となる。 The shortest Golomb ruler (OGR) is used for X-ray sensor placement and radio telescope placement. This interval is such that the number of distances that can be measured increases even if the number of sensors is small. For example, when there are four arrangement targets, their arrangement positions are {0-1-4-6}, and when there are ten arrangement objects, their arrangement positions are {0-1-6-6]. 10-23-26-34-41-53-55}.

最短ゴロム定規を用いると、等間隔配置よりも、多くの種類の間隔を得ることができる。図３（ａ）に示すように、配置対象が４個の場合に、それらを等間隔で配置した場合には、それらの配置位置は｛０−２−４−６｝となり、間隔の種類は、｛２，４，６｝の３種類となる。これに対して、図３（ｂ）に示すように、４個の配置対象を最短ゴロム定期間隔に従って配置した場合には、それらの配置位置は｛０−１−４−６｝となり、間隔の種類は、｛１，２，３，４，５，６｝の６種類となる。 If the shortest Golomb ruler is used, many types of intervals can be obtained rather than an equal interval arrangement. As shown in FIG. 3A, when there are four objects to be arranged and they are arranged at equal intervals, their arrangement positions are {0-2-4-6}, and the type of the interval is , {2, 4, 6}. On the other hand, as shown in FIG. 3B, when the four placement targets are placed according to the shortest Golomb regular interval, their placement positions are {0-1-4-6} There are six types, {1, 2, 3, 4, 5, 6}.

ゴロム定規(Golomb Ruler)の目盛は、２組の数字の差が同一ではない正の整数の集合である。配置対象がＭ個ある場合には、”δ_ij＝ａ_j−ａ_i（１≦ｉ≦ｊ≦ｍ）が全て異なり，かつ，０＝ａ₁＜ａ₂＜…＜ａ_Mを満たす数列ａ_k（ｋ＝１，２，…，ｍ）”の数値を目盛とした定規を作れば、それがゴロム定規である。このａ_Mが最も短くなるものを最短ゴロム定規という。 The scale of the Golomb Ruler is a set of positive integers where the difference between the two sets of numbers is not the same. If the arrangement object are M are "different _{_{_{δ ij = a j -a i (}}} 1 ≦ i ≦ j ≦ m) are all _{_{and, 0 = a 1 <a 2}} <... < sequence satisfy a _M a _If a ruler with a numerical value of “ _k (k = 1, 2,..., m)” is made, it is a Golomb ruler. The one with the shortest a _M is called the shortest Golomb ruler.

最短ゴロム定規の目盛間隔を遅延和型マイクロホンアレーを構成するマイクロホンの間隔として用いことにより、マイクロホンが等間隔に配置された通常の等間隔遅延和型マイクロホンアレーよりも、音声認識率を向上させることができる。 By using the scale interval of the shortest Golomb ruler as the interval between the microphones constituting the delay sum type microphone array, the speech recognition rate can be improved over the normal equal interval delay sum type microphone array in which the microphones are equally spaced. Can do.

この実施例では、遅延和型マイクロホンアレーのマイクロホンの間隔が、最短ゴロム定規の目盛間隔に比例した間隔に設定される。例えば、ｍ＝４の場合の最短ゴロム定規の目盛は｛０−１−４−６｝となり、その目盛間隔は１，３，２となる。したがって、マイクロホンが４個の場合には、隣り合うマイクロホンの間隔が、１：３：２となるように、４個のマイクロホンが配置される。 In this embodiment, the interval between the microphones of the delay sum type microphone array is set to an interval proportional to the scale interval of the shortest Golomb ruler. For example, the scale of the shortest Golomb ruler when m = 4 is {0-1-4-6}, and the scale interval is 1, 3, 2. Therefore, when there are four microphones, the four microphones are arranged so that the interval between adjacent microphones is 1: 3: 2.

なお、さらに、最適な間隔を強調するために、推定された音源と雑音のなす角度に応じて、その角度で遅延和処理後のＳＮＲが高くなるマイクロホン間隔になるマイクロホン対に対応する遅延処理後の信号に対してに大きな重みを付け、遅延和処理後のＳＮＲが低くなるマイクロホン間隔になるマイクロホン対に対応する遅延処理後の信号に対して小さな重みを付けた後に、それらを総和をとるようにすることが好ましい。 Further, in order to emphasize the optimum interval, after the delay processing corresponding to the pair of microphones corresponding to the microphone interval in which the SNR after the delay sum processing is increased at the angle according to the angle formed by the estimated sound source and noise. A large weight is given to the signal of, and a small weight is given to the signal after the delay processing corresponding to the microphone pair corresponding to the microphone pair whose SNR after the delay sum processing becomes low, and then the sum is taken. It is preferable to make it.

ただし、各マイクロホンに対する重みをｋ_i（ｉ＝１，２，…，ｍ）とすると、ｋ_iの総和が１となることという条件を満たすようにｋ_iが設定される。この条件であれば、音源の振幅は変化しない。 However, if the weight for each microphone is k _i (i = 1, 2,..., M), k _i is set so as to satisfy the condition that the sum of k _i is 1. Under this condition, the amplitude of the sound source does not change.

図４は、本実施例の遅延和型マイクロホンアレーを示している。 FIG. 4 shows the delay sum type microphone array of the present embodiment.

図４において、Ｍ_i（ｉ＝１，２，３，４）は、マイクロホンである。つまり、この例では、４個のマイクロホンを備えている。Ｄ_iは、各マイクロホンＭ_iで受音した信号ｘ_si（ｔ）に遅延量ｄ_iを付加する遅延器である。Ｐｉは遅延器Ｄ_iの出力信号ｘ_si（ｔ−ｄ_i）に重みｋ_iを乗算する乗算器である。Ｓは、乗算器Ｐ_iの出力信号ｋ_i・ｘ_si（ｔ−ｄ_i）を加算して、出力信号ｙ（ｔ）を出力する加算器である。 In FIG. 4, M _i (i = 1, 2, 3, 4) is a microphone. That is, in this example, four microphones are provided. D _i is a delay device that adds a delay amount d _i to the signal x _si (t) received by each microphone M _i . Pi is a multiplier that multiplies the output signal x _si (t-d _i ) of the delay device D _{i by} a weight k _i . S is an output signal k _i · multiplier P _i An adder that adds x _si (t−d _i ) and outputs an output signal y (t).

図５に示すように、４個のマイクロホンＭ₁〜Ｍ₄は一直線状に配置されている。マイクロホンＭ₁〜Ｍ₄の配置位置は、マイクロホンの間隔が最短ゴロム定規の目盛間隔に比例した間隔となるように、｛０ｃｍ−３ｃｍ−１２ｃｍ−１８ｃｍ｝に設定されている。したがって、Ｍ₁とＭ₂との間隔Ｗ₁₂は３ｃｍに、Ｍ₂とＭ₃との間隔Ｗ₂₃は９ｃｍに、Ｍ₃とＭ₄との間隔Ｗ₃₄は６ｃｍとなっている。 As shown in FIG. 5, the four microphones M _{1 to} M ₄ are arranged in a straight line. The arrangement positions of the microphones M _{1 to} M ₄ are set to {0 cm−3 cm−12 cm−18 cm} so that the distance between the microphones is proportional to the scale interval of the shortest Golomb ruler. Therefore, the interval W ₁₂ between M ₁ and M ₂ in the 3 cm interval W ₂₃ between M ₂ and M ₃ are in 9cm interval W ₃₄ between M ₃ and M ₄ has a 6 cm.

〔３〕評価実験
〔３．１〕実験条件
上記実施例での手法（提案手法）の効果を確かめるために、音声認識率による性能評価実験を行った。 [3] Evaluation Experiment [3.1] Experimental Conditions In order to confirm the effect of the technique (proposed technique) in the above embodiment, a performance evaluation experiment using a speech recognition rate was performed.

計算機上のシミュレーションにより、マイクロホンアレーを用いた場合の雑音環境下の音声信号を作成し、そのデータをもとに音声認識実験を行った。 We created a speech signal in a noisy environment using a microphone array by computer simulation, and conducted speech recognition experiments based on that data.

マイクロホンアレーのパラメータとしては、マイクロホンの列に正面から音声を入力し、３０度傾いた方向から予備検討と同じ雑音を入力した。マイクロホンを４個とし、従来手法である等間隔配列の遅延和型マイクロホンアレー（以下、ＤＳアレイ）と、提案手法である最短ゴロム定規配列の遅延和型マイクロホンアレー（ＯＧＲ−ＤＳアレイ）を比較した。 As microphone array parameters, speech was input from the front into the microphone row, and the same noise as in the preliminary study was input from a direction inclined by 30 degrees. Four microphones were compared, and the conventional method of delay-and-sum type microphone array (hereinafter referred to as DS array) with an equal interval array was compared with the proposed method of delay-and-sum type microphone array (OGR-DS array) with the shortest Golomb ruler array. .

ここで、マイクロホン間隔は、２つの手法において同規模とするために、ＤＳアレイでは、｛０ｃｍ−６ｃｍ−１２ｃｍ−１８ｃｍ｝にマイクロホンを配置し、ＯＧＲ−ＤＳアレイでは、図５に示すように、｛０ｃｍ−３ｃｍ−１２ｃｍ−１８ｃｍ｝にマイクロホンを配置した。また、対照実験として、マイクロホンアレーを用いない場合、つまりマイクロホン１個の場合の認識率も求めた。 Here, in order to make the microphone interval the same in the two methods, in the DS array, the microphones are arranged at {0 cm-6 cm-12 cm-18 cm}, and in the OGR-DS array, as shown in FIG. A microphone was placed at {0 cm-3 cm-12 cm-18 cm}. In addition, as a control experiment, the recognition rate when the microphone array was not used, that is, when there was one microphone was also obtained.

各発声毎に、各マイクロホンの重み（ｋ_i（ｉ＝１，２，３，４））を０．１ずつ変化させ、遅延和処理後の音声区間と無音声区間を検出し、比較して得られるＳＮＲが全８４通りの中で最も高くなるものを音声認識への入力とした。 For each utterance, the weight (k _i (i = 1, 2, 3, 4)) of each microphone is changed by 0.1 to detect and compare the voice and silent sections after delay-sum processing. The highest SNR obtained among all 84 patterns was used as input to speech recognition.

音声認識エンジンには、Julius3.lp2 を用い、ＩＰＡ−ｔｅｓｔｓｅｔの２０0 文の新聞朗読音声を評価データとして用いた( 参考文献〔３〕参照）。 Julius3.lp2 was used as the speech recognition engine, and IPA-testset newspaper reading speech was used as evaluation data (see reference [3]).

参考文献〔３〕：鹿野清宏ら：音声認識システム，オーム社，2001. Reference [3]: Kiyohiro Shikano et al .: Speech recognition system, Ohmsha, 2001.

音響特徴量は１２次のＭＦＣＣとそのΔＭＦＣＣおよびΔＰｏｗｅｒの計２５次元とし、フレーム長２５ｍｓ、フレームシフト１０ｍｓで分析した。 The acoustic feature amount was a 12th-order MFCC and its ΔMFCC and ΔPower total 25 dimensions, and the analysis was performed with a frame length of 25 ms and a frame shift of 10 ms.

〔３．２〕結果と考察
音声認識実験の結果を表１に示す。 [3.2] Results and discussion Table 1 shows the results of speech recognition experiments.

ＯＧＲ−ＤＳアレイはマイクロホンの配置を変え、重みを付けただけの簡単な方法にも関わらず、認識率を向上させることができた。 The OGR-DS array was able to improve the recognition rate in spite of the simple method of changing the microphone arrangement and weighting.

１０ｄＢ雑音環境下において、マイクロホン５個を｛０ｃｍ−６ｃｍ−１２ｃｍ−１８ｃｍ−２４ｃｍ｝に配置したＤＳアレイにおいても認識率を求めた結果、その認識率は４６．９％であった。これに対し、ＯＧＲ−ＤＳアレイでは、表１に示すように、マイクロホン４個でも認識率５１．１％となる。この場合、マイクロホンアレーの規模は、ＯＧＲ−ＤＳアレイの方が６ｃｍ小さかった。 As a result of obtaining the recognition rate even in a DS array in which five microphones were placed in {0 cm-6 cm-12 cm-18 cm-24 cm} under a 10 dB noise environment, the recognition rate was 46.9%. On the other hand, in the OGR-DS array, as shown in Table 1, even with four microphones, the recognition rate is 51.1%. In this case, the size of the microphone array was 6 cm smaller in the OGR-DS array.

このように、提案手法により、マイクロホン数を少なくし、マイクロホンアレーの規模を小さくすることが可能となった。 As described above, the proposed method makes it possible to reduce the number of microphones and the size of the microphone array.

今回の実験条件において、各マイクロホンの重みは、０ｃｍと３ｃｍに配置されたマイクロホンの重みを０．３とし、１２ｃｍと１８ｃｍに配置されたマイクロホンの重みを０．２としたものが全２００文中１９３文にのぼった。このことから、重みの定式化ができれば、処理速度を向上させることができると考えられる。 Under the current experimental conditions, the weights of the microphones arranged at 0 cm and 3 cm are 0.3, and the weights of the microphones arranged at 12 cm and 18 cm are 0.2. I climbed the sentence. From this, it is considered that if the weight can be formulated, the processing speed can be improved.

〔４〕その他
上記実施例では、マイクロホンアレー内の各マイクロホンは、一直線状に配置されているが、図６に示すように円弧状に配置されていてもよい。この場合は、マイクロホンＭ₁〜Ｍ₄は、４個設けられているので、Ｍ₁とＭ₂との間隔（円弧に沿った長さ）と、Ｍ₂とＭ₃との間隔（円弧に沿った長さ）と、Ｍ₃とＭ₄との間隔（円弧に沿った長さ）との比は、１：３：２に設定される。 [4] Others In the above embodiment, the microphones in the microphone array are arranged in a straight line, but may be arranged in an arc as shown in FIG. In this case, since four microphones M _{1 to} M ₄ are provided, the interval between M ₁ and M ₂ (length along the arc) and the interval between M ₂ and M ₃ (along the arc) The ratio between the distance between M ₃ and M ₄ (the length along the arc) is set to 1: 3: 2.

また、図７に示すように、マイクロホンアレー内に、仮想立方体の２以上の辺のそれぞれに、３以上のマイクロホンを配置したような場合にも、各辺毎のマイクロホンの配置に本発明を適用することができる。図７の場合は、各辺に４個のマイクロホンＭ₁〜Ｍ₄が配置されているので、各辺において、隣合うマイクロホンの間隔は、１：３：２に設定される。なお、図示はしないが、マイクロホンアレー内に、仮想四角錐の２以上の斜辺のそれぞれに、３以上のマイクロホンを配置したような場合にも、各斜辺毎のマイクロホンの配置に本発明を適用することができる。 Further, as shown in FIG. 7, the present invention is applied to the arrangement of the microphones for each side even when three or more microphones are arranged in each of two or more sides of the virtual cube in the microphone array. can do. In the case of FIG. 7, since four microphones M _{1 to} M ₄ are arranged on each side, the interval between adjacent microphones on each side is set to 1: 3: 2. Although not shown, the present invention is applied to the arrangement of microphones for each hypotenuse even when three or more microphones are arranged in each of two or more hypotenuses of the virtual quadrangular pyramid in the microphone array. be able to.

遅延和型マイクロホンアレーの一般的な構成を示すブロック図である。It is a block diagram which shows the general structure of a delay sum type microphone array. 各マイクロホン間隔（５ｃｍ，１０ｃｍ，１５ｃｍ）での、音源と雑音源の角度の変化による遅延和処理後のＳＮＲの変化を示すグラフである。It is a graph which shows the change of SNR after the delay sum process by the change of the angle of a sound source and a noise source in each microphone space | interval (5 cm, 10 cm, 15 cm). 図３（ａ）は、配置対象が４個の場合に、それらを等間隔で配置した場合の配置位置および間隔の種類を示し、図３（ｂ）は、４個の配置対象を最短ゴロム定期間隔に従って配置した場合の配置位置および間隔の種類を示す模式図である。FIG. 3A shows arrangement positions and types of intervals when four arrangement targets are arranged at equal intervals, and FIG. 3B shows four arrangement targets at the shortest Golomb period. It is a schematic diagram which shows the arrangement position at the time of arrange | positioning according to a space | interval, and the kind of space | interval. 本実施例の遅延和型マイクロホンアレーの構成を示すブロック図である。It is a block diagram which shows the structure of the delay sum type microphone array of a present Example. 図４のマイクロホンの配置を示す模式図である。It is a schematic diagram which shows arrangement | positioning of the microphone of FIG. マイクロホンアレー内の各マイクロホンが円弧状に配置されている場合の例を示す模式図である。It is a schematic diagram which shows the example in case each microphone in a microphone array is arrange | positioned at circular arc shape. マイクロホンアレー内に、仮想立方体の２以上の辺のそれぞれに、３以上のマイクロホンを配置した場合の例を示す模式図である。It is a schematic diagram which shows the example at the time of arrange | positioning three or more microphones in each of two or more sides of a virtual cube in a microphone array.

Explanation of symbols

Ｍ_i マイクロホン
Ｄ_i 遅延器
Ｐｉ乗算器
Ｓ加算器 M _i microphone D _i delayer Pi multipliers S adder

Claims

A microphone array comprising three or more microphones, wherein the distance between the microphones is set to an interval proportional to the scale interval of the shortest Golomb ruler.

The microphone array according to claim 1, wherein the three or more microphones are linearly arranged.

2. The microphone array according to claim 1, wherein the three or more microphones are arranged in an arc shape.

The microphone array according to any one of claims 1, 2, and 3, wherein four microphones are provided, and an interval between the microphones is set to an interval of 1: 3: 2.