JP2011234213A

JP2011234213A - Sound source direction estimation apparatus

Info

Publication number: JP2011234213A
Application number: JP2010103979A
Authority: JP
Inventors: Toshiharu Kamijo; 利治上條
Original assignee: Nidec Copal Corp
Current assignee: Nidec Copal Corp
Priority date: 2010-04-28
Filing date: 2010-04-28
Publication date: 2011-11-17

Abstract

PROBLEM TO BE SOLVED: To provide a sound source direction estimation apparatus capable of precisely estimating the direction of a sound source even there is a fluctuation with the sensitivity characteristics of a microphone.SOLUTION: A sound source direction estimation apparatus 20 allows audible and prescribed range of frequency components in each of the electric signals output by microphones 1 to 4 and amplified by amplifiers 5 to 8 to pass through each of BPF 21 to 24, and allows a sound source direction operation part 27 to estimate the direction of a sound source based on the frequency components passed through each of BPF 21 to 24. Then, the adjustment of amplification degree to reduce the difference of sensitivity characteristics of frequencies in the prescribed range is enabled by allowing an amplification adjustment part 26 to mutually adjust the amplification degree in generating frequency components from the electric signals with respect to each of the microphone 1 to 4, thereby gaining the frequency components with reduced sensitivity characteristics fluctuations for each of the microphones 1 to 4 for estimating the sound source direction precisely.

Description

本発明は、音源方向推定装置に関する。 The present invention relates to a sound source direction estimating apparatus.

従来、下記特許文献１に示されるように、テレビ会議等に適用される制御システムであって、撮像装置と複数のマイクロフォンとが筺体に設けられ、各マイクロフォンにより話者の音声を集音して電気信号に変換し、変換された電気信号を用いて話者（音源）の方向を算出するシステムが知られている。このシステムでは、算出した方向に撮像装置が向けられるよう撮像装置の角度を制御することで、話者を撮像できるようにしている。 2. Description of the Related Art Conventionally, as shown in Patent Document 1 below, a control system applied to a video conference or the like, in which an imaging device and a plurality of microphones are provided in a housing, and a speaker's voice is collected by each microphone. There is known a system that converts an electric signal and calculates the direction of a speaker (sound source) using the converted electric signal. In this system, a speaker can be imaged by controlling the angle of the imaging device so that the imaging device is directed in the calculated direction.

特開２００７−２１４７５３号公報JP 2007-214753 A

上記のように複数のマイクロフォンによって話者の方向を算出するシステムでは、通常、各マイクロフォンから出力される電気信号をアンプにより増幅し、増幅された電気信号の出力値を用いて音源の方向を算出する。ところが各マイクロフォンは、個体差により感度特性に多少のばらつきを有している。そこで、個々のマイクロフォンにおける感度特性のばらつきを低減するため、例えばシステムの工場出荷前において、基準音源に対する出力値を合わせるようにアンプの増幅度を調整するのが一般的である。 In a system that calculates the direction of a speaker using a plurality of microphones as described above, the electric signal output from each microphone is usually amplified by an amplifier, and the direction of the sound source is calculated using the output value of the amplified electric signal. To do. However, each microphone has some variation in sensitivity characteristics due to individual differences. Therefore, in order to reduce variations in sensitivity characteristics among individual microphones, for example, before the system is shipped from the factory, it is common to adjust the amplification degree of the amplifier so as to match the output value with respect to the reference sound source.

しかしながら、従来の技術では、アンプの増幅度の調整により一定の周波数に応じて出力値を合わせることはできたものの、個々のマイクロフォンの感度特性は周波数によってまちまちであるため、実際の使用時においてマイクロフォンに入力される音の周波数が調整時の基準音源の周波数とは異なる場合、結果的には各マイクロフォンにおける出力値に誤差が生じてしまっていた。そのため、音源の方向を精度良く算出することは難しかった。 However, with the conventional technology, although the output value can be adjusted according to a certain frequency by adjusting the amplification factor of the amplifier, the sensitivity characteristics of individual microphones vary depending on the frequency. When the frequency of the sound input to is different from the frequency of the reference sound source at the time of adjustment, as a result, an error occurs in the output value of each microphone. For this reason, it has been difficult to calculate the direction of the sound source with high accuracy.

本発明は、マイクロフォンの感度特性にばらつきがある場合であっても音源の方向を精度良く推定することができる音源方向推定装置を提供することを目的とする。 An object of the present invention is to provide a sound source direction estimation apparatus that can accurately estimate the direction of a sound source even when the sensitivity characteristics of microphones vary.

本発明に係る音源方向推定装置は、音源で生じた音を入力し音を電気信号に変換する複数のマイクロフォンと、複数のマイクロフォンのそれぞれから出力された電気信号を増幅する増幅手段と、増幅手段によって増幅されたそれぞれの電気信号における可聴域内の所定範囲の周波数成分を通過させるバンドパスフィルタと、バンドパスフィルタを通過した周波数成分に基づいて音源の方向を推定する音源方向推定手段と、を備え、複数のマイクロフォンに対応して電気信号から周波数成分を生成する際の増幅度が互いに調整可能に構成されていることを特徴とする。 A sound source direction estimating apparatus according to the present invention includes a plurality of microphones that input sound generated by a sound source and convert the sound into an electric signal, an amplifying unit that amplifies an electric signal output from each of the plurality of microphones, and an amplifying unit A band-pass filter that passes a predetermined range of frequency components within the audible range in each electric signal amplified by the sound source, and a sound source direction estimating unit that estimates the direction of the sound source based on the frequency component that has passed through the band-pass filter. The amplification degree when generating the frequency component from the electrical signal corresponding to the plurality of microphones is configured to be adjustable.

本発明に係る音源方向推定装置によれば、複数のマイクロフォンのそれぞれから出力された電気信号が増幅手段によって増幅され、増幅されたそれぞれの電気信号における可聴域内の所定範囲の周波数成分がバンドパスフィルタを通過する。そして、音源方向推定手段によって、バンドパスフィルタを通過した周波数成分に基づいて音源の方向が推定される。ここで、複数のマイクロフォンに対応して電気信号から周波数成分を生成する際の増幅度が互いに調整可能に構成されているので、個々のマイクロフォンの感度特性が周波数により異なっている場合でも、所定範囲の周波数において感度特性の相違を低減するような増幅度の調整が可能となる。よって、マイクロフォンごとの感度特性のばらつきが低減された周波数成分が得られる。こうしてばらつきが低減された周波数成分を用いて、音源方向推定手段によって音源の方向が推定されるため、音源の方向を精度良く推定することができる。 According to the sound source direction estimating apparatus according to the present invention, the electric signal output from each of the plurality of microphones is amplified by the amplifying means, and the frequency component in a predetermined range in the audible range in each of the amplified electric signals is a bandpass filter. Pass through. The sound source direction estimating means estimates the direction of the sound source based on the frequency component that has passed through the bandpass filter. Here, since the amplification degree when generating the frequency component from the electrical signal corresponding to a plurality of microphones is configured to be mutually adjustable, even if the sensitivity characteristics of the individual microphones differ depending on the frequency, a predetermined range It is possible to adjust the amplification degree so as to reduce the difference in sensitivity characteristics at the frequency of. Therefore, a frequency component in which variations in sensitivity characteristics for each microphone are reduced can be obtained. Since the direction of the sound source is estimated by the sound source direction estimating means using the frequency component in which the variation is thus reduced, the direction of the sound source can be estimated with high accuracy.

ここで、バンドパスフィルタを通過した周波数成分のうち、少なくとも一つのマイクロフォンに対応した周波数成分の増幅度を調整する増幅度調整手段を備えると好適である。 Here, it is preferable to provide an amplification degree adjusting means for adjusting the amplification degree of the frequency component corresponding to at least one of the frequency components that have passed through the bandpass filter.

このような構成によれば、バンドパスフィルタを通過した周波数成分のうち、少なくとも一つのマイクロフォンに対応した周波数成分の増幅度が増幅度調整手段によって調整されるので、バンドパスフィルタにより抽出された周波数成分における感度特性の相違を低減するように増幅度の調整を行うことができ、上記した方向推定精度の向上効果が好適に発揮される。 According to such a configuration, since the amplification degree of the frequency component corresponding to at least one microphone among the frequency components that have passed through the bandpass filter is adjusted by the amplification degree adjusting means, the frequency extracted by the bandpass filter The degree of amplification can be adjusted so as to reduce the difference in sensitivity characteristics among the components, and the above-described effect of improving the direction estimation accuracy is preferably exhibited.

また、増幅手段は、複数のマイクロフォンから出力された電気信号のうち、少なくとも一つのマイクロフォンから出力された電気信号の増幅度を調整する増幅度調整手段を有すると好適である。 In addition, it is preferable that the amplifying unit includes an amplification degree adjusting unit that adjusts an amplification degree of the electric signal output from at least one microphone among the electric signals output from the plurality of microphones.

このような構成によれば、増幅手段では、複数のマイクロフォンから出力された電気信号のうち少なくとも一つのマイクロフォンから出力された電気信号の増幅度を増幅度調整手段によって調整することができ、所定範囲の周波数において感度特性の相違を低減するような増幅度の調整が可能となる。そして、所定範囲の周波数成分をバンドパスフィルタにより抽出することでマイクロフォンごとの感度特性のばらつきが低減された周波数成分が得られ、上記した方向推定精度の向上効果が好適に発揮される。 According to such a configuration, the amplification unit can adjust the amplification degree of the electric signal output from at least one microphone among the electric signals output from the plurality of microphones by the amplification degree adjustment unit, It is possible to adjust the amplification degree so as to reduce the difference in sensitivity characteristics at the frequency of. Then, by extracting a frequency component in a predetermined range using a bandpass filter, a frequency component in which variation in sensitivity characteristics for each microphone is reduced is obtained, and the above-described effect of improving the direction estimation accuracy is preferably exhibited.

ここで、複数のマイクロフォンは、音を取り込むための開口が形成された筺体の内部に埋設されており、開口を通って筺体内に伝搬する音を入力すると好適である。 Here, the plurality of microphones are embedded in a housing in which an opening for capturing sound is formed, and it is preferable to input sound that propagates through the opening into the housing.

このような構成によれば、開口とマイクロフォンとを結ぶ方向線に対してある角度だけずれた位置で生じた音は、開口付近で回折してマイクロフォンに達する。この回折によって音は減衰するが、音源の位置する角度と音の減衰量とは、音の周波数に応じた相関関係を有する。すなわち、周波数が低い音では、角度が大きくなっても減衰量は小さいが、周波数が高い音では、角度が大きくなるほど減衰量は増大する。増幅手段によって増幅されたそれぞれの電気信号における所定範囲の周波数成分がバンドパスフィルタを通過することで、音源の位置する角度との高い相関関係を有する帯域の周波数成分を抽出することができる。そして、バンドパスフィルタを通過した周波数成分を用いて音源の方向が推定されるため、それ自体は指向性を有しない無指向性のマイクロフォンを用いた場合であっても、音源の方向を精度良く推定することができる。 According to such a configuration, the sound generated at a position shifted by a certain angle with respect to the direction line connecting the opening and the microphone is diffracted near the opening and reaches the microphone. Although sound is attenuated by this diffraction, the angle at which the sound source is located and the sound attenuation amount have a correlation according to the frequency of the sound. That is, for a sound with a low frequency, the amount of attenuation is small even when the angle is large, but for a sound with a high frequency, the amount of attenuation increases as the angle increases. A frequency component in a predetermined range in each electric signal amplified by the amplifying means passes through the bandpass filter, so that a frequency component in a band having a high correlation with the angle at which the sound source is located can be extracted. Since the direction of the sound source is estimated using the frequency component that has passed through the bandpass filter, the direction of the sound source can be accurately determined even when a non-directional microphone that does not have directivity is used. Can be estimated.

本発明によれば、マイクロフォンの感度特性にばらつきがある場合であっても音源の方向を精度良く推定することができる。 According to the present invention, the direction of a sound source can be accurately estimated even when the sensitivity characteristics of microphones vary.

本発明の一実施形態に係る音源方向推定装置が適用されたテレビ会議用カメラの斜視図である。1 is a perspective view of a video conference camera to which a sound source direction estimating device according to an embodiment of the present invention is applied. 図２（ａ）は図１のテレビ会議用カメラにおける開口を含む位置の断面図であり、図２（ｂ）は図２（ａ）中の筒状体の軸線方向から見た側面図である。2A is a cross-sectional view of the position including the opening in the video conference camera of FIG. 1, and FIG. 2B is a side view of the cylindrical body in FIG. 2A viewed from the axial direction. . 図１のテレビ会議用カメラの機能構成を示すブロック図である。It is a block diagram which shows the function structure of the camera for video conferences of FIG. 図３中の増幅度調整部における増幅度の調整を説明するための図である。It is a figure for demonstrating adjustment of the amplification degree in the amplification degree adjustment part in FIG. 筒状体の軸線に対して音源の方向がなす角度を示す模式図である。It is a schematic diagram which shows the angle which the direction of a sound source makes with respect to the axis line of a cylindrical body. 図６（ａ）はテレビ会議用カメラに対する音源の位置を示す図であり、図６（ｂ）は図６（ａ）の場合のマイクロフォンの感度特性を示す図である。FIG. 6A is a diagram showing the position of the sound source with respect to the video conference camera, and FIG. 6B is a diagram showing the sensitivity characteristics of the microphone in the case of FIG.

以下、本発明の一実施形態に係る音源方向推定装置について図面を参照しながら説明する。以下の説明では、音源方向推定装置がテレビ会議用カメラに適用される場合について説明する。 Hereinafter, a sound source direction estimating apparatus according to an embodiment of the present invention will be described with reference to the drawings. In the following description, a case where the sound source direction estimating device is applied to a video conference camera will be described.

図１〜図３に示すように、テレビ会議用カメラＡは、例えばテレビ会議に参加する複数の参加者により囲まれた位置に設置されて、発言する参加者（以下、「発言者」という）の方向を推定してその方向へカメラ３５を向けることにより、発言者の映像を取得するものである。テレビ会議用カメラＡは、ネットワークを介して相手方のテレビ会議システムと通信可能になっている。 As shown in FIGS. 1 to 3, the video conference camera A is installed at a position surrounded by, for example, a plurality of participants participating in the video conference, and speaks (hereinafter referred to as “speaker”). The video of the speaker is acquired by directing the camera 35 in that direction. The video conference camera A can communicate with the other party's video conference system via a network.

テレビ会議用カメラＡは、テーブル等に載置される直径約１０ｃｍの略円筒形状で樹脂製の第１筺体１０ａと、第１筺体１０ａの軸線に沿って鉛直方向に配置された軸（図示せず）を介して第１筺体１０ａの上側に取り付けられ、その軸を中心として第１筺体１０ａに対して回転可能な略直方体形状で樹脂製の第２筺体１０ｂとを有している。カメラ３５は、第２筺体１０ｂ内に収納されている。 The video conference camera A has a substantially cylindrical shape with a diameter of about 10 cm placed on a table or the like, and a resin-made first housing 10a, and an axis (not shown) arranged in the vertical direction along the axis of the first housing 10a. And a second housing 10b made of resin having a substantially rectangular parallelepiped shape that is rotatable about the axis of the first housing 10a. The camera 35 is housed in the second housing 10b.

第１筺体１０ａの側面には、同じ高さで周方向に９０°ずつ離間する位置に４つの円形の開口１１〜１４が形成されている。各開口１１〜１４は、テレビ会議用カメラＡの周囲で発せられた発言者の音声を第１筺体１０ａ内に取り込むための穴であり、内径が約８ｍｍになっている。 On the side surface of the first housing 10a, four circular openings 11 to 14 are formed at the same height and spaced apart by 90 ° in the circumferential direction. Each opening 11-14 is a hole for taking in the voice of the speaker uttered around the video conference camera A into the first housing 10a, and has an inner diameter of about 8 mm.

第１筺体１０ａの内面側には、各開口１１〜１４から第１筺体１０ａ内に向けて第１筺体１０ａの半径方向に延びる円筒形状の４本の筒状体１６〜１９が固定されている。各筒状体１６〜１９の内径は約８ｍｍであり、各開口１１〜１４の内周面と各筒状体１６〜１９の内壁面とは略面一になっている。各筒状体１６〜１９は、これらの内部に直径約８ｍｍの円柱形上の伝搬路Ｂ１〜Ｂ４を形成している。すなわち、各伝搬路Ｂ１〜Ｂ４は、各開口１１〜１４から第１筺体１０ａ内に向けて第１筺体１０ａの半径方向に水平に延びるように形成されている。なお、各開口１１〜１４は、第１筺体１０ａの外壁に形成された穴を意味しており、各開口１１〜１４には、各筒状体１６〜１９及び各伝搬路Ｂ１〜Ｂ４は含まれない。 Four cylindrical bodies 16 to 19 having a cylindrical shape extending in the radial direction of the first casing 10a from the openings 11 to 14 into the first casing 10a are fixed to the inner surface side of the first casing 10a. . The inner diameter of each cylindrical body 16-19 is about 8 mm, and the inner peripheral surface of each opening 11-14 and the inner wall surface of each cylindrical body 16-19 are substantially flush. Each cylindrical body 16-19 forms propagation paths B1-B4 on a cylindrical shape having a diameter of about 8 mm inside thereof. That is, each propagation path B1-B4 is formed so that it may extend horizontally in the radial direction of the 1st housing | casing 10a toward the inside of the 1st housing | casing 10a from each opening 11-14. In addition, each opening 11-14 means the hole formed in the outer wall of the 1st housing 10a, and each cylindrical body 16-19 and each propagation path B1-B4 are included in each opening 11-14. I can't.

更に、第１筺体１０ａの中心側に位置する各筒状体１６〜１９の端部には、前面が各開口１１〜１４に平行になるようにして円柱形状の４個のマイクロフォン（以下、「マイク」という）１〜４が埋設されている。各マイク１〜４は、例えば直径約７ｍｍ、厚さ約４ｍｍの、前面側でのみ集音可能な無指向性のエレクトリックコンデンサマイクである。各マイク１〜４は、直径約５ｍｍの円形のダイヤフラムからなる振動板１ａ〜４ａを内蔵している。各開口１１〜１４から各マイク１〜４の前面までの最短距離は約１０ｍｍであり、各開口１１〜１４の内径よりも長くなっている。また、上記したように、伝搬路Ｂ１〜Ｂ４の内径（断面）は、振動板１ａ〜４ａよりも大きくなっている（図２参照）。これにより、伝搬路Ｂ１〜Ｂ４を通る音は振動板１ａ〜４ａに確実に伝達されるようになっている。各マイク１〜４と各筒状体１６〜１９の内壁面との間には、スペーサ（図示せず）が配設されている。 Furthermore, four cylindrical microphones (hereinafter referred to as “hereinafter referred to as“ microphones ”) are formed at the ends of the cylindrical bodies 16 to 19 located on the center side of the first casing 10a so that the front faces are parallel to the openings 11 to 14, respectively. 1-4) are embedded. Each of the microphones 1 to 4 is an omnidirectional electric condenser microphone having a diameter of about 7 mm and a thickness of about 4 mm and capable of collecting sound only on the front side. Each of the microphones 1 to 4 incorporates diaphragms 1a to 4a made of a circular diaphragm having a diameter of about 5 mm. The shortest distance from each opening 11-14 to the front surface of each microphone 1-4 is about 10 mm, which is longer than the inner diameter of each opening 11-14. Moreover, as above-mentioned, the internal diameter (cross section) of propagation path B1-B4 is larger than the diaphragms 1a-4a (refer FIG. 2). Thereby, the sound passing through the propagation paths B1 to B4 is surely transmitted to the diaphragms 1a to 4a. Spacers (not shown) are disposed between the microphones 1 to 4 and the inner wall surfaces of the cylindrical bodies 16 to 19.

このような構成により、テレビ会議用カメラＡでは、会議における発言者の音声が開口１１〜１４及び伝搬路Ｂ１〜Ｂ４を通じてマイク１〜４に入力される仕組みとなっている。そして、マイク１〜４に入力される音声により振動板１ａ〜４ａが振動させられ、この振動に応じて音声が電気信号に変換され、電気信号がマイク１〜４からアンプ５〜８へ出力される（図３参照）。なお、マイク１〜４の背面側の内部空間Ｓには第１筺体１０ａの外部の音が伝搬しない構成になっている。 With such a configuration, the video conference camera A has a mechanism in which the voice of the speaker in the conference is input to the microphones 1 to 4 through the openings 11 to 14 and the propagation paths B1 to B4. Then, the diaphragms 1a to 4a are vibrated by the sound input to the microphones 1 to 4, and the sound is converted into an electrical signal according to the vibration, and the electrical signal is output from the microphones 1 to 4 to the amplifiers 5 to 8. (See FIG. 3). In addition, it is the structure which the sound outside the 1st housing 10a does not propagate to the internal space S of the back side of the microphones 1-4.

図３に示すように、テレビ会議用カメラＡは、マイク１〜４に入力された音声に基づいて所定の処理を施すことにより、発言者の方向を推定する機能を備えている。具体的には、テレビ会議用カメラＡは、各マイク１〜４から出力された電気信号を入力し、入力した電気信号を増幅させる４個のアンプ（増幅手段）５〜８と、各アンプ５〜８によって増幅された各電気信号における所定範囲の周波数成分を通過させるバンドパスフィルタ（以下、「ＢＰＦ」という）２１〜２４と、各ＢＰＦ２１〜２４を通過した各周波数成分をアナログ−デジタル（Analog−Digital）変換するＡＤ変換部２５と、ＡＤ変換部２５でデジタル変換された各周波数成分のうち、マイク１〜４の少なくとも一つに対応した周波数成分の増幅度を調整する増幅度調整部２６と、増幅度調整部２６から出力された周波数成分を用いて音源の方向を推定する音源方向演算部（音源方向推定手段）２７、ヒストグラム登録部２８、及び音源方向再演算部２９とを備えている。 As shown in FIG. 3, the video conference camera A has a function of estimating the direction of the speaker by performing a predetermined process based on the sound input to the microphones 1 to 4. Specifically, the video conference camera A receives the electric signals output from the microphones 1 to 4, and amplifies the input electric signals, four amplifiers (amplifying means) 5 to 8, and each amplifier 5. Bandpass filters (hereinafter referred to as “BPFs”) 21 to 24 that pass a predetermined range of frequency components in each electric signal amplified by ˜8, and each frequency component that passed through each BPF 21 to 24 is analog-digital (Analog) -Digital) AD conversion unit 25 for conversion, and amplification degree adjustment unit 26 for adjusting the amplification degree of the frequency component corresponding to at least one of the microphones 1 to 4 among the frequency components digitally converted by the AD conversion unit 25 A sound source direction calculation unit (sound source direction estimation means) 27 that estimates the direction of the sound source using the frequency component output from the amplification degree adjustment unit 26, a histogram registration unit 28, and a sound source method And a direction re-calculation unit 29.

アンプ５〜８は、マイク１〜４に各々接続されており、各マイク１〜４から出力された電気信号を増幅回路により増幅させる。各アンプ５〜８は、増幅させた電気信号を各ＢＰＦ２１〜２４に出力する。 The amplifiers 5 to 8 are connected to the microphones 1 to 4, respectively, and amplify the electrical signals output from the microphones 1 to 4 by an amplifier circuit. Each amplifier 5-8 outputs the amplified electric signal to each BPF 21-24.

各ＢＰＦ２１〜２４は、各アンプ５〜８から出力された電気信号を入力し、入力した電気信号のうち、例えば２．５ｋＨｚ以下及び６．５ｋＨｚ以上の周波数成分を遮断することにより、２．５ｋＨｚ〜６．５ｋＨｚの範囲の周波数成分を通過させる。各ＢＰＦ２１〜２４によって通過させられる周波数の範囲は、可聴域内にある所定の範囲とされる。各ＢＰＦ２１〜２４は、２．５ｋＨｚ以下の周波数成分を遮断するハイパスフィルタと、６．５ｋＨｚ以上の周波数成分を遮断するローパスフィルタとを組み合わせることにより構成される。 Each BPF 21-24 receives the electrical signal output from each of the amplifiers 5-8, and, for example, by blocking frequency components of 2.5 kHz or less and 6.5 kHz or more from the input electrical signal, 2.5 kHz Pass frequency components in the range of ~ 6.5 kHz. The range of the frequency passed by each BPF 21 to 24 is a predetermined range in the audible range. Each BPF 21 to 24 is configured by combining a high-pass filter that cuts off frequency components of 2.5 kHz or less and a low-pass filter that cuts off frequency components of 6.5 kHz or more.

ＡＤ変換部２５は、各ＢＰＦ２１〜２４を通過した各周波数成分をアナログ信号からデジタル信号に変換する。ＡＤ変換部２５は、ＢＰＦ２１〜２４のいずれから出力された周波数成分であるかを識別する機能を有してもよいし、各アンプ５〜８に対応して複数設けられてもよい。ＡＤ変換部２５は、デジタル信号に変換した周波数成分を生成し、その周波数成分がマイク１〜４のいずれに対応する周波数成分であるかを示す信号と共にその周波数成分を増幅度調整部２６に出力する。 The AD conversion unit 25 converts each frequency component that has passed through each of the BPFs 21 to 24 from an analog signal to a digital signal. The AD conversion unit 25 may have a function of identifying which one of the BPFs 21 to 24 is a frequency component, or a plurality of AD conversion units 25 may be provided corresponding to the amplifiers 5 to 8. The AD conversion unit 25 generates a frequency component converted into a digital signal, and outputs the frequency component to the amplification adjustment unit 26 together with a signal indicating which of the microphones 1 to 4 corresponds to the frequency component. To do.

増幅度調整部２６は、例えばＣＰＵ（Central Processing Unit)、ＲＯＭ(Read Only Memory)、ＲＡＭ(Random Access Memory)等により構成されている。増幅度調整部２６は、ＡＤ変換部２５から出力された周波数成分を入力し、入力した周波数成分のうち、感度特性が他のマイクに対してずれたマイクに対応する周波数成分の増幅度を調整する。言い換えれば、増幅度調整部２６は、各マイク１〜４に対応して電気信号から周波数成分を生成する際の増幅度を互いに調整することができる。 The amplification degree adjustment unit 26 includes, for example, a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), and the like. The amplification degree adjustment unit 26 receives the frequency component output from the AD conversion unit 25, and adjusts the amplification degree of the frequency component corresponding to the microphone whose sensitivity characteristic is deviated from other microphones among the input frequency component. To do. In other words, the amplification degree adjustment unit 26 can mutually adjust the amplification degree when generating the frequency component from the electric signal corresponding to each of the microphones 1 to 4.

より具体的に説明すると、マイク１〜４の感度特性は、個体差によりばらつきを有している。図４は、増幅度調整部２６における増幅度の調整を説明するための図である。図４において、横軸は周波数を示しており、縦軸は、ある基準音をマイクの正面の一定距離離れた位置から発生させた場合における、基準値に対するマイクの出力値の相対値（すなわち感度）を示している。図４に示す例では、マイク１とマイク２とでは周波数によって感度特性が異なっている。すなわち、マイク１では周波数が高くなるにつれて相対値が低下するのに対し、マイク２では周波数が高くなるにつれて相対値が上昇する。言い換えれば、マイク１は低い周波数で感度が高く、マイク２は高い周波数で感度が高くなっている。 More specifically, the sensitivity characteristics of the microphones 1 to 4 vary depending on individual differences. FIG. 4 is a diagram for explaining the adjustment of the amplification degree in the amplification degree adjustment unit 26. In FIG. 4, the horizontal axis indicates the frequency, and the vertical axis indicates the relative value of the output value of the microphone with respect to the reference value (that is, sensitivity) when a certain reference sound is generated from a position at a certain distance in front of the microphone. ). In the example shown in FIG. 4, the microphone 1 and the microphone 2 have different sensitivity characteristics depending on the frequency. That is, the relative value of the microphone 1 decreases as the frequency increases, whereas the relative value of the microphone 2 increases as the frequency increases. In other words, the microphone 1 has high sensitivity at a low frequency, and the microphone 2 has high sensitivity at a high frequency.

この場合、増幅度調整部２６は、各ＢＰＦ２１〜２４によって通過させられる２．５ｋＨｚ〜６．５ｋＨｚの範囲における感度がマイク１とマイク２とで略等しくなるよう、マイク２の増幅度を調整する。増幅度調整部２６による増幅度の調整は、ＡＤ変換部２５から出力された周波数成分に対して感度特性の相違に応じた一定の係数を乗じることで行われる。より具体的には、増幅度を増加させる場合には１以上の係数が用いられ、増幅度を減少させる場合には１未満の係数が用いられる。この係数は、例えば工場出荷前において、基準音を用いた試験によって各マイク１〜４に応じて適宜設定され、増幅度調整部２６に格納される。このように、増幅度調整部２６による単純な演算によりマイク１〜４の感度特性を揃えることができるので、音源の方向の推定精度の向上が図られ、テレビ会議用カメラＡによる発言者の撮影が確実に行われる。 In this case, the amplification degree adjusting unit 26 adjusts the amplification degree of the microphone 2 so that the sensitivities in the range of 2.5 kHz to 6.5 kHz that are passed by the BPFs 21 to 24 are substantially equal between the microphone 1 and the microphone 2. . The amplification degree adjustment by the amplification degree adjustment unit 26 is performed by multiplying the frequency component output from the AD conversion unit 25 by a constant coefficient corresponding to the difference in sensitivity characteristics. More specifically, a coefficient of 1 or more is used when increasing the amplification degree, and a coefficient of less than 1 is used when decreasing the amplification degree. This coefficient is appropriately set according to each of the microphones 1 to 4 by a test using a reference sound before being shipped from the factory, for example, and stored in the amplification degree adjustment unit 26. As described above, since the sensitivity characteristics of the microphones 1 to 4 can be made uniform by a simple calculation by the amplification degree adjustment unit 26, the accuracy of the direction estimation of the sound source can be improved, and the speaker can be photographed by the video conference camera A. Is surely done.

なお、増幅度調整部２６による増幅度の調整は、係数を乗じる場合に限られず、各ＢＰＦ２１〜２４によって通過させられる範囲の周波数において各マイク１〜４の感度のばらつきを低減し得る手法であれば、いずれの手法が用いられてもよい。また、図４ではマイク１及びマイク２がそれぞれ周波数に応じて一定の傾向で変化する感度特性を有する場合を例として示したが、実際にはマイクの感度特性は周波数に対して不規則にばらつくことが一般的であり、そのような場合でも、ＢＰＦ２１〜２４における所定範囲の周波数成分の抽出と増幅度調整部２６における増幅度の調整とによって、各マイク１〜４の感度特性のばらつきを低減することができる。 The adjustment of the amplification degree by the amplification degree adjustment unit 26 is not limited to the case of multiplying by a coefficient, and may be a technique that can reduce variations in sensitivity of the microphones 1 to 4 in a frequency range that is allowed to pass by the BPFs 21 to 24. Any method may be used. FIG. 4 shows an example in which the microphone 1 and the microphone 2 have sensitivity characteristics that change with a certain tendency according to the frequency, but actually the sensitivity characteristics of the microphones vary irregularly with respect to the frequency. Even in such a case, variations in sensitivity characteristics of the microphones 1 to 4 are reduced by extracting frequency components in a predetermined range in the BPFs 21 to 24 and adjusting the amplification degree in the amplification degree adjustment unit 26. can do.

増幅度調整部２６は、増幅度が調整されたマイクに対応する周波数成分と、増幅度が調整されていないマイクに対応する周波数成分とを、各周波数成分がマイク１〜４のいずれに対応する周波数成分であるかを示す信号と共に音源方向演算部２７に出力する。 The amplification degree adjustment unit 26 corresponds to a frequency component corresponding to a microphone whose amplification degree is adjusted and a frequency component corresponding to a microphone whose amplification degree is not adjusted, and each frequency component corresponds to any of the microphones 1 to 4. It outputs to the sound source direction calculating part 27 with the signal which shows whether it is a frequency component.

音源方向演算部２７、ヒストグラム登録部２８、及び音源方向再演算部２９は、例えばＣＰＵ（Central Processing Unit)、ＲＯＭ(Read Only Memory)、ＲＡＭ(Random Access Memory)等により構成されている。音源方向演算部２７、ヒストグラム登録部２８、及び音源方向再演算部２９は、増幅度調整部２６から出力された周波数成分に基づいて発言者の方向を演算することにより、発言者の方向を推定する。 The sound source direction calculation unit 27, the histogram registration unit 28, and the sound source direction recalculation unit 29 are configured by, for example, a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), and the like. The sound source direction calculation unit 27, the histogram registration unit 28, and the sound source direction recalculation unit 29 estimate the speaker direction by calculating the speaker direction based on the frequency component output from the amplification degree adjustment unit 26. To do.

具体的には、音源方向演算部２７は、増幅度調整部２６から出力された各周波数成分に基づいて、各マイク１〜４に対応する周波数成分の出力値（音の大きさ）を比較することにより、音声が発せられた音源の方向を演算する。音源方向演算部２７での音源の方向の演算処理としては、公知の技術を用いることができる。テレビ会議用カメラＡにおける音源の方向は、第１筺体１０ａの周方向の所定位置（例えば開口１１の中心位置）を基準として、第１筺体１０ａの上方から見た場合に時計回りを正とした角度（０°〜３６０°）で表される。音源方向演算部２７による音源の方向の演算は、所定時間毎、例えば４０ミリ秒毎に実行される。 Specifically, the sound source direction calculation unit 27 compares output values (sound levels) of frequency components corresponding to the microphones 1 to 4 based on the frequency components output from the amplification degree adjustment unit 26. Thus, the direction of the sound source from which the sound is emitted is calculated. As the calculation process of the direction of the sound source in the sound source direction calculation unit 27, a known technique can be used. The direction of the sound source in the video conference camera A is positive in the clockwise direction when viewed from above the first housing 10a with reference to a predetermined position in the circumferential direction of the first housing 10a (for example, the center position of the opening 11). It is expressed as an angle (0 ° to 360 °). The calculation of the direction of the sound source by the sound source direction calculation unit 27 is executed every predetermined time, for example, every 40 milliseconds.

前述したように、マイク１〜４は、開口１１〜１４から第１筺体１０ａ内に所定長（例えば約１０ｍｍ）入り込んだ位置に埋設されているため、周波数が高い音ほど、マイク１〜４に達する音は回折により減衰する。例えば、図５に示すように、マイク１の中心と開口１１の中心とを結ぶ方向線（すなわち筒状体１６の軸線）に対して音源の位置する方向が角度θだけずれていると、周波数の高い音は、角度θずれている分減衰してマイク１に達する。そして、各ＢＰＦ２１〜２４により２．５ｋＨｚ以上の周波数成分が通過させられるため、角度の影響を受けた出力値が音源方向演算部２７に入力され、各マイク１〜４に対応する出力値の比較により、音源方向演算部２７における音源の方向の演算が精度良く行われる。音源方向演算部２７は、演算した所定時間毎の音源の方向をヒストグラム登録部２８に出力する。 As described above, the microphones 1 to 4 are embedded in the first casing 10a through the openings 11 to 14 at a predetermined length (for example, about 10 mm). The sound that reaches is attenuated by diffraction. For example, as shown in FIG. 5, if the direction in which the sound source is located is shifted by an angle θ with respect to the direction line connecting the center of the microphone 1 and the center of the opening 11 (that is, the axis of the cylindrical body 16), Sound that is high is attenuated by the angle θ shift and reaches the microphone 1. Since each BPF 21-24 passes a frequency component of 2.5 kHz or more, an output value affected by the angle is input to the sound source direction calculation unit 27, and the output values corresponding to the microphones 1 to 4 are compared. Thus, the calculation of the direction of the sound source in the sound source direction calculation unit 27 is performed with high accuracy. The sound source direction calculation unit 27 outputs the calculated direction of the sound source every predetermined time to the histogram registration unit 28.

ヒストグラム登録部２８は、音源方向演算部２７から出力された音源の方向を入力し、入力した音源の方向を逐次記憶する。ヒストグラム登録部２８による音源の方向の記憶では、常に、最も古いデータに最新のデータが上書きされる。そして、ヒストグラム登録部２８は、記憶した音源の方向を２０°刻みで１８段階の角度範囲に分類し、分類結果に応じてヒストグラムを生成する。ヒストグラム登録部２８は、生成したヒストグラムを音源方向再演算部２９に出力する。 The histogram registration unit 28 receives the direction of the sound source output from the sound source direction calculation unit 27 and sequentially stores the input direction of the sound source. In storing the direction of the sound source by the histogram registration unit 28, the latest data is always overwritten on the oldest data. Then, the histogram registration unit 28 classifies the stored sound source directions into 20-degree angular ranges in increments of 20 °, and generates a histogram according to the classification result. The histogram registration unit 28 outputs the generated histogram to the sound source direction recalculation unit 29.

音源方向再演算部２９は、ヒストグラム登録部２８から出力されたヒストグラムを入力し、入力したヒストグラムの中で度数が最大である角度範囲における度数と、この角度範囲の近傍の角度範囲における度数とに基づいて角度範囲の平均値を求めることにより、音源の方向を再演算する。音源方向再演算部２９は、再演算した音源の方向を発言者の方向としてモータ制御部３１に出力する。 The sound source direction recalculation unit 29 receives the histogram output from the histogram registration unit 28, and converts the frequency in the angle range where the frequency is maximum in the input histogram and the frequency in the angle range near the angle range. The direction of the sound source is recalculated by obtaining the average value of the angle range based on it. The sound source direction recalculation unit 29 outputs the recalculated sound source direction to the motor control unit 31 as the speaker direction.

こうして、音源方向演算部２７、ヒストグラム登録部２８、及び音源方向再演算部２９による音源の方向の演算処理が行われ、発言者の方向が推定される。図１及び図３に示すように、テレビ会議用カメラＡでは、第１筺体１０ａ、開口１１〜１４、筒状体１６〜１９、マイク１〜４、アンプ５〜８、ＢＰＦ２１〜２４、ＡＤ変換部２５、増幅度調整部２６、音源方向演算部２７、ヒストグラム登録部２８、及び音源方向再演算部２９を備えて音源方向推定装置２０が構成されている。 In this way, the sound source direction calculation processing is performed by the sound source direction calculation unit 27, the histogram registration unit 28, and the sound source direction recalculation unit 29, and the direction of the speaker is estimated. As shown in FIGS. 1 and 3, in the video conference camera A, the first casing 10a, openings 11 to 14, cylindrical bodies 16 to 19, microphones 1 to 4, amplifiers 5 to 8, BPFs 21 to 24, AD conversion The sound source direction estimating device 20 is configured to include a unit 25, an amplification degree adjusting unit 26, a sound source direction calculating unit 27, a histogram registration unit 28, and a sound source direction recalculating unit 29.

更に、図３に示すように、テレビ会議用カメラＡは、音源方向推定装置２０によって推定された音源の方向へカメラ３５を向け、カメラ３５により発言者の映像を取得する機能を備えている。具体的には、テレビ会議用カメラＡは、音源方向再演算部２９から出力された音源の方向に基づいてモータ駆動部３２を制御するモータ制御部３１と、モータ制御部３１により制御されてモータ３３を駆動させるモータ駆動部３２と、モータ駆動部３２により駆動させられてカメラ回転機構３４に回転力を与えるモータ３３と、モータ３３に連結されて回転により第２筺体１０ｂと共にカメラ３５を旋回させるカメラ回転機構３４と、映像を取得するカメラ３５と、カメラ３５より取得された映像の映像データを映像処理する映像処理部３６と、映像処理部３６により処理された映像データを転送するデータ転送部３７と、データ転送部３７により転送された映像データをネットワークに送信するネットワーク通信部３８とを備えている。 Further, as shown in FIG. 3, the video conference camera A has a function of directing the camera 35 in the direction of the sound source estimated by the sound source direction estimating device 20 and acquiring the video of the speaker by the camera 35. Specifically, the videoconferencing camera A is controlled by the motor control unit 31 that controls the motor drive unit 32 based on the direction of the sound source output from the sound source direction recalculation unit 29 and the motor controlled by the motor control unit 31. A motor drive unit 32 that drives the motor 33, a motor 33 that is driven by the motor drive unit 32 to apply a rotational force to the camera rotation mechanism 34, and is connected to the motor 33 to rotate the camera 35 together with the second casing 10b by rotation. A camera rotation mechanism 34, a camera 35 that acquires video, a video processing unit 36 that performs video processing on video data acquired from the camera 35, and a data transfer unit that transfers video data processed by the video processing unit 36 37 and a network communication unit 38 that transmits the video data transferred by the data transfer unit 37 to the network.

次に、このようなテレビ会議用カメラＡにおける動作について説明する。以下の説明では、図６（ａ）に示すように、開口１１と開口１２とから等しい距離で開口１３及び開口１４とは反対側の４５°に位置する発言者から音声が発せられた場合を例として説明する。図６（ｂ）に示すように、各マイク１〜４に対応する周波数成分の出力値は、２．５ｋＨｚ〜１０ｋＨｚの間でピーク値に差が見られる。より具体的には、２．５ｋＨｚ〜１０ｋＨｚの間に見られる出力値は、マイク１，２では同等であり、マイク３，４ではマイク１，２よりも減衰している。一方、２．５ｋＨｚ以下に見られる出力値は、マイク１〜４で略等しくなっている。 Next, the operation of the video conference camera A will be described. In the following description, as shown in FIG. 6A, a case where a voice is emitted from a speaker located at 45 ° opposite to the opening 13 and the opening 14 at an equal distance from the opening 11 and the opening 12. This will be described as an example. As shown in FIG. 6B, the peak values of the output values of the frequency components corresponding to the microphones 1 to 4 are between 2.5 kHz to 10 kHz. More specifically, the output values seen between 2.5 kHz and 10 kHz are the same for the microphones 1 and 2, and are attenuated more than the microphones 1 and 2 for the microphones 3 and 4. On the other hand, the output values seen at 2.5 kHz or less are substantially equal for the microphones 1 to 4.

音源方向推定装置２０では、各ＢＰＦ２１〜２４によって２．５ｋＨｚ以下及び６．５ｋＨｚ以上の周波数成分は遮断されることにより、２．５ｋＨｚ〜６．５ｋＨｚの範囲の周波数成分が通過させられる。図６（ｂ）に示す例の場合、回折の影響を大きく受けた２．５ｋＨｚ〜６．５ｋＨｚの範囲の成分が出力値として抽出される。更に、増幅度調整部２６によって、２．５ｋＨｚ〜６．５ｋＨｚの範囲で各マイク１〜４の感度が略等しくなるよう、マイク１〜４の少なくとも一つに対応した周波数成分の増幅度が調整される。そして、増幅度調整部２６から出力された各周波数成分を用いて音源の方向が演算され、音源は、４５°の位置にあると推定される。音源方向推定装置２０は、４５°を示す信号をモータ制御部３１に出力する。 In the sound source direction estimation apparatus 20, the frequency components in the range of 2.5 kHz to 6.5 kHz are allowed to pass by blocking the frequency components of 2.5 kHz or less and 6.5 kHz or more by the BPFs 21 to 24. In the example shown in FIG. 6B, components in the range of 2.5 kHz to 6.5 kHz that are greatly affected by diffraction are extracted as output values. Further, the amplification degree adjustment unit 26 adjusts the amplification degree of the frequency component corresponding to at least one of the microphones 1 to 4 so that the sensitivities of the microphones 1 to 4 are substantially equal in the range of 2.5 kHz to 6.5 kHz. Is done. Then, the direction of the sound source is calculated using each frequency component output from the amplification degree adjustment unit 26, and the sound source is estimated to be at a 45 ° position. The sound source direction estimating device 20 outputs a signal indicating 45 ° to the motor control unit 31.

モータ制御部３１は、音源方向推定装置２０から出力された４５°を示す信号を入力し、モータ駆動部３２を制御してモータ３３を駆動させ、カメラ３５が４５°の方向へ向くように、カメラ回転機構３４によってカメラ３５を第２筺体１０ｂと共に旋回させる。なお、モータ制御部３１によるカメラ３５の旋回制御は、所定時間連続して一定範囲内の音源の方向が入力された際に、入力された一定範囲内の音源の方向の平均値を用いて実行されてもよい。 The motor control unit 31 inputs a signal indicating 45 ° output from the sound source direction estimating device 20, controls the motor driving unit 32 to drive the motor 33, and the camera 35 faces in the 45 ° direction. The camera 35 is rotated together with the second casing 10b by the camera rotation mechanism 34. Note that the turning control of the camera 35 by the motor control unit 31 is executed using the average value of the direction of the sound source within the certain range when the direction of the sound source within the certain range is input continuously for a predetermined time. May be.

カメラ３５は、前方の映像を取得し、取得した映像を示す映像データを生成する。カメラ３５は、生成した映像データを映像処理部３６に出力する。映像処理部３６は、カメラ３５から出力された映像データを入力し、入力した映像データに映像処理を施し、映像処理後の映像データをデータ転送部３７に出力する。データ転送部３７は、映像処理部３６から出力された映像データを入力し、入力した映像データをネットワーク通信部３８に転送する。そして、ネットワーク通信部３８は、データ転送部３７から転送された映像データをネットワークを介して相手方のテレビ会議システムに送信する。 The camera 35 acquires a front video and generates video data indicating the acquired video. The camera 35 outputs the generated video data to the video processing unit 36. The video processing unit 36 receives the video data output from the camera 35, performs video processing on the input video data, and outputs the video data after the video processing to the data transfer unit 37. The data transfer unit 37 receives the video data output from the video processing unit 36 and transfers the input video data to the network communication unit 38. Then, the network communication unit 38 transmits the video data transferred from the data transfer unit 37 to the other party's video conference system via the network.

本実施形態の音源方向推定装置２０によれば、各マイク１〜４から出力された電気信号が各アンプ５〜８によって増幅され、増幅されたそれぞれの電気信号における可聴域内の所定範囲の周波数成分が各ＢＰＦ２１〜２４を通過する。そして、音源方向演算部２７によって、各ＢＰＦ２１〜２４を通過した周波数成分に基づいて音源の方向が推定される。ここで、各マイク１〜４に対応して電気信号から周波数成分を生成する際の増幅度は、増幅度調整部２６によって互いに調整可能に構成されているので、各マイク１〜４の感度特性が周波数により異なっている場合でも、所定範囲の周波数において感度特性の相違を低減するような増幅度の調整が可能となる。よって、各マイク１〜４ごとの感度特性のばらつきが低減された周波数成分が得られ、ばらつきが低減された周波数成分を用いて、音源方向演算部２７によって音源の方向が推定されるため、音源の方向を精度良く推定することができる。 According to the sound source direction estimating apparatus 20 of the present embodiment, the electrical signals output from the microphones 1 to 4 are amplified by the amplifiers 5 to 8, and the frequency components in a predetermined range within the audible range in each of the amplified electrical signals. Passes through each BPF 21-24. The sound source direction calculation unit 27 estimates the direction of the sound source based on the frequency components that have passed through the BPFs 21 to 24. Here, since the amplification degree when generating the frequency component from the electrical signal corresponding to each microphone 1 to 4 is configured to be mutually adjustable by the amplification degree adjustment unit 26, the sensitivity characteristics of each microphone 1 to 4 are configured. Even when the frequency varies depending on the frequency, it is possible to adjust the amplification degree so as to reduce the difference in sensitivity characteristics in a predetermined range of frequencies. Therefore, a frequency component with reduced variation in sensitivity characteristics for each microphone 1 to 4 is obtained, and the direction of the sound source is estimated by the sound source direction calculation unit 27 using the frequency component with reduced variation. Can be accurately estimated.

更に、各アンプ５〜８によって増幅された各電気信号における所定範囲の周波数成分がＢＰＦ２１〜２４を通過することで、音源の位置する角度との高い相関関係を有する帯域の周波数成分を抽出することができ、各ＢＰＦ２１〜２４を通過した周波数成分を用いて音源の方向が推定されるため、無指向性のマイク１〜４を用いた場合であっても、音源の方向を精度良く推定することができる。 Furthermore, a frequency component in a predetermined range in each electric signal amplified by each amplifier 5 to 8 passes through the BPF 21 to 24, thereby extracting a frequency component in a band having a high correlation with the angle at which the sound source is located. Since the direction of the sound source is estimated using the frequency components that have passed through each of the BPFs 21 to 24, the direction of the sound source can be accurately estimated even when the omnidirectional microphones 1 to 4 are used. Can do.

以上、本発明の好適な実施形態について説明したが、本発明は上記実施形態に限られるものではない。 The preferred embodiment of the present invention has been described above, but the present invention is not limited to the above embodiment.

例えば、上記実施形態では、各ＢＰＦ２１〜２４を通過した周波数成分のうち少なくとも一つのマイクに対応した周波数成分の増幅度が増幅度調整部２６によって調整される場合について説明したが、増幅度調整手段は、各アンプ５〜８に設けられてもよい。すなわち、各アンプ５〜８は、各マイク１〜４から出力された電気信号の増幅度を調整するための調整機構（例えば調整つまみ）を増幅度調整手段として有してもよい。この場合、各マイク１〜４から出力された電気信号のうち少なくとも一つのマイクから出力された電気信号の増幅度を調整機構によって調整することで、上記実施形態と同様にして、各マイク１〜４の感度特性のばらつきを低減することができる。 For example, in the above-described embodiment, the case where the amplification degree of the frequency component corresponding to at least one microphone among the frequency components that have passed through each of the BPFs 21 to 24 is adjusted by the amplification degree adjustment unit 26 is described. May be provided in each amplifier 5-8. That is, each amplifier 5-8 may have an adjustment mechanism (for example, adjustment knob) for adjusting the amplification degree of the electric signal output from each microphone 1-4 as an amplification degree adjustment means. In this case, by adjusting the amplification degree of the electrical signal output from at least one microphone among the electrical signals output from each microphone 1 to 4 by the adjustment mechanism, each microphone 1 to 4 variations in sensitivity characteristics can be reduced.

また、上記実施形態では、４個のマイク１〜４が設けられる場合について説明したが、マイクの設置個数は２又は３個であってもよく、５個以上であってもよい。また、ＢＰＦ２１〜２４によって通過させられる周波数の範囲は、マイクの特性に応じて可聴域内の範囲で適宜設定することができる。このように可聴域内の範囲とすることで、人によって聴くことのできるあらゆる周波数の音源に音源方向推定装置を適用することができ、音源方向推定装置としての汎用性が高められる。 Moreover, although the said embodiment demonstrated the case where the four microphones 1-4 were provided, the installation number of microphones may be 2 or 3, and may be 5 or more. In addition, the range of frequencies passed by the BPFs 21 to 24 can be set as appropriate within the audible range according to the characteristics of the microphone. By setting the range within the audible range in this way, the sound source direction estimating device can be applied to sound sources of all frequencies that can be heard by humans, and versatility as a sound source direction estimating device is enhanced.

更にまた、上記実施形態では、音源方向推定装置２０がテレビ会議用カメラＡに適用される場合について説明したが、複数のマイクを備えて音源の方向を推定し、推定した音源の方向を利用する装置であればこれに限られない。例えば、本発明の音源方向推定装置は、異常音を検知して、その異常音が発生した音源の方向を映す監視カメラ等にも適用できる。 Furthermore, in the above-described embodiment, the case where the sound source direction estimating device 20 is applied to the video conference camera A has been described. However, the direction of the sound source is estimated using a plurality of microphones, and the estimated direction of the sound source is used. If it is an apparatus, it will not be restricted to this. For example, the sound source direction estimating apparatus of the present invention can be applied to a monitoring camera or the like that detects abnormal sound and reflects the direction of the sound source in which the abnormal sound has occurred.

１〜４…マイク、５〜８…アンプ（増幅手段）、１０ａ…筺体、１１〜１４…開口、２０…音源方向推定装置、２１〜２４…ＢＰＦ、２６…増幅度調整部（増幅度調整手段）、２７…音源方向演算部（音源方向推定手段）。 DESCRIPTION OF SYMBOLS 1-4 ... Microphone, 5-8 ... Amplifier (amplification means), 10a ... Housing, 11-14 ... Opening, 20 ... Sound source direction estimation apparatus, 21-24 ... BPF, 26 ... Amplification degree adjustment part (Amplification degree adjustment means) 27... Sound source direction calculation unit (sound source direction estimation means).

Claims

A plurality of microphones for inputting sound generated by a sound source and converting the sound into an electrical signal;
Amplifying means for amplifying the electrical signal output from each of the plurality of microphones;
A bandpass filter that passes a predetermined range of frequency components within the audible range in each of the electrical signals amplified by the amplification means;
Sound source direction estimating means for estimating the direction of the sound source based on the frequency component that has passed through the band-pass filter,
A sound source direction estimating apparatus, wherein the amplification degree when generating the frequency component from the electrical signal corresponding to the plurality of microphones is adjustable.

The sound source direction estimating apparatus according to claim 1, further comprising: an amplification degree adjusting unit that adjusts an amplification degree of a frequency component corresponding to at least one of the frequency components that has passed through the bandpass filter.

The amplification means includes amplification degree adjustment means for adjusting an amplification degree of an electric signal output from at least one of the electric signals output from the plurality of microphones. Or the sound source direction estimation apparatus of 2 description.

The plurality of microphones are embedded in a housing in which an opening for capturing the sound is formed, and the sound that propagates through the opening into the housing is input. The sound source direction estimation apparatus according to any one of claims 1 to 3.