JP3599653B2

JP3599653B2 - Sound pickup device, sound pickup / sound source separation device and sound pickup method, sound pickup / sound source separation method, sound pickup program, recording medium recording sound pickup / sound source separation program

Info

Publication number: JP3599653B2
Application number: JP2000270043A
Authority: JP
Inventors: 真理子青木; 賢一古家; 和彦山森
Original assignee: Nippon Telegraph and Telephone Corp; NTT Inc
Current assignee: Nippon Telegraph and Telephone Corp; NTT Inc
Priority date: 2000-09-06
Filing date: 2000-09-06
Publication date: 2004-12-08
Anticipated expiration: 2020-09-06
Also published as: JP2002084590A

Description

【０００１】
【発明の属する技術分野】
本発明は、空間に複数の音源が異なる位置に配置されている場合に、３本以上のマイクロホンを用いて、空間を複数のゾーンに分割し、各ゾーンにある音源からの音を他ゾーンの音源とは独立に収音する収音装置、収音・音源分離装置及び収音、収音・音源分離方法並びにこの方法を実行するプログラムを記録した記録媒体に関する。
【０００２】
【従来の技術】
従来のゾーン分離収音技術には、例えば、音が持つ次のような特徴を利用したものがある。音はいくつかの周波数成分の和として表現されることが知られている。そこで、複数の音が同時に鳴っている場合、各チャネルのマイクロホンに入力される音源信号を各音源からの周波数成分が周波数軸上で重ならない程度の帯域に分割し、チャネル間の各周波数成分の到達時間差や到達レベル差を基に、各周波数成分それぞれがどのゾーンからのものであるか判定し、同一ゾーンからの成分を集めて合成することにより、各ゾーン毎の音を個別に収音する方法が用いられていた。（参考文献：特開平１０−３１３４９７号公報（特願平０９−２５２３１２号）「音源分離方法、装置および記録媒体」）。
【０００３】
【発明が解決しようとする課題】
ところが、この従来の技術では、各音源信号が各チャネルに到達する時間差およびレベル差を用いているため、マイクロホンの指向特性によりゾーンが固定されていた。よって、複数の音源が同一ゾーン内に入る場合、マイクロホンの向きを物理的に変えることなく、この音源をそれぞれ分離抽出することは不可能であった。（課題１）
また、ゾーンを切り替えるためには音源の方向を検出する必要があるが、上記入力系の構成、すなわち、２本の両指向性マイクロホンと１本の全指向性マイクロホンを同軸上に配置した構成で音源の方向を推定することは困難であった。従来、音源方向を推定するためには、２本以上の全指向性マイクロホンを互いに離して設置し、音源から各マイクロホンへの到達時間の差を検出する方法が知られているが、この構成は、マイクロホンの本数及び規模が大きくなるという欠点があった。（課題２）
【０００４】
【課題を解決するための手段】
本発明は、３本の収音手段で収音された信号の線形和を算出することで入力手段が持つ指向特性を電気的に可変にし、従来、同一ゾーンに含まれるため分離が不可能であった音源に対しても、マイクロホンを物理的に動かすことなくゾーンを切り替え、個々の音源を分離抽出することを特徴とする（課題１の解決）。
さらに、２本の両指向性収音手段と１本の全指向性収音手段を用いて音源方向を推定することで、マイクロホンの本数および規模を変えることなく、音源方向推定を可能とする（課題２の解決）。さらに、推定された音源方向に応じてゾーンを分割するのに適した指向性を選択することで、音源の位置に依らず自律的にゾーン毎の音を分離抽出することを特徴とする。
【０００５】
ここで、音源方向推定の基本的な考え方を説明する。
図１の音源方向推定の説明図を参照して音源１，２が図のように配置された場合を例に挙げて説明する。この場合、主軸方向が９０°の単一指向性の指向性可変手段４−１と２７０°の単一指向性の指向性可変手段４−３の出力信号のレベル差ΔＬを測定すると、音源１，２が二つとも０°〜１８０°の間にあるため、レベル差ΔＬは、単一指向性の元々持つ感度差（２０ｄＢ〜３０ｄＢ）とほぼ同じ値となる。一方、指向性を回転させ、主軸方向が１８０°の単一指向性の指向性可変手段４−１と、０°の単一指向性の指向性可変手段４−３のレベル差を測定すると、レベル差ΔＬはほぼ０ｄＢとなる。よって、ΔＬを観測することで、音源２つの内一つ（音源１）は０°〜９０°の範囲に、もう一つ（音源２）は９０°〜１８０°の範囲にあることがわかる。そこで、主軸方向を少しづつ回転させ、すべての方向に対してΔＬを観測すれば、どの方向に音源があるのかがわかる。本発明はこの方法を用いる。
各手段の具体的な方法については次の発明の実施の形態において説明する。
【０００６】
【発明の実施の形態】
図２に本発明の実施例である収音・音源分離装置の概要構成を示す。
収音手段１は両指向性マイクロホン１、収音手段２は両指向性マイクロホン１と主軸が直交するように配置された両指向性マイクロホン２、収音手段３は全指向性マイクロホン３で構成される。また、収音手段１〜３は同軸上に配置される。
【０００７】
４の指向性可変手段はそれぞれ、上記３本の収音手段の出力信号の線形和を算出して出力する。取り得る線形和とは具体的には、式（１）で表される。ここでＳ１（ｔ），Ｓ２（ｔ），Ｓ３（ｔ）は収音手段１，２，３の出力信号、Ａｉ，Ｂｊ，Ｃｋは重み付け係数、ｉ，ｊ，ｋは各収音手段を特定するインデックス、Ｐは１つの指向性パタンを特定するインデックス、Ｍ，Ｎ，Ｑはそれぞれ任意の正の整数、ｎは指向性可変手段の数とする。
【０００８】
【数１】

【０００９】
次に、６の指向性選択手段について説明する。
指向性選択手段６は、音源方向情報φ_１，・・・，φ_ｎを用い、その方向の音を高いＳ／Ｎで収音できる指向性（指向性可変手段）を選択する。
一つの例として、音源方向に対し、死角を向けるような指向性を選択する方法がある。例えば、音源が２個ある場合、それぞれの音源方向に死角を向けた指向性を二つ選択する。ここで、説明のために、音源１の方向に死角を向けた指向性を指向性１、音源２の方向に死角を向けた指向性を指向性２とする。指向性１を有する指向性可変手段の出力信号は、音源１の音が抑圧され、音源２の音が高いＳ／Ｎで収音されていると期待できる。同様に、指向性２を有する指向性選択手段の出力信号は、音源２の音が抑圧され、音源１の音が高いＳ／Ｎで収音されていると期待できる。さらに、これら出力信号を従来の音源分離装置の入力信号とすることで、より精度の高い音源分離が実現できる。
【００１０】
また、音源方向に指向性の主軸を向ける指向性選択も例として挙げられる。
さらに、隣り合う音源の方向を結ぶ線分の方向に主軸を向けるよう指向性を選択する方法も例として挙げられる。例えば、部屋を２つのゾーンに分割することを考える場合、２本の単一指向性を互いに主軸方向が逆に向くようにして実現する方法が考えられる。この場合、指向性可変手段二つを使って、式（２）、（３）に示す指向性を形成し、それを入力信号として、従来の音源分離装置に供給すればよい。このとき、角度θは０＜θ＜３６０°であり、主軸の回転角度τは音源方向情報φ_１，・・・，φ_ｎで設定でき、なおかつ、マイクロホンを物理的に回転させること無くゾーン別収音を実現できる。
【００１１】
【数２】

【００１２】
回転角度τを生じさせる方法を以下に説明する。
互いに主軸方向が直交するように配置された両指向性マイクロホン１，２の出力（電圧Ｅとする）はそれぞれ式（４），（５）で与えられる。
Ｅ_１＝ｃｏｓθ （４）
Ｅ_２＝ｓｉｎθ （５）
ここで、両指向性マイクロホン１，２のゲインａ，ｂを与え、両出力を加算することで、主軸方向を任意に回転可能な両指向性マイクロホンの出力を合成できる。この原理を式（６），（７）に示す。
【００１３】
【数３】

【００１４】
上記方法で両指向性マイクロホンと全指向性マイクロホンの出力を加算することで、主軸方向を回転させた単一指向性を合成することができる。
指向性選択手段６で選ばれる指向性は、指向性可変手段の説明で述べたとおり、３本のマイクロホン出力の線形和で表される特性すべてであるが、具体例として単一指向性、ＭＳステレオ型指向性、などが挙げられる。
【００１５】
指向性選択手段６で選択した指向性可変手段４の出力は音源分離装置５に入力され各音源信号が分離される。
音源分離装置５の構成を図１０に示す。
この音源分離装置は特開平１０−３１３４９７号公報に記載の音源分離装置と同様な構成を備え、上記公報に記載のものは、帯域分割部にＲ，Ｌチャネルのマイクロホンの出力信号を入力するのに対し、本発明の音源分離装置は指向性選択手段６で選択された指向性可変手段４の２つの出力信号、チャネル１，２信号（２つの異なる指向性を有する指向性可変手段の出力）を入力する点で相違している。
【００１６】
チャネル１，２信号は帯域分割部５１とチャネル間時間差／レベル差検出部５２に入力され、帯域分割部５１ではそれぞれに１つの音源信号成分のみ存在する複数の周波数帯域信号（ｃｈ１（ｆ_１），・・・，ｃｈ１（ｆ_ｎ）、ｃｈ２（ｆ_１），・・・，ｃｈ２（ｆ_ｎ））に分割されて帯域別チャネル間時間差／レベル差検出部５３と音源判定信号判別部５４に入力される。音源判定信号選別部５４において、各検出部５２，５３のチャネル間時間差／レベル差検出出力に基づいて音源信号判定部５５でその帯域の帯域分割された各出力チャネル信号（ｃｈ１（ｆ_１），・・・，ｃｈ１（ｆ_ｎ）、ｃｈ２（ｆ_１），・・・，ｃｈ２（ｆ_ｎ））の何れがどの音源から入力された信号であるかが判定され、この判定結果により帯域分割された各出力チャネル信号から、同一音源から入力された信号を少なくとも一つ選択し、選択された帯域毎の成分信号は音源信号合成部５６Ａ，５６Ｂで合成され出力することで複数の音源信号が分離される。
【００１７】
なお、上記の説明では音源分離装置に指向性可変手段の２つの出力信号を入力しているが、２つ以上の出力信号を入力しても同様に複数の音源信号が分離できる。
次に、音源方向推定について説明する。
図３に本発明の音源方向推定手段を備えた収音・音源分離装置のブロック図を示す。
【００１８】
両指向性マイクロホン１，２、全指向性マイクロホン３、指向性可変手段４，音源方向推定手段７，及び指向性選択手段６から収音装置が構成され、収音装置と音源分離装置５から収音・音源分離装置が構成される。
指向性可変手段を複数個有し、その出力信号を基に音源方向検出手段７において音源方向を推定する。音源方向の推定方法については音源方向推定手段の説明で詳述する。指向性選択手段６では、７の音源方向推定手段において推定された方向の音を高いＳ／Ｎで収音できる指向性を選択する。指向性選択手段６は、図１と例と同じである。音源方向推定手段を備えた構成により、音源の方向が未知の場合でも、それを推定し、自動的に各ゾーンの音を収音できる。
【００１９】
次に、音源方向推定手段の構成例１について説明する。
図４に音源方向推定手段の構成例１のブロック図を示す。
指向性可変手段４においては、例えば、主軸方向をΔτ度毎に変えた単一指向性を（３６０／Δτ）個形成し、音源方向推定手段７へ供給する。
次に、音源方向推定手段７について説明する。
まず、指向性可変手段４の出力信号が供給された組み合わせ決定手段８においては、上記合成された指向性を二つずつの組（２つの異なる指向性を対応付けた指向性可変手段の組）にする。この組をＰｎで表す。例えば、主軸方向が反対向きになる単一指向性同士を組にする。
【００２０】
９のレベル差算出手段においては、上記組み合わせ決定手段８で組み合わされたペアの出力レベル差（組み合わされた指向性可変手段の出力の差分）を算出する。このレベル差をΔＬｎとする。
次に、１０のレベル差方向対応付け手段においては、上記組み合わせ手段で組み合わされた各ペア（Ｐｎ）に対し、上記レベル差算出手段で算出された各ΔＬｎと、ある角度θｎを対応付ける。この対応を、（Ｐｎ、ΔＬｎ、θｎ）で表す。例えば、図５に示すように、回転角度がｎΔτ度の単一指向性を有する指向性可変手段４−１と、主軸方向がそれと逆向きの単一指向性を有する指向性可変手段４−３とのペアに対しては、レベル差ΔＬｎと、回転角度ｎΔτ度と直交する方向（ｎΔτ＋９０）度を対応付ける。
【００２１】
１１のレベル差変動幅検出手段においては、上記レベル差方向対応付け手段１０で対応付けた方向の、降順または昇順にΔＬｎを並べなおし、隣接するΔＬｎ＋１との差分（ΔＬｎ，ｎ＋１）を算出する。この差分をレベル差変動幅と表す。
図６に示した方向に音源がある場合を例に説明する。
ここでは例として、角度θｎと対応付けられるレベル差ΔＬｎ、指向性可変手段はθｎと直交する方向に主軸を持つ、互いに主軸が逆向きの指向性可変手段の組を対応付ける。図６には、昇順に並べなおされた方向θｎの代表例、およびθ１と対応付けられた指向性可変手段４−１、および４−３を図示してある。ΔＬｎは、指向性可変手段４−１のレベルから指向性可変手段４−３のレベルを引いたものと定義する。この場合に、ΔＬｎとθｎの関係をグラフに示したのが図７である。ちょうど、音源方向を境に、ΔＬｎの値が大きく変動していることがわかる。すなわち、レベル差変動幅はΔＬｎ，ｎ＋１のときに最大になることがわかる。
【００２２】
レベル差変動幅Δｎの値が大きく変動するとは、言い換えれば指向性の方向が音源をまたぐ時に生じるレベル差の有意な正負の変化を意味する。図７では、レベル差ΔＬｎを、指向性可変手段４−１の出力から指向性可変手段４−３の出力を減算したものとする。この場合、θ１〜θｎまでは、音源は指向性可変手段４−１の出力が、指向性可変手段４−３の出力に比べ大きい側にあるため、レベル差ΔＬｎは正値を取る。しかし、θｎを越えると、今度は音源は、指向性可変手段４−３の出力が、指向性可変手段４−１の出力に比べて大きい側にあることになり、レベル差ΔＬｎは負の値を取ることになる。
【００２３】
１２のレベル差変動方向対応付け手段では、例えば、ΔＬｎ，ｎ＋１のときにはθｎを対応させる。ここで、対応付けのほかの例として、｛（θｎ＋（θｎ＋１）｝／２を対応付けてもよい。
１３のレベル差変動ピーク検出手段においては、レベル差変動幅が最大になる（ΔＬｎ，ｎ＋１，θｎ）を検出する。ここで、検出されるピークの個数は一つである必要は無く、変動が大きいところから順に複数個とってもよい。例えば、ある閾値ｋを越える変動幅はすべて検出する、という方法を採れば、複数音源が同時に鳴っている場合でも複数音源の方向を検出できる。
【００２４】
１４の方向検出手段においては、上記レベル差変動ピーク検出手段で検出されたΔＬｎ，ｎ＋１に対応する方向θｎを検出する。ここでも、上記に述べたように、θｎは複数個検出してもよい。上記のように検出された音源方向の情報は、指向性選択手段に送られる。
次に、音源方向推定手段の構成例２について説明する。
図８に音源方向推定手段の構成例２のブロック図を示す。
【００２５】
これは、音源方向の推定を、図４の音源方向推定手段に比べて精度良く行うためのものである。図４の音源方向推定手段では、指向性可変手段で合成した指向性により生じるレベル差を算出する。しかし、部屋の反射等が無視できない環境においては、電気的に合成した指向性は鈍くなる。例えば、無響室においては死角と主軸方向とのレベル差が３０ｄＢ以上つくような単一指向性であっても、反射がある通常の部屋では、１０ｄＢから１５ｄＢ程度の差しか生じなくなる。その結果、レベル差変動ピーク検出手段において誤差が生じる可能性が高まる。
【００２６】
そこで、方向によるレベル差をより大きく生じさせ、レベル差変動ピーク検出手段における推定精度を上げるために、以下の処理を行う。
ここで、指向性可変手段および組み合わせ決定手段は、図４の場合と同様とする。すなわち、指向性可変手段においては、主軸方向をΔτ度毎に変えた単一指向性を３６０／Δτ個形成し、音源方向検出手段へ供給する。信号を供給された組み合わせ決定手段においては、上記合成された指向性を二つずつの組にする。ここでは、主軸方向が反対向きになる指向性同士を組にする。
【００２７】
次に、１５の帯域分割手段において、組み合わせ決定手段８でペアにされた指向性可変手段からの出力信号を、それぞれ、複数の周波数帯域に分割する。帯域分割の際、一つの周波数帯域に含まれる成分が単独音源からの周波数のみで生成されていると近似できる程度に細かく分割する。この帯域分割の方法には、例えば、フーリエ変換が用いられる。帯域分割された信号は１６の帯域別差分算出手段へ供給される。
【００２８】
１６の帯域別差分算出手段は、各周波数帯域において、組み合わせ決定手段でペアにされた指向性可変手段からの出力信号レベル差を算出する。これを帯域別レベル差と呼ぶ。ここでは、指向性可変手段４−１のレベルから指向性可変手段４−３のレベルを引いた値を帯域別レベル差と定義する。
１７の信号判定手段においては、あらかじめ決められた基準に基づき、帯域別レベル差の値に応じて周波数帯域をグルーピングする。例えば、帯域別レベル差が正の値をとる周波数帯域と、負の値をとる周波数帯域を分けてグルーピングする。
【００２９】
９のレベル差算出手段では、指向性可変手段４の出力信号のうち、帯域別レベル差が正の値をとる周波数帯域を合わせた信号のレベルを、あらためて、指向性可変手段４−１のレベルとする。同様に、指向性可変手段４−３の出力信号のうち、帯域別レベル差が負の値をとる周波数帯域を合わせた信号のレベルを、あらためて、指向性可変手段４−３のレベルとする。そして、この新たに決められた指向性可変手段４−１のレベルと指向性可変手段４−３のレベルの差分ΔＬｎを算出する。
【００３０】
上記組み合わせ手段で組み合わされた各ペアに対し、上記レベル差算出手段で算出された各ΔＬｎとある角度θｎを対応付ける。例えば、回転角度がｎΔτ度の単一指向性と、主軸方向がそれに直交する単一指向性との組においては、レベル差ΔＬｎと、回転角度ｎΔτと直交する方向（ｎΔτ＋９０）度を対応付ける。
その他の手段については、図４の場合と同じである。
次に音源方向推定手段の構成例３について説明する。
【００３１】
図９に音源方向推定手段の構成例３のブロック図を示す。
指向性可変手段４において、死角方向がΔτ°毎に異なる指向性を合成する。例えば、単一指向性の死角がΔτ°毎に異なるように合成する。
音源方向推定手段７においては、まず１８のレベル算出手段で、上記指向性可変手段４それぞれの出力信号レベルを算出する。
１９のレベル順位付け手段においては、上記レベル算出手段で算出されたレベルを降順または昇順に並べ替える。
【００３２】
２０のレベル小方向検出手段においては、上記レベル順位付け手段において順位付けられたレベルのうち、小さいものから少なくとも一つ選び、そのレベルに対応する指向性可変手段の死角方向を音源方向として検出する。ここで、検出するレベル（及びそれに対応する死角方向）は複数でもよい。ある閾値を設定し、その閾値以下となるレベルを持つ死角方向を検出することで、音源が複数であっても、それらの方向を検出することができる。
【００３３】
なお、本発明の収音装置及び収音・音源分離装置をＣＰＵやメモリ等を有するコンピュータとアクセス主体となるユーザが利用するユーザ端末と記録媒体から構成することができる。
記録媒体はＣＤ−ＲＯＭ、磁気ディスク装置、半導体メモリ等のコンピュータ読み取り可能な記録媒体であり、ここに記録された収音方法あるいは収音・音源分離方法を実行させるプログラムはコンピュータに読み取られコンピュータの動作を制御しコンピュータ上に前述した実施の形態における各構成要素を実現する。
【００３４】
【発明の効果】
本発明は、３本のマイクロホンで収音された信号の線形和を算出することで入力手段が持つ指向特性を電気的に可変し、従来、同一ゾーンに含まれるため分離が不可能であった音源に対しても、マイクロホンを物理的に動かすことなくゾーンを切り替え、個々の音源を分離抽出することを可能とする。（課題１に対する効果）。
【００３５】
さらに、２本の両指向性マイクロホンと１本の全指向性マイクロホンを用いて音源方向を推定することで、マイクロホンの本数および規模を替えることなく、音源方向推定を可能とする（課題２に対する効果）。
さらに、指向性選択手段により、推定された音源方向に応じてゾーンを分割するのに適した指向性を選択することで、音源の位置に拠らず自律的にゾーン毎の音を分離抽出することができる。
【図面の簡単な説明】
【図１】音源方向推定の説明図。
【図２】本発明の収音・音源分離装置の概略構成図。
【図３】本発明の音源方向推定手段を備えた収音・音源分離装置のブロック図。
【図４】本発明の音源方向推定手段の構成例１のブロック図。
【図５】指向性可変手段の組と方向θｎの対応付けの例を説明する図。
【図６】音源方向θ１と対応付けられた指向特性可変手段の例を説明する図。
【図７】レベル差方向対応付け手段で対応付けられたレベル差を方向の昇順に並べなおした例を示す図。
【図８】本発明の音源方向推定手段の構成例２のブロック図。
【図９】本発明の音源方向推定手段の構成例３のブロック図。
【図１０】本発明の音源分離装置のブロック図。
【符号の説明】
１，２両指向性マイクロホン
３全指向性マイクロホン
４指向性可変手段
５音源分離装置
６指向性選択手段
７音源方向推定手段
８組み合せ決定手段
９レベル差算出手段
１０レベル差方向対応付け手段
１１レベル差変動幅検出手段
１２レベル差変動方向対応付け手段
１３レベル差変動ピーク検出手段
１４方向検出手段
１５帯域分割手段
１６帯域別差分算出手段
１７信号判定手段
１８レベル算出手段
１９レベル順位付け手段
２０レベル小方向検出手段
５１帯域分割部
５２チャネル間時間差／レベル差検出部
５３帯域別チャネル間時間差／レベル差検出部
５４音源判定信号選別部
５５音源信号判定部
５６音源信号合成部[0001]
TECHNICAL FIELD OF THE INVENTION
According to the present invention, when a plurality of sound sources are arranged at different positions in a space, the space is divided into a plurality of zones by using three or more microphones, and a sound from a sound source in each zone is divided into other zones. The present invention relates to a sound pickup device that picks up sound independently of a sound source, a sound pickup / sound source separation device, sound pickup, a sound pickup / sound source separation method, and a recording medium that records a program for executing the method.
[0002]
[Prior art]
As a conventional zone separation sound collecting technique, for example, there is a technique using the following features of sound. It is known that sound is represented as a sum of several frequency components. Therefore, when a plurality of sounds are sounding simultaneously, the sound source signal input to the microphone of each channel is divided into bands in which the frequency components from each sound source do not overlap on the frequency axis, and the Based on the arrival time difference and arrival level difference, it is determined which zone each frequency component is from, and the sound from each zone is individually collected by collecting and synthesizing components from the same zone. The method was used. (Reference: Japanese Patent Application Laid-Open No. Hei 10-313497 (Japanese Patent Application No. 09-252212) "Method of Separating Sound Source, Apparatus and Recording Medium").
[0003]
[Problems to be solved by the invention]
However, in this conventional technique, since a time difference and a level difference at which each sound source signal reaches each channel is used, the zone is fixed by the directional characteristics of the microphone. Therefore, when a plurality of sound sources enter the same zone, it is impossible to separate and extract the sound sources without physically changing the direction of the microphone. (Issue 1)
In order to switch the zone, it is necessary to detect the direction of the sound source. However, in the configuration of the input system, that is, the configuration in which two bidirectional microphones and one omnidirectional microphone are coaxially arranged. It was difficult to estimate the direction of the sound source. Conventionally, in order to estimate the sound source direction, a method is known in which two or more omnidirectional microphones are installed apart from each other and a difference in arrival time from the sound source to each microphone is detected. However, there is a disadvantage that the number and scale of the microphones are increased. (Issue 2)
[0004]
[Means for Solving the Problems]
The present invention calculates the linear sum of the signals picked up by the three sound pickup means to electrically change the directivity characteristic of the input means. Conventionally, the input means are included in the same zone and cannot be separated. The present invention is characterized in that zones are switched even for existing sound sources without physically moving a microphone, and individual sound sources are separated and extracted (solution 1).
Further, by estimating the sound source direction using two bidirectional sound collecting means and one omnidirectional sound collecting means, it is possible to estimate the sound source direction without changing the number and scale of the microphones ( Solution of problem 2). Furthermore, by selecting a directivity suitable for dividing a zone according to the estimated sound source direction, the sound of each zone is autonomously separated and extracted regardless of the position of the sound source.
[0005]
Here, the basic concept of sound source direction estimation will be described.
A case where the

sound sources

1 and 2 are arranged as shown in the figure will be described as an example with reference to the explanatory diagram of sound source direction estimation in FIG. In this case, when the level difference ΔL between the output signals of the directivity varying means 4-1 having a main axis direction of 90 ° and the directivity varying means 4-3 having a unidirectionality of 270 ° is measured, the

sound source

1 , 2 are between 0 ° and 180 °, the level difference ΔL is substantially the same as the original sensitivity difference (20 dB to 30 dB) of unidirectionality. On the other hand, when the directivity is rotated and the level difference between the directivity changing means 4-1 having a single directivity of 180 ° in the main axis direction and the directivity changing means 4-3 having a single directivity of 0 ° is measured, The level difference ΔL is almost 0 dB. Therefore, by observing ΔL, it can be understood that one of the two sound sources (sound source 1) is in the range of 0 ° to 90 ° and the other (sound source 2) is in the range of 90 ° to 180 °. Then, by rotating the main axis direction little by little and observing ΔL in all directions, it is possible to know in which direction the sound source is located. The present invention uses this method.
The specific method of each means will be described in the following embodiments of the invention.
[0006]
BEST MODE FOR CARRYING OUT THE INVENTION
FIG. 2 shows a schematic configuration of a sound collection / sound source separation device according to an embodiment of the present invention.
The sound collecting means 1 is constituted by a bidirectional microphone 1, the sound collecting means 2 is constituted by a bidirectional microphone 2 arranged so that the main axis is orthogonal to the bidirectional microphone 1, and the sound collecting means 3 is constituted by an omnidirectional microphone 3. You. The sound pickup means 1 to 3 are arranged coaxially.
[0007]
Each of the four directivity changing means calculates and outputs a linear sum of the output signals of the three sound collecting means. The possible linear sum is specifically represented by Expression (1). Here, S1 (t), S2 (t), and S3 (t) are output signals of the sound pickup means 1, 2, and 3, Ai, Bj, and Ck are weighting coefficients, and i, j, and k specify each sound pickup means. Index, P is an index for specifying one directivity pattern, M, N, and Q are arbitrary positive integers, respectively, and n is the number of directivity variable units.
[0008]
(Equation 1)

[0009]
Next, six directivity selecting means will be described.
The directivity selecting means 6 selects the directivity (directivity variable means) capable of collecting the sound in that direction with a high S / N using the sound source direction information φ ₁ ,..., Φ _n .
As one example, there is a method of selecting a directivity that directs a blind spot to a sound source direction. For example, when there are two sound sources, two directivities having blind spots directed to the respective sound source directions are selected. Here, for the sake of explanation, the directivity in which the blind spot is directed toward the sound source 1 is referred to as directivity 1, and the directivity in which the blind spot is directed toward the sound source 2 is referred to as directivity 2. The output signal of the directivity changing unit having directivity 1 can be expected that the sound of the sound source 1 is suppressed and the sound of the sound source 2 is collected at a high S / N. Similarly, it can be expected that the output signal of the directivity selection unit having directivity 2 has the sound of the sound source 2 suppressed and the sound of the sound source 1 collected at a high S / N. Further, by using these output signals as input signals of a conventional sound source separation device, more accurate sound source separation can be realized.
[0010]
In addition, directivity selection in which the main axis of directivity is directed to the direction of the sound source is also exemplified.
Furthermore, a method of selecting the directivity so that the main axis is directed to the direction of a line segment connecting the directions of the adjacent sound sources is also given as an example. For example, when dividing a room into two zones, a method of realizing two unidirectional patterns so that the main axis directions are opposite to each other can be considered. In this case, the directivity shown in Expressions (2) and (3) may be formed by using two directivity changing means, and supplied to a conventional sound source separation device as an input signal. At this time, the angle θ is 0 <θ <360 °, the rotation angle τ of the main shaft can be set by the sound source direction information φ ₁ ,..., Φ _n , and can be set for each zone without physically rotating the microphone. Sound pickup can be realized.
[0011]
(Equation 2)

[0012]
A method for generating the rotation angle τ will be described below.
The outputs (voltage E) of the

bidirectional microphones

1 and 2 arranged so that the main axis directions are orthogonal to each other are given by equations (4) and (5), respectively.
E ₁ = cos θ (4)
E ₂ = sin θ (5)
Here, by providing the gains a and b of the

bidirectional microphones

1 and 2 and adding the two outputs, the outputs of the bidirectional microphones that can freely rotate in the main axis direction can be synthesized. This principle is shown in equations (6) and (7).
[0013]
(Equation 3)

[0014]
By adding the outputs of the bidirectional microphone and the omnidirectional microphone by the above method, it is possible to synthesize the unidirectionality obtained by rotating the main axis direction.
The directivity selected by the directivity selecting means 6 is, as described in the description of the directivity changing means, all the characteristics represented by the linear sum of the outputs of the three microphones. Stereo-type directivity, and the like.
[0015]
The output of the directivity changing unit 4 selected by the directivity selecting unit 6 is input to the sound source separation device 5 to separate each sound source signal.
FIG. 10 shows the configuration of the sound source separation device 5.
This sound source separation device has a configuration similar to that of the sound source separation device described in Japanese Patent Application Laid-Open No. H10-313497. On the other hand, in the sound source separation device of the present invention, the two output signals of the directivity varying means 4 selected by the directivity selecting means 6, the

channel

1 and 2 signals (the outputs of the two directivity varying means having different directivities). Is different.
[0016]
Channel 1 signal is input to the band division section 51 and the inter-channel time difference / level difference detector 52, a plurality of frequency band signals that are present only one source signal component respectively in the band division portion 51 (ch1 (f ₁₎ ,..., Ch1 (f _n ), ch 2 (f ₁ ),..., Ch 2 (f _n ) Is entered. In the sound source determination signal selection unit 54, each output channel signal (ch1 (f ₁ ), , Ch1 (f _n ), ch 2 (f ₁ ),..., Ch 2 (f _n )) are determined from which sound source the signal is input from, and the band is divided based on the determination result. From the output channel signals, at least one signal input from the same sound source is selected, and the component signals for each of the selected bands are synthesized and output by the sound source

signal synthesis units

56A and 56B to separate a plurality of sound source signals. Is done.
[0017]
In the above description, two output signals of the directivity changing means are input to the sound source separation device, but a plurality of sound source signals can be similarly separated by inputting two or more output signals.
Next, the sound source direction estimation will be described.
FIG. 3 shows a block diagram of a sound collection / sound source separation device provided with the sound source direction estimating means of the present invention.
[0018]
A sound collecting device is constituted by the

bidirectional microphones

1 and 2, the omni-directional microphone 3, the directivity changing means 4, the sound source direction estimating means 7, and the directivity selecting means 6, and the sound collecting device and the sound source separating device 5 collect sound. A sound / sound source separation device is configured.
A plurality of directivity changing units are provided, and the sound source direction is estimated by the sound source direction detecting unit 7 based on the output signals. The method of estimating the sound source direction will be described in detail in the description of the sound source direction estimating means. The directivity selecting means 6 selects a directivity capable of collecting the sound in the direction estimated by the sound source direction estimating means 7 at a high S / N. The directivity selecting means 6 is the same as that of FIG. With the configuration including the sound source direction estimating means, even if the direction of the sound source is unknown, it can be estimated and the sound of each zone can be automatically collected.
[0019]
Next, a first configuration example of the sound source direction estimating means will be described.
FIG. 4 shows a block diagram of a configuration example 1 of the sound source direction estimating means.
In the directivity changing means 4, for example, (360 / Δτ) single directivities in which the main axis direction is changed every Δτ degrees are formed and supplied to the sound source direction estimating means 7.
Next, the sound source direction estimating means 7 will be described.
First, in the combination determining means 8 to which the output signal of the directivity changing means 4 is supplied, the combined directivity is set in pairs of two (a set of directivity changing means corresponding to two different directivities). To This set is represented by Pn. For example, a pair of single directivities whose main axis directions are opposite to each other is paired.
[0020]
The level difference calculating means 9 calculates the output level difference of the pair combined by the combination determining means 8 (the difference between the combined outputs of the directivity changing means). This level difference is defined as ΔLn.
Next, ten level difference direction association means associates each pair (Pn) combined by the combination means with each ΔLn calculated by the level difference calculation means and a certain angle θn. This correspondence is represented by (Pn, ΔLn, θn). For example, as shown in FIG. 5, a directivity changing means 4-1 having a unidirectionality with a rotation angle of nΔτ degrees and a directivity changing means 4-3 having a unidirectionality with a main axis direction opposite thereto. Is associated with a level difference ΔLn and a direction (nΔτ + 90) degrees orthogonal to the rotation angle nΔτ degrees.
[0021]
The eleventh level difference fluctuation width detecting means rearranges ΔLn in descending or ascending order in the direction associated by the level difference direction associating means 10, and calculates the difference (ΔLn, n + 1) from the adjacent ΔLn + 1. This difference is referred to as a level difference fluctuation width.
The case where the sound source is in the direction shown in FIG. 6 will be described as an example.
Here, as an example, the level difference ΔLn associated with the angle θn and the directivity changing unit are associated with a set of directivity changing units having a main axis in a direction orthogonal to θn and having main axes opposite to each other. FIG. 6 illustrates representative examples of the directions θn rearranged in ascending order, and the directivity changing units 4-1 and 4-3 associated with θ1. ΔLn is defined as a value obtained by subtracting the level of the directivity varying unit 4-3 from the level of the directivity varying unit 4-1. FIG. 7 is a graph showing the relationship between ΔLn and θn in this case. It can be seen that the value of ΔLn fluctuates greatly from the direction of the sound source. That is, it can be seen that the level difference fluctuation width becomes maximum when ΔLn, n + 1.
[0022]
A large change in the value of the level difference fluctuation width Δn means, in other words, a significant positive or negative change in the level difference that occurs when the direction of the directivity straddles the sound source. In FIG. 7, it is assumed that the level difference ΔLn is obtained by subtracting the output of the directivity varying unit 4-3 from the output of the directivity varying unit 4-1. In this case, from θ1 to θn, the level difference ΔLn takes a positive value since the output of the directivity changing unit 4-1 is on the side larger than the output of the directivity changing unit 4-3. However, when θn is exceeded, the output of the directivity varying unit 4-3 becomes larger than the output of the directivity varying unit 4-1 this time, and the level difference ΔLn becomes a negative value. Will take.
[0023]
In the twelve level difference fluctuation direction associating means, for example, when ΔLn, n + 1, θn is associated. Here, as another example of the association, {(θn + (θn + 1)} / 2 ”may be associated.
The thirteenth level difference fluctuation peak detecting means detects (ΔLn, n + 1, θn) at which the level difference fluctuation width becomes maximum. Here, the number of peaks to be detected does not need to be one, and a plurality of peaks may be taken in descending order of variation. For example, if a method of detecting all fluctuation widths exceeding a certain threshold value k is adopted, the directions of a plurality of sound sources can be detected even when a plurality of sound sources are sounding simultaneously.
[0024]
The fourteenth direction detecting means detects a direction θn corresponding to ΔLn, n + 1 detected by the level difference fluctuation peak detecting means. Here, as described above, a plurality of θn may be detected. The information on the sound source direction detected as described above is sent to the directivity selection unit.
Next, configuration example 2 of the sound source direction estimating means will be described.
FIG. 8 is a block diagram showing a configuration example 2 of the sound source direction estimating means.
[0025]
This is for more accurately estimating the sound source direction as compared with the sound source direction estimating means of FIG. The sound source direction estimating means in FIG. 4 calculates a level difference caused by the directivity synthesized by the directivity changing means. However, in an environment where the reflection of a room or the like cannot be ignored, the directivity electrically combined becomes dull. For example, in an anechoic room, even if there is unidirectionality such that the level difference between the blind spot and the main axis direction is 30 dB or more, in a normal room having reflection, only a difference of about 10 to 15 dB will not occur. As a result, the possibility that an error occurs in the level difference fluctuation peak detecting means increases.
[0026]
Therefore, the following processing is performed in order to generate a larger level difference depending on the direction and to increase the estimation accuracy in the level difference fluctuation peak detecting means.
Here, the directivity changing unit and the combination determining unit are the same as those in FIG. That is, the directivity changing means forms 360 / Δτ single directivities in which the main axis direction is changed every Δτ degrees, and supplies it to the sound source direction detecting means. In the combination determining means supplied with the signal, the combined directivities are made into pairs of two. Here, the directivities whose main axis directions are opposite to each other are paired.
[0027]
Next, in the 15 band dividing units, the output signals from the directivity changing units paired by the combination determining unit 8 are each divided into a plurality of frequency bands. At the time of band division, the components included in one frequency band are finely divided to an extent that it can be approximated that the components are generated only by frequencies from a single sound source. For this band division method, for example, Fourier transform is used. The band-divided signal is supplied to 16 band-by-band difference calculating means.
[0028]
The 16 band-by-band difference calculating means calculates the output signal level difference from the directivity changing means paired by the combination determining means in each frequency band. This is called a band level difference. Here, a value obtained by subtracting the level of the directivity varying unit 4-3 from the level of the directivity varying unit 4-1 is defined as a band-specific level difference.
In the seventeenth signal determination means, the frequency bands are grouped according to the value of the level difference for each band based on a predetermined reference. For example, a frequency band in which the band-based level difference takes a positive value and a frequency band in which the band-based level difference takes a negative value are separately grouped.
[0029]
In the level difference calculating means 9, among the output signals of the directivity changing means 4, the level of the signal obtained by combining the frequency bands in which the level difference for each band takes a positive value is renewed, and the level of the directivity changing means 4-1 is renewed. And Similarly, among the output signals of the directivity varying means 4-3, the level of the signal obtained by combining the frequency bands in which the level difference by band takes a negative value is set as the level of the directivity varying means 4-3 again. Then, a difference ΔLn between the newly determined level of the directivity varying unit 4-1 and the level of the directivity varying unit 4-3 is calculated.
[0030]
Each pair combined by the combination means is associated with each ΔLn calculated by the level difference calculation means and a certain angle θn. For example, in a set of a single directivity having a rotation angle of nΔτ degrees and a single directivity having a main axis direction orthogonal thereto, the level difference ΔLn is associated with a direction (nΔτ + 90) degrees orthogonal to the rotation angle nΔτ.
Other means are the same as those in FIG.
Next, a configuration example 3 of the sound source direction estimating means will be described.
[0031]
FIG. 9 shows a block diagram of Configuration Example 3 of the sound source direction estimating means.
The directivity changing means 4 synthesizes directivities in which the blind spot direction differs every Δτ °. For example, the composition is performed such that the blind spot of unidirectionality is different every Δτ °.
In the sound source direction estimating means 7, first, an output signal level of each of the directivity changing means 4 is calculated by eighteen level calculating means.
In the nineteenth level ranking means, the levels calculated by the level calculation means are rearranged in descending or ascending order.
[0032]
In the 20 level small direction detecting means, at least one of the levels ranked by the level ranking means is selected from the smaller ones, and the blind spot direction of the directivity changing means corresponding to the level is detected as the sound source direction. . Here, a plurality of levels (and corresponding blind spot directions) may be detected. By setting a certain threshold and detecting a blind spot direction having a level equal to or lower than the threshold, even if there are a plurality of sound sources, those directions can be detected.
[0033]
Note that the sound collection device and the sound collection / sound source separation device of the present invention can be configured by a computer having a CPU, a memory, and the like, a user terminal used by a user who is an access subject, and a recording medium.
The recording medium is a computer-readable recording medium such as a CD-ROM, a magnetic disk device, and a semiconductor memory, and a program for executing the sound collection method or the sound collection / sound source separation method is read by a computer and read by the computer. The operation is controlled to realize each component in the above-described embodiment on a computer.
[0034]
【The invention's effect】
According to the present invention, the directivity characteristic of the input means is electrically varied by calculating the linear sum of the signals collected by the three microphones. Conventionally, the directivity was included in the same zone, so that separation was impossible. For a sound source, it is possible to switch zones without physically moving a microphone, and to separate and extract individual sound sources. (Effect on Task 1).
[0035]
Furthermore, by estimating the sound source direction using two bidirectional microphones and one omnidirectional microphone, it is possible to estimate the sound source direction without changing the number and scale of the microphones (effect on problem 2). ).
Furthermore, by selecting directivity suitable for dividing the zone according to the estimated sound source direction by the directivity selection means, the sound of each zone is autonomously separated and extracted regardless of the position of the sound source. be able to.
[Brief description of the drawings]
FIG. 1 is an explanatory diagram of sound source direction estimation.
FIG. 2 is a schematic configuration diagram of a sound collection / sound source separation device of the present invention.
FIG. 3 is a block diagram of a sound collection / sound source separation device provided with a sound source direction estimating means of the present invention.
FIG. 4 is a block diagram of a configuration example 1 of a sound source direction estimating unit of the present invention.
FIG. 5 is a view for explaining an example of correspondence between a set of directivity changing means and a direction θn.
FIG. 6 is a diagram illustrating an example of a directional characteristic varying unit associated with a sound source direction θ1.
FIG. 7 is a diagram showing an example in which the level differences associated by the level difference direction association means are rearranged in ascending order of direction.
FIG. 8 is a block diagram of a configuration example 2 of a sound source direction estimating unit of the present invention.
FIG. 9 is a block diagram of a configuration example 3 of a sound source direction estimating unit of the present invention.
FIG. 10 is a block diagram of a sound source separation device according to the present invention.
[Explanation of symbols]
1, 2 bidirectional microphone 3 omnidirectional microphone 4 directivity varying means 5 sound source separation device 6 directivity selecting means 7 sound source direction estimating means 8 combination determining means 9 level difference calculating means 10 level difference direction associating means 11 level difference Fluctuation width detecting means 12 level difference fluctuation direction associating means 13 level difference fluctuation peak detecting means 14 direction detecting means 15 band dividing means 16 band difference calculating means 17 signal determining means 18 level calculating means 19 level ranking means 20 level small direction Detecting means 51 Band dividing section 52 Inter-channel time difference / level difference detecting section 53 Inter-channel time difference / level difference detecting section 54 Sound source judgment signal selecting section 55 Sound source signal judging section 56 Sound source signal synthesizing section

Claims

A plurality of directional sound collecting means having directions different from each other, and at least one omni-directional sound collecting means;
Directivity changing means for receiving a sound pickup signal from each sound pickup means and generating a signal weighted and added by a weighted addition coefficient in which a combination of different directivity directions is set,
Sound source direction estimating means for estimating the sound source direction based on the signal weighted and added by the generated weighted addition coefficient,
A sound pickup device comprising directivity selection means for selecting a signal weighted by a weighting addition coefficient based on an estimated sound source direction.

The sound pickup device according to claim 1,
The sound source direction estimating means is
Combination determining means for classifying a plurality of directivity variable means as two different combinations, and level difference calculating means for comparing signal levels weighted by weighting addition coefficients from the two directivity variable means classified by the combination determining means The level difference calculated by the level difference calculating means, the set combined by the combination determining means, the level difference direction associating means for associating the three directions, and the level difference direction associating means. Level difference variation width detection means for calculating a difference between the level difference and a level difference corresponding to an adjacent direction, and a level difference variation direction association for associating the difference between the level difference calculated by the level difference variation width detection means with the direction Means for selecting at least one of the differences calculated by the level difference variation width detecting means in order from the maximum value. A peak detection unit, and a direction detection unit that detects, as a sound source direction, a direction associated with the at least one level difference selected by the level difference variation peak detection unit by the level difference variation direction association unit. A sound pickup device for estimating a sound source direction.

The sound pickup device according to claim 2,
The sound source direction estimating means is
Band splitting means for splitting each signal weighted by the weighting addition coefficient from the directivity variable means combined by the combination determining means into frequency bands, and two sets of the band-divided signals combined by the combining means A band-by-band difference calculation unit that calculates a level difference for each band of the signal weighted by the weighting addition coefficient, and a signal determination unit that classifies band components based on the level difference of each band,
The level difference calculating unit calculates the total band level of each band of the band component classified by the signal determining unit, calculates a difference between the calculated whole band levels, and sets the difference as the level difference. Sound pickup device.

The sound pickup device according to claim 1,
The directivity changing means synthesizes directivity in which the blind spot direction or the main axis direction of the directivity is different every Δτ degrees (τ is an arbitrary angle),
The sound source direction estimating means is
Level calculating means for calculating the level of the signal weighted by the weighting addition coefficient from each directivity variable means, level ranking means for rearranging the levels calculated by the level calculating means in descending or ascending order, and level ranking A small-level direction detecting means for selecting at least one of the levels ranked by the means from among the small ones and calculating the blind spot direction or directivity main axis direction of the directivity changing means corresponding to the level as a sound source direction; A sound pickup device for estimating a sound source direction.

A signal which is provided with the sound pickup device according to any one of claims 1 to 4, and is weighted with at least two weighted addition coefficients from a directivity variable unit selected by a directivity selection unit of the sound pickup device. And enter
A band division unit that divides a signal weighted by each weighted addition coefficient into a plurality of frequency bands to such an extent that only a sound signal component of one sound source is included;
A band-based inter-channel parameter difference detecting means for detecting a difference between parameter values of an acoustic signal that changes due to a position of a sound source for each of the same bands of each of the divided signals as a band-based inter-channel parameter value difference,
Sound source signal determination means for determining which of the band-divided signals of the band is a signal input from which sound source, based on a band-based inter-channel parameter value difference of each band,
From each signal that is band-divided based on the output of the sound source signal determination unit, a sound source determination signal selection unit that selects at least one signal input from the same sound source,
A sound pickup / sound source separation device comprising: sound source signal synthesis means for synthesizing a plurality of band signals selected as signals from the same sound source by a sound source determination signal selection means as a sound source signal.

Generating a plurality of directional sound pickup signals in directions different in directivity from each other and at least one omnidirectional sound pickup signal;
Generate a signal weighted and added by a weighted addition coefficient in which a combination of directivity directions different from each other based on the generated collected sound signals is set,
Estimating the sound source direction based on the signal weighted by the weighting addition coefficient,
A sound pickup method, wherein a signal weighted by the weighted addition coefficient is selected based on the estimated sound source direction.

The sound pickup method according to claim 6,
The estimation of the sound source direction
The signals weighted by the plurality of weighted addition coefficients are classified as two different combinations, the signal levels weighted by the two classified weighted addition coefficients are compared, and the calculated level difference, a combined set, The three directions are associated, the difference between the level difference associated with the direction and the level difference corresponding to the adjacent direction is calculated, the difference between the calculated level difference is associated with the direction, and of the calculated level differences, A sound collecting method, wherein at least one is selected in order from a maximum value, and a direction associated with the selected at least one level difference difference is estimated as a sound source direction.

The sound pickup method according to claim 7,
The estimation of the sound source direction
Each of the signals weighted by the combined weighted addition coefficient is divided into frequency bands, and a level difference for each band of the signal weighted by the two combined weighted addition coefficients is calculated for the band-divided signal. Classifying the band components based on the level difference of each band, calculating the total band level of each class of the classified band component, calculating the difference between the calculated total band levels, A sound collection method.

The sound pickup method according to claim 6,
The setting of the combination in which the directivity directions are different from each other is set by combining the directivity in which the blind spot direction or the main axis direction of the directivity is different every Δτ degrees (τ is an arbitrary angle),
The estimation of the sound source direction is performed by calculating the levels of the signals weighted by the respective weighted addition coefficients, rearranging the calculated levels in descending or ascending order, and ranking at least the smaller of the ranked levels. A sound collection method comprising: selecting one of the two levels; calculating a variable directionality blind spot direction or a directivity main axis direction corresponding to the selected level as a sound source direction; and estimating the sound source direction.

10. A plurality of frequency bands to such an extent that a signal weighted by at least two weighted addition coefficients selected by the sound collection method according to claim 6 includes only an audio signal component of one sound source. Divided into
For each of the same band of each of these divided signals, the difference between the values of the parameters of the acoustic signal that changes due to the position of the sound source is detected as the inter-channel parameter value difference for each band,
Based on the parameter value difference between channels for each band of each band, it is determined which of the signals divided into the band of the band is a signal input from which sound source,
From each signal divided based on this determination, select at least one signal input from the same sound source,
A sound collection / sound source separation method comprising combining a plurality of band signals selected as signals from the same sound source as a sound source signal.

Weighting in which a combination of mutually different directivity directions is set based on a plurality of directional sound collection means having directions different from each other and sound pickup signals picked up by at least one omnidirectional sound pickup means. Directivity variable processing for generating a signal weighted and added by an addition coefficient,
Sound source direction estimation processing for estimating the sound source direction based on the signal weighted and added by the weighted addition coefficient generated by the directivity variable processing,
A machine-readable recording medium recording a sound collection program for causing a computer to execute a directivity selection process of selecting a signal weighted by a weighted addition coefficient based on an estimated sound source direction.

A machine-readable recording medium recording the sound collection program according to claim 11,
Sound source direction estimation processing
A combination determination process of classifying the signals weighted by the plurality of weighting addition coefficients as two different combinations, a level difference calculation process of comparing the levels of the signals weighted by the two weighting factors classified by the combination determination process, The level difference calculated in the level difference calculation process, the set combined in the combination determination process, the level difference direction association process in which the three sound source directions are associated, and the direction in the level difference direction association process. Level difference variation width detection processing for calculating the difference between the level difference and the level difference corresponding to the adjacent direction, and level difference variation direction association for associating the difference between the level difference calculated in the level difference variation width detection processing with the direction Process and at least one of the level differences calculated in the level difference variation width detection process in order from the maximum value. Level difference fluctuation peak detection processing, and direction detection processing for estimating, as a sound source direction, a direction associated with at least one level difference difference selected in the level difference fluctuation peak detection processing in the level difference fluctuation direction association processing. And a machine-readable recording medium on which a sound collection program for causing a computer to execute each process is recorded.

A machine-readable recording medium recording the sound collection program according to claim 12,
Sound source direction estimation processing
Band division processing for dividing each signal weighted by the weighting addition coefficient combined in the combination determination processing into frequency bands, and weighting the band-divided signal with two sets of weighting addition coefficients combined in the combination determination processing A band-by-band difference calculation process of calculating a level difference for each band of the signal, and a signal determination process of classifying band components based on the level difference of each band,
The level difference calculation process calculates the total band level of each of the band components classified by the signal determination process, calculates the difference between the calculated total band levels by class, and calculates the level difference. A machine-readable recording medium that records a sound collection program that causes a computer to execute processing.

A machine-readable recording medium recording the sound collection program according to claim 11,
The directivity changing process includes a process of combining directivities in which the blind spot direction or the main axis direction of the directivity is different every Δτ degrees (τ is an arbitrary angle),
The sound source direction estimation process includes a level calculation process for calculating a level of a signal weighted by a weighting addition coefficient from each directivity variable process, and a level ranking for rearranging the levels calculated in the level calculation process in descending or ascending order. Processing, and at least one of the levels ranked in the level ranking processing is selected from the smaller ones, and the variable directivity blind spot direction or the directivity main axis direction corresponding to that level is calculated as the sound source direction. A machine-readable recording medium that records a sound collection program that estimates a sound source direction by a direction detection process and causes a computer to execute each process.

A sound pickup program recorded on a machine-readable recording medium on which the sound pickup program according to any one of claims 11 to 14 is recorded, and at least two weighted additions selected by the directivity selection processing. Band division processing of dividing a signal weighted by a coefficient into a plurality of frequency bands to such an extent that only a component of an acoustic signal of one sound source is included;
A band-based inter-channel parameter difference detection process that detects a difference in parameter values of an audio signal that changes due to the position of the sound source for each of the same bands of these divided signals as a band-based inter-channel parameter value difference,
A sound source signal determination process of determining which of the band-divided signals of the band is a signal input from which sound source, based on a parameter value difference between channels for each band of each band,
From each signal subjected to band division based on the output of the sound source signal determination process, a sound source determination signal selection process of selecting at least one signal input from the same sound source,
A machine-readable recording medium storing a sound collection / sound source separation program for causing a computer to execute a sound source synthesis process for synthesizing a plurality of band signals as a sound source signal, which is selected as a signal from the same sound source in the sound source determination signal selection process.