JP3949074B2

JP3949074B2 - Objective signal extraction method and apparatus, objective signal extraction program and recording medium thereof

Info

Publication number: JP3949074B2
Application number: JP2003094840A
Authority: JP
Inventors: 章子荒木; 宏澤田; 昭二牧野; 良向井
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2003-03-31
Filing date: 2003-03-31
Publication date: 2007-07-25
Anticipated expiration: 2023-03-31
Also published as: JP2004302122A

Abstract

<P>PROBLEM TO BE SOLVED: To eliminate a problem of permutation that a method of extracting a target signal from observation signals received from a plurality of sensors while signals from a plurality of directions are mixed by using a method for blind signal separation (BSS) in a frequency range. <P>SOLUTION: An approximate value H<SB>1</SB>(f) of a frequency response between a target signal source and a sensor is found based upon a given direction of the target signal source, and used to find an initial value vector t<SB>1(0)</SB>meeting a restriction conditions for extracting a target signal without distortion; and t<SB>1(0)</SB>is updated through independent component analysis so that an output signal increases non-Gaussian property and the updated vector is varied as a separate vector so that its norm meets the restriction condition. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
この発明は、複数方向からの信号が混合されて受信され、観測したい元の信号（目的信号）のみを直接観測することはできず、目的信号に他のノイズ（雑音）などが重畳されて観測されるという状況において、目的信号を推定する方法、その装置、目的信号抽出プログラム、その記録媒体に関し、例えばオーディオ分野において、音声認識装置の入力マイクロホンと話者とが離れているためそのマイクロホンが目的話者音声以外の音まで拾ってしまうような状況でも、目的音声を抽出することで認識率の高い音声認識系の構築を可能とするものである。
【０００２】
【従来の技術】
独立成分分析（ＩＣＡ）法
目的信号を分離抽出する手法の一つ目として独立成分分析（Independent Component Analysis：ＩＣＡ）による手法がある。これは、複数の線形混合された信号を、元の信号や混合過程についての知識を全く用いることなしに推定する手法であり、ブラインド音源分離（Blind Source Separation：ＢＳＳ）とよばれる。まず、ブラインド音源分離（ＢＳＳ）について説明する。
・実環境での混合信号（観測信号）モデル
ｓ_iを信号源１１_iの信号、ｈ_jiを信号源１１_iからセンサ１２_jまでのインパルス応答（周波数応答）、Ｐをインパルス応答の次数、信号源１１_iの数をＮ個（Ｎ＞１）、センサ１２_jの数をＭ（Ｍ≧Ｎ）個、ｎを離散的時刻とすると、センサ１２_jで観測される信号ｘ_jは
ｘ_j(ｎ）＝Σ_i=1 ^NΣ_p=1 ^Pｈ_ji(p)ｓ_i(ｎ−ｐ＋１）（ｊ＝１，…，Ｍ）（１）
と表現される。ここでＮ個の信号ｓ_iは統計的に互いに独立であると仮定する。観測信号ｘ_j(ｎ）は一定周期で標本化され、ディジタル信号系列とされている。
・分離信号のモデル
ブラインド音源分離では、式（１）の形で得られる観測信号と、長さがＱタップ、インパルス応答がｗ_ijのＮ×Ｍ個の分離フィルタ群１３_ijから成る分離系を用いて分離する。この分離フィルタ群１３_ijを用いて、分離して得られる信号ｙ_i(ｎ）は、
ｙ_k(ｎ）＝Σ_j=1 ^MΣ_q=1 ^Qｗ_ij(ｑ)ｘ_j(ｎ−ｑ＋１）（ｉ＝１，…，Ｎ）（２）
と表される。図６にＮ＝Ｍ＝２の場合について、信号源１１₁，１１₂とセンサ１２₁，１２₂間の混合過程と、センサ１２₁，１２₂の出力信号ｘ₁，ｘ₂から２×２個のフィルタ群１３_ijを用いるＩＣＡ法により分離信号ｙ₁(ｎ），ｙ₂(ｎ）を出力端子１４₁，１４₂に得る分離過程を示す。
【０００３】
分離フィルタ係数（周波数応答）ｗ_ijの推定には、独立成分分析（ＩＣＡ）と呼ばれる技術が広く用いられる。これは、信号同士の統計的独立性に基づいた技術であり、分離フィルタ係数ｗ_ijは出力信号ｙ_i(ｎ）が互いに独立となるよう逐次的学習により決定される。
混合過程が例えば実音場での集音などでは、信号にシステムのインパルス応答が畳み込まれて混合され、式（１）のように非常に複雑な信号が得られる。これを分離するためには、式（２）のような複雑な形で表される分離フィルタ係数ｗ_ijを推定する必要がある。これまでに提案されている手法では、このような複雑な分離フィルタ係数ｗ_ijの推定は推定精度が低く、推定にかかる時間的な費用（コスト）も大きいことが知られている。
このため、信号を周波数領域へ変換し、各周波数において分離行列を求める手法（周波数領域ＢＳＳ）が広く用いられている。
【０００４】
・周波数領域ＢＳＳ法
周波数領域ＢＳＳ法の機能構成を図７に示す。観測信号ｘ₁(ｎ），ｘ₂(ｎ）を周波数領域変換部２１で、例えば短時間離散フーリエ変換（ＤＦＴ）（窓関数を掛け例えば１／２フレームごとにずらしながら１フレームずつ離散フーリエ変換）して次式（３）で示すような関係の周波数領域の信号に変換する。
Ｘ（ｆ，ｍ）＝Ｈ（ｆ）Ｓ（ｆ，ｍ）（３）
ここでＳ（ｆ，ｍ）＝［Ｓ₁(ｆ，ｍ），Ｓ₂(ｆ，ｍ），…，Ｓ_N(ｆ，ｍ）］^T，Ｘ（ｆ，ｍ）＝［Ｘ₁(ｆ，ｍ），Ｘ₂(ｆ，ｍ），…，Ｘ_N(ｆ，ｍ）］^Tであり、[ ]^Tは転置を表わし、Ｈ（ｆ）はＨ_ji（ｆ）を要素とする混合行列であり、ｆは周波数、ｍは観測信号を短時間ごとのフレームに分割した際のフレーム番号である。この式（３）により、式（１）に示した複雑な混合を、各周波数成分での瞬時混合として表現でき、分離問題を簡単化できる。
分離行列推定部２２において、各出力信号の周波数領域の信号Ｙ_i(ｆ，ｍ）が互いに独立となるように、次式（４）を満す分離行列Ｗ（ｆ）を推定する。
Ｙ（ｆ，ｍ）＝Ｗ（ｆ）Ｘ（ｆ，ｍ）（４）
ここでＹ（ｆ，ｍ）＝［Ｙ₁(ｆ，ｍ），Ｙ₂(ｆ，ｍ），…，Ｙ_N(ｆ，ｍ）］^T，Ｗ（ｆ）は要素Ｗ_ji（ｆ）のＮ×Ｍの行列である。
このようにして各周波数成分においての分離が達成される。
時間領域変換部２４において周波数領域で出力される信号Ｙ_i(ｆ，ｍ）を例えば逆フーリエ変換により時間領域の信号に変換する。あるいは時間領域変換部２５で分離行列Ｗ（ｆ）の各要素Ｗ_ij（ｆ）に例えば逆フーリエ変換を施して時間領域表現の分離フィルタ係数ｗ_ij（ｑ）に変換し、分離フィルタ群２６でこの変換した伝達関数ｗ_ij（ｑ）を用いて観測信号ｘ_j(ｎ）に対し式（２）を計算することで、分離された出力信号ｙ_i(ｎ）を得る。こうして得られる分離信号の中から、何らかの手法を用いて目的信号を選ぶことで、目的信号が分離抽出される。
【０００５】
分離行列推定部２２では一般に、事前白色化（Pre-whitening）処理、ＩＣＡによる直交行列推定処理、事後白色化（Post-whitening）処理の３段階の処理が行われる。つまり図８に示すように事前白色化（Pre-whitening）部３１で白色化行列Ｖ（ｆ）を、直交行列推定部３２で直交行列Ｔ（ｆ）をそれぞれ推定し、その後、事後白色化（Post-whitening）部３３でこれらの推定された二つの行列を用い、分離行列Ｗ（ｆ）＝Ｔ ^H(ｆ）Ｖ（ｆ）を求める。
つまり事前白色化部３１では各周波数における観測信号Ｘ（ｆ，ｍ）を、白色化行列Ｖ（ｆ）を用いてＺ（ｆ，ｍ）＝Ｖ（ｆ）Ｘ（ｆ，ｍ）のように事前に白色化（Pre-whitening）する。ここでＶ（ｆ）は、Ｘ（ｆ）の共分散行列Ｒ_xx（ｆ）＝Ｅ［ＸＸ ^T］の固有値を対角要素に並べた行列Λ（ｆ）と、固有ベクトルを並べた行列Ｏ（ｆ）を用いてＶ（ｆ）＝Λ ^-1/2（ｆ）Ｏ（ｆ）で得られる。
【０００６】
直交行列推定部３２では白色化した観測信号Ｚ（ｆ，ｍ）を分離するための行列をＴ（ｆ）と書くと、分離信号Ｙ（ｆ，ｍ）は
Ｙ（ｆ，ｍ）＝Ｔ（ｆ）Ｚ（ｆ，ｍ）（５）
と表される。前段で白色化を行っているため、ここでは行列Ｔ（ｆ）を直交行列に限ることができる。すなわち、Ｔ（ｆ）のｋ行目をベクトルｔ _k(ｆ）と表すとき、ベクトルｔ _i（ｆ）とベクトルｔ _j（ｆ）が直交する性質を持つ行列に限ることができる。この分離のための直交行列Ｔ（ｆ）を求める際にＩＣＡを用いる（例えば非特許文献１および２参照）。
ここではＩＣＡの手法の一つである、出力信号の非ガウス性を高めることで個々の独立成分を取り出す手法を説明する。これは、その分布がガウシアンでは無い（非ガウスの）原信号が混合された信号は、中心極限定理によりガウシアンに近くなるという性質を利用し、ガウシアンに近い信号Ｚ（ｆ，ｍ）を、ベクトルｔ _kを用いてより非ガウス性の高い信号Ｙ_k(ｆ，ｍ）に変換することで原信号の周波数領域信号を抽出できる、という原理に基づいた手法である。
【０００７】
この手法では、出力信号Ｙ_k(ｆ）の分布が最もガウシアンから遠い分布となった際に最大値を取る目的関数Γ（ｆ）を最大化する直交行列Ｔ（ｆ）の成分ベクトルｔ _k(ｆ）を求め、独立成分Ｙ_k(ｆ，ｍ）を一つずつ取り出す。すなわちこの手法では分離のための直交行列Ｔ（ｆ）は一行ずつ求められる。尚、ｋ＞２の場合には、ｔ _k(ｆ）が以前に求めたものと同一にならぬよう、ｋより大きいｒ番目のベクトルｔ _rは必ずベクトルｔ _kと直交するｔ _k(ｆ）を求める。
このように取り出される独立成分Ｙ₁(ｆ，ｍ），…，Ｙ_k(ｆ，ｍ），…，Ｙ_N(ｆ，ｍ）は原信号の周波数領域信号Ｓ₁(ｆ，ｍ），…，Ｓ_i(ｆ，ｍ），…，Ｓ_N(ｆ，ｍ）のいずれかに対応するが、その大きさと順序には任意性がある。これは、ＩＣＡが、信号の独立成分を取り出すという規範にのみ基いてベクトルｔ _k(ｆ）を推定しているためであり、ベクトルｔ _k(ｆ）の長さや求まる順序については規定していないためである。
【０００８】
このベクトルの大きさの任意性を回避するためには、一般に、ベクトルｔ _k(ｆ）のノルムを１とする拘束条件を付加することが行われている。すなわち、従来のＩＣＡでは次式（６）で示すように‖ｔ _k(ｆ）‖＝１であるｔ _k(ｆ）中の目的関数Γ（ｆ）を最大とするものを求める。
arg maxｔ _k(ｆ）Γ（ｆ） subject to ‖ｔ _k(ｆ）‖＝１（６）
周波数領域ＢＳＳでは、目的関数Γ（ｆ）としてＥ｛Ｇ（｜ｔ _k ^H(ｆ）Ｚ（ｆ）｜²）｝が用いられる。ここでＧはある非線型関数であり、Ｇ（ｚ）＝log(ａ＋ｚ）やＧ（ｚ）＝√（ａ＋ｚ）（ａは定数）などがよく用いられる。
しかし、従来のＩＣＡでは、拘束条件を用いてベクトルの大きさの任意性については回避しているが、ベクトルｔ _k(ｆ）の求まる順序には任意性が残ったままである。この順序の任意性が、従来法による周波数領域ＢＳＳの問題点であり置換（パーミュテーション：Permutation）の問題と呼ばれている。
このPermutationの問題を、ここではＮ＝Ｍ＝２の場合について具体的に説明する。
【０００９】
図９において多数の黒の小さい点は白色化された信号Ｚ₁(ｆ，ｍ）を横軸に、Ｚ₂(ｆ，ｍ）を縦軸にプロットしたものであり、太い実線で示した円４１は、拘束条件‖ｔ _k(ｆ）‖＝１を表している。細い実線４２は目的関数Γ（ｆ）＝Ｅ｛Ｇ（｜ｔ _k ^H(ｆ）Ｚ（ｆ）｜²）｝の等高線を表しており、外側ほど値が大きくなる。
式（６）では、拘束条件の円４１の上でΓ（ｆ）を最大にするベクトルｔ _k を求めるものであるから、図９中の円４１の中心を通り互いに直交する軸Ａと軸Ｂ上の、基点を円４１の中心とする２つの白いベクトルα，βのうちのどちらかが解として求まる。すなわち、ｔ ₁(ｆ）＝α，ｔ ₂(ｆ）＝βという解も、ｔ ₂(ｆ）＝α，ｔ ₁(ｆ）＝βという解も求まり得る。これは、どちらの場合でも出力Ｙ₁(ｆ，ｍ）とＹ₂(ｆ，ｍ）の独立性を保つことができるからである。
このことを式で説明する。式（５）を、Ｎ＝Ｍ＝２の場合について書き下すと次式（７）となる。
【数１】

【００１０】
直交行列Ｔ（ｆ）の一行目から一つ目の出力Ｙ₁(ｆ，ｍ）が、Ｔ（ｆ）の二行目から二つ目の出力Ｙ₂(ｆ，ｍ）が得られ、この時Ｙ₁(ｆ，ｍ）とＹ₂(ｆ，ｍ）は独立である。しかし、直交行列Ｔ（ｆ）はその行が入れかわっても、出力Ｙ₁(ｆ，ｍ）とＹ₂(ｆ，ｍ）の独立性は保たれる。すなわち直交行列Ｔ（ｆ）の１行目と２行目を入れかえると、一つ目の出力にＹ₂(ｆ，ｍ）が、二つ目の出力にＹ₁(ｆ，ｍ）が得られるが、ここでもやはり二つの出力信号は独立である。即ち、ＩＣＡは出力信号同士を互いに独立にはするが、その出力順序は拘束しない。
これより、任意の二つの周波数ｆ₁とｆ₂を考えた時、例えば出力信号Ｙ₁(ｆ₁，ｍ）とＹ₁(ｆ₂，ｍ）とが、同じ信号ｓ_iに対する推定信号であるとは限らない。従って、周波数領域ＢＳＳでは、Ｙ_i(ｆ₁，ｍ）とＹ_i(ｆ₂，ｍ）が同じ信号源の信号ｓ_iの推定となるように、直交行列Ｔ（ｆ）の行を正しく並べ替える必要がある。これを置換（Permutation）の問題と呼ぶ。
【００１１】
このPermutationの問題を解決した後、その直交行列Ｔ（ｆ）と事前白色化部３１で用いた白色化行列Ｖ（ｆ）とを用いて事後白色化（Post-Whitening）部３３でＷ（ｆ）＝Ｔ ^H(ｆ）Ｖ（ｆ）を演算して分離行列Ｗ（ｆ）を求める。
なお、Permutationの問題を解決する方法としては、たとえば非特許文献３がある。
【００１２】
適応型ビームフォーマ法
目的信号を分離抽出する手法の二つ目としては、適応型ビームフォーマによる手法がある。この適応型ビームフォーマ法は図１０に示すように、センサアレイ５０で観測された入力信号を目的信号オフ時推定部５１に入力して、妨害信号のみが存在する時間区間を検出する。この検出した時間区間において入力信号をフィルタ群５２へ供給し、そのフィルタ群５２の出力信号の和を誤差信号ｅ（ｔ）とし、フィルタ制御部５３において誤差信号のパワーが最小となるようにフィルタ群５２のフィルタ係数（インパルス応答）ｗ_ijを更新する。次に求まったフィルタ係数ｗ_ijをフィルタ群５４にコピーし、このフィルタ群５４に入力信号を通すことで、妨害信号が抑圧され、目的信号が強調された出力信号ｙ（ｎ）が得られる。
ここでは、目的信号がｓ₁(ｎ）であるとして説明を行う。また、適応型ビームフォーマ法は周波数領域で用いられることが多いのでここでも周波数領域で説明を行う。
フィルタ係数更新時、分離行列Ｗ _1j（ｆ）が全て０となる意味の無い解（目的信号も出力されない）が得られることのないように、以下に述べるような拘束条件のもとで、誤差信号Ｅ（ｆ，ｍ）のパワーが最小となるよう、分離行列Ｗ _1j（ｆ）を推定する。ここでＷ _1j（ｆ）はフィルタ係数ｗ_i(ｋ）を、Ｅ（ｆ，ｍ）は誤差信号ｅ（ｔ）をそれぞれ例えば短時間フーリエ変換により周波数領域に変換したものである。
【００１３】
適応型ビームフォーマ法では、目的信号源からセンサｊまでの周波数応答Ｈ_j1（ｆ）が既知である必要がある。既知である周波数応答をＨ′_j1（ｆ）＝exp(ｊ２πｆτ_j1）とする。もしくは目的信号源の方位θを既知として、目的信号源からセンサｊまでの周波数応答Ｈ_j1（ｆ）を、信号のセンサ間遅延時間τ_j1だけを用いてＨ′_j1（ｆ）＝exp(ｊ２πτ_j1）と近似する。ここで図１１に示すようにτ_j1＝（ｄ_j／ｃ）sin θ₁であり、ｄ_jはセンサ１２_jの座標、ｃは音速、θ_１は音源１１_１の方向である。この近似は、目的信号源（スピーカ）１１₁からセンサ（マイクロホン）１２_jに到達する信号は直接音だけであるという近似となっている。
このように、Ｈ′_j1（ｆ）が既知の時、拘束条件として例えば次式（８）で与えられ、
Σ_j=1 ^MＨ′_j1Ｗ_1j（ｆ）＝Ｗ ₁(ｆ）Ｈ ₁(ｆ）＝１（８）
この式（８）の条件を満たしながら誤差信号Ｅ（ｆ，ｍ）のパワーを最小とする係数Ｗ′_1j（ｆ）を求める。ここで、Ｈ ₁(ｆ）＝［Ｈ′₁₁（ｆ），Ｈ′₂₁（ｆ）］^T，Ｗ ₁(ｆ）＝［Ｗ₁₁（ｆ），Ｗ₁₂（ｆ）］である。式（８）は、目的信号から出力までの周波数応答を全ての周波数で１にする、という拘束条件となっている。これは目的信号が歪み無く出力されるための条件である。
【００１４】
適応型ビームフォーマ法における拘束条件を与えるためには、上記のように目的信号源１１₁からセンサ１２_jまでの周波数応答Ｈ_j1（ｆ）が必要である。しかし、Ｈ_j1（ｆ）は信号源１１₁の移動や場の変化（温度変化、扉の開放などによる形状の変化など）などにより変動するため、観測時の周波数応答Ｈ_j1（ｆ）と、適応型ビームフォーマ駆動時の周波数応答Ｈ′_j1（ｆ）とが等しいことは少ない。また、目的信号源１１₁の方位θ₁を用いてＨ′_j1（ｆ）を近似する場合にも、目的信号源１１₁の方位の推定が誤っている場合や、実環境での録音などのように信号の直接音だけでなく反射音も存在する場合には、Ｈ′_j1（ｆ）の近似精度は低くなる。
このように、適応型ビームフォーマ法で用いられる拘束条件は、多くの場合、実際に使用する環境に合わないという意味で不正確なものであり、これが適応型ビームフォーマ法の問題点となっている。このような不正確なＨ′_j1（ｆ）を拘束条件とする場合、適応型ビームフォーマ法による妨害信号除去能力は著しく低下する。
【００１５】
図１２を用いてこれを説明する。この図においてグレー（灰色）で表される点は、白色化された信号Ｚ₁(ｆ，ｍ）を横軸に、Ｚ₂(ｆ，ｍ）を縦軸にプロットしたものである。また、目的信号に関する直交ベクトルｔ ₁および拘束条件の式（８）についても、Ｚ_i(ｆ）と同じ平面に表示することができ、図において、破線４４は適応型ビームフォーマ法により推定されたベクトルｔ ₁を、ｔ_１を横軸、ｔ_２を縦軸として表し、一点鎖線４５は拘束条件を表している。
正しい拘束条件を与えた場合、図１２（ａ）に示すように、まず拘束条件を示す線４５とプロットされたＺ_i(ｆ）の軸の一方（図では軸Ａ）とは平行であることが分かる。また、正しい拘束条件を与えた場合、適応型ビームフォーマ法によって推定された、目的信号に関する直交ベクトルｔ ₁と軸Ａは垂直に交わる。
両者が垂直である時、妨害信号が最も良く抑圧される（例えば非特許文献２参照）。
これに対し、目的信号方向を間違えて拘束条件を与えた場合は、図１２（ｂ）に示すように推定された直交ベクトルｔ ₁と軸Ａは垂直には交わらない。これは、妨害信号除去能力が低いことを示している。
【００１６】
【非特許文献１】
A.Hyvaerinen and J.Karhunen and E.Oja,“Independent Component
Analysis,”John Wiley & Sons,2001,ISBN 0-471-40540
【非特許文献２】
M.Knaak and D.Filbert,“Acoustical semi-blind source separation
for machine monitoring,”in 3rd. International Conference on Blind
Source Separation and Independent Component Analysis,2001,pp.361-366
【非特許文献３】
澤田宏，向井良，荒木章子，牧野昭二，“周波数領域ブラインド音源分離におけるpermutation問題の解法”，日本音響学会秋季研究発表会，
pp.541-542,2002年９月
【００１７】
【発明が解決しようとする課題】
従来の周波数領域でのＢＳＳは、分離問題を各周波数について解くため、各帯域での分離行列は、時間的コストも小さく分離精度も良く求まる。しかし、周波数領域ＢＳＳでは、直交行列Ｔ（ｆ）の大きさを直交行列Ｔ（ｆ）の各行ベクトルのノルムが１という拘束条件で規定するが、直交行列Ｔ（ｆ）の行の順番については拘束が無かった。このため、求めたＹ（ｆ，ｍ）について置換（Permutation）の問題を解く必要があった。
また、適応型ビームフォーマ法では、目的信号源からセンサまでの周波数応答や目的信号源の方向等が正しく入手できないので、誤った拘束条件のもとでフィルタ信号（逆混合行列）の最適化が行われ、妨害信号の除去能力が十分ではなかった。
【００１８】
この発明の目的は、ＩＣＡによる学習中にPermutationの問題が生じないアルゴリズムを提案し、Permutationを解く処理を必要なくすると同時に、与えられる拘束条件の信頼性が低い場合でも妨害信号を十分除去することができる目的信号抽出方法、その装置、目的信号抽出プログラム、その記録媒体を提供することにある。
【００１９】
【課題を解決するための手段】
この発明による装置の基本的な機能構成は図１に示すように、図７に示した従来の独立成分解析（ＩＣＡ）法による周波数領域でのブラインド信号分離（ＢＳＳ）の機能構成と同様であるが、この発明では分離行列推定部に特徴を持つ。事前知識保持部の目的信号源とセンサ間の周波数応答の事前知識Ｈ ₁を用いて目的信号を歪みなく抽出する拘束条件を満す分離ベクトルｔ ₁の初期値ｔ ₁₍₀₎を計算し、分離行列推定部においては、この初期値ｔ ₁₍₀₎を、ＩＣＡ法により出力信号の非ガウス性をより高めるように更新し、この更新したベクトルｔ ₁が前記拘束条件を満たすようにベクトルｔ ₁のノルムを更新する。必要に応じて上記２つの更新を繰り返し、ベクトルｔ ₁が十分収束するまで行う。ここで例えばＨ ₁(ｆ）＝［Ｈ′₁₁（ｆ），Ｈ′₂₁（ｆ）］であり、事前知識としては例えば適応型ビームフォーマ法で利用される程度の精度を持った目的信号方向に関するものであれば良い。
【００２０】
【発明の実施の形態】
図１にこの発明装置の機能構成例を示し、図２にこの発明の方法の処理手順の例を示す。以下では観測信号がｘ₁(ｎ），ｘ₂(ｎ）、分離信号がｙ₁(ｎ），ｙ₂(ｎ）であり、分離した目的信号としてｙ₁(ｎ）を抽出する場合を例とする。
センサからの観測信号を取り込んで記憶部（図１に示していない）に一時格納する（Ｓ１）。図７に示した従来の周波数領域ＢＳＳ法と同様にこれら観測信号を例えば短時間フーリエ変換により周波数領域信号行列Ｘ（ｆ，ｍ）に周波数領域変換部２１で変換する（Ｓ２）。この変換された信号行列Ｘ（ｆ，ｍ）を用いて分離行列推定部６１で推定した分離行列Ｗ（ｆ）を推定する（Ｓ３）。
この推定は図８に示した手法と同様に事前白色化部３１で白色化行列Ｖ（ｆ）を算出し（Ｓ３−１）、白色化行列Ｖ（ｆ）を用いて信号行列Ｘ（ｆ，ｍ）を白色化して白色化観測信号行列Ｚ（ｆ，ｍ）を求める（Ｓ３−２）。
目的信号方向に関する事前知識Ｈ ₁を用いた分離行列推定について詳しく説明を行う。
【００２１】
この発明では直交行列推定部６３に特色がありこの実施形態では、直交行列推定部６３において、目的関数Γ（ｆ）を最大化するベクトルｔ ₁を、式（１０）に示す拘束条件の下に求める。
arg maxｔ ₁Γ（ｆ）（９）
Ｗ ₁ ^H(f)Ｈ ₁(f)＝ｔ ₁ ^H(f)Ｖ(f)Ｈ ₁(f)＝１（10）
これは例えば以下のように実現される。
まず直交ベクトルｔ ₁(ｆ）の初期値ｔ ₁₍₀₎(ｆ）を与える（Ｓ３−３０）。この初期値ｔ ₁₍₀₎(ｆ）は式（１０）の拘束条件を満たす任意のベクトルを用いることができるが、従来技術の項で説明した適応型ビームフォーマ法で求めたベクトルを用いるとよい。つまり、まず事前知識保持部６２に保持されている事前知識としての事前周波数応答Ｈ ₁（ｆ）の読み出しを行う（Ｓ３−３１）。初期値計算部６３ａに事前周波数応答Ｈ ₁（ｆ）、信号行列Ｚ（ｆ，ｍ）、白色化行列Ｖ（ｆ）を入力し、式（８）を満たしながら図１０での誤差信号Ｅ（ｆ，ｍ）のパワーを最小にするＷ ₁（ｆ）を求める。
Ｗ ₁（ｆ）＝ｔ ₁（ｆ）Ｖ（ｆ）の関係よりｔ ₁（ｆ）を求め、これを初期値ベクトルｔ ₁₍₀₎（ｆ）とする。（Ｓ３−３２）。
このベクトルｔ ₁₍₀₎は、拘束条件が正しく与えられる場合には既に分離を達成する直交ベクトルｔ ₁（ｆ）となり、拘束条件の信頼性が低い場合には分離能力は低いが、その向きは正しいベクトル、つまり図１２に示した例では軸Ａに垂直なベクトルに近くなる。従って、このベクトルを初期値ｔ ₁₍₀₎（ｆ）に用いることで良好かつ高速な収束が得られる。なお事前周波数応答情報Ｈ ₁(ｆ）としては、従来の適応型ビームフォーマ法で説明したように目的信号源の方位（目的信号到来方向）θを既知としてＨ_j1（ｆ）＝exp(ｊ２πｆτ_j1），τ_j1＝（ｄ_j／ｃ）sin θ₁を計算したもの、あるいは予め測定したものでよい。
【００２２】
ＩＣＡ処理部６３ｂに信号行列Ｚ（ｆ，ｍ）、初期値ｔ ₁₍₀₎（ｆ）、白色化行列Ｖ（ｆ）を入力してＩＣＡ法を用いて出力信号の非ガウス性をより高めるようにベクトルｔ ₁を更新する（Ｓ３−３３）。これにより、式（１０）の拘束条件に依らず、出力信号の分離が最も良く行われるベクトルｔ ₁を推定することができる。
ノルム更新部６３ｃに更新されたベクトルｔ ₁(ｆ）、白色化行列Ｖ（ｆ）、事前情報Ｈ ₁(ｆ）を入力して、更新したベクトルｔ ₁が拘束条件式（１０）を満たすように、ベクトルの長さ（ノルム）を変更する（Ｓ３−３４）。これは、ＩＣＡ処理部６３ｂで推定されたベクトルｔ ₁の方向は変えず、長さだけを変えて、ベクトルｔ ₁が式（１０）の拘束条件を満たすように変更する操作を行えばよい。式（１０）は、目的信号から出力信号までの間の周波数応答、つまり目的信号源からこの目的信号抽出装置の出力端までの周波数応答が全ての周波数で１であるという条件であり、目的信号が歪み無く出力されるための条件である。よって、式（１０）の拘束条件を満たすベクトルｔ ₁により分離された全ての周波数成分は全て同一の目的信号の成分である。言いかえると、式（７）の直交行列Ｔ（ｆ）の一行目は全ての周波数で目的信号に対応する出力を生成することになり、Permutationの問題が生じない。
【００２３】
ノルム変更が行われた後、収束判定部６３ｄでそのベクトルｔ ₁(ｆ）の収束状態の判定を行う（Ｓ３−３５）。十分に収束している場合、目的信号を分離抽出する為に必要なベクトルｔ ₁の収束結果を出力する。まだ収束していない場合、そのベクトルｔ ₁(ｆ）をスイッチ部６３ｅを通じてＩＣＡ処理部６３ｂに再び入力して、つまりステップＳ３−３３に戻り、ステップＳ３−３３〜Ｓ３−３５を繰り返す。
収束した直交ベクトルｔ ₁(ｆ）と白色化行列Ｖ（ｆ）を事後白色化部３３に入力して、事後白色化した分離ベクトルｗ ₁(ｆ）を計算する（Ｓ３−４）。
目的信号が複数の場合は同様にして各目的信号と対応する分離ベクトルｗ _iを求め、つまり分離行列Ｗ（ｆ）を求める。この分離行列Ｗ（ｆ）と信号行列Ｘ（ｆ，ｍ）を分離演算部２７に入力して式（４）を演算して分離された目的信号行列Ｙ（ｆ，ｍ）を演算し（Ｓ４）、この演算結果を時間領域変換部２４で例えば逆フーリエ変換により時間領域信号に変換して、各分離された目的信号ｙ₁(ｎ），ｙ₂(ｎ）を求める（Ｓ５）。
【００２４】
あるいは事後白色化して得られた分離行列Ｗ（ｆ）を時間領域変換部２５で例えば逆フーリエ変換によりフィルタ係数群ｗ_ijに変換し（Ｓ６）、分離フィルタ群２６で観測信号ｘ_j(ｎ)に対し、対応するフィルタ係数を畳み込んで分離された目的信号ｙ₁(ｎ)，ｙ₂(ｎ)を得るようにしてもよい（Ｓ７）。
この実施形態によれば、この発明の課題を解決できる仕組を以下に説明する。
上述した処理によりこの発明の課題が解決される仕組について図４を用いて説明する。グレーで表される点は、白色化された信号Ｚ₁(ｆ，ｍ)を横軸に、Ｚ₂(ｆ，ｍ)を縦軸にプロットしたものであり、一点鎖線４６は拘束条件を表し破線４７は適応型ビームフォーマ法により推定された分離ベクトルを、この実施形態の初期値ｔ ₁₍₀₎としたものを表わし、実線４８はこの実施形態により求まった直交ベクトルｔ ₁を表わし、図１２（ａ）に示した場合と同様に軸Ａと実線ベクトルｔ ₁とが垂直に交わる時、妨害信号が最も抑圧される。
【００２５】
（１）従来のＩＣＡによるＢＳＳでは直交行列Ｔ（ｆ）のノルムを１とする拘束条件（図４中の円４１）の下に最大化問題を解くので、図４中に示す互いに直角でその一方が軸Ａと垂直な２本のベクトルａ及びｂのうち、どちらがベクトルｔ ₁として求まるかは不定である。この不確定性がPermutationの問題であった。
この実施形態では拘束条件として式（１０）を用いるが、これは目的信号から出力信号までの間の周波数応答が全ての周波数で１であるという条件、すなわち目的信号が歪み無く出力されるための条件である。よって、拘束条件を満たすベクトルｔ ₁は、全ての周波数において目的信号を生成することを可能とする。
言いかえると、式（７）の直交行列Ｔ（ｆ）の一行目が全ての周波数で目的信号に対応することになり、Permutationの問題が生じない。
【００２６】
ベクトルｔ ₁に対する上記繰り返し処理の各回において、ｔ ₁の長さ（ノルム）はベクトルが式（１０）の拘束条件を満たすよう決定されるが、拘束条件が実際と多少ずれている場合でも図４に示した例のように拘束条件は線Ｂよりも軸Ａと平行に近くなるので、ほとんどの場合において軸Ａに垂直なベクトルが最終的な直交ベクトルｔ ₁として求まる。すなわち発明方法により、拘束条件が実際と多少ずれている場合でも、Permutationの問題は生じない。
また、初期値ｔ ₁₍₀₎に適応型ビームフォーマ法により求めたものを用いる場合は、拘束条件が実際と多少ずれていても軸Ａに垂直に近いベクトルから学習を始めることができることもPermutationの問題を解決することに寄与している。
【００２７】
（２）適応型ビームフォーマ法では、目的信号方向を間違えて拘束条件を与えた場合には妨害信号除去能力が低くなる。この時、図１２（ｂ）に示したように、推定された直交ベクトルｔ ₁と軸Ａは垂直には交わらなかった。
ＩＣＡ処理部６３ｂで図２中のステップＳ３−３３におけるＩＣＡ法によるベクトルｔ ₁の更新では、ベクトルｔ ₁は図９に示したベクトルαかβのように軸Ａ又は軸Ｂに垂直な方向へ近づくよう更新される。ここでは、Permutationの問題が解決されているのでベクトルｔ ₁は軸Ａと垂直な方向へ収束する。
この発明では、更新の各回においてＩＣＡ法でｔ ₁を軸Ａと垂直な方向へ近づけた後で、拘束条件を満たすためにｔ ₁の長さを変える操作を行うので、拘束条件の正確さに依らずにベクトルｔ ₁は軸Ａと垂直な方向へ近づいていく。
その結果、拘束条件の信頼性が低い場合でも、軸Ａと垂直な方向のベクトルが最終的な分離ベクトルｔ ₁として求まることになる。
【００２８】
実施例
ここでは、目的関数Γ（ｆ）＝Ｅ｛Ｇ（｜ｔ₁ ^HＺ｜²）｝の場合についての、この発明の実施例について述べる。ここでＧはある非線型関数であり、Ｇ（ｚ）＝log(ａ＋ｚ)やＧ（ｚ）＝√（ａ＋ｚ）（ａは定数）などがよく用いられる。
はじめに初期値計算部６３ａ（ステップＳ３−３２）において、直交ベクトルｔ₁(ｆ)の初期値ｔ₁₍₀₎(ｆ）を選ぶ。初期値ｔ₁₍₀₎(ｆ）は任意の値を用いることができるが、図１２で示した従来の適応型ビームフォーマ法で求まったベクトルは、分離能力は低いが解の近くにあるので、これを初期値に用いることで良好かつ高速な収束が得られる。この初期値ベクトルｔ₁₍₀₎(ｆ）は白色化行列Ｖ（ｆ）と目的信号源とセンサ間の既知の周波数応答Ｈ₁(ｆ）と白色化された信号Ｚ（ｆ）とを用いて次式（１１）の計算により求めることができる。
【数２】

ここでＲ_z(ｆ)はＺ（ｆ）の共分散行列Ｒ_z(ｆ)＝Ｅ［Ｚ（ｆ）Ｚ ^H ( ｆ ) ］であり、Ｅ[ ]は平均を表わす。
このベクトルｔ₁₍₀₎(ｆ）は、従来の適応型ビームフォーマ法で用いた規範（妨害信号のみが存在する時間における誤差信号の最小化）で求まるものであり、拘束条件が正しく与えられる場合には既に分離を達成する直交ベクトルｔ₁(ｆ）と同一のものとなり、拘束条件の信頼性が低い場合には分離能力は低いが解の近くにあるベクトルとなる。
【００２９】
次に、ＩＣＡ処理部６３ｂ（ステップＳ３−３３）においてベクトルｔ ₁の更新を行う。目的関数Γ（ｆ）＝Ｅ｛Ｇ（｜ｔ ₁ ^H Ｚ｜²）｝の最大化は次の更新式（１２）により行われる。
【数３】

であり、ｇ（ｚ）は非線型関数Ｇ（ｚ）のｚに関する微分、下付きの（）内の値は更新回数をそれぞれ表す。
【００３０】
次に、ノルム更新部６３ｃ（ステップＳ３−３４）においてベクトルｔ ₁の長さを変更してベクトルｔ ₁が式（１０）の拘束条件を満たすようにする。これは以下の式（１３）により実現できる。
【数４】

次に、判定部６３ｄ（ステップＳ３−３５）で収束判定を行う。まだ収束していない場合、ベクトルｔ ₁の更新と長さの変更を繰り返す。十分に収束している場合、目的信号を分離抽出する為に必要なｔ ₁の収束結果を出力する。
【００３１】
この発明による目的信号抽出装置は、ＣＰＵやメモリ等を有するコンピュータと、ユーザ端末と、ＣＤ−ＲＯＭやＤＶＤ−ＲＯＭ、磁気ディスク装置、半導体メモリ等の読み取り可能な記録媒体とから構成することができる。記録媒体に記録された目的信号に関する事前情報Ｈ ₁(ｆ)と、記録媒体に記録された目的信号抽出プログラムもしくは回線を通して伝送された目的信号抽出プログラムは、コンピュータに読み取られ、コンピュータ上で前述した各処理を実現する。
この発明は目的音源信号の抽出のみならず、目的電波源の信号の抽出にも適用でき、この場合はセンサとしてはアンテナが用いられ、アンテナよりの観測信号は一般にベースバンドに変換され、サンプリングされたディジタル信号系列として処理される。
【００３２】
【発明の効果】
図４の実線４８は、目的信号方向を間違えて拘束条件を与えた場合に、発明法を用いて推定した直交ベクトルｔ ₁を示している。軸Ａに垂直なベクトルが推定されている。このように、目的信号方向を間違えて拘束条件を与えた場合にも十分な抑圧性能が得られるベクトルが推定されており、この発明の有効性が分かる。
図５は、出力端子１４₁に得られる信号について、各周波数における目的信号対妨害信号比（ＳＩＲ）をｄＢで示しており、値が正ならば、目的信号が出力端子１４₁に正しく得られていることを示し、値が負ならば、Permutationの問題が生じて妨害信号が出力端子１４₁に得られていることを示す。
【００３３】
図５（ａ）は、従来のＩＣＡ法を用いた場合に出力端子１４₁に得られる信号の各周波数におけるＳＩＲである。目的信号に関する拘束を入れていないため、Permutationの問題が著しい。
図５（ｂ）（ｃ）はそれぞれ、この発明方法を用いた場合に出力端子１４₁に得られる信号の各周波数におけるＳＩＲである。図５（ｂ）は無残響で、目的信号の方向が正しい角度と２０度ずれて与えられている場合、図５（ｃ）は目的信号の方向が正しく与えられているが、残響がある場合の結果である。すなわち図５（ｂ）（ｃ）の双方とも、正確な拘束条件を与えることができない状況である。しかし、この発明方法によると、双方ともほとんどの周波数で正のＳＩＲ値が得られており、Permutationの問題はほとんど生じていないことから、発明方法が有効であることが分かる。
【図面の簡単な説明】
【図１】この発明装置の機能構成例を示すブロック図。
【図２】この発明方法の実施形態の処理手順の例を示す流れ図。
【図３】図１中の分離行列推定部の具体的機能構成例を示すブロック図。
【図４】この発明方法が課題を解決する仕組を説明するための図。
【図５】発明の効果を示す図。
【図６】ＩＣＡ法によるブラインド音源分離（ＢＳＳ）のモデルを示す図。
【図７】従来のＩＣＡ法による周波数領域ＢＳＳの機能構成を示すブロック図。
【図８】図７中の従来の分離行列推定部２２の詳細な機能構成を示すブロック図。
【図９】置換（Permutation）の問題を説明するための図。
【図１０】従来の適応型ビームフォーマ法の機能構成を示すブロック図。
【図１１】適応型ビームフォーマ法で使うパラメータを説明するための信号源とセンサとの配置を示す図。
【図１２】適応型ビームフォーマ法で得られる解を示す図。[0001]
BACKGROUND OF THE INVENTION
In the present invention, signals from a plurality of directions are mixed and received, and only the original signal (target signal) to be observed cannot be directly observed, but other noise (noise) is superimposed on the target signal and observed. In this situation, the method of estimating the target signal, the apparatus thereof, the target signal extraction program, and the recording medium thereof, for example, in the audio field, the input microphone of the voice recognition apparatus and the speaker are separated from each other. Even in a situation where sounds other than the speaker's voice are picked up, it is possible to construct a voice recognition system with a high recognition rate by extracting the target voice.
[0002]
[Prior art]
Independent component analysis (ICA) method
As a first method for separating and extracting a target signal, there is a method based on independent component analysis (ICA). This is a technique for estimating a plurality of linearly mixed signals without using any knowledge about the original signal and mixing process, and is called blind source separation (BSS). First, blind sound source separation (BSS) will be described.
-Mixed signal (observation signal) model in real environment
s_iThe signal source 11_iSignal, h_jiThe signal source 11_iTo sensor 12_jImpulse response (frequency response), P is the order of impulse response, signal source 11_iOf N (N>1), sensor 12_jIf the number of M is M (M ≧ N) and n is a discrete time, the sensor 12_jSignal x observed at_jIs
x_j(n) = Σ_{i = 1} ^NΣ_{p = 1} ^Ph_ji(p) s_i(n−p + 1) (j = 1,..., M) (1)
It is expressed as Where N signals s_iAre statistically independent of each other. Observation signal x_j(n) is sampled at a constant period to form a digital signal sequence.
・ Separated signal model
In blind sound source separation, the observation signal obtained in the form of equation (1), the length is Q tap, and the impulse response is w_ijN × M separation filter groups 13_ijSeparation using a separation system consisting of This separation filter group 13_ijThe signal y obtained by separation using_i(n)
y_k(n) = Σ_{j = 1} ^MΣ_{q = 1} ^Qw_ij(q) x_j(n−q + 1) (i = 1,..., N) (2)
It is expressed. In the case of N = M = 2 in FIG.₁, 11₂And

sensor

12₁, 12₂The mixing process between the

sensor

12₁, 12₂Output signal x₁, X₂To 2 × 2 filter group 13_ijThe separated signal y by the ICA method using₁(n), y₂(n) is the output terminal 14₁, 14₂Shows the separation process.
[0003]
Separation filter coefficient (frequency response) w_ijA technique called independent component analysis (ICA) is widely used for the estimation. This is a technique based on statistical independence between signals, and the separation filter coefficient w_ijIs the output signal y_iIt is determined by sequential learning so that (n) are independent of each other.
When the mixing process is, for example, collecting sound in a real sound field, the system impulse response is convoluted with the signal and mixed to obtain a very complicated signal as shown in Equation (1). In order to separate this, the separation filter coefficient w expressed in a complicated form as in Equation (2)_ijNeed to be estimated. In the methods proposed so far, such a complicated separation filter coefficient w_ijIt is known that the estimation of is low in estimation accuracy and the time cost (cost) required for the estimation is large.
For this reason, a technique (frequency domain BSS) for converting a signal into the frequency domain and obtaining a separation matrix at each frequency is widely used.
[0004]
・ Frequency domain BSS method
A functional configuration of the frequency domain BSS method is shown in FIG. Observation signal x₁(n), x₂(n) is subjected to, for example, a short-time discrete Fourier transform (DFT) (by multiplying by a window function, for example, one frame at a time while shifting every 1/2 frame), and (n) is expressed by the following equation (3). The signal is converted into a signal in the frequency domain having such a relationship.
X(F, m) =H(F)S(F, m) (3)
hereS(F, m) = [S₁(f, m), S₂(f, m), ..., S_N(f, m)]^T,X(F, m) = [X₁(f, m), X₂(f, m), ..., X_N(f, m)]^TAnd []^TStands for transpose,H(F) is H_ji(F) is a mixing matrix, f is a frequency, and m is a frame number when the observation signal is divided into short-time frames. By this equation (3), the complicated mixing shown in equation (1) can be expressed as instantaneous mixing at each frequency component, and the separation problem can be simplified.
In the separation matrix estimation unit 22, the frequency domain signal Y of each output signal_iA separation matrix that satisfies the following equation (4) so that (f, m) are independent of each other:WEstimate (f).
Y(F, m) =W(F)X(F, m) (4)
hereY(F, m) = [Y₁(f, m), Y₂(f, m), ..., Y_N(f, m)]^T,W(F) is element W_jiIt is an N × M matrix of (f).
In this way, separation at each frequency component is achieved.
The signal Y output in the frequency domain in the time domain transform unit 24_i(f, m) is converted into a signal in the time domain by, for example, inverse Fourier transform. Alternatively, the separation matrix in the time domain conversion unit 25WEach element W of (f)_ijFor example, an inverse Fourier transform is applied to (f) to obtain a separation filter coefficient w in time domain representation._ijThe transfer function w converted into (q) and converted by the separation filter group 26_ijObservation signal x using (q)_jBy calculating equation (2) for (n), the separated output signal y_i(n) is obtained. The target signal is separated and extracted by selecting the target signal from the separated signals obtained in this way by using some method.
[0005]
In general, the separation matrix estimation unit 22 performs a three-stage process including a pre-whitening process, an orthogonal matrix estimation process using ICA, and a post-whitening process. That is, as shown in FIG. 8, the pre-whitening unit 31 performs the whitening matrix.V(F) is converted into an orthogonal matrix by the orthogonal matrix estimation unit 32.T(F) is estimated, and then the post-whitening unit 33 uses these two estimated matrixes to form a separation matrix.W(F) =T ^H(f)V(F) is obtained.
That is, the pre-whitening unit 31 observes signals at each frequency.X(F, m) is the whitening matrixVUsing (f)Z(F, m) =V(F)XPre-whitening as in (f, m). hereV(F)XCovariance matrix R of (f)_xx(F) = E [XX ^T] A matrix with the eigenvalues ofΛ(F) and a matrix of eigenvectorsOUsing (f)V(F) =Λ ^-1/2(F)OObtained in (f).
[0006]
The orthogonal matrix estimator 32 whitens the observed signalZA matrix for separating (f, m)TWrite (f), separation signalY(F, m)
Y(F, m) =T(F)Z(F, m) (5)
It is expressed. Since the whitening is performed in the previous stage, here the matrixT(F) can be limited to an orthogonal matrix. That is,T(F) k-th row vectort _kWhen expressed as (f), a vectort _i(F) and vectort _j(F)Can be limited to matrices having orthogonal properties. Orthogonal matrix for this separationTICA is used in determining (f) (see, for example, Non-Patent Documents 1 and 2).
Here, a method of extracting individual independent components by increasing the non-Gaussianity of the output signal, which is one of ICA methods, will be described. This is because a signal mixed with an original signal whose distribution is not Gaussian (non-Gaussian) becomes close to Gaussian by the central limit theorem, and a signal Z (f, m) close to Gaussian is expressed as a vector.t _kSignal Y is more non-Gaussian_kThis is a technique based on the principle that the frequency domain signal of the original signal can be extracted by converting to (f, m).
[0007]
In this method, the output signal Y_kAn orthogonal matrix that maximizes the objective function Γ (f) that takes the maximum value when the distribution of (f) is the farthest from Gaussian.TComponent vector of (f)t _k(f) is obtained and the independent component Y is obtained._kTake out (f, m) one by one. In other words, this method uses orthogonal matrix for separation.T(F) is obtained line by line. K>In the case of 2,t _kthe r-th vector greater than k so that (f) is not the same as previously determinedt _rIs always a vectort _kOrthogonal tot _k(f) is obtained.
Independent component Y thus taken out₁(f, m), ..., Y_k(f, m), ..., Y_N(f, m) is the frequency domain signal S of the original signal.₁(f, m), ..., S_i(f, m), ..., S_NIt corresponds to any one of (f, m), but its size and order are arbitrary. This is a vector based solely on the norm that ICA extracts independent components of the signal.t _kThis is because (f) is estimated, and the vectort _kThis is because the length of (f) and the order in which it is obtained are not defined.
[0008]
In order to avoid this vector size arbitraryness, in general, the vectort _kA constraint condition in which the norm of (f) is 1 is added. That is, in the conventional ICA, as shown by the following formula (6),t _k(f) ‖ = 1t _kFind the one that maximizes the objective function Γ (f) in (f).
arg maxt _k(f) Γ (f) subject to ‖t _k(f) ‖ = 1 (6)
In the frequency domain BSS, E {G (|t _k ^H(f)Z(F) |²)} Is used. Here, G is a non-linear function, and G (z) = log (a + z), G (z) = √ (a + z) (a is a constant), etc. are often used.
However, the conventional ICA avoids the arbitraryness of the vector size by using the constraint condition.t _kArbitraryness remains in the order in which (f) is determined. This arbitraryness of the order is a problem of the frequency domain BSS according to the conventional method and is called a problem of permutation.
This Permutation problem will be specifically described here in the case of N = M = 2.
[0009]
In FIG. 9, a number of small black dots indicate the whitened signal Z.₁With (f, m) on the horizontal axis, Z₂(f, m) is plotted on the vertical axis, and a circle 41 indicated by a thick solid line represents a constraint condition ‖.t _k(f) represents ‖ = 1. The thin solid line 42 indicates the objective function Γ (f) = E {G (|t _k ^H(f)Z(F) |²)} Contour line, and the value increases toward the outside.
In equation (6), a vector that maximizes Γ (f) on the constraint circle 41t _k Therefore, one of the two white vectors α and β on the axes A and B passing through the center of the circle 41 in FIG. 9 and orthogonal to each other and having the base point as the center of the circle 41 is the solution. It is obtained as That is,t ₁(f) = α,t ₂The solution (f) = β ist ₂(f) = α,t ₁The solution (f) = β can also be obtained. This is the output Y in either case₁(f, m) and Y₂This is because the independence of (f, m) can be maintained.
This will be explained by equations. When formula (5) is written down for the case of N = M = 2, the following formula (7) is obtained.
[Expression 1]

[0010]
Orthogonal matrixT(F) The first output Y from the first line₁(f, m) isT(F) second to second output Y₂(f, m) is obtained and Y₁(f, m) and Y₂(f, m) are independent. But the orthogonal matrixT(F) outputs Y even if the line is changed₁(f, m) and Y₂Independence of (f, m) is maintained. Ie orthogonal matrixTIf the first and second lines in (f) are swapped, the first output is Y₂(f, m) is Y on the second output₁(f, m) is obtained, but again the two output signals are independent. That is, ICA makes output signals independent of each other, but does not constrain the output order.
From this, any two frequencies f₁And f₂For example, the output signal Y₁(f₁, M) and Y₁(f₂, M) is the same signal s_iIs not necessarily an estimated signal. Therefore, in the frequency domain BSS, Y_i(f₁, M) and Y_i(f₂, M) are signals s of the same signal source_iIs an orthogonal matrix such thatTIt is necessary to rearrange the rows in (f) correctly. This is called the problem of permutation.
[0011]
After solving this Permutation problem, the orthogonal matrixT(F) and the whitening matrix used in the prewhitening unit 31V(F) In the post-whitening section 33 usingW(F) =T ^H(f)V(F) is calculated and the separation matrixW(F) is obtained.
As a method for solving the Permutation problem, for example, there is Non-Patent Document 3.
[0012]
Adaptive beamformer method
As a second method of separating and extracting the target signal, there is a method using an adaptive beamformer. In this adaptive beamformer method, as shown in FIG. 10, an input signal observed by the sensor array 50 is input to a target signal OFF time estimation unit 51 to detect a time interval in which only a disturbing signal exists. In this detected time interval, an input signal is supplied to the filter group 52, the sum of the output signals of the filter group 52 is set as an error signal e (t), and the filter control unit 53 performs filtering so that the power of the error signal is minimized. Filter coefficient (impulse response) w of group 52_ijUpdate. Next obtained filter coefficient w_ijIs copied to the filter group 54, and an input signal is passed through the filter group 54, so that an interference signal is suppressed and an output signal y (n) in which the target signal is emphasized is obtained.
Here, the target signal is s₁The description will be made assuming that (n). Since the adaptive beamformer method is often used in the frequency domain, the description will be given here in the frequency domain.
Separation matrix when filter coefficients are updatedW _1jThe power of the error signal E (f, m) under the constraint conditions described below so that a meaningless solution (no target signal is not output) where (f) is all 0 is not obtained. So that the separation matrix is minimizedW _1jEstimate (f). hereW _1j(F) is the filter coefficient w_i(k) and E (f, m) are obtained by converting the error signal e (t) into the frequency domain by, for example, short-time Fourier transform.
[0013]
In the adaptive beamformer method, the frequency response H from the target signal source to the sensor j_j1(F) needs to be known. The known frequency response is H ′_j1(F) = exp (j2πfτ_j1). Alternatively, assuming that the direction θ of the target signal source is known, the frequency response H from the target signal source to the sensor j_j1(F) is the signal inter-sensor delay time τ_j1Only using H '_j1(F) = exp (j2πτ_j1). Here, as shown in FIG._j1= (D_j/ C) sin θ₁And d_jIs sensor 12_j, C is the speed of sound, θ₁ Is sound source 11₁ Direction. This approximation is based on the target signal source (speaker) 11.₁To sensor (microphone) 12_jIt is an approximation that the signal arriving at is only the direct sound.
Thus, H '_j1When (f) is known, the constraint condition is given by the following equation (8), for example:
Σ_{j = 1} ^MH '_j1W_1j(F) =W ₁(f)H ₁(f) = 1 (8)
A coefficient W ′ that minimizes the power of the error signal E (f, m) while satisfying the condition of the equation (8)._1j(F) is obtained. here,H ₁(f) = [H ′₁₁(F), H '_{twenty one}(F)]^T,W ₁(f) = [W₁₁(F), W₁₂(F)]. Expression (8) is a constraint condition that the frequency response from the target signal to the output is set to 1 at all frequencies. This is a condition for outputting the target signal without distortion.
[0014]
In order to give the constraint condition in the adaptive beamformer method, the target signal source 11 is used as described above.₁To sensor 12_jFrequency response up to_j1(F) is required. But H_j1(F) is the signal source 11.₁Frequency response H at the time of observation because it fluctuates due to movement of the object and changes in the field (temperature change, shape change due to door opening, etc.)_j1(F) and frequency response H ′ when the adaptive beamformer is driven._j1(F) is rarely equal. The target signal source 11₁Direction θ₁H '_j1The target signal source 11 is also used when approximating (f).₁If the direction of the head is incorrect, or if there is a reflected sound as well as a direct sound of the signal, such as recording in a real environment, H '_j1The approximation accuracy of (f) is lowered.
In this way, the constraint conditions used in the adaptive beamformer method are often inaccurate in the sense that they do not match the actual environment in use, and this is a problem with the adaptive beamformer method. Yes. Such inaccurate H '_j1When (f) is set as a constraint, the interference signal removal capability by the adaptive beamformer method is significantly reduced.
[0015]
This will be described with reference to FIG. The point represented in gray in this figure is the whitened signal Z₁With (f, m) on the horizontal axis, Z₂(f, m) is plotted on the vertical axis. Also, the orthogonal vector for the target signalt ₁And constraint equation (8), Z_i(f) can be displayed in the same plane, and in the figure, the broken line 44 is a vector estimated by the adaptive beamformer method.t ₁T₁Is the horizontal axis, t₂Is represented as a vertical axis, and the alternate long and short dash line 45 represents a constraint condition.
When the correct constraint condition is given, first, as shown in FIG. 12A, a line 45 indicating the constraint condition and the plotted Z_iIt can be seen that one of the axes of (f) (axis A in the figure) is parallel. In addition, when the correct constraint condition is given, the orthogonal vector for the target signal estimated by the adaptive beamformer methodt ₁And axis A intersect perpendicularly.
When both are vertical, the interference signal is best suppressed (see, for example, Non-Patent Document 2).
On the other hand, when a constraint condition is given with a wrong target signal direction, an orthogonal vector estimated as shown in FIG.t ₁And axis A do not intersect vertically. This indicates that the interference signal removal capability is low.
[0016]
[Non-Patent Document 1]
A. Hyvaerinen and J. Karhunen and E. Oja, “Independent Component
Analysis, ”John Wiley & Sons, 2001, ISBN 0-471-40540
[Non-Patent Document 2]
M. Knaak and D. Filbert, “Acoustical semi-blind source separation
for machine monitoring, ”in 3rd. International Conference on Blind
Source Separation and Independent Component Analysis, 2001, pp.361-366
[Non-Patent Document 3]
Hiroshi Sawada, Ryo Mukai, Akiko Araki, Shoji Makino, “Solution of permutation problem in frequency domain blind source separation”, Acoustical Society of Japan Autumn Meeting,
pp.541-542, September 2002
[0017]
[Problems to be solved by the invention]
Since the BSS in the conventional frequency domain solves the separation problem for each frequency, the separation matrix in each band can be obtained with low time cost and good separation accuracy. However, in the frequency domain BSS, the orthogonal matrixTThe size of (f) is the orthogonal matrixTAlthough defined by the constraint that the norm of each row vector in (f) is 1, the orthogonal matrixTThere was no restriction on the order of the rows in (f). Because of this, askedYIt was necessary to solve the permutation problem for (f, m).
In addition, since the adaptive beamformer method cannot obtain the frequency response from the target signal source to the sensor and the direction of the target signal source correctly, the filter signal (inverse mixing matrix) can be optimized under erroneous constraints. The interference signal removal capability was not sufficient.
[0018]
An object of the present invention is to propose an algorithm that does not cause a Permutation problem during learning by ICA, eliminates the need to solve the Permutation, and at the same time sufficiently eliminates interference signals even when the reliability of given constraints is low. OBJECT SIGNAL EXTRACTION METHOD, APPARATUS, OBJECT SIGNAL EXTRACTION PROGRAM, AND RECORDING MEDIUM
[0019]
[Means for Solving the Problems]
As shown in FIG. 1, the basic functional configuration of the apparatus according to the present invention is the same as the functional configuration of blind signal separation (BSS) in the frequency domain by the conventional independent component analysis (ICA) method shown in FIG. However, the present invention has a feature in the separation matrix estimation unit. Prior knowledge of frequency response between target signal source and sensor of prior knowledge holding unitH ₁A separation vector that satisfies the constraint of extracting the target signal without distortion usingt ₁Initial value oft _{1 (0)}And the separation matrix estimation unit calculates this initial value.t _{1 (0)}Is updated by the ICA method so as to further increase the non-Gaussianity of the output signal, and this updated vector is updated.t ₁Vector such that satisfies the constraintt ₁Update the norm of. Repeat the above two updates as necessary to get the vectort ₁Do until it converges sufficiently. Where for exampleH ₁(f) = [H ′₁₁(F), H '_{twenty one}(F)], and the prior knowledge may be related to the target signal direction with a degree of accuracy used in the adaptive beamformer method, for example.
[0020]
DETAILED DESCRIPTION OF THE INVENTION
FIG. 1 shows an example of the functional configuration of the apparatus of the present invention, and FIG. 2 shows an example of the processing procedure of the method of the present invention. In the following, the observed signal is x₁(n), x₂(n), the separated signal is y₁(n), y₂(n) and y as the separated target signal₁The case where (n) is extracted is taken as an example.
An observation signal from the sensor is captured and temporarily stored in a storage unit (not shown in FIG. 1) (S1). Similar to the conventional frequency domain BSS method shown in FIG. 7, these observed signals are converted into a frequency domain signal matrix by, for example, a short-time Fourier transform.XThe frequency domain conversion unit 21 converts the data into (f, m) (S2). This transformed signal matrixXSeparation matrix estimated by the separation matrix estimation unit 61 using (f, m)W(F) is estimated (S3).
This estimation is performed by the pre-whitening unit 31 in the same manner as the method shown in FIG.V(F) is calculated (S3-1), and the whitening matrixVSignal matrix using (f)X(F, m) is whitened and whitened observation signal matrixZ(F, m) is obtained (S3-2).
Prior knowledge about target signal directionH ₁The separation matrix estimation using is described in detail.
[0021]
In the present invention, the orthogonal matrix estimation unit 63 has a feature. In this embodiment, the orthogonal matrix estimation unit 63 uses the vector that maximizes the objective function Γ (f).t ₁Is obtained under the constraint condition shown in Expression (10).
arg maxt ₁Γ (f) (9)
W ₁ ^H(f)H ₁(f) =t ₁ ^H(f)V(f)H ₁(f) = 1 (10)
This is realized as follows, for example.
First orthogonal vectort ₁Initial value of (f)t _{1 (0)}(f) is given (S3-30). This initial valuet _{1 (0)}As (f), an arbitrary vector that satisfies the constraint condition of Expression (10) can be used, but a vector obtained by the adaptive beamformer method described in the section of the prior art may be used. That is, first, the prior frequency response as the prior knowledge held in the prior knowledge holding unit 62.H ₁Reading of (f) is performed (S3-31). Pre-frequency response to initial value calculator 63aH ₁(F), signal matrixZ(F, m), whitening matrixV(F) is input and the power of the error signal E (f, m) in FIG. 10 is minimized while satisfying the equation (8).W ₁(F) is obtained.
W ₁(F) =t ₁(F) From the relationship of V (f)t ₁(F) is obtained and this is converted into an initial value vectort _{1 (0)}(F). (S3-32).
This vectort _{1 (0)}Is an orthogonal vector that already achieves separation if the constraints are given correctlyt ₁When the reliability of the constraint condition is low, the separation capability is low, but the direction is close to a correct vector, that is, a vector perpendicular to the axis A in the example shown in FIG. Therefore, this vector is the initial valuet _{1 (0)}Good and fast convergence can be obtained by using it in (f). Prior frequency response informationH ₁As (f), as described in the conventional adaptive beamformer method, the direction (target signal arrival direction) θ of the target signal source is known and H_j1(F) = exp (j2πfτ_j1), Τ_j1= (D_j/ C) sin θ₁May have been calculated or previously measured.
[0022]
The ICA processing unit 63b has a signal matrix.Z(F, m), initial valuet _{1 (0)}(F), whitening matrixV(F) is input and the ICA method is used to increase the non-Gaussianity of the output signal.t ₁Is updated (S3-33). As a result, the vector that performs the best separation of the output signal regardless of the constraint condition of Equation (10).t ₁Can be estimated.
Vector updated by norm update unit 63ct ₁(f), whitening matrixV(F), prior informationH ₁Enter (f), updated vectort ₁The length (norm) of the vector is changed so that satisfies the constraint condition (10) (S3-34). This is the vector estimated by the ICA processing unit 63bt ₁Change the length of the vector without changing the direction oft ₁May be changed so as to satisfy the constraint condition of Expression (10). Equation (10) is a condition that the frequency response from the target signal to the output signal, that is, the frequency response from the target signal source to the output terminal of the target signal extraction device is 1 at all frequencies. Is a condition for output without distortion. Therefore, a vector that satisfies the constraint condition of equation (10)t ₁All the frequency components separated by the above are components of the same target signal. In other words, the orthogonal matrix of equation (7)TThe first line of (f) generates an output corresponding to the target signal at all frequencies, so that no Permutation problem occurs.
[0023]
After the norm change, the convergence determination unit 63d uses the vectort ₁The convergence state of (f) is determined (S3-35). Vector that is necessary to separate and extract the target signal when it is sufficiently convergedt ₁The convergence result of is output. The vector, if not yet convergedt ₁(f) is input again to the ICA processing unit 63b through the switch unit 63e, that is, the process returns to step S3-33, and steps S3-33 to S3-35 are repeated.
Converged orthogonal vectort ₁(f) and whitening matrixV(F) is input to the post-whitening unit 33, and the post-whitening separated vectorw ₁(f) is calculated (S3-4).
Similarly, if there are multiple target signals, separate vectors corresponding to each target signalw _iThat is, the separation matrixW(F) is obtained. This separation matrixW(F) and signal matrixX(F, m) is input to the separation calculation unit 27 and the target signal matrix separated by calculating Expression (4)Y(F, m) is calculated (S4), and the calculation result is converted into a time domain signal by, for example, inverse Fourier transform in the time domain conversion unit 24, and each separated target signal y is calculated.₁(n), y₂(n) is obtained (S5).
[0024]
Or separation matrix obtained by whitening after the factW(F) is converted into a filter coefficient group w by, for example, inverse Fourier transform in the time domain transform unit 25._ij(S6) and the separation filter group 26 uses the observation signal x_jThe target signal y separated by convolving the corresponding filter coefficient with respect to (n)₁(n), y₂(n) may be obtained (S7).
According to this embodiment, the structure which can solve the subject of this invention is explained below.
A mechanism for solving the problems of the present invention by the processing described above will be described with reference to FIG. The points represented in gray are the whitened signal Z₁With (f, m) on the horizontal axis, Z₂(f, m) is plotted on the vertical axis, the alternate long and short dash line 46 represents the constraint condition, the broken line 47 represents the separation vector estimated by the adaptive beamformer method, and the initial value of this embodiment.t _{1 (0)}The solid line 48 represents the orthogonal vector obtained by this embodiment.t ₁As in the case shown in FIG. 12A, the axis A and the solid line vectort ₁When they intersect perpendicularly, the jamming signal is most suppressed.
[0025]
(1) Orthogonal matrix in conventional ICA BSSTSince the maximization problem is solved under the constraint condition (circle 41 in FIG. 4) in which the norm of (f) is 1, two vectors a perpendicular to each other and one of which is perpendicular to the axis A shown in FIG. And b are vectorst ₁It is uncertain as to This uncertainty was a problem with Permutation.
In this embodiment, Expression (10) is used as a constraint condition. This is because the frequency response between the target signal and the output signal is 1 at all frequencies, that is, the target signal is output without distortion. It is a condition. Therefore, a vector that satisfies the constraint conditiont ₁Makes it possible to generate the target signal at all frequencies.
In other words, the orthogonal matrix of equation (7)TThe first line of (f) corresponds to the target signal at all frequencies, and no Permutation problem occurs.
[0026]
vectort ₁In each iteration of the above iterative processt ₁The length (norm) of the vector is determined so that the vector satisfies the constraint condition of Equation (10). However, even if the constraint condition is slightly different from the actual condition, the constraint condition is determined from the line B as in the example shown in FIG. Is nearly parallel to axis A, so in most cases the vector perpendicular to axis A is the final orthogonal vectort ₁It is obtained as That is, even if the constraint conditions are slightly different from the actual conditions, the permutation problem does not occur.
The initial valuet _{1 (0)}When using what is obtained by the adaptive beamformer method, learning can be started from a vector close to the axis A even if the constraint condition is slightly different from the actual condition, which contributes to solving the problem of Permutation. ing.
[0027]
(2) In the adaptive beamformer method, when the target signal direction is wrong and the constraint condition is given, the interference signal removal capability is lowered. At this time, as shown in FIG.t ₁And axis A did not intersect vertically.
The ICA processing unit 63b uses the ICA method in step S3-33 in FIG.t ₁Update, vectort ₁Is updated so as to approach a direction perpendicular to the axis A or the axis B like the vectors α or β shown in FIG. Here, since the Permutation problem has been solved,t ₁Converges in a direction perpendicular to axis A.
In this invention, the ICA method is used for each update.t ₁In order to satisfy the constraint condition after approaching in the direction perpendicular to axis At ₁Because the operation to change the length oft ₁Approaches in a direction perpendicular to axis A.
As a result, even if the reliability of the constraint condition is low, the vector in the direction perpendicular to the axis A is the final separation vector.t ₁Will be asked for.
[0028]
Example
Here, the objective function Γ (f) = E {G (| t₁ ^HZ |²)}, An embodiment of the present invention will be described. Here, G is a non-linear function, and G (z) = log (a + z), G (z) = √ (a + z) (a is a constant), etc. are often used.
First, in the initial value calculation unit 63a (step S3-32), the orthogonal vector t₁Initial value t of (f)_{1 (0)}Select (f). Initial value t_{1 (0)}Although any value can be used for (f), the vector obtained by the conventional adaptive beamformer method shown in FIG. 12 has a low separation ability but is close to the solution, so this is used as an initial value. Thus, good and fast convergence can be obtained. This initial value vector t_{1 (0)}(f) is the whitening matrix V (f) and the known frequency response H between the target signal source and the sensor.₁Using (f) and the whitened signal Z (f), it can be obtained by calculation of the following equation (11).
[Expression 2]

Where R_z(f) is the covariance matrix R of Z (f)_z(f) =E [Z (f) Z ^H ( f ) ]And E [] represents the average.
This vector t_{1 (0)}(f) is obtained by the standard used in the conventional adaptive beamformer method (minimization of the error signal in the time when only the interfering signal exists), and separation is already achieved when the constraint condition is given correctly. Orthogonal vector t₁It becomes the same as (f), and when the reliability of the constraint condition is low, the separation ability is low but the vector is near the solution.
[0029]
Next, in the ICA processing unit 63b (step S3-33), the vectort ₁Update. Objective function Γ (f) = E {G (|t ₁ ^H Z｜²)} Is maximized by the following update equation (12).
[Equation 3]

G (z) is the derivative of the nonlinear function G (z) with respect to z, and the value in the subscript () represents the number of updates.
[0030]
Next, in the norm update unit 63c (step S3-34), the vectort ₁Vector with changing lengtht ₁To satisfy the constraint of equation (10). This can be realized by the following equation (13).
[Expression 4]

Next, convergence determination is performed by the determination unit 63d (step S3-35). Vector if not yet convergedt ₁Repeat update and length change. If it is sufficiently converged, it is necessary to separate and extract the target signal.t ₁The convergence result of is output.
[0031]
An object signal extraction device according to the present invention can be composed of a computer having a CPU, a memory, and the like, a user terminal, and a readable recording medium such as a CD-ROM, DVD-ROM, magnetic disk device, and semiconductor memory. . Prior information on the target signal recorded on the recording mediumH ₁(f) and the target signal extraction program recorded on the recording medium or the target signal extraction program transmitted through the line are read by the computer, and each process described above is realized on the computer.
The present invention can be applied not only to extraction of a target sound source signal but also to extraction of a signal of a target radio wave source. In this case, an antenna is used as a sensor, and an observation signal from the antenna is generally converted into baseband and sampled. Processed as a digital signal sequence.
[0032]
【The invention's effect】
A solid line 48 in FIG. 4 indicates an orthogonal vector estimated using the inventive method when the target signal direction is wrong and a constraint condition is given.t ₁Is shown. A vector perpendicular to axis A is estimated. As described above, a vector that can provide sufficient suppression performance even when the target signal direction is wrong and a constraint condition is given is estimated, and the effectiveness of the present invention can be understood.
FIG. 5 shows the output terminal 14₁, The target signal-to-interference signal ratio (SIR) at each frequency is indicated in dB. If the value is positive, the target signal is output to the output terminal 14.₁If the value is negative, a permutation problem occurs and the disturbing signal is output to the output terminal 14.₁It shows that it is obtained.
[0033]
FIG. 5A shows an output terminal 14 when the conventional ICA method is used.₁SIR at each frequency of the obtained signal. Since there are no constraints on the target signal, the problem of Permutation is significant.
FIGS. 5B and 5C respectively show the output terminal 14 when the method of the present invention is used.₁SIR at each frequency of the obtained signal. FIG. 5 (b) shows no reverberation, and the direction of the target signal is given 20 degrees from the correct angle. FIG. 5 (c) shows the case where the direction of the target signal is given correctly but there is reverberation. Is the result of That is, both FIGS. 5B and 5C are in a situation where an accurate constraint condition cannot be given. However, according to the method of the present invention, positive SIR values are obtained at almost all frequencies, and the Permutation problem hardly occurs, so that it can be understood that the method of the present invention is effective.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a functional configuration example of an apparatus according to the present invention.
FIG. 2 is a flowchart showing an example of a processing procedure according to an embodiment of the method of the present invention.
FIG. 3 is a block diagram illustrating a specific functional configuration example of a separation matrix estimation unit in FIG. 1;
FIG. 4 is a view for explaining a mechanism by which the method of the present invention solves the problem.
FIG. 5 is a diagram showing the effect of the invention.
FIG. 6 is a diagram showing a model of blind sound source separation (BSS) by the ICA method.
FIG. 7 is a block diagram showing a functional configuration of a frequency domain BSS by a conventional ICA method.
8 is a block diagram showing a detailed functional configuration of a conventional separation matrix estimation unit 22 in FIG.
FIG. 9 is a diagram for explaining a problem of permutation.
FIG. 10 is a block diagram showing a functional configuration of a conventional adaptive beamformer method.
FIG. 11 is a diagram showing the arrangement of signal sources and sensors for explaining parameters used in the adaptive beamformer method.
FIG. 12 is a diagram showing a solution obtained by an adaptive beamformer method.

Claims

A method of observing signals arriving from a plurality of directions with a plurality of sensors and extracting a target signal using a blind signal separation method in a frequency domain based on the observation signals from the plurality of sensors,
A procedure for converting the observation signal from the sensor into a signal in the frequency domain;
A procedure for calculating a separation matrix at each frequency by independent component analysis from the frequency domain signal,
Multiplying the separation matrix and the frequency domain signal, converting the multiplication result into a time domain signal to obtain a target signal, or converting the separation matrix into a time domain frequency response, and converting the frequency response to the above Having a procedure to obtain the target signal by convolving with the observation signal,
The procedure for calculating the separation matrix is as follows:
Using the prior knowledge of the frequency response in the frequency domain between the target signal source and the sensor, a separation vector that satisfies the constraint that the target signal is extracted without distortion

A procedure for calculating the initial value t _{1 (0)} (f) of the separation vector from
Here, f represents a frequency, R _z ( f ) represents a covariance matrix R _z ( f ) = E [Z (f) Z ^H ( f ) ] of Z (f), and Z (f) is white E [] represents the mean, Z ^H ( f ) represents the Hermitian conjugate of Z (f), V (f) represents the whitening matrix, and H ₁ (f) represents the target signal Represents the known frequency response between the source and the sensor, A ^T represents the transpose of matrix A,
The initial value t _{1 (0)} (f) of the separation vector is increased by the independent component analysis so as to further increase the non-Gaussianity of the output signal.

More updates ,
Here, t _{1 (k + 1)} represents the initial value of the separation vector after update, t _{1 (k)} represents the initial value of the separation vector before update, and g (z) represents the nonlinear function G (z). represents the derivative with respect to z, k represents the number of updates,
For the updated vector, an initial value t _{1 (k + 1) new} of the changed separation vector is set so that the norm satisfies the constraint condition.

Target signal extracting method characterized by having a step of a more determined a component of the separation matrix.

The prior knowledge is obtained by obtaining an observation signal arrival delay time between the plurality of sensors based on a given direction with respect to the target signal source, and obtaining the frequency response using the delay time. The target signal extraction method according to claim 1.

2. The target signal extraction method according to claim 1, wherein the prior knowledge is obtained by measuring a frequency response between the target signal source and the sensor in advance.

Before the vector whose norm has been changed is made one component of the separation matrix, it is determined whether the changed vector has sufficiently converged. If the convergence is not sufficient, the vector is used as the initial value, and the non-Gaussian property is further improved. The method for extracting a target signal according to any one of claims 1 to 3, further comprising a step of setting the separation matrix as one component if the convergence is sufficient when returning to the step of increasing.

A frequency domain converter that receives observation signals from a plurality of sensors and converts these observation signals into signals in the frequency domain;
A separation matrix estimator that receives the signal and calculates a separation matrix at each frequency by independent component analysis from the signal;
The above signal and the above separation matrix are input, a separation calculation unit for calculating a separation signal matrix obtained by calculating and separating the target signal for each frequency, and a target signal extracted by converting the separation signal matrix into a time domain signal. The time domain transform unit to be obtained, or the time domain transform unit that receives the separation matrix and converts the matrix into the time domain separation filter signal group, and the object that the separation filter signal group and the observation signal are input and extracted by filtering A separation filter unit for outputting a signal;
A device comprising:
The separation matrix estimation unit
A prior knowledge holding unit that holds the frequency response in the frequency domain between the target signal source and the sensor as prior knowledge;
From the above prior knowledge and the above signal, a separation vector that satisfies the constraint that the target signal is extracted without distortion

An initial value calculator that calculates the initial value t _{1 (0)} (f) of the separation vector from
Here, f represents a frequency, R _z ( f ) represents a covariance matrix R _z ( f ) = E [Z (f) Z ^H ( f ) ] of Z (f), and Z (f) is white E [] represents the mean, Z ^H ( f ) represents the Hermitian conjugate of Z (f), V (f) represents the whitening matrix, and H ₁ (f) represents the target signal Represents the known frequency response between the source and the sensor, A ^T represents the transpose of matrix A,
The initial value t _{1 (0)} (f) and the initial value t _{1 (0)} (f) are set so as to maximize the objective function having these as variables from the signal.

An independent component analysis processing unit for obtaining a more updated vector, where t _{1 (k + 1)} represents an initial value of the separation vector after update, and t _{1 (k)} represents an initial value of the separation vector before update, g (z) represents the derivative of the nonlinear function G (z) with respect to z, k represents the number of updates,
The initial value t _{1 (k + 1) new} of the changed separation vector is set so that the norm of the changed vector satisfies the constraint condition.

And a norm update unit that outputs a separation vector as one component of the separation matrix.

A target signal extraction program for causing a computer to execute each procedure of the target signal extraction method according to claim 1.

A computer-readable recording medium on which the target signal extraction program according to claim 6 is recorded.