JP2008060635A

JP2008060635A - Blind signal extracting device, method thereof, program thereof, and recording medium stored with this program

Info

Publication number: JP2008060635A
Application number: JP2006231648A
Authority: JP
Inventors: Akiko Araki; 章子荒木; Hiroshi Sawada; 宏澤田; Shoji Makino; 昭二牧野; Cermak Jan; チェルマークヤン
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2006-08-29
Filing date: 2006-08-29
Publication date: 2008-03-13
Anticipated expiration: 2026-08-29
Also published as: JP4738284B2

Abstract

<P>PROBLEM TO BE SOLVED: To extract a target signal from a mixed signal using a beam former even if a user does not have prior knowledge of an impulse response vector of the target signal. <P>SOLUTION: In observing signals emitted from N-sets of signal sources using M-sets of sensors 4m to extract one signal or more (where N, M are each an integer of not less than 2), the observation signal x<SB>m</SB>(t) observed with the sensor 4m is converted into a signal x<SB>n</SB>(f, τ) of a frequency region (5), x<SB>n</SB>(f, τ) is normalized (22), the normalized x<SB>n</SB>(f, τ) is clustered to N-cluster C<SB>n</SB>(24), a correlation matrix R<SP>n</SP><SB>J</SB>(f) of the observation signal including only the unwanted signal is estimated from the C<SB>n</SB>(25), the beam former w<SB>n</SB>(f) is calculated from the cluster information C<SB>n</SB>and the R<SP>n</SP><SB>J</SB>(f) (28), a target signal y<SB>n</SB>(f, τ) is extracted from x<SB>n</SB>(f, τ) using the w<SB>n</SB>(f) (30), and the y<SB>n</SB>(f, τ) is converted into a signal of a time region (32). <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

この発明は、必要である源信号（目的信号）のみを直接観測することが出来ず、目的信号に他のノイズ、干渉信号などが重畳されて観測されるという状況において目的信号を推定して抽出するブラインド信号抽出装置、方法、プログラム、および記録媒体に関するものである。 In the present invention, it is not possible to directly observe only a necessary source signal (target signal), but the target signal is estimated and extracted in a situation where other noise, interference signal, etc. are observed superimposed on the target signal. The present invention relates to a blind signal extraction device, a method, a program, and a recording medium.

ここでは、まず観測信号のモデル化と信号の周波数領域の定義を行い、次に従来技術について簡単に述べる。
［観測信号］
全ての信号はあるサンプリング周波数ｆ_ｓでサンプリングされ、離散的に表現されるものとする。Ｎ個（Ｎは２以上の整数）の源信号が混合されて、Ｍ個（Ｍは２以上の整数）のセンサで観測されたとする。この発明は、信号の発生源からセンサまでの距離により信号が減衰・遅延し、また壁などにより、信号が反射するなどして伝送路歪みが発生しうる状況を扱う。このような状況では、複数の信号源からの源信号ｓ_ｎ（ｔ）（ｎ＝１、．．．、Ｎ）が複数のセンサで観測信号ｘ_ｍ（ｔ）（ｍ＝１、．．．、Ｍ）として観測され、各信号源ｎからセンサｍまでのインパルス応答をｈ_ｍｎ（ｕ）（ｕは時間を表す）とする。センサｍでの観測信号ｘ_ｍ（ｔ）は各源信号ｓ_ｎ（ｔ）に対し、対応するインパルス応答ｈ_ｍｎ（ｕ）が畳込み混合され、次式で表される。
ｘ_ｍ(t)＝Σ_ｎ＝１ ^ＮΣ_ｕ＝０ ^∞ｈ_ｍｎ(ｕ)ｓ_ｎ(ｔ−ｕ) （１）
ここでは、源信号ｓ_１（ｔ）、．．．、ｓ_Ｎ（ｔ）やインパルス応答ｈ_１１（ｕ）、．．．、ｈ_１Ｎ（ｕ）、．．．、ｈ_Ｍ１（ｕ）、．．．、ｈ_ＭＮ（ｕ）についての情報を事前に得られない状況を考える。この状況で、観測信号ｘ_１（ｔ）、．．．、ｘ_Ｍ（ｔ）のみを用いて源信号ｓ_１（ｔ）、．．．、ｓ_Ｎ（ｔ）を分離抽出することがこの発明の広義の目的である。
［周波数領域表現］
この発明では、周波数領域において、各操作を行う。そのため、センサでの観測信号ｘ_ｍ（ｔ）にＬ点（Ｌは任意の整数）に、公知の技術である例えば、短時間フーリエ変換を適用して周波数ごとの時間系列
ｘ_ｍ(ｆ、τ)＝Σ_u=-L/2 ^L/2-1ｘ_ｍ(τ＋ｕ)ｇ(ｕ)ｅ^{−ｉ２πｆｕ} （２）
を求める。ここで、ｆは周波数であり、ｆ＝０、ｆ_ｓ／Ｌ、．．．、ｆ_ｓ（Ｌ−１）／Ｌと離散化されており、τは任意の時間であり、上述の通り、ｆ_ｓはサンプリング周波数である。ｇ（ｕ）は例えばハニング窓などの窓関数である。 Here, the observation signal is first modeled and the frequency domain of the signal is defined, and then the prior art is briefly described.
[Observation signal]
All signals are sampled at a certain sampling frequency f _s and expressed discretely. It is assumed that N (N is an integer of 2 or more) source signals are mixed and observed by M (M is an integer of 2 or more) sensors. The present invention deals with a situation in which a signal is attenuated / delayed depending on a distance from a signal generation source to a sensor, or a signal is reflected by a wall or the like, thereby causing transmission path distortion. In such a situation, the source signals s _n (t) (n = 1,..., N) from a plurality of signal sources are observed by a plurality of sensors to be observed signals x _m (t) (m = 1,. , M), and the impulse response from each signal source n to sensor m is h _mn (u) (u represents time). The observation signal x _m (t) at the sensor m is convolutionally mixed with the corresponding impulse response h _mn (u) for each source signal s _n (t), and is expressed by the following equation.
x _m (t) = Σ _{n = 1} ^N Σ _{u = 0} ^∞ h _mn (u) s _n (tu) (1)
Here, the source signals s ₁ (t),. . . , S _N (t) and impulse response h ₁₁ (u),. . . , H _1N (u),. . . , H _M1 (u),. . . Consider a situation where information about h _MN (u) cannot be obtained in advance. In this situation, the observation signals x ₁ (t),. . . , X _M (t) only and source signals s ₁ (t),. . . , S _N (t) is a broad object of this invention.
[Frequency domain expression]
In the present invention, each operation is performed in the frequency domain. Therefore, a time series x _m (f, τ) for each frequency by applying a short-time Fourier transform, for example, a known technique to the L point (L is an arbitrary integer) on the observation signal x _m (t) of the sensor. ) = Σ _{u = −L / 2} ^{L / 2-1} x _m (τ + u) g (u) e ^−i2πfu (2)
Ask for. Where f is the frequency, f = 0, f _s / L,. . . , F _s (L−1) / L, and τ is an arbitrary time, and as described above, f _s is a sampling frequency. g (u) is a window function such as a Hanning window.

式（１）で示される時間領域での畳み込み混合は、周波数領域では、
ｘ_ｍ(ｆ、τ)＝Σ_ｎ＝１ ^Ｎｈ_ｍｎ(ｆ)ｓ_ｎ(ｆ、τ) （３）
と各周波数での単純混合に近似表現される。ここで、ｈ_ｍｎ(ｆ)は信号源ｎからセンサｍまでの周波数成分ｆについての周波数応答（インパルス応答）、ｓ_ｎ(ｆ、τ)は式（２）と同様の式に従って、源信号ｓ_ｎ（ｔ）に短時間フーリエ変換を施したものであり、以下も同様とする。センサ１〜Ｍの観測信号ｘ_１（ｆ、τ）、．．．、ｘ_Ｍ(ｆ、τ)を式（３）を用いて、ベクトルで表記すると、
ｘ(ｆ、τ)＝Σ_ｎ＝１ ^Ｎｈ_ｎ(ｆ)ｓ_ｎ(ｆ、τ) （４）
となる。ここで、ｘ(ｆ、τ)は、ｘ(ｆ、τ)＝［ｘ_１(ｆ、τ)、．．．、ｘ_ｍ(ｆ、τ)、．．．、ｘ_Ｍ(ｆ、τ)］^Ｔとなる観測信号ベクトルであり、ｈ_ｎ(ｆ)は、ｈ_ｎ(ｆ)＝［ｈ_1ｎ(ｆ)、．．．、ｈ_ｍｎ(ｆ)、．．．、ｈ_Ｍｎ(ｆ)］^Ｔであり、信号源から各センサへの周波数応答をまとめたベクトルである。また、［Ａ］^ＴはベクトルＡの転置ベクトルを示す。以下の説明も同様とする。
［代表的な従来技術］
混合信号から目的とする信号を抽出する代表的な信号抽出手法として、適応型ビームフォーマ（ａｄａｐｔｉｖｅｂｅａｍｆｏｒｍｅｒ：ＡＢＦ）が非特許文献１等に記載され、広く知られている。 In the frequency domain, the convolutional mixing in the time domain expressed by Equation (1) is
x _m (f, τ) = Σ _{n = 1} ^N h _mn (f) s _n (f, τ) (3)
And an approximate representation of simple mixing at each frequency. Here, h _mn (f) is a frequency response (impulse response) for the frequency component f from the signal source n to the sensor m, and s _n (f, τ) is a source signal s according to an equation similar to equation (2). _n (t) is subjected to a short-time Fourier transform, and so on. Observation signals x ₁ (f, τ),. . . , X _M (f, τ) as a vector using equation (3),
x (f, τ) = Σ _{n = 1} ^N h _n (f) s _n (f, τ) (4)
It becomes. Here, x (f, τ) is x (f, τ) = [x ₁ (f, τ),. . . , X _m (f, τ),. . . , X _M (f, τ)] is an observed signal vector ^T , and h _n (f) is h _n (f) = [h _1n (f),. . . , H _mn (f),. . . , H _Mn (f)] ^T , a vector summarizing the frequency response from the signal source to each sensor. [A] ^T indicates a transposed vector of the vector A. The same applies to the following description.
[Representative conventional technology]
As a typical signal extraction method for extracting a target signal from a mixed signal, an adaptive beamformer (ABF) is described in Non-Patent Document 1 and widely known.

従来の適応型ビームフォーマ（以下、従来型ビームフォーマという）の機能構成例を図１に示す。複数のセンサ４ｍで観測された信号ｘ_ｍ（ｔ）が周波数領域変換部５に入力される。周波数領域変換部５で信号ｘ_ｍ（ｔ）が周波数領域信号ｘ_ｍ（ｆ、τ）に変換される。ｘ_ｍ（ｆ、τ）（ｍ＝１、．．．、Ｍ）は全て従来型ビームフォーマ６へ入力される。 An example of the functional configuration of a conventional adaptive beamformer (hereinafter referred to as a conventional beamformer) is shown in FIG. Signals x _m (t) observed by the plurality of sensors 4 _m are input to the frequency domain transform unit 5. The signal x _m (t) is converted into the frequency domain signal x _m (f, τ) by the frequency domain converter 5. x _m (f, τ) (m = 1,..., M) are all input to the conventional beamformer 6.

従来型ビームフォーマ６は複数のセンサを用いたシステムにおいて、目的信号ｓ_ｎ（ｔ）を強調し、不要信号ｓ_１（ｔ）、．．．、ｓ_ｎ−１（ｔ）、．．．、ｓ_ｎ+１（ｔ）、．．．、ｓ_Ｎ（ｔ）をできるだけ抑圧するフィルタｗ_ｎ（ｆ）＝［ｗ_１ｎ（ｆ）、．．．、ｗ_ｍｎ（ｆ）、．．．、ｗ_Ｍｎ（ｆ）］^Ｔを推定することで実現される。 The conventional beamformer 6 emphasizes the target signal s _n (t) in the system using a plurality of sensors, and unnecessary signals s ₁ (t),. . . , S _n-1 (t),. . . , S _{n + 1} (t),. . . , S _N (t) as much as possible filter w _n (f) = [w _1n (f),. . . , W _mn (f),. . . , W _Mn (f)] ^T is realized.

従来型ビームフォーマ６を設計する際には、「目的信号発生源から各センサへのインパルス応答ベクトルｈ_ｎ（ｆ）もしくは、その近似であるステアリングベクトル
ａ_ｎ(ｆ)＝［ｅｘｐ（−ｉ２πｆτ_１ｎ），…，ｅｘｐ（−ｉ２πｆτ_Ｍｎ)］^Ｔ（５）
が既知である」ということを仮定する。ここでτ_ｍｎは信号源ｎがセンサｍに達する時刻と原点０に達する時間差である。従来は、図２に示すように直線状に配置したセンサシステムを用いることが多く、信号源ｎの方向をθ_ｎ、センサ４ｍのセンサ４１を基準とした座標ｄ_ｍとすると、上述のτ_ｍｎは、τ_ｍｎ＝ｄ_ｍｃｏｓθ_ｎ／ｃで与えられる。ここでｃは信号の速度である。 When designing the conventional beamformer 6, "impulse response vector _h n (f) from the target signal source to each sensor or a steering vector _{a n} is the approximation (f) = [exp (-i2πfτ 1n , ..., exp (-i2πfτ _Mn )] ^T (5)
Is known ". Here, τ _mn is the difference between the time when the signal source n reaches the sensor m and the time when it reaches the origin 0. Conventionally, often using a sensor system arranged in a straight line as shown in FIG. 2, _n the direction of the signal source n _theta, when the sensor 41 of the sensor 4m and coordinates d _m on the basis of above tau _mn Is given by τ _mn = d _m cos θ _n / c. Where c is the speed of the signal.

図１に説明を戻すと、不要信号を抑圧する従来型ビームフォーマ６として、以下の式で表される出力パワーＡ’(ｗ_ｎ(ｆ))を最小にするフィルタ群（ベクトル）ｗ_ｎ(ｆ)を推定する。
Ａ’(ｗ_ｎ(ｆ))＝Ｅ｛｜ｙ_ｎ｜^２(ｆ、τ)｝
＝Ｅ｛ｙ_ｎ(ｆ、τ)ｙ_ｎ ^＊(ｆ、τ)｝
＝Ｅ｛ｗ_ｎ ^Ｈ(ｆ)ｘ(ｆ、τ）ｘ^Ｈ（ｆ、τ）ｗ_ｎ(ｆ)｝
＝ｗ_ｎ ^Ｈ(ｆ)Ｒ_ｘ(ｆ)ｗ_ｎ(ｆ) （６）
ここで、Ｅ｛・｝は時間τに関する平均操作、Ａ^＊はＡの複素共役、Ｒ_ｘ（ｆ）＝Ｅ｛ｘ(ｆ、τ）ｘ^Ｈ（ｆ、τ）｝は観測信号の相関行列、［Ａ］^Ｈは行列（ベクトル）Ａの共役転置行列（ベクトル）を示し、ｙ_ｎ（ｆ、τ）は従来型ビームフォーマ６の出力であり、以下の式（７）で表すことが出来る。
ｙ_ｎ(ｆ、τ)＝ｗ_ｎ ^Ｈ(ｆ)ｘ(ｆ、τ) （７）
ここで、意味のない解(ｗ_ｎ(ｆ)＝０＝［０、．．．、０］^Ｔ)を回避するために、目的信号が無歪みで得られるという以下の式に示す拘束条件を付与する。
ｗ_ｎ ^Ｈ(ｆ)ｈ_ｎ(ｆ)＝１（８）
これにより、式（８）を満たし、かつ上記式（６）のＡ’(ｗ_ｎ(ｆ))の値が最小となるｗ_ｎ(ｆ)の値を求める問題はＬａｇｕｒａｎｇｅの未定乗数ｐを用いて、以下の式（９）で表すことができる。
Ａ(ｗ_ｎ(ｆ))＝Ａ’(ｗ_ｎ(ｆ))＋ｐ(ｗ_ｎ ^Ｈ(ｆ)ｈ_ｎ(ｆ)−１) （９）
式（９）を解くことにより、従来型ビームフォーマ６は、

で得られる。 When Returning to FIG. 1, a conventional beamformer 6 for suppressing undesired signals, the following output power A represented by the formula _{'(w n} (f)) filter group that minimizes the (vector) w _n ( f) is estimated.
A ′ (w _n (f)) = E {| y _n | ² (f, τ)}
= E {y _n (f, τ) y _n ^* (f, τ)}
= E {w _n ^H (f) x (f, τ) x ^H (f, τ) w _n (f)}
= W _n ^H (f) R _x (f) w _n (f) (6)
Here, E {•} is an average operation with respect to time τ, A ^* is a complex conjugate of A, and R _x (f) = E {x (f, τ) x ^H (f, τ)} is a correlation matrix of an observation signal , [A] ^H is a conjugate transpose matrix (vector) of the matrix (vector) A, and y _n (f, τ) is an output of the conventional beamformer 6 and can be expressed by the following equation (7). .
y _n (f, τ) = w _n ^H (f) x (f, τ) (7)
Here, in order to avoid a meaningless solution (w _n (f) = 0 = [0,..., 0] ^T ), the constraint condition shown in the following equation that the target signal can be obtained without distortion is given. Give.
w _n ^H (f) h _n (f) = 1 (8)
As a result, the problem of obtaining the value of w _n (f) that satisfies equation (8) and minimizes the value of A ′ (w _n (f)) in equation (6) uses Lagrange's undetermined multiplier p. And can be represented by the following formula (9).
A (w _n (f)) = A ′ (w _n (f)) + p (w _n ^H (f) h _n (f) −1) (9)
By solving equation (9), the conventional beamformer 6 is

It is obtained by.

従来型ビームフォーマ６（従来の適応型ビームフォーマ）では、式（１０）におけるインパルス応答ベクトルｈ_ｎ（ｆ）は実測してインパルス応答記憶部１０に記憶させておいたインパルス応答ベクトルｈ_ｎ（ｆ）を読み出して用いることが理想である。しかし、代わりに上記式（５）に示すステアリングベクトルａ_ｎ（ｆ）をステアリングベクトル記憶部１２に記憶させ、読み出されたステアリングベクトルａ_ｎ（ｆ）を上記式（１０）のインパルス応答ベクトルｈ_ｎ（ｆ）の代わりに用いることが広く行われている。 In conventional beamformer 6 (conventional adaptive beamformer), the impulse response vector in equation (10) h _{n (f)} the impulse response vector or may be stored by actually measuring the impulse response storage unit 10 h _{n (f} ) Is ideally read and used. However, instead, the steering vector a _n (f) shown in the above equation (5) is stored in the steering vector storage unit 12, and the read steering vector a _n (f) is converted into the impulse response vector h of the above equation (10). It is widely used instead of _n (f).

しかし、実環境において、インパルス応答ベクトルｈ_ｎ（ｆ）やステアリングベクトルａ_ｎ（ｆ）が正しく与えられることは稀であり、上記式（９）に示すＡ(ｗ_ｎ(ｆ))の最小化が必ずしも不要信号のみの最小化にはならないことが多い。このことから、混合信号（観測信号）の相関行列Ｒ_ｘ（ｆ）の代わりに、不要信号のみの時間区間における信号ξ（ｆ、τ）の相関行列Ｒ_Ｊ（ｆ）＝Ｅ｛ξ（ｆ、τ）ξ^Ｈ（ｆ、τ）｝を用いることが非特許文献２などで広く行われている。これは、Ｒ_ｘ（ｆ）を用いる場合よりも高い性能を実現することが知られている。即ち、従来型ビームフォーマにおいては、不要信号のみの時間区間（目的音不在の時間区間）における相関行列が精度よく推定できることが望ましい。 However, in a real environment, the impulse response vector h _{n (f)} and the steering vector a _{n (f)} is given correctly are rare, minimization of A (w _{n (f))} represented by the above formula (9) In many cases, however, it is not always necessary to minimize only unnecessary signals. Therefore, instead of the correlation matrix R _x (f) of the mixed signal (observation signal), the correlation matrix R _J (f) = E {ξ (f) of the signal ξ (f, τ) in the time interval of only the unnecessary signal. , Τ) ξ ^H (f, τ)} is widely used in Non-Patent Document 2 and the like. This is known to achieve higher performance than using R _x (f). That is, in the conventional beamformer, it is desirable that the correlation matrix can be accurately estimated in the time section of only unnecessary signals (time section in the absence of the target sound).

従来型ビームフォーマ６は上記式（１０）のｗ_ｎ（ｆ）と観測信号ベクトルｘ（ｆ、τ）により、上記式（７）により、出力信号ベクトルｙ_ｎ（ｆ、τ）が出力される。出力信号ｙ_ｎ（ｆ、τ）は時間領域変換部８に入力され、周波数領域から時間領域に変換され、ｙ_ｎ（ｔ）が生成される。
Ｈａｙｋｉｎ，Ｓ．適応フィルタ理論科学技術出版２００１６９０頁−６９３頁大賀寿郎山崎芳男金田豊音響システムとディジタル処理電子情報通信学会編コロナ社１９０頁−１９１頁 The conventional beamformer 6 outputs an output signal vector y _n (f, τ) according to the above equation (7) from w _n (f) in the above equation (10) and the observed signal vector x (f, τ). . The output signal y _n (f, τ) is input to the time domain transforming unit 8 and transformed from the frequency domain to the time domain to generate y _n (t).
Haykin, S .; Adaptive Filter Theory Science and Technology Publication 2001, pages 690-693 Toshiro Oga Yoshio Yamazaki Yutaka Kaneda Acoustic system and digital processing The Institute of Electronics, Information and Communication Engineers Corona Corporation pages 190-191

上述の通り、従来型ビームフォーマでは目的信号源から各センサへのインパルス応答ベクトルｈ_ｎ（ｆ）もしくはその近似であるステアリングベクトルａ_ｎ（ｆ）が必要である。すなわち目的信号に関する事前知識が必要であるという難点がある。更にそれらは、実環境では正しく得ることが困難であり、事前知識と使用環境でのインパルス応答ベクトルｈ_ｎ（ｆ）がずれてしまった場合の従来型ビームフォーマの性能は著しく低下する。
また、高い性能を得るためには、不要信号のみの時間区間における信号の相関行列Ｒ_Ｊ（ｆ）を推定する必要があるが、不要信号が非定常な信号である場合には、それは非常に困難である。 As described above, the conventional beamformer requires the impulse response vector h _n (f) from the target signal source to each sensor or the steering vector a _n (f) that is an approximation thereof. That is, there is a difficulty that prior knowledge about the target signal is necessary. Furthermore, they are difficult to obtain correctly in the actual environment, and the performance of the conventional beamformer is significantly degraded when the prior knowledge and the impulse response vector h _n (f) in the use environment are deviated.
Further, in order to obtain high performance, it is necessary to estimate the correlation matrix R _J (f) of the signal in the time interval of only the unnecessary signal. However, when the unnecessary signal is a non-stationary signal, it is very Have difficulty.

Ｎ個の信号源から発せられた信号をＭ個のセンサで観測し、観測された信号のうち、１個以上の信号を抽出する信号抽出装置において、ただし、Ｎ、Ｍは２以上の整数であり、上記Ｍ個のセンサで観測された観測信号を周波数領域の信号に変換し、上記周波数領域の信号に対し、正規化を行い、正規化観測信号ベクトルを算出し、上記正規化観測信号ベクトルを上記Ｎ個のクラスタにクラスタリングし、
上記クラスタの情報から、不要信号のみが含まれる観測信号の相関行列である不要信号相関行列を推定し、上記クラスタの情報と、上記不要信号相関行列と、からビームフォーマを計算し、上記ビームフォーマを用い、上記周波数領域の信号から上記目的信号を抽出し、上記抽出された上記目的信号を時間領域の信号に変換する。 In a signal extraction apparatus for observing signals emitted from N signal sources with M sensors and extracting one or more of the observed signals, where N and M are integers of 2 or more Yes, the observation signals observed by the M sensors are converted into frequency domain signals, the frequency domain signals are normalized, a normalized observation signal vector is calculated, and the normalized observation signal vector Are clustered into the N clusters,
An unnecessary signal correlation matrix that is a correlation matrix of an observation signal including only unnecessary signals is estimated from the cluster information, a beam former is calculated from the cluster information and the unnecessary signal correlation matrix, and the beam former is calculated. The target signal is extracted from the frequency domain signal, and the extracted target signal is converted into a time domain signal.

上記の構成により、事前にインパルス応答ベクトルあるいは、ステアリングベクトルを測定しておくことなく、また不要信号が非定常な信号であっても、目的信号を精度よく、分離抽出することが可能となる。 With the above configuration, it is possible to accurately extract and extract a target signal without measuring an impulse response vector or a steering vector in advance and even if an unnecessary signal is a non-stationary signal.

以下に発明を実施するための最良の形態を示す。 The best mode for carrying out the invention will be described below.

この発明の機能構成例を図３に示し、この発明の主要な処理の流れを図４に示す。図１と同一機能構成部分には同一参照符号を付け、重複説明を省略する。以下も同様とする。 An example of the functional configuration of the present invention is shown in FIG. 3, and the main processing flow of the present invention is shown in FIG. The same functional components as those in FIG. 1 are denoted by the same reference numerals, and redundant description is omitted. The same applies to the following.

また、この発明では、信号のスパース性を仮定する。スパースとは、信号が殆どの時刻τにおいて、０であることを示す。信号のスパース性は、例えば、音声信号で確認される。信号のスパース性を仮定することで、複数の源信号が存在しても、各時間周波数ポイント（ｆ、τ）では互いに重ならず、高々１つしか存在しないと仮定することが出来る。即ち上記式（４）は以下の式で表すことができる。 In the present invention, signal sparsity is assumed. Sparse means that the signal is zero at most times τ. The sparsity of the signal is confirmed by an audio signal, for example. By assuming the sparseness of the signal, it can be assumed that even if there are a plurality of source signals, at each time frequency point (f, τ), they do not overlap each other and there is at most one. That is, the above formula (4) can be expressed by the following formula.

ｘ(ｆ、τ)＝ｈ_ｎ(ｆ)ｓ_ｎ(ｆ、τ) （１１）
ここで、ｈ_ｎ(ｆ)はインパルス応答ベクトル、ｓ_ｎ(ｆ、τ)は（ｆ、τ）に存在する源信号を表す。 x (f, τ) = h _n (f) s _n (f, τ) (11)
Here, h _n (f) represents an impulse response vector, and s _n (f, τ) represents a source signal existing at (f, τ).

センサ４ｍで収音されたそれぞれの観測信号ｘ_ｍ（ｔ）（ｍ＝１、．．．Ｍ）は周波数領域変換部５に入力される。周波数領域変換部５でそれぞれの観測信号ｘ_ｍ（ｔ）は、例えば、公知の技術である上記の短時間フーリエ変換などで、時間領域から周波数領域に変換され、ｘ_ｍ（ｆ、τ）に変換される（ステップＳ２）。更に、ｘ_ｍ（ｆ、τ）は、観測信号ベクトルｘ（ｆ、τ）として出力される。観測信号ベクトルｘ（ｆ、τ）は正規化部２２に入力され、正規化観測信号ベクトルが算出される（ステップＳ４）。具体的には、観測信号ベクトルｘ（ｆ、τ）＝［ｘ_１（ｆ、τ）、．．．、ｘ_ｍ（ｆ、τ）］^Ｔに対し、偏角の正規化を以下の式で行う。

また、ノルム正規化を以下の式で行う。 Each observation signal x _m (t) (m = 1,... M) collected by the sensor 4 m is input to the frequency domain conversion unit 5. Each observation signal x _m (t) is converted from the time domain to the frequency domain by, for example, the above-described short-time Fourier transform, which is a known technique, and converted to x _m (f, τ) by the frequency domain transform unit 5. Conversion is performed (step S2). Further, x _m (f, τ) is output as an observation signal vector x (f, τ). The observation signal vector x (f, τ) is input to the normalization unit 22, and a normalized observation signal vector is calculated (step S4). Specifically, the observed signal vector x (f, τ) = [x ₁ (f, τ),. . . , X _m (f, τ)] For ^T , normalization of the declination is performed by the following equation.

Norm normalization is performed using the following equation.

ｘ⁻（ｆ、τ）←ｘ⁻（ｆ、τ）／‖ｘ⁻（ｆ、τ）‖ （１３）
ここで、ｘ⁻（ｆ、τ）は正規化された観測信号ベクトルｘ（ｆ、τ）を表し、ａｒｇ（ｒ）はｒの偏角を表し、ｉは虚数単位を表し、│ｒ│はｒの絶対値を表し、‖ｒ‖はｒのノルムを表し、Ｑは基準とするセンサの番号（Ｑ∈｛1、．．．、Ｍ｝）を表し、ｃは信号の速度を表し、αは任意の正の定数を表す。αについては、α＝４ｄ_ｍａｘが最も好ましい。ただし、ｄ_ｍａｘは、基準として選択された任意のセンサＱと他のセンサとの距離の最大値を表す。また、αは他の数値でもよい。 x ⁻ (f, τ) ← x ⁻ (f, τ) / ‖x ⁻ (f, τ) ‖ (13)
Where x ⁻ (f, τ) represents the normalized observed signal vector x (f, τ), arg (r) represents the declination of r, i represents the imaginary unit, and | r | represents the absolute value of r, ‖r‖ represents the norm of r, Q represents the reference sensor number (Q∈ {1,..., M}), c represents the speed of the signal, α Represents any positive constant. For α, α = 4d _max is most preferred. However, d _max represents the maximum value of the distance between an arbitrary sensor Q selected as a reference and another sensor. Α may be another numerical value.

上記式（１１）〜（１３）より、正規化された観測信号ベクトルｘ⁻（ｆ、τ）は以下の式で表すことができる。

ここで、Ａ_ｎ＝（Σ_ｍ＝１ ^Ｍ│ｈ_ｍｎ│^２）^１／２であり、信号ｓ_ｎ（ｆ、τ）に関するインパルス応答情報にのみ依存することが分かる。 From the above equations (11) to (13), the normalized observation signal vector x ⁻ (f, τ) can be expressed by the following equation.

Here, it can be seen that A _n = (Σ _{m = 1} ^M | h _mn | ² ) ^1/2 and depends only on the impulse response information regarding the signal s _n (f, τ).

正規化された全ての時間周波数の観測信号ベクトルｘ⁻（ｆ、τ）はクラスタリング部２４に入力され、Ｎ個のクラスタにクラスタリングされる（ステップＳ６）。このクラスタリングは、例えば、ｋ−ｍｅａｎｓ法を用いて効果的に行うことができる。また、詳細は「Ｒ．Ｏ．Ｄｕｄａ、Ｐ．Ｅ．Ｈａｒｔ、ａｎｄＤ．Ｇ．Ｓｔｏｒｋ，ＰａｔｔｅｒｎＣｌａｓｓｉｆｉｃａｔｉｏｎ，ＷｉｌｅｙＩｎｔｅｒｓｃｉｅｎｃｅ, ２ｎｄｅｄｉｔｉｏｎ，２０００．」に記載されている。以下にクラスタリングの方法を具体的に説明する。 Normalized observation signal vectors x ⁻ (f, τ) of all time frequencies are input to the clustering unit 24 and clustered into N clusters (step S6). This clustering can be effectively performed using, for example, the k-means method. Details are described in “RO Duda, PE Hart, and DG Stock, Pattern Classification, Wiley Interscience, 2nd edition, 2000”. The clustering method will be specifically described below.

記憶部２６にはあらかじめ、上記式（１４）で示される正規化観測信号ベクトルｘ⁻（ｆ、τ）とセントロイドの初期値ｃ^ｊ _ｎ（ｊ＝０、ｎ＝１、...、Ｎ）が記憶されている。
クラスタリング部２４は、記憶部２６から正規化観測信号ベクトルｘ⁻（ｆ、τ）を読み込み、これらをクラスタリングしてＮ個のクラスタＣ_１、...、Ｃ_Ｎを生成する。すなわち、Ｍ次元複素ベクトルである正規化された観測信号ベクトルｘ⁻（ｆ、τ）をＭ次元複素空間で以下の手順で直接クラスタリングする。 In the storage unit 26, the normalized observation signal vector x ⁻ (f, τ) represented by the above formula (14) and the initial centroid value c ^j _n (j = 0, n = 1,..., N ) Is stored.
The clustering unit 24 reads the normalized observation signal vector x ⁻ (f, τ) from the storage unit 26 and clusters them to generate _N clusters C ₁ ,..., CN. That is, the normalized observation signal vector x ⁻ (f, τ), which is an M-dimensional complex vector, is directly clustered in the following procedure in the M-dimensional complex space.

１．クラスタのセントロイドの初期値ｃ^ｊ _ｎを記憶部２６から読み込む。セントロイドの初期値ｃ^ｊ _ｎは、正規化観測信号ベクトルｘ⁻（ｆ、τ）と同じ次元のベクトル（Ｍ次元複素ベクトル）である。なお、セントロイドの初期値ｃ^０ _ｎの選び方については後述する。
２．ｊ＋１を新たなｊとする。
３．すべての時間周波数（ｆ、τ）における正規化観測信号ベクトルｘ⁻（ｆ、τ）を、最も近いセントロイドｃ^ｊ−１ _ｎで代表されるクラスタＣ_ｎに割り当てる。すなわち、各正規化ベクトルｘ⁻（ｆ、τ）に対して、‖ｘ⁻（ｆ、τ）−ｃ^ｊ−１ _ｎ‖が最も小さくなるようにｎを選ぶ。
４．各クラスタＣ_ｎに割りあてられた正規化観測信号ベクトルｘ⁻（ｆ、τ）の平均値を計算し、そのノルムを１にすることでセントロイドを更新する。すなわち、各クラスタＣ_ｎに割りあてられた正規化観測信号ベクトルｘ⁻（ｆ、τ）に対して、
ｃ_ｎ ^ｊ＝Ｅ｛ｘ（ｆ、τ）｝_ｎ／‖Ｅ｛ｘ⁻（ｆ、τ）｝_ｎ‖ （１５）
の演算を行うことにより、セントロイドを更新する。ここで、Ｅ｛・｝_ｎは、クラスタＣ_ｎのメンバに対する平均操作を表す。
５．セントロイドｃ^ｊ _ｎが収束するまで、手順２−５を繰り返す。最後に収束したセントロイドを、ｃ_ｎ（ｎ＝１、...、Ｎ）として、記憶部２６に記憶する。以上が、クラスタリング手順である。 1. The initial value c ^j _n of the cluster centroid is read from the storage unit 26. The centroid initial value c ^j _n is a vector (M-dimensional complex vector) having the same dimension as the normalized observation signal vector x ⁻ (f, τ). Note that how to select the initial value c ⁰ _n of the centroid will be described later.
2. Let j + 1 be a new j.
3. The normalized observation signal vector x ⁻ (f, τ) at all time frequencies (f, τ) is assigned to the cluster C _n represented by the nearest centroid c ^j−1 _n . That is, for each normalized vector x ⁻ (f, τ), n is selected such that ‖x ⁻ (f, τ) −c ^j−1 _n ‖ is the smallest.
4). The average value of the normalized observation signal vector x ⁻ (f, τ) assigned to each cluster C _n is calculated, and the centroid is updated by setting the norm to 1. That is, for the normalized observation signal vector x ⁻ (f, τ) assigned to each cluster C _n ,
_{^{c n j = E {x (}} f, τ)} n / ‖E {x - (f, τ)} n ‖ (15)
The centroid is updated by performing the above calculation. Here, E {·} _n represents an average operation for the members of cluster C _n .
5. Repeat steps 2-5 until centroid c ^j _n converges. The finally converged centroid is stored in the storage unit 26 as c _n (n = 1,..., N). The above is the clustering procedure.

次に、セントロイドの初期値の選び方の例を説明する。
《初期値設定方法１》
正規化観測信号ベクトルｘ⁻（ｆ、τ）の中からランダムにＮ個のベクトルを選び、それをセントロイドの初期値ｃ^０ _ｎ（ｎ＝１、...、Ｎ）とする。
《初期値設定方法２》
セントロイドは、式（１１）〜（１５）において、全てのｍ、ｎについて、│ｈ_ｍｎ（ｆ）│＝１と仮定すると、以下の式（１６）のように書けるので、これを用いる。
｛ｃ_ｎ｝_ｑ＝Ｅ［ｘ⁻（ｆ、τ）］_ｎ
＝ｅｘｐ［ｉ２π（ｄ_ｍ−ｄ_Ｑ）^Ｔｖ_ｎ／α］／Ｍ^１／２
＝ｅｘｐ［ｉ２π‖ｄ_ｍ−ｄ_Ｑ‖^ＴｃｏｓΘ_ｎ ^ｍＱ／α］／Ｍ^１／２
（１６）
ここで、ｄ_ｍはセンサ４_ｍの位置ベクトルを表し、ｖ_ｎ＝ｃｏｓΘ_ｎ ^ｍＱは、センサ４ｍと基準として選択したセンサ４Ｑを結ぶ軸に対する信号ｓ_ｎ（ｔ）の到来方向ベクトルを表し、図５において、太いベクトルで示されているものである。また、ｖ_ｎは単位ベクトルであり、‖ｖ_ｎ‖＝１である。 Next, an example of how to select the initial value of the centroid will be described.
<< Initial value setting method 1 >>
N vectors are randomly selected from the normalized observation signal vector x ⁻ (f, τ), and set as the initial value c ⁰ _n (n = 1,..., N) of the centroid.
<< Initial value setting method 2 >>
The centroid can be written as the following equation (16) assuming that | h _mn (f) | = 1 for all m and n in the equations (11) to (15), and this is used.
_{_{^{{C n} q = E [}}} x - (f, τ)] n
_{_{= Exp [i2π (d m -d}} Q) T v n / α] / M 1/2
= Exp [i2π‖d _m -d _Q || ^{_{^{^{T cosΘ n mQ / α] /}}}} M 1/2
(16)
Here, _{d m} represents the position vector of the sensor _{_{_{4 m, v n = cosΘ n}}} mQ represents the arrival direction vector of the signal _s n (t) with respect to the axis connecting the selected sensor 4Q as a sensor 4m and a reference, FIG. In FIG. 5, this is indicated by a thick vector. Furthermore, _{v n} is the unit vector, a ‖V _n || = 1.

センサ位置ｄ_ｍ（ｍ＝１、...、Ｍ）は、記憶部２６において保持されている値を、方位θ_ｎと仰角φ_ｎ（ｎ＝１、...、Ｎ）を適当に与える。ここで、センサ位置ｄ_ｍ、方位θ_ｎと仰角φ_ｎは初期値であるので、適当な値で良い。例えば、θ_ｎ＝２πｎ／Ｎ、φ_ｎ＝０とすると、空間的に散らばった初期値が得られる。 The sensor position d _m (m = 1,..., M) appropriately gives the value held in the storage unit 26 as the azimuth θ _n and the elevation angle φ _n (n = 1,..., N). . Here, since the sensor position d _m , the azimuth θ _n and the elevation angle φ _n are initial values, they may be appropriate values. For example, when θ _n = 2πn / N and φ _n = 0, initial values dispersed spatially are obtained.

図３に説明を戻すと、クラスタリング部２４により求まったクラスタＣ_ｎは各源信号ｓ_ｎ（ｆ、τ）に対応している。また、そのセントロイドｃ⁻ _ｎ＝Ｅ｛ｘ⁻（ｆ、τ）｝_{ｘ−（ｆ、τ）∈Ｃｎ}は上記式（１４）から理解されるように、源信号ｓ_ｎ（ｆ、τ）に関するインパルス応答情報を表すことが分かる。 Returning to FIG. 3, the cluster C _n obtained by the clustering unit 24 corresponds to each source signal s _n (f, τ). Further, the centroid ^{_{^{c - n = E {x -}}} (f, τ)} x- (f, τ) ∈Cn , as will be appreciated from the above equation (14), the source signal _s n (f, tau) It can be seen that the impulse response information for is represented.

各クラスタＣ_ｎはそれぞれ、不要信号相関行列推定部２５に入力される。不要信号相関行列推定部２５では各クラスタＣ_ｎごとにその情報から、源信号ｓ_ｎ（ｆ、τ）に対する不要信号区間の相関行列、つまり、不要信号のみが含まれる観測信号の相関行列である不要信号相関行列Ｒ^ｎ _Ｊ（ｆ）を以下の式（１７）（１８）で推定する（ステップＳ８）。

推定された不要信号相関行列Ｒ^ｎ _Ｊ（ｆ）とクラスタリング部２４よりのクラスタのセントロイド情報ｃ^― _ｎは、ビームフォーマ計算部２８に入力される。ビームフォーマ計算部２８ではクラスタの情報Ｃ_ｎと、不要信号相関行列Ｒ^ｎ _Ｊ（ｆ）とからビームフォーマｗ_ｎ（ｆ）を計算する（ステップＳ１０）。ビームフォーマｗ_ｎ（ｆ）の具体的な計算方法については、実施例２で詳細に説明する。 Each cluster C _n is input to the unnecessary signal correlation matrix estimation unit 25. The unnecessary signal correlation matrix estimator 25 calculates the correlation matrix of the unnecessary signal section with respect to the source signal s _n (f, τ) from the information for each cluster C _n , that is, the correlation matrix of the observation signal including only the unnecessary signal. The unnecessary signal correlation matrix R ⁿ _J (f) is estimated by the following equations (17) and (18) (step S8).

The estimated unnecessary signal correlation matrix R ⁿ _J (f) and the cluster centroid information c ⁻ _n from the clustering unit 24 are input to the beamformer calculation unit 28. The beamformer calculation unit 28 calculates the beamformer w _n (f) from the cluster information C _n and the unnecessary signal correlation matrix R ⁿ _J (f) (step S10). A specific calculation method of the beamformer w _n (f) will be described in detail in the second embodiment.

計算されたビームフォーマｗ_ｎ（ｆ）は目的信号抽出部３０に入力される。目的信号抽出部３０では、ビームフォーマｗ_ｎ（ｆ）を用いて、以下の式（１９）を計算して、周波数領域の観測信号ｘ（ｆ、τ）から目的信号ｙ_ｎ（ｆ、τ）を抽出する（ステップＳ１２）。
ｙ_ｎ(ｆ、τ)＝ｗ_ｎ(ｆ)^Ｈｘ(ｆ、τ) （１９）
式（１７）〜（１９）を全てのｎ（ｎ＝１、．．．、Ｎ）に対して行うことで、Ｎ個全ての信号を抽出する。ｙ_ｎ(ｆ、τ)は全て、時間領域変換部３２に入力される。目的信号抽出部３０で抽出された目的信号ｙ_ｎ（ｆ、τ）は、例えば公知の技術である短時間逆フーリエ変換などで時間領域の信号ｙ_ｎ（ｔ）に変換される（ステップＳ１４）。 The calculated beamformer w _n (f) is input to the target signal extraction unit 30. The target signal extraction unit 30 calculates the following equation (19) using the beamformer w _n (f), and calculates the target signal y _n (f, τ) from the frequency domain observation signal x (f, τ). Is extracted (step S12).
y _n (f, τ) = w _n (f) ^H x (f, τ) (19)
By performing the equations (17) to (19) for all n (n = 1,..., N), all N signals are extracted. All y _n (f, τ) are input to the time domain conversion unit 32. The target signal y _n (f, τ) extracted by the target signal extraction unit 30 is converted into a time domain signal y _n (t) by, for example, a short-time inverse Fourier transform which is a known technique (step S14). .

次に、この発明の実施例２について説明する。実施例２は、実施例１で説明したビームフォーマ計算部２８をより詳細に構成した例である。図６に実施例２のビームフォーマ計算部２８とこれに関係する部分の機能構成例を示す。図６に図示されていない部分は実施例１で説明したものと同様の処理を行うものとし、以下の実施例についても同様とする。ビームフォーマ計算部２８はインパルス応答推定部４０と適応型ビームフォーマ計算部４２とで構成されている。 Next, a second embodiment of the present invention will be described. The second embodiment is an example in which the beamformer calculation unit 28 described in the first embodiment is configured in more detail. FIG. 6 shows a functional configuration example of the beamformer calculation unit 28 according to the second embodiment and parts related thereto. The parts not shown in FIG. 6 perform the same processing as that described in the first embodiment, and the same applies to the following embodiments. The beamformer calculation unit 28 includes an impulse response estimation unit 40 and an adaptive beamformer calculation unit 42.

インパルス応答推定部４０では、クラスタＣ_ｎのセントロイド情報ｃ^― _ｎから目的信号のインパルス応答ベクトルｈ_ｎ（ｆ）を推定する。具体的には、クラスタリング部２４よりのセントロイド情報ｃ^― _ｎについて逆正規化を行うことで源信号ｓ_ｎ（ｆ、τ）に対するインパルス応答ベクトルの推定を行う。 In the impulse response estimating unit 40, a cluster _{C n} centroid information c ^- _n to estimate the impulse response vector _h n of the target signal (f) from. Specifically, the impulse response vector for the source signal s _n (f, τ) is estimated by denormalizing the centroid information c ⁻ _n from the clustering unit 24.

まず、クラスタＣ_ｎのセントロイドは上記式（１５）であり、ｘ⁻（ｆ、τ）は式（１４）で表すことができる。ここで、全てのｍ、ｎに対して、│ｈ_ｍｎ│＝１と仮定するとセントロイドｃ^― _ｎのｍ番目の成分ｃ^― _ｍｎは、以下の式（２０）が成り立つ。

式（２０）を、ｈ_ｍｎ（ｆ）について解くと、以下の式（２１）を得ることが出来る。 First, the centroid of the cluster C _n is the above equation (15), and x ⁻ (f, τ) can be represented by the equation (14). Here, assuming that | h _mn | = 1 for all m and n, the following equation (20) is established for the m-th component c ⁻ _mn of the centroid c ⁻ _n .

Solving equation (20) with respect to h _mn (f), the following equation (21) can be obtained.

式（２０）の右辺はインパルス応答ｈ_ｍｎ（ｆ）が正規化されたものになっているが、式（２１）は式（２０）から逆にインパルス応答ベクトルｈ_ｎ（ｆ）について求め直しているので、式（２１）は逆正規化と呼ぶ。

The right side of the equation (20) is a normalized impulse response h _mn (f), but the equation (21) is obtained by recalculating the impulse response vector h _n (f) from the equation (20). Therefore, equation (21) is called denormalization.

図６の説明に戻ると、推定されたインパルス応答ベクトルｈ_ｎ（ｆ）と、不要信号相関行列推定部２５よりのＲ^ｎ _Ｊ（ｆ）とが、適応型ビームフォーマ計算部４２に入力される。 Returning to the description of FIG. 6, the estimated impulse response vector h _n (f) and R ⁿ _J (f) from the unnecessary signal correlation matrix estimation unit 25 are input to the adaptive beamformer calculation unit 42. .

適応型ビームフォーマ計算部４２は、インパルス応答ベクトルｈ_ｎ（ｆ）と上記不要信号相関行列Ｒ^ｎ _Ｊ（ｆ）を用いて適応型ビームフォーマｗ_ｎ（ｆ）を計算する。具体的には、以下の式（２２）により適応型ビームフォーマｗ_ｎ（ｆ）を計算することが出来る。

上記式（２２）は、上記式（１０）のＲ_ｘ（ｆ）をＲ^ｎ _Ｊ（ｆ）に置き換えることで得ることができる。 The adaptive beamformer calculation unit 42 calculates an adaptive beamformer w _n (f) using the impulse response vector h _n (f) and the unnecessary signal correlation matrix R ⁿ _J (f). Specifically, the adaptive beamformer w _n (f) can be calculated by the following equation (22).

The above formula (22) can be obtained by replacing R _x (f) in the above formula (10) with R ⁿ _J (f).

目的信号抽出部３０では、式（２２）の適応型ビームフォーマｗ_ｎ（ｆ）と、周波数領域の信号ｘ（ｆ、τ）とを上記式（１９）に適用して、目的信号ｙ_ｎ（ｆ、τ）を抽出する。 The target signal extraction unit 30 applies the adaptive beamformer w _n (f) in Expression (22) and the signal x (f, τ) in the frequency domain to the above Expression (19), thereby generating the target signal y _n ( f, τ) is extracted.

次に実施例２の変形例として実施例２’を示す。インパルス応答推定部４０による、上記式（２２）を用いたインパルス応答ベクトルｈ_ｎ（ｆ）の推定を出力する代わりに、ステアリングベクトルａ_ｎ（ｆ）を推定して、出力させることも考えられる。信号ｓ_ｎ（ｆ、τ）のステアリングベクトルを上記式（５）と同様に
ａ_ｎ（ｆ）＝［ｅｘｐ（−ｉ２πｆτ_１ｎ）、・・・、ｅｘｐ（−ｉ２πｆτ_Ｍｎ）］^Ｔ (２３)
とすると、ステアリングベクトルａ_ｎ（ｆ）はインパルス応答ベクトルｈ_ｎ（ｆ）の推定であるから、上記式（２１）と上記式（２３）の位相項を比較すると、τ_ｍｎ（ｍ＝１、．．．、Ｍ）は以下の式（２４）で推定できる。
τ＾_ｍｎ＝αｃ^−１ａｒｇ［ｃ⁻ _ｍｎｃ⁻ _Ｑｎ］／２π （２４）
この式（２４）の計算をインパルス応答推定部４０’で行う。 Next, Example 2 ′ is shown as a modification of Example 2. According to the impulse response estimator 40, instead of outputting the impulse response vector estimation of h _{n (f)} using the above formula (22), to estimate the steering vector a _{n (f),} it is conceivable to output. Signal _s n (f, _τ) the expression steering vectors as well as _{(5) a n (f)} = [exp (-i2πfτ 1n), ···, exp (-i2πfτ Mn)] T (23)
Then, since the steering vector a _n (f) is an estimation of the impulse response vector h _n (f), when the phase terms of the equation (21) and the equation (23) are compared, τ _mn (m = 1, ..., M) can be estimated by the following equation (24).
_{^{^{τ ^ mn = αc -1 arg [}}} c - mn c - Qn] / 2π (24)
The calculation of Expression (24) is performed by the impulse response estimation unit 40 ′.

τ＾_ｍｎを用いたステアリングベクトルａ_ｎ（ｆ）をインパルス応答ベクトルｈ_ｎ（ｆ）の推定として出力する。すなわち上記式（２３）（２４）からステアリングベクトルａ_ｎ（ｆ）は以下の式（２７）で表すことができ、インパルス応答推定部４０’からインパルス応答ｈ＾_ｎ（ｆ）として出力される。
ａ_ｎ（ｆ）＝［ｅｘｐ（−ｉ２πｆτ＾_１ｎ）、・・・、ｅｘｐ（−ｉ２πｆτ＾_Ｍｎ）］^Ｔ≒ｈ＾_ｎ（ｆ）（２５）
式（２５）のｈ＾_ｎ（ｆ）とＲ^ｎ _Ｊから、上記式（２２）で適応型ビームフォーマを計算する。 The steering vector a _n (f) using τ ^ _mn is output as an estimate of the impulse response vector h _n (f). That is, from the above equations (23) and (24), the steering vector a _n (f) can be expressed by the following equation (27), and is output from the impulse response estimation unit 40 ′ as the impulse response h _n (f).
_{a n (f) = [exp} (-i2πfτ ^ 1n), ···, exp (-i2πfτ ^ Mn)] T ≒ h ^ n (f) (25)
An adaptive beamformer is calculated by the above equation (22) from h ^ _n (f) and R ⁿ _{J of} equation (25).

また、上記式（２２）を使用して不要信号相関行列推定部２５で適応型ビームフォーマｗ_ｎ（ｆ）を推定する際に、不要信号相関行列Ｒ^ｎ _Ｊ（ｆ）を使用して、音響伝達特性つまり、インパルス応答ベクトルｈ_ｎ（ｆ）やステアリングベクトルａ_ｎ（ｆ）が既知の場合はそれらを使用しても良い。また、式（２２）中の不要信号相関行列Ｒ^ｎ _Ｊ（ｆ）の代わりに、観測信号の相関行列Ｒ_ｘ（ｆ）を使用しても良い。これらのことは、以下の実施例においても同様である。 In addition, when the unnecessary signal correlation matrix estimation unit 25 estimates the adaptive beamformer w _n (f) using the above equation (22), the unnecessary signal correlation matrix R ⁿ _J (f) is used to generate an acoustic signal. If the transfer characteristics, that is, the impulse response vector h _n (f) and the steering vector a _n (f) are known, they may be used. Further, instead of the unnecessary signal correlation matrix R ⁿ _J (f) in the equation (22), the correlation matrix R _x (f) of the observation signal may be used. The same applies to the following embodiments.

実施例２で示した適応型ビームフォーマでは、Ｎ≦Ｍの場合には、高い性能を得ることが出来るが、Ｎ＞Ｍの場合には、性能が限られることが問題であった。具体的には、適応型ビームフォーマは不要信号の数がＭ−１個以下であれば、効果的に目的信号ｙ_ｎ（ｆ、τ）を抽出できるが、Ｍ個以上であると、その効果が不十分であることが知られている。そこで実施例３では、Ｎ＞Ｍの場合、即ち、Ｎ−１（＞Ｍ−１）個の不要信号がある場合にも目的信号を抽出できることを示す。実施例３の機能構成例を図７に示す。実施例３は、実施例２と比較して、不要信号選択部４９、入力信号推定部５０が追加され、不要信号相関行列推定部２５、インパルス応答推定部４０、適応型ビームフォーマ計算部４２、の処理が変更されている。 In the adaptive beamformer shown in the second embodiment, high performance can be obtained when N ≦ M. However, when N> M, the performance is limited. Specifically, the adaptive beamformer can effectively extract the target signal y _n (f, τ) if the number of unnecessary signals is M−1 or less. Is known to be inadequate. Therefore, the third embodiment shows that the target signal can be extracted even when N> M, that is, when there are N−1 (> M−1) unnecessary signals. A functional configuration example of the third embodiment is shown in FIG. In the third embodiment, an unnecessary signal selection unit 49 and an input signal estimation unit 50 are added as compared with the second embodiment, and an unnecessary signal correlation matrix estimation unit 25, an impulse response estimation unit 40, an adaptive beamformer calculation unit 42, The processing of has been changed.

不要信号選択部４９で、Ｋ個の不要信号についての不要信号相関行列Ｒ^ｎ _Ｊ（ｆ）を推定する。ここで、Ｋは、Ｋ≦Ｍ−１を満たす整数とする。つまり、目的信号ｓ_ｎ（ｆ、τ）に相当するクラスタＣ_ｎ以外の不要信号に相当するクラスタＣ_Ｌ（Ｌ＝１、．．．、ｎ−１、ｎ＋１、．．．、Ｎ）からＫ個のクラスタを選び、これらのクラスタをＣ_Ｊとする。
Ｋ個のクラスタの選び方として、クラスタＣ_Ｌ中で、クラスタメンバが多いものから順にＫ個のクラスタを選ぶ方法や、以下の式（２６）で表されるξ_Ｌ（ｆ、τ）のパワーが大きいものから順にＫ個のクラスタを選ぶ方法等が考えられる。

An unnecessary signal selection unit 49 estimates an unnecessary signal correlation matrix R ⁿ _J (f) for K unnecessary signals. Here, K is an integer that satisfies K ≦ M−1. That is, from clusters C _L (L = 1,..., N−1, n + 1,..., N) corresponding to unnecessary signals other than the cluster C _n corresponding to the target signal s _n (f, τ) to K. Choose clusters and let these clusters be C _J.
As the choice of K clusters, in cluster C _L, and a method of selecting the K clusters in order from many cluster member, the power of the following formula (26) represented by xi] _{L (f,} tau) A method of selecting K clusters in order from the largest is conceivable.

不要信号選択部４９で選択されたＫ個のクラスタＣ_Ｊを用いて、Ｋ個の不要信号についての不要信号相関行列Ｒ^ｎ _Ｊ（ｆ）を以下の式（２７）（２８）で計算する。

Using the K clusters C _J selected by the unnecessary signal selection unit 49, the unnecessary signal correlation matrix R ⁿ _J (f) for the K unnecessary signals is calculated by the following equations (27) and (28).

また、入力信号推定部５０において、不要信号選択部４９で選択したＫ個の不要信号と目的信号Ｃ_ｎとが混合したビームフォーマ入力信号ベクトルｘ（ｆ、τ）を推定する。これは、不要信号クラスタＣ_Ｊと目的信号クラスタＣ_ｎを用いて、以下の式（２９）で得ることができる。

適応型ビームフォーマ計算部４２では、上記式（２７）（２８）で得られた不要信号相関行列Ｒ^ｎ _Ｊ（ｆ）とインパルス応答推定部４０または４０’よりのインパルス応答ベクトルｈ_ｎ（ｆ）を用いて、上記式（２２）を用いて、適応型ビームフォーマｗ_ｎ（ｆ）を計算する。 Further, the input signal estimation unit 50 estimates a beamformer input signal vector x (f, τ) in which the K unnecessary signals selected by the unnecessary signal selection unit 49 and the target signal C _n are mixed. It uses the undesired signal cluster C _J and the target signal clusters C _n, can be obtained by the following equation (29).

In the adaptive beamformer calculation unit 42, the unnecessary signal correlation matrix R ⁿ _J (f) obtained by the above equations (27) and (28) and the impulse response vector h _n (f) from the impulse

response estimation unit

40 or 40 ′. Is used to calculate the adaptive beamformer w _n (f) using the above equation (22).

目的信号抽出部３０では、入力信号推定部５０よりのビームフォーマ入力信号ｘ（ｆ、τ）が入力される。目的信号抽出部３０では適応型ビームフォーマ計算部４２よりの適応型ビームフォーマｗ_ｎ（ｆ）と、ビームフォーマ入力信号ベクトルｘ（ｆ、τ）を用いて、上記式（１９）を計算して、目的信号ｙ_ｎ（ｆ、τ）を抽出する。 The target signal extraction unit 30 receives the beamformer input signal x (f, τ) from the input signal estimation unit 50. The target signal extraction unit 30 calculates the above equation (19) using the adaptive beamformer w _n (f) from the adaptive beamformer calculation unit 42 and the beamformer input signal vector x (f, τ). , The target signal y _n (f, τ) is extracted.

実施例３では上述の通り、Ｎ＞Ｍの場合を説明した。しかし、Ｎ≦Ｍの場合でも実施できる。この場合は、実施例２と比較して、入力信号推定部５０と不要信号選択部４９の処理の分だけ余計にコストがかかる。 In the third embodiment, the case where N> M is described as described above. However, it can be implemented even when N ≦ M. In this case, compared with the second embodiment, an extra cost is required for the processing of the input signal estimation unit 50 and the unnecessary signal selection unit 49.

実施例４は、センサの位置情報が既知の場合の実施例である。実施例４の機能構成例の一部を図８に示す。実施例２もしくは３で説明したインパルス応答推定部４０は到来方向推定部６０とインパルス応答計算部６２により構成されている。また、Ｍ個のセンサの位置を表すセンサ位置情報を記憶しているセンサ位置情報記憶部６４を有する。 The fourth embodiment is an embodiment where the position information of the sensor is known. FIG. 8 shows a part of a functional configuration example of the fourth embodiment. The impulse response estimation unit 40 described in the second or third embodiment includes an arrival direction estimation unit 60 and an impulse response calculation unit 62. Moreover, it has the sensor position information storage part 64 which has memorize | stored the sensor position information showing the position of M sensors.

到来方向推定部６０で、信号の到来方向を推定する。到来方向推定部６０には、クラスタリング部２４よりのセントロイド情報ｃ⁻ _ｎとセンサ位置情報記憶部６４よりの各センサの位置を表す３次元ベクトルｄ_ｍ（ｍ＝１、．．．、Ｍ）が入力される。信号ｓ_ｎの到来方向を表す長さ１の３次元ベクトルをｖ_ｎ（ｎ＝１、．．．、Ｎ）とすると、信号ｓ_ｎの到来方向の推定値はセントロイド情報ｃ⁻ _ｎを用いて、以下の式（３０）で計算することができる。 The arrival direction estimation unit 60 estimates the arrival direction of the signal. The arrival direction estimation unit 60, the centroid information c than clustering section 24 ^- three-dimensional vector representing the position of each sensor than _n and the sensor position information storage unit _{64 d m (m = 1,} ..., M) Is entered. _V n dimensional vector of length one representing the direction of arrival of the signal _{s n} (n = 1, ..., N) When the estimated value of the arrival direction of the signal _{s n} is the centroid information c ^- with _n And can be calculated by the following equation (30).

ｖ_ｎ＝αＤ^＋ａｒｇ［ｃ⁻ _ｎ］／２π （３０）
ここで、Ｄ＝［ｄ_１−ｄ_Ｑ、．．．、ｄ_ｍ−ｄ_Ｑ、．．．、ｄ_Ｍ−ｄ_Ｑ］^Ｔであり、ｄ_Ｑは基準として、任意に選択したセンサ４Ｑの位置を表す３次元ベクトルであり、Ｄ^＋は、Ｄの一般化逆行列を表す。 v _n = αD ⁺ arg [c ⁻ _n ] / 2π (30)
Here, D = [d ₁ -d _Q ,. . . , _D m _-d Q,. . . , D _M −d _Q ] ^T , where d _Q is a three-dimensional vector representing the position of the sensor 4Q arbitrarily selected as a reference, and D ⁺ represents a generalized inverse matrix of D.

次にインパルス応答計算部６２において、信号ｓ_ｎの到来方向とセンサ位置情報とを用いて、インパルス応答を計算する。インパルス応答計算部６２には、到来方向推定部６０よりの信号ｓ_ｎの到来方向の推定値ｑ_ｎと、センサ位置情報記憶部６４よりのセンサ位置情報ｄ_Ｑが入力される。インパルス応答計算部６２は以下の式（３１）で表す信号ｓ_ｎについてのステアリングベクトルａ_ｎ＾（ｆ）の推定値を求める。このステアリングベクトルａ_ｎ＾（ｆ）をインパルス応答ベクトルの推定値ｈ_ｎ（ｆ）として計算する。

そして、ステアリングベクトルａ_ｎ＾（ｆ）（インパルス応答ベクトルの推定値ｈ_ｎ（ｆ））がインパルス応答計算部６２から出力され、適応型ビームフォーマ計算部４２に入力される。 Next, in the impulse response calculation unit 62, by using the direction of arrival and the sensor position information of the signals s _n, to calculate the impulse response. The impulse response calculation unit 62 receives the arrival direction estimation value q _n of the signal s _n from the arrival direction estimation unit 60 and the sensor position information d _Q from the sensor position information storage unit 64. The impulse response calculator 62 obtains an estimated value of the steering vector a _n ^ (f) for the signal s _n represented by the following equation (31). This steering vector a _n ^ (f) is calculated as an estimated value h _n (f) of the impulse response vector.

The steering vector a _n ^ (f) (impulse response vector estimation value h _n (f)) is output from the impulse response calculation unit 62 and input to the adaptive beamformer calculation unit 42.

本実施例では、適応型ビームフォーマの代わりに、最大利得ビームフォーマを用いる構成を示す。最大利得ビームフォーマとは、センサアレイ出力における目的信号を最大にしつつセンサアレイ出力における不要信号成分を最小にするようなフィルタｗ_ｎ（ｆ）をビームフォーマとする方法である。（Ｄ．Ｈ．ＪｏｈｎｓｏｎａｎｄＤ．Ｅ．Ｄｕｄｇｅｏｎ，“ＡｒｒａｙＳｉｇｎａｌＰｒｏｃｅｓｓｉｎｇＣｏｎｃｅｐｔｓａｎｄＴｅｃｈｎｉｑｕｅｓ”，ＰｒｅｎｔｉｃｅＨａｌｌ，１９９３．）
最大利得ビームフォーマにおいては、センサアレイ出力中の目的信号成分と不要信号成分を推定することが１つのポイントとなるが、不要信号が非定常信号である場合、不要信号を推定することは非常に困難であるという問題があった。実施例５では、この問題をスパース性の仮定を用いることで解決する。つまり（１）目的信号のみの観測信号の相関行列である目的信号相関行列Ｒ_Ｔ ^ｎ（ｆ）と、不要信号のみの観測信号の相関行列である不要信号相関行列Ｒ_Ｊ ^ｎ（ｆ）とを推定すること、（２）最大利得ビームフォーマ計算部で、目的信号相関行列Ｒ_Ｔ ^ｎ（ｆ）と不要信号相関行列Ｒ_Ｊ ^ｎ（ｆ）とから最大利得ビームフォーマｗ_ｎ（ｆ）を推定することにより解決することが出来る。 In this embodiment, a configuration using a maximum gain beamformer instead of the adaptive beamformer is shown. The maximum gain beamformer is a method in which a filter w _n (f) that minimizes an unnecessary signal component in the sensor array output while maximizing a target signal in the sensor array output is used as a beamformer. (DH Johnson and DE Dudgeon, “Array Signal Processing Concepts and Technologies”, Prentice Hall, 1993.)
In the maximum gain beamformer, it is one point to estimate the target signal component and unnecessary signal component in the sensor array output. However, when the unnecessary signal is a non-stationary signal, it is very difficult to estimate the unnecessary signal. There was a problem that it was difficult. In the fifth embodiment, this problem is solved by using a sparsity assumption. In other words, (1) the target signal correlation matrix R _T ⁿ (f) that is the correlation matrix of the observation signal only of the target signal and the unnecessary signal correlation matrix R _J ⁿ (f) that is the correlation matrix of the observation signal only of the unnecessary signal (2) The maximum gain beamformer calculation unit estimates the maximum gain beamformer w _n (f) from the target signal correlation matrix R _T ⁿ (f) and the unnecessary signal correlation matrix R _J ⁿ (f). Can be solved.

また、最大利得ビームフォーマでは「目的信号の歪みを最小にする」という上記式拘束条件（８）が無いため、各周波数ｆにおいて、様々な、ゲイン特性を持つビームフォーマｗ_ｎ（ｆ）が構成される。これは、例えば、音声信号のような広帯域信号に最大利得ビームフォーマを適用した場合、出力がｗ_ｎ（ｆ）の周波数特性により歪んでしまうことを意味する。このため、従来は最大利得ビームフォーマを広帯域信号に用いることは困難であった。実施例５では、観測信号ベクトルｘ（ｆ、τ）と最大利得ビームフォーマｗ_ｎ（ｆ）の出力信号との誤差が最小となるように、最大利得ビームフォーマｗ_ｎ（ｆ）を補正することでこれを解決する。 Further, since the maximum gain beamformer does not have the above-mentioned constraint condition (8) of “minimizing distortion of the target signal”, beamformers w _n (f) having various gain characteristics are formed at each frequency f. Is done. This means that, for example, when the maximum gain beamformer is applied to a wideband signal such as an audio signal, the output is distorted by the frequency characteristic of w _n (f). For this reason, conventionally, it has been difficult to use the maximum gain beamformer for wideband signals. In the fifth embodiment, the maximum gain beamformer w _n (f) is corrected so that the error between the observation signal vector x (f, τ) and the output signal of the maximum gain beamformer w _n (f) is minimized. To solve this.

まず、最大利得ビームフォーマの原理を簡単に説明する。上述したように「センサアレイ出力における目的信号を最大にしつつセンサアレイ出力における不要信号成分を最小にする」との条件より評価関数は以下の式（３２）となる。

First, the principle of the maximum gain beamformer will be briefly described. As described above, the evaluation function is expressed by the following expression (32) based on the condition that “the target signal at the sensor array output is maximized while the unnecessary signal component at the sensor array output is minimized”.

ここで、分母は不要信号の出力パワー、分子は目的信号の出力パワーであり、Ｒ_Ｔ ^ｎ（ｆ）は目的信号のみの観測信号の相関行列、Ｒ_Ｊ ^ｎ（ｆ）は不要信号のみの観測信号の相関行列である。また、（Ｒ_Ｊ ^ｎ（ｆ））^１／２＝ＥＦ^１／２Ｅ^Ｈで表すことが出来、ここで、Ｅ＝［ｅ_１、．．．ｅ_ｍ］であり、ｅ_ｉはＲ_Ｊ ^ｎ（ｆ）の固有ベクトルであり、Ｆ＝ｄｉａｇ（λ_１、．．．、λ_Ｍ）であり、λ_ｉはｅ_ｉに対応するＲ_Ｊ ^ｎの固有値とし、ｗ^〜＝（Ｒ_Ｊ ^ｎ（ｆ））^１／２ｗ_ｎとすると、上記式（３２）は以下の式（３３）に変えることが出来る。

Here, the denominator is the output power of the unnecessary signal, the numerator is the output power of the target signal, R _T ⁿ (f) is the correlation matrix of the observation signal only of the target signal, and R _J ⁿ (f) is the observation of only the unnecessary signal. It is a correlation matrix of a signal. Also, (R _J ⁿ (f)) ^1/2 = EF ^1/2 E ^H , where E = [e ₁ ,. . . e _m ], e _i is the eigenvector of R _J ⁿ (f), F = diag (λ ₁ ,..., λ _M ), and λ _i is the eigenvalue of R _J ⁿ corresponding to e _i And w ^~ = (R _J ⁿ (f)) ^1/2 w _n , the above equation (32) can be changed to the following equation (33).

ここで「児玉、須田、“システム制御のためのマトリクス理論、コロナ社、１９９５」に記載のレイリー商の定理より、ｇ（ｗ^〜）の最大値は、（Ｒ_Ｊ ^ｎ（ｆ））^−１／２（Ｒ_Ｔ ^ｎ（ｆ））（Ｒ_Ｊ ^ｎ（ｆ））^−１／２の最大固有値λで与えられ、対応する固有ベクトルをｅとすると、最大値はｍａｘｇ（ｗ^〜）＝λ＝ｇ（ｅ）となる。すなわち求める最大利得ビームフォーマｗ_ｎは以下の式（３４）（３５）で表すことができる。
ｗ^〜＝ｅ（３４）
ｗ_ｎ＝（Ｒ_Ｊ ^ｎ（ｆ））^−１／２ｅ（３５）
実施例５の機能構成例を図９に示す。実施例１と比較して、観測信号相関行列推定部７２が追加され、ビームフォーマ計算部２８は、目的信号相関行列推定部７０、固有ベクトル計算部７４、最大利得ビームフォーマ計算部７６、補正ベクトル計算部７８、補正部８０、とで構成されている。 Here, from the Rayleigh quotient theorem described in “Kodama, Suda,“ Matrix Theory for System Control, Corona, 1995 ”, the maximum value of g (w ⁻ ) is (R _J ⁿ (f)) ^{−1. _{^{_{^{/ 2 (R T n (f}}}}} )) is given by ^(R J n ^(f)) the maximum eigenvalue of ^-1/2 lambda, when the corresponding eigenvectors and e, the maximum value ^{maxg (w ~) = λ =} g (E). Maximum gain beamformer _{w n} for obtaining That can be expressed by the following equation (34) (35).
w ^~ = e (34)
w _n = (R _J ⁿ (f)) − ^1/2 e (35)
FIG. 9 shows a functional configuration example of the fifth embodiment. Compared with the first embodiment, an observation signal correlation matrix estimation unit 72 is added, and a beamformer calculation unit 28 includes a target signal correlation matrix estimation unit 70, an eigenvector calculation unit 74, a maximum gain beamformer calculation unit 76, and a correction vector calculation. Part 78 and correction part 80.

目的信号相関行列推定部７０で、クラスタの情報から目的信号ｓ_ｎ（ｆ、τ）のみの時間区間の相関行列を以下の式（３６）（３７）で推定する。

ここで、Ｃ_ｎは目的信号に対応するクラスタである。不要信号相関行列推定部２５よりの不要信号相関行列Ｒ_Ｊ ^ｎ（ｆ）と、目的信号相関行列Ｒ_Ｔ ^ｎ（ｆ）とが、固有ベクトル計算部７４に入力される。固有ベクトル計算部７４で（Ｒ_Ｊ ^ｎ（ｆ））^−１／２（Ｒ_Ｔ ^ｎ（ｆ））（Ｒ_Ｊ ^ｎ（ｆ））^−１／２の最大固有ベクトルｅ_ｎ（ｆ）を上記で説明したレイリー商の定理より、計算する。 The target signal correlation matrix estimation unit 70 estimates the correlation matrix of the time interval of only the target signal s _n (f, τ) from the cluster information using the following equations (36) and (37).

Here, C _n is a cluster corresponding to the target signal. The unnecessary signal correlation matrix R _J ⁿ (f) and the target signal correlation matrix R _T ⁿ (f) from the unnecessary signal correlation matrix estimation unit 25 are input to the eigenvector calculation unit 74. In the eigenvector calculation unit 74, the maximum eigenvector e _n (f) of (R _J ⁿ (f)) − ^1/2 (R _T ⁿ (f)) (R _J ⁿ (f)) − ^1/2 has been described above. Calculate from the Rayleigh quotient theorem.

不要信号相関行列推定部２５よりのＲ_Ｊ ^ｎ（ｆ）と固有ベクトル計算部７４よりのｅ_ｎ（ｆ）とが最大利得ビームフォーマ計算部７６に入力される。最大利得ビームフォーマ計算部７６では、以下の式（３８）より最大利得ビームフォーマｗ_ｎ(ｆ)を計算する。
ｗ_ｎ(ｆ)＝（Ｒ_Ｊ ^ｎ(ｆ)）^−1/2ｅ_ｎ(ｆ) （３８）
この式（３８）は、上記式（３５）に基づいている。 R _J ⁿ (f) from the unnecessary signal correlation matrix estimation unit 25 and e _n (f) from the eigenvector calculation unit 74 are input to the maximum gain beamformer calculation unit 76. The maximum gain beamformer calculation unit 76 calculates the maximum gain beamformer w _n (f) from the following equation (38).
_{_{w n (f) = (R}} J n (f)) -1/2 e n (f) (38)
This equation (38) is based on the above equation (35).

一方、観測信号相関行列推定部７２で、観測信号ベクトルｘ（ｆ、τ）の相関行列である観測信号相関行列Ｒ_ｘ（ｆ）を以下の式（３９）を用いて推定する。
Ｒ_ｘ（ｆ）＝Ｅ｛ｘ（ｆ、τ）ｘ^Ｈ（ｆ、τ）｝（３９）
補正ベクトル計算部７８に、最大利得ビームフォーマ計算部７６よりの最大利得ビームフォーマｗ_ｎ（ｆ）と、観測信号相関行列推定部７２よりの観測信号相関行列Ｒ_ｘ（ｆ）が入力される。補正ベクトル計算部７８では、最大利得ビームフォーマｗ_ｎ（ｆ）を補正するための補正ベクトルα_ｎ（ｆ）を生成する。この補正は、最大利得ビームフォーマｗ_ｎ（ｆ）が出力に与える歪みが最小になるよう最大利得ビームフォーマｗ_ｎ（ｆ）を変換する。例えば、以下の式（４０）であらわされる観測信号ベクトルｘ（ｆ、τ）と出力信号ベクトルｙ_ｎ（ｆ、τ）との誤差Ａを最小にする補正ベクトルα_ｎ（ｆ）を計算する。
Ａ(α_ｎ(ｆ))＝Ｅ｛‖ｘ(ｆ、τ)−α_ｎ(ｆ)ｙ_ｎ(ｆ、τ)‖^２｝（４０）
ここで、ｙ_ｎ(ｆ、τ)は最大利得ビームフォーマｗ_ｎ（ｆ）の出力ｙ_ｎ（ｆ、τ）＝ｗ_ｎ（ｆ）ｘ（ｆ、τ）である。
上記式（４０）の右辺を展開すると、
Ａ(α_ｎ(ｆ))＝｛Ｅ［‖ｘ(ｆ、τ)‖］｝^２−α_ｎ(ｆ)Ｅ［ｘ^Ｈ（ｆ、τ）ｙ_ｎ(ｆ、τ)］−α_ｎ ^Ｈ(ｆ)Ｅ［ｙ_ｎ(ｆ、τ)^＊ｘ（ｆ、τ）］
＋α_ｎα_ｎ ^ＨＥ［│ｙ_ｎ(ｆ、τ)│^２］（４１）
式（４１）において、両辺をα_ｎ ^Ｈ(ｆ)で偏微分すると、以下の式（４２）になる。
∂Ａ(α_ｎ(ｆ))／∂ α_ｎ ^Ｈ(ｆ)＝
−Ｅ［ｙ_ｎ(ｆ、τ)^＊ｘ（ｆ、τ）］＋α_ｎＥ［│ｙ_ｎ(ｆ、τ)│^２］（４２）
上記式（４２）の左辺を０とおき、α_ｎについて求めると、以下の式（４３）になる。
α_ｎ(ｆ)＝Ｅ［ｙ_ｎ(ｆ、τ)^＊ｘ（ｆ、τ）］／Ｅ［│ｙ_ｎ(ｆ、τ)│^２］
（４３）
ここで、上記式（１９）と上記式（３９）より上記式（４３）は以下の式（４４）になる。 On the other hand, the observation signal correlation matrix estimation unit 72 estimates an observation signal correlation matrix R _x (f) that is a correlation matrix of the observation signal vector x (f, τ) using the following equation (39).
R _x (f) = E {x (f, τ) x ^H (f, τ)} (39)
The maximum gain beamformer w _n (f) from the maximum gain beamformer calculation unit 76 and the observation signal correlation matrix R _x (f) from the observation signal correlation matrix estimation unit 72 are input to the correction vector calculation unit 78. The correction vector calculator 78 generates a correction vector α _n (f) for correcting the maximum gain beamformer w _n (f). This correction converts the maximum gain beamformer w _n (f) so that the distortion that the maximum gain beamformer w _n (f) gives to the output is minimized. For example, a correction vector α _n (f) that minimizes an error A between the observed signal vector x (f, τ) and the output signal vector y _n (f, τ) expressed by the following equation (40) is calculated.
_{A (α n (f))} = E {‖x (f, τ) -α n (f) y n (f, τ) ‖ ^2} (40)
Here, y _n (f, τ) is the output y _n (f, τ) = w _n (f) x (f, τ) of the maximum gain beamformer w _n (f).
When the right side of the above equation (40) is expanded,
A (α _n (f)) = {E [‖x (f, τ) ‖]} ² −α _n (f) E [x ^H (f, τ) y _n (f, τ)] − α _n ^H (f) E [y _n (f, τ) ^* x (f, τ)]
_{_{^{+ Α n α n H E [}}} │y n (f, τ) │ 2] (41)
In equation (41), when both sides are partially differentiated by α _n ^H (f), the following equation (42) is obtained.
∂A (α _n (f)) / ∂ α _n ^H (f) =
−E [y _n (f, τ) ^* x (f, τ)] + α _n E [| y _n (f, τ) | ² ] (42)
When the left side of the equation (42) is set to 0 and α _n is obtained, the following equation (43) is obtained.
α _n (f) = E [y _n (f, τ) ^* x (f, τ)] / E [| y _n (f, τ) | ² ]
(43)
Here, from the above equation (19) and the above equation (39), the above equation (43) becomes the following equation (44).

ここで、上述したように、Ｒ_ｘ（ｆ）は観測信号ベクトルｘ（ｆ、τ）の相関行列である。上記式（４４）から理解されるように、最大利得ビームフォーマｗ_ｎ（ｆ）と観測信号ベクトルｘ（ｆ、τ）を用いて、補正ベクトル計算部７８では、補正ベクトルα_ｎ(ｆ)が計算される。

Here, as described above, R _x (f) is a correlation matrix of the observed signal vector x (f, τ). As can be understood from the above equation (44), the correction vector α _n (f) is calculated by the correction vector calculator 78 using the maximum gain beamformer w _n (f) and the observed signal vector x (f, τ). Calculated.

補正部８０は、最大利得ビームフォーマｗ_ｎ（ｆ）に対し、補正ベクトルα_ｎ(ｆ)を用いて、周波数歪みを補正し、補正ビームフォーマを計算する。具体的には以下の式（４５）により補正して補正ビームフォーマｗ_ｎ’(ｆ)を求めることが出来る。
ｗ_ｎ’(ｆ)＝［α_ｎ(ｆ)］_Ｂｗ_ｎ(ｆ) （４５）
ここで、Ｂは任意のセンサの番号であり、Ｂ∈｛１、．．．、Ｍ｝であり、［ｑ］_ＢはベクトルｑのＢ番目の要素であることを示している。 The correction unit 80 corrects the frequency distortion for the maximum gain beamformer w _n (f) using the correction vector α _n (f), and calculates a corrected beamformer. Specifically, the corrected beamformer w _n ′ (f) can be obtained by correction using the following equation (45).
w _n ′ (f) = [α _n (f)] _B w _n (f) (45)
Here, B is the number of an arbitrary sensor, and Bε {1,. . . , M}, and [q] _B indicates the Bth element of the vector q.

目的信号抽出部３０では、補正ビームフォーマｗ_ｎ’(ｆ)を用いて、以下の式（４６）で目的信号ｙ_ｎ（ｆ、τ）を抽出する。
ｙ_ｎ（ｆ、τ）＝ｗ_ｎ’^Ｈ(ｆ)ｘ（ｆ、τ）（４６）
また、実施例５の変形例の機能構成例を図１０に示す。ビームフォーマ計算部２８は目的信号相関行列推定部７０、固有ベクトル計算部７４、最大利得ビームフォーマ計算部７６、とで構成され、目的信号抽出部３０は信号抽出部８１と歪み補正部８２とで構成されている。 The target signal extraction unit 30 extracts the target signal y _n (f, τ) by the following equation (46) using the corrected beam former w _n ′ (f).
y _n (f, τ) = w _n ′ ^H (f) x (f, τ) (46)
FIG. 10 shows a functional configuration example of a modification of the fifth embodiment. The beamformer calculation unit 28 includes an objective signal correlation matrix estimation unit 70, an eigenvector calculation unit 74, and a maximum gain beamformer calculation unit 76, and the target signal extraction unit 30 includes a signal extraction unit 81 and a distortion correction unit 82. Has been.

最大利得ビームフォーマ計算部７６よりの最大利得ビームフォーマｗ_ｎ（ｆ）と、周波数領域変換部５よりの観測信号ベクトルｘ（ｆ、τ）とは、信号抽出部８１に入力される。信号抽出部８１では、以下の式（４７）を計算して、歪みを含んだ目的信号ｙ_ｎ（ｆ、τ）を抽出する。
ｙ_ｎ（ｆ、τ）＝ｗ_ｎ ^Ｈ(ｆ)ｘ（ｆ、τ）（４７）
歪みを含んだ目的信号ｙ_ｎ（ｆ、τ）は歪み補正部８２に入力される。
また、補正ベクトル計算部７８よりの補正ベクトルα_ｎ(ｆ)も歪み補正部８２に入力される。歪み補正部８２では、以下の式（４８）で出力信号を変換することで、歪みを補正して補正出力信号ｙ_ｎ’（ｆ、τ）を出力する。
ｙ_ｎ’(ｆ、τ)＝［α_ｎ(ｆ)］_Ｂｙ_ｎ(ｆ、τ) （４８）
なお、以上で説明した実施例１〜５では、全てのｎについて信号を抽出するとしてきたが、単独の信号（１つのｎ）についてのみ、ビームフォーマを構成するだけでもよい。目的信号の選択については、例えば、データベース上の目的信号のインパルス応答ベクトルｈ_ｄと発明法により全ての音源ｎについて推定されたインパルス応答ベクトルｈ_ｎを比較して、最もｈ_ｄに近いｈ_ｎを持つ音源ｎを選ぶことで選択できる。例えば、ｍｉｎ_ｎ（ｈ_１・ｈ_ｎ）などのアルゴリズムが考えられる。選ばれたｎについてのみ実施例２〜５で説明したビームフォーマ計算部２８による上記式（２４）などを用いたビームフォーマを構成すれば、目的信号についての適応型ビームフォーマを得ることができる。 The maximum gain beamformer w _n (f) from the maximum gain beamformer calculation unit 76 and the observation signal vector x (f, τ) from the frequency domain conversion unit 5 are input to the signal extraction unit 81. The signal extraction unit 81 calculates the following equation (47) to extract the target signal y _n (f, τ) including distortion.
y _n (f, τ) = w _n ^H (f) x (f, τ) (47)
The target signal y _n (f, τ) including distortion is input to the distortion correction unit 82.
The correction vector α _n (f) from the correction vector calculation unit 78 is also input to the distortion correction unit 82. The distortion correction unit 82 converts the output signal according to the following equation (48) to correct the distortion and output a corrected output signal y _n ′ (f, τ).
_{y n '(f, τ)} = [α n (f)] B y n (f, τ) (48)
In the first to fifth embodiments described above, signals are extracted for all n. However, a beam former may be configured only for a single signal (one n). For purposes signal selection, for example, by comparing the impulse response vector h _n, which is estimated for all the sound sources n by the impulse response vector h _d and invention method object signal on the database, the h _n closest to the h _d The sound source n can be selected by selecting it. For example, an algorithm such as min _n (h ₁ · h _n ) is conceivable. An adaptive beamformer for a target signal can be obtained by configuring a beamformer using the above formula (24) by the beamformer calculation unit 28 described in the second to fifth embodiments for only selected n.

［実験結果］
上記実施例の効果を示すために、実験を行った。図１１に示す部屋で測定したインパルス応答を複数の音声に畳み込み混合することで、混合信号を模した。実験条件は図１１に示す通りである。長辺が８８０ｃｍ、短辺が３７５ｃｍ、高さが２４０ｃｍ、残響は１２０ｍｓの室内において、底面の長辺から２００ｃｍ、短辺から２８２ｃｍの位置に３つのセンサ４１、４２、４３を配置した。長辺と平行軸をｘ、短辺と平行軸をｙとし、図１２に示すように、３つのセンサ４１、４２、４３をｙ軸上に２個、ｘ軸上に１個、辺の長さ４ｃｍの正三角形の頂点につまり２次元に配した場合の実験を行う。またセンサとしてはマイクロホンを用いた。
４通りの音声組み合わせについて、信号対不要信号比（ＳＩＲ）と信号対歪み比（ＳＤＲ）を評価した。なお、単位はｄＢである。 [Experimental result]
Experiments were conducted to show the effects of the above examples. The impulse response measured in the room shown in FIG. 11 was convolved with a plurality of sounds and mixed to simulate the mixed signal. The experimental conditions are as shown in FIG. Three sensors 41, 42, and 43 were arranged at positions 200 cm from the long side of the bottom surface and 282 cm from the short side in a room having a long side of 880 cm, a short side of 375 cm, a height of 240 cm, and a reverberation of 120 ms. The long side and the parallel axis are x, the short side and the parallel axis are y, and as shown in FIG. 12, three sensors 41, 42 and 43 are provided on the y axis, one on the x axis, and the length of the side. An experiment is conducted in the case where the vertices are arranged in two dimensions at the apex of a regular triangle of 4 cm in length. A microphone was used as the sensor.
The signal-to-unnecessary signal ratio (SIR) and the signal-to-distortion ratio (SDR) were evaluated for four voice combinations. The unit is dB.

４つの音源をセンサ位置におけるｘ軸とｙ軸の交点を中心とし、ｘ軸の＋方向を０度とし、左回りに３０度、３１５度方向とセンサ位置を中心と半径５０ｃｍの円との各交差点上にそれぞれの音源を、２２５度、３１５度の方向と半径８０ｃｍの円との交差点上に、それぞれ音源を配置させる。実施例２の効果を確かめる実験では、１２０度、２２５度、３１５度方向の音源を用い、Ｎ（源信号の数）＝Ｍ（センサの数）＝３とした。また実施例３の効果を確かめる実験ではＮ＝４、Ｍ＝３とした。 Each of the four sound sources is centered on the intersection of the x-axis and y-axis at the sensor position, the + direction of the x-axis is 0 degree, 30 degrees counterclockwise, 315 degrees, and the sensor position is centered on a circle with a radius of 50 cm. Each sound source is arranged on the intersection, and the sound source is arranged on the intersection between the directions of 225 degrees and 315 degrees and a circle having a radius of 80 cm. In the experiment for confirming the effect of the second embodiment, sound sources with directions of 120 degrees, 225 degrees, and 315 degrees were used, and N (number of source signals) = M (number of sensors) = 3. In an experiment for confirming the effect of Example 3, N = 4 and M = 3.

図１３にこの実験の結果を示す。実施例３は従来法、実施例２、実施例２’、実施例４、実施例５において、図７記載の入力信号推定部５０を設けた場合を示す。従来法では、図１記載の適応型ビームフォーマ６を表す上記式（１０）において、ｈ_ｎ（ｆ）に既知のステアリングベクトルａ_ｎ（ｆ）を与えたものを用いた。この場合、Ｎ＝Ｍの場合も、Ｎ＞Ｍの場合も、共に高い性能を得られなかった。これは、残響のある環境での実験であるため、与えられたステアリングベクトルａ_ｎ（ｆ）が残響の影響まで、考慮できなかったことが主な原因として考えられる。また、Ｎ＞Ｍの場合に十分なＳＩＲが得られないのは、適応型ビームフォーマの限界つまりＭ−１個の不要信号しか効果的に抑圧できないことを示している。 FIG. 13 shows the results of this experiment. The third embodiment shows a case where the input signal estimation unit 50 shown in FIG. 7 is provided in the conventional method, the second embodiment, the second embodiment 2 ′, the fourth embodiment, and the fifth embodiment. In the conventional method, the equation (10) representing the adaptive beamformer 6 shown in FIG. 1 is used in which a known steering vector a _n (f) is given to h _n (f). In this case, high performance was not obtained in both cases of N = M and N> M. This is because an experiment in an environment with reverberation, given steering vector a _{n (f)} until the influence of the reverberation, considered as a main cause that could not be taken into account. Also, the fact that sufficient SIR cannot be obtained when N> M indicates that the limit of the adaptive beamformer, that is, only M−1 unnecessary signals can be effectively suppressed.

従来法に対し、Ｎ＝Ｍの場合、Ｎ＞Ｍの場合であっても、上記実施例はＳＩＲ、ＳＤＲの値を比較すると、従来法よりも高い性能を持つことが理解される。 Compared to the conventional method, even when N = M and N> M, it is understood that the above embodiment has higher performance than the conventional method when comparing the values of SIR and SDR.

以上の各実施形態の他、本発明であるブラインド信号抽出装置は上述の実施形態に限定されるものではなく、本発明の趣旨を逸脱しない範囲で適宜変更が可能である。また、ブラインド信号抽出装置において説明した処理は、記載の順に従って時系列に実行されるのみならず、処理を実行する装置の処理能力あるいは必要に応じて並列的にあるいは個別に実行されるとしてもよい。 In addition to the above embodiments, the blind signal extraction device according to the present invention is not limited to the above-described embodiments, and can be appropriately changed without departing from the spirit of the present invention. Further, the processing described in the blind signal extraction device is not only executed in time series according to the order of description, but may also be executed in parallel or individually as required by the processing capability of the device that executes the processing. Good.

また、この発明のブラインド信号抽出装置における処理をコンピュータによって実現する場合、ブラインド信号抽出装置が有すべき機能の処理内容はプログラムによって記述される。そして、このプログラムをコンピュータで実行することにより、ブラインド信号抽出装置における処理機能がコンピュータ上で実現される。 Further, when the processing in the blind signal extraction device of the present invention is realized by a computer, the processing contents of the functions that the blind signal extraction device should have are described by a program. Then, by executing this program on a computer, the processing function of the blind signal extraction device is realized on the computer.

この処理内容を記述したプログラムは、コンピュータで読み取り可能な記録媒体に記録しておくことができる。コンピュータで読み取り可能な記録媒体としては、例えば、磁気記録装置、光ディスク、光磁気記録媒体、半導体メモリ等どのようなものでもよい。具体的には、例えば、磁気記録装置として、ハードディスク装置、フレキシブルディスク、磁気テープ等を、光ディスクとして、ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｃ）、ＤＶＤ−ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、ＣＤ−ＲＯＭ（ＣｏｍｐａｃｔＤｉｓｃＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＣＤ−Ｒ（Ｒｅｃｏｒｄａｂｌｅ）／ＲＷ（ＲｅＷｒｉｔａｂｌｅ）等を、光磁気記録媒体として、ＭＯ（Ｍａｇｎｅｔｏ−Ｏｐｔｉｃａｌｄｉｓｃ）等を、半導体メモリとしてＥＥＰ−ＲＯＭ（ＥｌｅｃｔｒｏｎｉｃａｌｌｙＥｒａｓａｂｌｅａｎｄＰｒｏｇｒａｍｍａｂｌｅ−ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）等を用いることができる。 The program describing the processing contents can be recorded on a computer-readable recording medium. As the computer-readable recording medium, for example, any recording medium such as a magnetic recording device, an optical disk, a magneto-optical recording medium, and a semiconductor memory may be used. Specifically, for example, as a magnetic recording device, a hard disk device, a flexible disk, a magnetic tape, and the like, and as an optical disk, a DVD (Digital Versatile Disc), a DVD-RAM (Random Access Memory), a CD-ROM (Compact Disc Read Only). Memory), CD-R (Recordable) / RW (ReWritable), etc., magneto-optical recording medium, MO (Magneto-Optical disc), etc., semiconductor memory, EEP-ROM (Electronically Erasable-Programmable-Ready), etc. Can be used.

また、このプログラムの流通は、例えば、そのプログラムを記録したＤＶＤ、ＣＤ−ＲＯＭ等の可搬型記録媒体を販売、譲渡、貸与等することによって行う。さらに、このプログラムをサーバコンピュータの記憶装置に格納しておき、ネットワークを介して、サーバコンピュータから他のコンピュータにそのプログラムを転送することにより、このプログラムを流通させる構成としてもよい。 The program is distributed by selling, transferring, or lending a portable recording medium such as a DVD or CD-ROM in which the program is recorded. Furthermore, the program may be distributed by storing the program in a storage device of the server computer and transferring the program from the server computer to another computer via a network.

このようなプログラムを実行するコンピュータは、例えば、まず、可搬型記録媒体に記録されたプログラムもしくはサーバコンピュータから転送されたプログラムを、一旦、自己の記憶装置に格納する。そして、処理の実行時、このコンピュータは、自己の記録媒体に格納されたプログラムを読み取り、読み取ったプログラムに従った処理を実行する。また、このプログラムの別の実行形態として、コンピュータが可搬型記録媒体から直接プログラムを読み取り、そのプログラムに従った処理を実行することとしてもよく、さらに、このコンピュータにサーバコンピュータからプログラムが転送されるたびに、逐次、受け取ったプログラムに従った処理を実行することとしてもよい。また、サーバコンピュータから、このコンピュータへのプログラムの転送は行わず、その実行指示と結果取得のみによって処理機能を実現する、いわゆるＡＳＰ（ＡｐｐｌｉｃａｔｉｏｎＳｅｒｖｉｃｅＰｒｏｖｉｄｅｒ）型のサービスによって、上述の処理を実行する構成としてもよい。なお、本形態におけるプログラムには、電子計算機による処理の用に供する情報であってプログラムに準ずるもの（コンピュータに対する直接の指令ではないがコンピュータの処理を規定する性質を有するデータ等）を含むものとする。 A computer that executes such a program first stores, for example, a program recorded on a portable recording medium or a program transferred from a server computer in its own storage device. When executing the process, the computer reads a program stored in its own recording medium and executes a process according to the read program. As another execution form of the program, the computer may directly read the program from a portable recording medium and execute processing according to the program, and the program is transferred from the server computer to the computer. Each time, the processing according to the received program may be executed sequentially. A configuration in which the above-described processing is executed by a so-called ASP (Application Service Provider) type service that realizes a processing function only by an execution instruction and result acquisition without transferring a program from the server computer to the computer. It is good. Note that the program in this embodiment includes information that is used for processing by an electronic computer and that conforms to the program (data that is not a direct command to the computer but has a property that defines the processing of the computer).

また、この形態では、コンピュータ上で所定のプログラムを実行させることにより、ブラインド信号抽出装置を構成することとしたが、これらの処理内容の少なくとも一部をハードウェア的に実現することとしてもよい。 In this embodiment, the blind signal extraction apparatus is configured by executing a predetermined program on a computer. However, at least a part of these processing contents may be realized by hardware.

この発明は、オーディオ分野の応用として、音声認識機の入力マイクロホンと話者が離れた位置にあるためマイクロホンが目的話者音声以外の音まで収音してしまうような状況でも、目的音声を分離抽出することで、認識率の高い音声認識系の構築が可能になる。 As an application in the audio field, the present invention separates target voices even in situations where the microphone picks up sounds other than the target speaker voice because the input microphone of the voice recognizer and the speaker are separated from each other. By extracting, it is possible to construct a speech recognition system with a high recognition rate.

従来技術のシステムの機能構成例を示すブロック図。The block diagram which shows the function structural example of the system of a prior art. 直線状に配置したセンサシステムを用いた場合において、音源ｎが任意のセンサｊに達する時刻と原点０に達する時刻との時間差τを説明するための図。The figure for demonstrating the time difference (tau) between the time when the sound source n reaches the arbitrary sensors j, and the time when it reaches the origin 0 in the case of using the sensor system arranged in a straight line. この発明の実施例１のシステムの機能構成例を示すブロック図。1 is a block diagram showing a functional configuration example of a system according to Embodiment 1 of the present invention. この発明の実施例１の主な処理の流れを示すフローチャート。The flowchart which shows the flow of the main processes of Example 1 of this invention. 任意の２つのセンサであるセンサｍとセンサＱとにおいて、上記式（１６）で使用するｃｏｓΘ_ｎ ^ｍＱを説明するための図。The figure for ^{demonstrating} cos ₍ ^theta ) nmQ used by the said Formula (16) in the sensor m and the sensor Q which are arbitrary two sensors. この発明の実施例２のシステムの機能構成例の一部を示すブロック図。The block diagram which shows a part of example of a function structure of the system of Example 2 of this invention. この発明の実施例３のシステムの機能構成例の一部を示すブロック図。The block diagram which shows a part of example of a function structure of the system of Example 3 of this invention. この発明の実施例４のシステムの機能構成例の一部を示すブロック図。The block diagram which shows a part of example of a function structure of the system of Example 4 of this invention. この発明の実施例５のシステムの機能構成例の一部を示すブロック図。The block diagram which shows a part of example of a function structure of the system of Example 5 of this invention. この発明の実施例５の変形例のシステムの機能構成例の一部を示すブロック図。The block diagram which shows a part of example of a function structure of the system of the modification of Example 5 of this invention. 従来の技術とこの発明の技術との比較実験を真上から見た図。The figure which looked at the comparison experiment of the prior art and the technique of this invention from right above. 図１１の３つのセンサ４１、４２、４３の位置関係の詳細を示す図。The figure which shows the detail of the positional relationship of the three sensors 41, 42, and 43 of FIG. 従来の技術とこの発明の技術の効果を比較した実験結果を示す図。The figure which shows the experimental result which compared the effect of the prior art and the technique of this invention.

Claims

In a signal extraction apparatus for observing signals emitted from N signal sources with M sensors and extracting one or more of the observed signals, where N and M are integers of 2 or more Yes,
A frequency domain conversion unit that converts observation signals observed by the M sensors into a frequency domain signal;
A normalization unit that normalizes the frequency domain signal and calculates a normalized observation signal vector;
A clustering unit for clustering the normalized observation signal vector into N clusters;
An unnecessary signal correlation matrix estimation unit that estimates an unnecessary signal correlation matrix that is a correlation matrix of an observation signal including only unnecessary signals from the information of the cluster,
A beamformer calculation unit for calculating a beamformer from the information on the cluster and the unnecessary signal correlation matrix;
A target signal extraction unit that extracts a target signal from the frequency domain signal using the beamformer;
A blind signal extraction apparatus comprising: a time domain conversion unit that converts the extracted target signal into a time domain signal.

The blind signal extraction device according to claim 1,
The beamformer calculation unit
An impulse response estimator for estimating an impulse response of the target signal from the centroid information of the cluster;
An adaptive beamformer calculation unit that calculates an adaptive beamformer using the impulse response and the unnecessary signal correlation matrix,
The blind signal extraction apparatus, wherein the target signal extraction unit extracts the target signal from the frequency domain signal using the adaptive beamformer.

The blind signal extraction device according to claim 1,
The unnecessary signal correlation matrix estimation unit estimates an unnecessary signal correlation matrix that is a correlation matrix of K unnecessary signals selected from the cluster information, where K is an integer that satisfies K ≦ M−1. And
And an input signal estimation unit for estimating a beamformer input signal including only the target signal and the selected K unnecessary signals from the cluster information,
The beamformer calculation unit
An impulse response estimator for estimating an impulse response of the target signal from the centroid information of the cluster;
An adaptive beamformer calculation unit that calculates an adaptive beamformer using the impulse response and the unnecessary signal correlation matrix, and
The blind signal extraction apparatus, wherein the target signal extraction unit extracts the target signal from the beamformer input signal using the adaptive beamformer.

The blind signal extraction device according to claim 2 or 3,
Furthermore, a sensor position information storage unit storing sensor position information representing the positions of the M sensors is provided,
The impulse response estimator is
A direction-of-arrival estimation unit that estimates a direction of arrival of a signal using the sensor position information and the centroid information of the cluster;
A blind signal extraction device comprising: an estimated response arrival direction; and an impulse response calculation unit that calculates an impulse response from the sensor position information.

The blind signal extraction device according to claim 1,
Furthermore, an observation signal correlation matrix estimation unit for estimating an observation signal correlation matrix that is a correlation matrix of the observation signal from the observation signal is provided,
The beamformer calculation unit
A target signal correlation matrix estimator for estimating a target signal correlation matrix that is a correlation matrix of an observation signal including the target signal from the cluster information;
A maximum gain beamformer calculation unit for calculating a maximum gain beamformer from the unnecessary signal correlation matrix and the target signal correlation matrix;
A correction unit that corrects frequency distortion using the observed signal correlation matrix and calculates a corrected beamformer for the maximum gain beamformer, and
The target signal extraction unit extracts the target signal from the frequency domain signal using the correction beamformer.
A blind signal extraction device characterized by that.

The blind signal extraction device according to claim 1,
Furthermore, an observation signal correlation matrix estimation unit for estimating an observation signal correlation matrix that is a correlation matrix of the observation signal from the observation signal is provided,
The beamformer calculation unit
A target signal correlation matrix estimator for estimating a target signal correlation matrix that is a correlation matrix of an observation signal including the target signal from the cluster information;
A maximum gain beamformer calculation unit for calculating a maximum gain beamformer from the unnecessary signal correlation matrix and the target signal correlation matrix;
A correction vector calculation unit that calculates a correction vector using the observed signal correlation matrix and the maximum gain beamformer,
The target signal extraction unit
A signal extraction unit that extracts a target signal including distortion from the frequency domain signal using the maximum beamformer;
A blind signal extraction apparatus comprising: a distortion correction unit that performs distortion correction on the target signal including the extracted distortion using the correction vector and outputs the target signal.

In a signal extraction method of observing signals emitted from N signal sources with M sensors and extracting one or more signals from the observed signals, N and M are integers of 2 or more. Yes,
A frequency domain transforming process in which the frequency domain transforming means transforms the observation signals observed by the M sensors into signals in the frequency domain;
A normalization process in which the normalization means normalizes the frequency domain signal and calculates a normalized observation signal vector;
A clustering process in which the clustering means clusters the normalized observation signal vector into N clusters;
An unnecessary signal correlation matrix estimation process is an unnecessary signal correlation matrix estimation process for estimating an unnecessary signal correlation matrix that is a correlation matrix of an observation signal including only an unnecessary signal from the information of the cluster,
A beamformer calculating means for calculating a beamformer from the cluster information and the unnecessary signal correlation matrix;
A target signal extraction means for extracting a target signal from the frequency domain signal using the beam former;
A blind signal extraction method comprising: a time domain conversion process in which time domain conversion means converts the extracted target signal into a time domain signal.

The blind signal extraction method according to claim 7,
The beamformer calculation process is as follows:
An impulse response estimation means for estimating an impulse response of the target signal from the centroid information of the cluster;
An adaptive beamformer calculation means comprising: an adaptive beamformer calculation process for calculating an adaptive beamformer using the impulse response and the unnecessary signal correlation matrix;
The blind signal extraction method, wherein the target signal extraction step is a step of extracting the target signal from the frequency domain signal using the adaptive beamformer.

The blind signal extraction method according to claim 7,
The unnecessary signal correlation matrix estimation process is a process of estimating an unnecessary signal correlation matrix that is a correlation matrix of K unnecessary signals selected from the cluster information, where K is an integer satisfying K ≦ M−1. And
Further, the input signal estimation means has an input signal estimation process for estimating a beamformer input signal including only the target signal and the selected K unnecessary signals from the cluster information,
The beamformer calculation process is as follows:
An impulse response means for estimating an impulse response of the target signal from the centroid information of the cluster;
An adaptive beamformer calculating means includes an adaptive beamformer calculating step of calculating an adaptive beamformer using the impulse response and the unnecessary signal correlation matrix;
The blind signal extraction method, wherein the target signal extraction step is a step of extracting a target signal from the beamformer input signal using the adaptive beamformer.

A blind signal extraction method according to claim 8 or 9, wherein
The impulse response estimation process is as follows:
An arrival direction estimation unit in which an arrival direction estimation unit estimates an arrival direction of a signal using the sensor position information in the sensor position information storage unit and the centroid information of the cluster, and here, a sensor position information storage unit Is stored sensor position information indicating the positions of the M sensors.
A blind signal extraction method, wherein the impulse response calculation means includes an impulse response calculation step of calculating an impulse response from the estimated arrival direction of the signal and the sensor position information.

The blind signal extraction method according to claim 7,
Further, the observation signal correlation matrix estimation means includes an observation signal correlation matrix estimation process for estimating an observation signal correlation matrix that is a correlation matrix of the observation signal from the observation signal,
The beamformer calculation process is as follows:
A target signal correlation matrix estimation means for estimating a target signal correlation matrix, which is a correlation matrix of an observation signal including the target signal, from the cluster information;
A maximum gain beamformer calculating means for calculating a maximum gain beamformer from the unnecessary signal correlation matrix and the target signal correlation matrix;
A correcting means for correcting the frequency distortion using the observed signal correlation matrix for the maximum gain beamformer and calculating a corrected beamformer, and
The target signal extraction process is a process of extracting a target signal from the frequency domain signal using the correction beamformer.
And a blind signal extraction method.

The blind signal extraction method according to claim 7,
Further, the observation signal correlation matrix estimation means includes an observation signal correlation matrix estimation process for estimating an observation signal correlation matrix that is a correlation matrix of the observation signal from the observation signal,
The beamformer calculation process is as follows:
A target signal correlation matrix estimation means for estimating a target signal correlation matrix, which is a correlation matrix of an observation signal including the target signal, from the cluster information;
A maximum gain beamformer calculating means includes a maximum gain beamformer calculating step of calculating a maximum gain beamformer from the unnecessary signal correlation matrix and the target signal correlation matrix;
Further, the correction vector calculation means has a correction vector calculation step of calculating a correction vector using the observed signal correlation matrix and the maximum gain beamformer,
The target signal extraction process is as follows:
A signal extraction process in which a signal extraction means extracts a target signal including distortion from the frequency domain signal using the maximum beamformer;
A blind signal extraction comprising: a distortion correction process, wherein distortion correction means performs distortion correction on the target signal including the extracted distortion by using the correction vector, and outputs the target signal. Method.

The blind signal extraction program for making a computer perform each process of the blind signal extraction method in any one of Claims 7-12.

A computer-readable recording medium on which the blind signal extraction program according to claim 13 is recorded.