JP4676920B2

JP4676920B2 - Signal separation device, signal separation method, signal separation program, and recording medium

Info

Publication number: JP4676920B2
Application number: JP2006133728A
Authority: JP
Inventors: 宏澤田; 章子荒木; 良向井; 昭二牧野
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2006-05-12
Filing date: 2006-05-12
Publication date: 2011-04-27
Anticipated expiration: 2026-05-12
Also published as: JP2007306373A

Abstract

<P>PROBLEM TO BE SOLVED: To suitably perform clustering even in a frequency band in which spatial aliasing may happen, and create a suitable separation signal. <P>SOLUTION: A phase is normalized in a converted frequency region signal of a mixed signal observed by a plurality of sensors. Furthermore, a frequency normalized signal is created which eliminates a frequency dependent component. A feature amount vector is clustered for every time frequency which makes the frequency normalized signal, etc. as respective elements, and a model parameter of a frequency response model is computed using a centroid of each cluster. Afterward, a phase/norm normalization vector is clustered which makes phase/norm normalized values of the frequency region signals as respective elements, in reference to the frequency response model, etc. with the model parameter substituted thereinto; and creates cluster information corresponding to the cluster to which each phase/norm normalization vector belongs. Separation signals of the frequency region are created using the frequency region signal and the cluster information, and they are converted to separation signals of the time region. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、信号処理の技術分野に属し、特に複数の源信号が空間内で混合された混合信号から源信号を抽出する技術に関する。 The present invention belongs to the technical field of signal processing, and particularly relates to a technique for extracting a source signal from a mixed signal obtained by mixing a plurality of source signals in a space.

［ブラインド信号分離］
複数の源信号が混在した混合信号から、源信号を推定して分離する信号分離技術としてブラインド信号分離がある。まず、このブラインド信号分離の定式化を行う。すべての信号はあるサンプリング周波数ｆ_ｓでサンプリングされ、離散的に表現されるものとする。また、Ｎ個（N≧２）の信号が混合されてＭ個（M≧２）のセンサで観測されたとする。以下では、信号の発生源からセンサまでの距離により信号が減衰・遅延し、また壁などにより信号が反射して伝送路の歪みが発生しうる状況を扱う。このような状況で混合される信号は、信号源ｋからセンサｑ（ｑはセンサの番号を示す〔ｑ＝１，…，Ｍ〕。また、ｋは信号源の番号を示す〔ｋ＝１，…，Ｎ〕。）へのインパルス応答h_qk（r）による畳み込み混合

で表現できる。ここでｔはサンプリング時刻を示している。また、s_k（t）は、サンプリング時刻ｔにおいて信号源から発せられる源信号を示しており、x_q（t）は、サンプリング時刻ｔにおいてセンサｑで観測される信号を示している。また、ｒは掃引のための変数を示している。 [Blind signal separation]
Blind signal separation is a signal separation technique for estimating and separating a source signal from a mixed signal in which a plurality of source signals are mixed. First, this blind signal separation is formulated. All signals are sampled at a sampling frequency f _s and expressed discretely. Further, it is assumed that N (N ≧ 2) signals are mixed and observed by M (M ≧ 2) sensors. The following deals with a situation where the signal is attenuated / delayed depending on the distance from the signal source to the sensor, and the signal may be reflected by a wall or the like to cause distortion in the transmission path. The signal mixed in such a situation is transmitted from the signal source k to the sensor q (q is a sensor number [q = 1,..., M]. K is a signal source number [k = 1, ..., N].) Convolutional mixing with impulse response h _qk (r)

Can be expressed as Here, t indicates the sampling time. Further, s _k (t) indicates a source signal emitted from the signal source at the sampling time t, and x _q (t) indicates a signal observed by the sensor q at the sampling time t. R represents a variable for sweeping.

一般的なインパルス応答h_qk（r）は、適当な時間経過後にパルス的な強い応答を持ち、時間と共に減衰していく。ブラインド信号分離の目的は、源信号s₁（t）,…,s_N（t）やインパルス応答h₁₁（r）,…,h_1N（r）,…,h_M1（r）,…,h_MN（r）を知らずに、観測信号（以下「混合信号」と呼ぶ）x₁（t）,…,x_M（t）のみから、源信号s₁（t）,…,s_N（t）にそれぞれ対応する分離信号y₁（t）,…,y_N（t）を求めることにある。 The general impulse response h _qk (r) has a strong pulse-like response after an appropriate time has elapsed and decays with time. The purpose of blind signal separation is the source signal s ₁ (t), ..., s _N (t) and impulse response h ₁₁ (r), ..., h _1N (r), ..., h _M1 (r), ..., h Without knowing _MN (r), only source signals s ₁ (t), ..., s _N (t) from the observed signals (hereinafter referred to as "mixed signals") x ₁ (t), ..., x _M (t) the corresponding separated signals y ₁ (t), ..., is to seek y _N (t).

［周波数領域］
次に、従来のブラインド信号分離の手順について説明する。
ここでは周波数領域において分離の操作を行う。そのためにセンサｑでの混合信号ｘ_ｑ（ｔ）にＬ点の短時間離散フーリエ変換（STFT：Short-Time Fourier Transform）を適用し、時間周波数スロット毎（f,τ）の混合信号（以下「周波数領域信号」と呼ぶ）

を求める。ここでｆは周波数でありf=0,f_s/L,…,f_s（L-1）/Lと離散化されている（ｆ_ｓはサンプリング周波数）。また、τは離散時間であり、ｊは虚数単位である。さらにg（r）は窓関数である。窓関数としては、例えば、ハニング窓

などのg（0）にパワーの中心を持つ窓関数を用いる。この場合、周波数領域信号X_q（f,τ）は時刻t=τを中心とする混合信号x_q（t）の周波数特性を表現する。なお、周波数領域信号X_q（f,τ）はＬサンプルにわたる情報を含んでいるため、すべてのτに対して周波数領域信号X_q（f,τ）を求める必要はなく、適当な間隔のτごとにX_q（f,τ）を求める。 [Frequency domain]
Next, a conventional blind signal separation procedure will be described.
Here, the separation operation is performed in the frequency domain. For this purpose, an L-point short-time Fourier transform (STFT) is applied to the mixed signal x _q (t) at the sensor q, and the mixed signal (hereinafter referred to as “f, τ”) Called "frequency domain signal")

Ask for. Here, f is a frequency and is discretized as f = 0, f _s / L,..., F _s (L−1) / L (f _s is a sampling frequency). Also, τ is a discrete time, and j is an imaginary unit. Furthermore, g (r) is a window function. As a window function, for example, Hanning window

Use a window function with the center of power at g (0). In this case, the frequency domain signal X _q (f, τ) represents the frequency characteristics of the mixed signal x _q (t) centered at time t = τ. Since the frequency domain signal X _q (f, τ) contains information over L samples, it is not necessary to obtain the frequency domain signal X _q (f, τ) for all τ, and τ at an appropriate interval X _q (f, τ) is obtained for each.

周波数領域で処理を行うと、式（１）で示される時間領域での畳み込み混合が、

と各周波数での単純混合に近似表現でき、分離の操作が単純になる。ここで、H_qk（f）は源信号ｋからセンサｑまでの周波数応答であり、S_k（f,τ）は式（２）と同様な式に従って源信号s_k（t）に短時間離散フーリエ変換を施したもの（周波数領域の源信号）である。ベクトルを用いて式（３）を表記すると、

となる。ここで、X（f,τ）=[X₁（f,τ）,…,X_M（f,τ）]^Tは、各センサに対応する周波数領域信号を各要素とする混合信号ベクトルであり、H_k（f）=[H_1k（f）,…,H_Mk（f）]^Tは、信号源ｋから各センサへの周波数応答を各要素とするベクトルである。なお、[*]^Tは[*]の転置ベクトルを示す。 When processing is performed in the frequency domain, convolutional mixing in the time domain represented by Equation (1) is

And simple mixing at each frequency, and the separation operation is simplified. Here, H _qk (f) is a frequency response from the source signal k to the sensor q, and S _k (f, τ) is discrete to the source signal s _k (t) for a short time according to an equation similar to the equation (2). This is a signal subjected to Fourier transform (frequency domain source signal). When expression (3) is expressed using a vector,

It becomes. Where X (f, τ) = [X ₁ (f, τ), ..., X _M (f, τ)] ^T is a mixed signal vector having frequency domain signals corresponding to each sensor as elements. , H _k (f) = [H _1k (f),..., H _Mk (f)] ^T is a vector having frequency responses from the signal source k to the sensors as elements. [*] ^T indicates a transposed vector of [*].

［時間周波数マスクによる信号分離］
ブラインド信号分離手法の１つとして、時間周波数マスクを用いる方法がある。この手法は、信号源の数Ｎとセンサ数Ｍとの大小にかかわらず適用可能な信号分離抽出手法である。この手法では信号のスパース性を仮定する。スパースとは、信号が殆どの離散時間τにおいて０であることを指す。信号のスパース性は、例えば、周波数領域での音声信号で確認される。信号のスパース性と相互独立性を仮定することで、複数の信号が同時に存在していても、各時間周波数スロット（f,τ）では互いに重なって観測される確率が低いことを仮定できる。信号のスパース性と相互独立性を仮定する場合、上述の式（３）は、
X_q(f,τ)=H_qk(f)S_k(f,τ) …（5）
と表記でき、上述の式（４）は、
X(f,τ)=H_k(f)S_k(f,τ) …（6）
と表記できる。つまり、時間周波数スロット（f,τ）毎にみると、どれか一つの信号源ｋだけがアクティブで、その他の信号源の信号が０になっているものとモデル化できる。 [Signal separation by time frequency mask]
As one of blind signal separation methods, there is a method using a time-frequency mask. This method is a signal separation and extraction method applicable regardless of the number N of signal sources and the number M of sensors. This method assumes signal sparsity. Sparse refers to the signal being zero at most discrete times τ. The sparsity of the signal is confirmed by an audio signal in the frequency domain, for example. By assuming the sparseness and mutual independence of signals, it can be assumed that even if a plurality of signals are present at the same time, there is a low probability of being observed in each time frequency slot (f, τ). Assuming signal sparsity and mutual independence, Equation (3) above is
X _q (f, τ) = H _qk (f) S _k (f, τ) (5)
The above equation (4) can be expressed as
X (f, τ) = H _k (f) S _k (f, τ) (6)
Can be written. That is, in each time frequency slot (f, τ), it can be modeled that only one of the signal sources k is active and the signals of the other signal sources are zero.

この場合の信号分離の操作は、時間周波数スロット（f,τ）毎に、どの信号源がアクティブになっているかを判定することで行える。具体的には、混合信号ベクトルから適当な特徴量を生成し、生成した特徴量をクラスタリングしてクラスタC_ｋを生成する。そして、各クラスタC_ｋのメンバに対応する周波数領域信号を抽出する時間周波数マスクM_k（f,τ）を推定し、
Y_k（f,τ）=M_k（f,τ）X_Q'（f,τ）…（7）
により、各信号を分離抽出する。ここで、X_Q'(f,τ)は混合信号ベクトルの要素の１つであり、Ｑ’∈{1,…,M}である。 The signal separation operation in this case can be performed by determining which signal source is active for each time frequency slot (f, τ). Specifically, an appropriate feature amount is generated from the mixed signal vector, and the generated feature amount is clustered to generate a cluster C _k . Then, estimate a time frequency mask M _k (f, τ) that extracts a frequency domain signal corresponding to a member of each cluster C _k ,
Y _k (f, τ) = M _k (f, τ) X _{Q ′} (f, τ) (7)
Thus, each signal is separated and extracted. Here, X _{Q ′} (f, τ) is one of the elements of the mixed signal vector, and Q′∈ {1,..., M}.

クラスタリングに用いる特徴量としては、例えば、２つのセンサ（センサｑと基準センサＱ〔なお、Ｑを基準値と呼び、基準値Ｑに対応するセンサを基準センサＱと表記する。〕）における周波数領域信号の位相差

から計算される信号の推定到来方向（Direction of Arrival : DOA）

を例示できる（例えば、非特許文献１参照）。なお、ｄはセンサｑと基準センサＱとの距離であり、ｃは信号速度である。また、クラスタリングには、k-means法（例えば、非特許文献２参照）等を用いることができる。また、時間周波数マスクM_k（f,τ）としては、例えばそれぞれのクラスタC_ｋに属するメンバの平均値θ_１ ^〜θ₂ ^〜，…，θ_N ^〜を求め、

のようにして生成したものを用いることができる。ここでΔは信号を抽出する範囲を与える。この方法では、Δを小さくすると、よい分離抽出性能が得られるが非線形型歪みは大きくなる。また、Δを大きくすると、非線形型歪みは減少するが分離性能が劣化する。
その他、クラスタリングの特徴量として、２つのセンサ（センサｑと基準センサＱ）における周波数領域信号の位相差（式（８））を周波数で除したものを用いてもよい。 As the feature amount used for clustering, for example, frequency regions in two sensors (sensor q and reference sensor Q (Q is referred to as a reference value, and a sensor corresponding to the reference value Q is referred to as a reference sensor Q)). Signal phase difference

Direction of arrival (DOA) of the signal calculated from

(For example, refer nonpatent literature 1). Here, d is the distance between the sensor q and the reference sensor Q, and c is the signal speed. Further, k-means method (for example, see Non-Patent Document 2) or the like can be used for clustering. Further, as the time frequency mask M _k (f, τ), for example, average values θ ₁ ^to θ ₂ ^to ,..., Θ _N ^to members belonging to the respective clusters C _k are obtained,

Those generated as described above can be used. Here, Δ gives a range for extracting a signal. In this method, if Δ is reduced, good separation and extraction performance can be obtained, but nonlinear distortion increases. When Δ is increased, the non-linear distortion is reduced, but the separation performance is deteriorated.
In addition, as a clustering feature amount, a value obtained by dividing a phase difference (formula (8)) of frequency domain signals in two sensors (sensor q and reference sensor Q) by a frequency may be used.

また、非特許文献３には、３個以上のセンサの情報を効率的に利用し、時間周波数マスクによって信号分離を行う方法が開示されている。この方法では、各センサで観測された混合信号を短時間離散フーリエ変換し、時間周波数スロット（f,τ）毎の周波数領域信号X_q（f,τ）を生成し、これらを

によって正規化する。なお、ｅｘｐはネピア数を示し、ａｒｇ［・］は偏角を示し、ｊは虚数単位を示し、ｃは信号の伝達速度を示し、ｄ_ｍａｘは基準センサＱ（Ｑ∈{1,…,M}）と他のセンサｑ（ｑ∈{1,…,M}）との距離の最大値を示す。また、X'(f,τ)=[X₁'(f,τ),…,X_M'(f,τ)]^Tであり、‖・‖はノルムを示す。 Non-Patent Document 3 discloses a method of efficiently separating information from three or more sensors and performing signal separation using a time-frequency mask. In this method, the mixed signal observed by each sensor is subjected to discrete Fourier transform for a short time, and a frequency domain signal X _q (f, τ) for each time frequency slot (f, τ) is generated.

Normalize by Here, exp represents the number of Napiers, arg [•] represents the deflection angle, j represents the imaginary unit, c represents the signal transmission speed, and d _max represents the reference sensor Q (Qε {1,..., M }) And the maximum distance between other sensors q (qε {1,..., M}). Further, X ′ (f, τ) = [X ₁ ′ (f, τ),..., X _M ′ (f, τ)] ^T , and ‖ and ‖ denote norms.

次に、このように正規化されたベクトルX''(f,τ)を特徴量としてクラスタリングを行い、各信号源に対応すると推定されるクラスタを生成し、各クラスタのサンプルを取り出すための時間周波数マスクを生成する。そして、これらの時間周波数マスクを周波数領域信号に乗じることによって周波数領域の分離信号を求め、それらを短時間逆フーリエ変換（ISTFT: Inverse Short-Time Fourier Transform）することによって時間領域の分離信号を求める。
S. Araki, S. Makino, A. Blin, R. Mukai, and H. Sawada, "Underdetermined blind separation for speech in real environments with sparseness and ICA," in Proc. ICASSP 2004, vol. III, May 2004, pp. 881-884 R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, Wiley Interscience, 2nd edition, 2000. S. Araki, H. Sawada, R. Mukai and S. Makino, "A novel blind source separation method with observation vector clustering," IWAENC2005, pp. 117-120, 2005. Next, clustering is performed using the normalized vector X '' (f, τ) as a feature value in this way to generate clusters that are estimated to correspond to each signal source, and the time for taking samples of each cluster Generate a frequency mask. Then, frequency domain separation signals are obtained by multiplying these time frequency masks by frequency domain signals, and time domain separation signals are obtained by performing short-time inverse Fourier transform (ISTFT) on them. .
S. Araki, S. Makino, A. Blin, R. Mukai, and H. Sawada, "Underdetermined blind separation for speech in real environments with sparseness and ICA," in Proc. ICASSP 2004, vol. III, May 2004, pp . 881-884 RO Duda, PE Hart, and DG Stork, Pattern Classification, Wiley Interscience, 2nd edition, 2000. S. Araki, H. Sawada, R. Mukai and S. Makino, "A novel blind source separation method with observation vector clustering," IWAENC2005, pp. 117-120, 2005.

しかしながら、これらの従来技術において
f＜c/(2d_max)（２センサの場合はd_max=d） …(11)
の関係を満たさない場合、いわゆる空間的エイリアシング（aliasing）が発生する可能性がある。そして、この空間的エイリアシングが発生した場合、適切なクラスタリングが実行できない可能性がある。以下に、この空間的エイリアシングに起因する問題を説明する。 However, in these prior art
f <c / (2d _max ) (d _max = d for 2 sensors) (11)
If this relationship is not satisfied, so-called spatial aliasing may occur. When this spatial aliasing occurs, there is a possibility that appropriate clustering cannot be performed. In the following, problems caused by this spatial aliasing will be described.

信号のスパース性と相互独立性を仮定すると、周波数領域信号X_q（f,τ）は、何れかの信号源ｋからセンサｑヘの周波数応答Ｈ_ｑｋ（f）と、周波数領域の源信号S_k（f,τ）とを掛けたものに近似できる（式（５）参照）。すなわち、
X_q(f,τ)=H_qk(f)S_k(f,τ)
X_Q(f,τ)=H_Qk(f)S_k(f,τ)
と近似できる。この場合、周波数領域信号X_q（f,τ）の偏角と、周波数領域信号X_Q(f,τ)の偏角との相対値は、周波数応答H_qk(f)の偏角と、周波数応答H_Qk(f)の偏角との相対値のみに依存するものと近似できる。 Assuming _sparseness and mutual independence of the signal, the frequency domain signal X _q (f, τ) is the frequency response H _qk (f) from any signal source k to the sensor q and the frequency domain source signal S. _It can be approximated to the product of _k (f, τ) (see equation (5)). That is,
X _q (f, τ) = H _qk (f) S _k (f, τ)
X _Q (f, τ) = H _Qk (f) S _k (f, τ)
Can be approximated. In this case, the relative value of the declination of the frequency domain signal X _q (f, τ) and the declination of the frequency domain signal X _Q (f, τ) is the declination of the frequency response H _qk (f) and the frequency It can be approximated that it depends only on the relative value of the response H _Qk (f) to the argument.

ここで、周波数応答H_qk(f)を反射や残響の無い直接波モデルで近似すると、
H_qk(f)=λ_qk・exp[‐j・2π・f・γ_qk] …(12)
と表現できる。なお、γ_qkとλ_qk＞０とは、信号源ｋからセンサｑへの到達時間と、減衰とを、それぞれ表現するモデルパラメータである。また、信号源ｋからセンサｑへの到達時間γ_qkは、
γ_qk=d_kq/c
と表記できる。なお、d_kqは、信号源ｋとセンサｑとの距離を示す。よって、式（１２）は、
H_qk(f)=λ_qk・exp[‐j・2π・f・d_kq/c] …(13)
と表記できる。 If the frequency response H _qk (f) is approximated by a direct wave model without reflection or reverberation,
H _qk (f) = λ _qk・ exp [−j ・ 2π ・ f ・ γ _qk ]… (12)
Can be expressed as Note that γ _qk and λ _qk > 0 are model parameters that express the arrival time from the signal source k to the sensor q and the attenuation, respectively. The arrival time γ _qk from the signal source k to the sensor q is
γ _qk = d _kq / c
Can be written. D _kq represents the distance between the signal source k and the sensor q. Therefore, equation (12) becomes
H _qk (f) = λ _qk・ exp [−j ・ 2π ・ f ・ d _kq / c] (13)
Can be written.

ここで、式（１１）の関係を満たす場合、周波数応答Ｈ_ｑｋ（f）の偏角と、周波数応答Ｈ_Qｋ（f）の偏角との相対値2π・f・(d_kQ‐d_kq)/cは必ず−π以上π未満となり、周波数領域信号X_q(f,τ)の偏角と、周波数領域信号X_Q(f,τ)の偏角との相対値も必ず−π以上π未満となる。そして、この偏角の相対値は、信号源ｋからセンサｑ及び基準センサＱへの信号の到達時間差γ_Qk‐γ_qkと、周波数ｆとの双方に比例したものになる。 Here, when the relationship of Expression (11) is satisfied, the relative value 2π · f · (d _kQ -d _kq ) of the deviation angle of the frequency response H _qk (f) and the deviation angle of the frequency response H _Qk (f) / c is always −π or more and less than π, and the relative value of the declination of the frequency domain signal X _q (f, τ) and the declination of the frequency domain signal X _Q (f, τ) is always −π or more and less than π. It becomes. The relative value of the deviation angle is proportional to both the arrival time difference γ _Qk -γ _qk of the signal from the signal source k to the sensor q and the reference sensor Q and the frequency f.

上述の各従来例では、各センサｑに対応する周波数領域信号X_q(f,τ)の偏角を、基準センサＱに対応する周波数領域信号X_Q(f,τ)の偏角を基準に正規化し、さらに正規化された偏角を周波数ｆに比例した値で除して特徴量を生成している（例えば式（９）（１０）など参照）。この場合、当該特徴量の偏角は、信号源ｋからセンサｑ及び基準センサＱへの信号の到達時間差γ_Qk‐γ_qkに比例するものとなる。そして、当該到達時間差γ_Qk‐γ_qkは、信号源ｋ，センサｑ，基準センサＱの相対位置にのみ依存するのだから、当該特徴量は信号源ｋの位置に依存するクラスタを形成する。 In each of the conventional examples described above, the declination of the frequency domain signal X _q (f, τ) corresponding to each sensor q is used as the reference for the declination of the frequency domain signal X _Q (f, τ) corresponding to the reference sensor Q. The feature amount is generated by normalizing and dividing the normalized declination by a value proportional to the frequency f (see, for example, equations (9) and (10)). In this case, the declination of the feature value is proportional to the arrival time difference γ _Qk -γ _qk of the signal from the signal source k to the sensor q and the reference sensor Q. Since the arrival time difference γ _Qk -γ _qk depends only on the relative positions of the signal source k, the sensor q, and the reference sensor Q, the feature amount forms a cluster depending on the position of the signal source k.

一方、式（１１）の関係を満たさない場合、周波数領域信号X_q(f,τ)の偏角と、周波数領域信号X_Q(f,τ)の偏角との相対値が−π未満やπ以上の範囲まで達する可能性がある。しかし、実際の混合信号の分析では、−π未満やπ以上の範囲に達した偏角の相対値は、それと等価な−π以上π未満の範囲の偏角としか判定できない。この場合、偏角の相対値は、周波数ｆの増加に対し、所定の周期で巡回的にジャンプ（πから−π又は−πからπにジャンプ）し、全周波数ｆにわたる比例関係とはならない。そのため、上述の各従来例のように偏角を正規化し、さらにそれらを周波数ｆに比例した値で除して特徴量を生成したとしても、クラスタリングによって、各信号源の位置を正しく特定できるクラスタを形成するのは困難である。なぜなら、同じ信号源に対応する特徴量であっても、上述の巡回的なジャンプが生じた周波数を超える領域ではクラスタを形成しないからである。 On the other hand, when the relationship of Expression (11) is not satisfied, the relative value between the declination of the frequency domain signal X _q (f, τ) and the declination of the frequency domain signal X _Q (f, τ) is less than −π. There is a possibility of reaching a range of π or more. However, in the actual analysis of the mixed signal, the relative value of the declination that reaches a range of less than −π or greater than or equal to π can be determined only as an equivalent declination of a range of −π or more and less than π. In this case, the relative value of the declination jumps cyclically (jumps from π to −π or −π to π) with a predetermined period as the frequency f increases, and does not have a proportional relationship over the entire frequency f. For this reason, even if the declination is normalized as in each of the above-described conventional examples and the feature amount is generated by dividing the deviation by a value proportional to the frequency f, the cluster can correctly identify the position of each signal source by clustering. Is difficult to form. This is because even if the feature amounts correspond to the same signal source, no cluster is formed in a region exceeding the frequency at which the above-described cyclic jump occurs.

本発明はこのような点に鑑みてなされたものであり、上述した従来の技術を拡張し、空間的エイリアシングが起こり得る周波数帯域でも好適にクラスタリングを行い、適切な分離信号を生成することが可能な技術を提供することを目的とする。すなわち、本発明の目的は、空間的エイリアシングが起こらない場合にも適用可能であるが、空間的エイリアシングが発生した場合であっても好適に分離信号を求めることが可能な技術を提供することである。 The present invention has been made in view of these points, and can extend the conventional technique described above to perform appropriate clustering even in a frequency band where spatial aliasing can occur and generate an appropriate separated signal. Aims to provide a new technology. That is, the object of the present invention is applicable to a case where spatial aliasing does not occur, but provides a technique capable of suitably obtaining a separated signal even when spatial aliasing occurs. is there.

第１の本発明では上記課題を解決するために、まず、周波数領域変換部が、複数のセンサで観測された混合信号を、それぞれ、周波数領域の混合信号（周波数領域信号）に変換する。次に、正規化部が、周波数領域信号の位相を正規化し、さらに周波数依存成分を排除した周波数正規化信号を生成する。なお、周波数正規化信号は、そのノルムが正規化されたものであってもよいし、されないものであってもよい。そして、第１クラスタリング部が、周波数正規化信号、或いは、当該周波数正規化信号を所定の関数へ代入して得られる値を各要素とする、時間周波数毎の特徴量ベクトルをクラスタリングし、各クラスタのセントロイド（クラスタの中心ベクトル）を算出する。また、モデルパラメータ抽出部が、当該セントロイドを用い、周波数応答モデルのモデルパラメータを算出する。その後、第２クラスタリング部が、モデルパラメータが代入された周波数応答モデル、又は、モデルパラメータを最適化したモデルパラメータが代入された周波数応答モデルを基準に、周波数領域信号の位相・ノルム正規化値を各要素とする周波数依存の位相・ノルム正規化ベクトルをクラスタリングし、各位相ノルム正規化ベクトルが属するクラスタに対応するクラスタ情報を生成する。そして、分離信号生成部が、周波数領域信号とクラスタ情報を用い、周波数領域の分離信号を生成する。最後に、周波数領域の分離信号を時間領域の分離信号に変換する。 In the first aspect of the present invention, in order to solve the above-described problem, first, the frequency domain conversion unit converts the mixed signals observed by a plurality of sensors into mixed signals (frequency domain signals) in the frequency domain. Next, the normalization unit normalizes the phase of the frequency domain signal and further generates a frequency normalized signal from which frequency dependent components are excluded. The frequency normalized signal may be a signal whose norm is normalized or may not. Then, the first clustering unit clusters the feature vector for each time frequency using each element as a frequency normalized signal or a value obtained by substituting the frequency normalized signal into a predetermined function. Centroid (cluster center vector) is calculated. The model parameter extraction unit calculates model parameters of the frequency response model using the centroid. Thereafter, the second clustering unit calculates the phase-norm normalized value of the frequency domain signal based on the frequency response model in which the model parameters are substituted or the frequency response model in which the model parameters are optimized. Clustering is performed on the frequency-dependent phase / norm normalized vectors as the respective elements, and cluster information corresponding to the clusters to which each phase norm normalized vector belongs is generated. Then, the separated signal generation unit uses the frequency domain signal and the cluster information to generate a frequency domain separated signal. Finally, the frequency domain separation signal is converted into a time domain separation signal.

ここで、第１クラスタリング部によるクラスタリングの対象となる特徴量ベクトルの多くが上述の巡回的なジャンプを起こしていないのであれば、生成される各クラスタのセントロイドは、周波数応答モデルを周波数正規化したものに近づく（詳細は後述）。そのため、このセントロイドを用い、周波数応答モデルのモデルパラメータを推定することができる。また、第２クラスタリング部は、当該モデルパラメータが代入された周波数応答モデル、又は、最適化したモデルパラメータが代入された周波数応答モデルを基準とし、周波数依存の位相・ノルム正規化ベクトルをクラスタリングする。ここで、モデルパラメータは周波数に依存しないが、モデルパラメータが代入された周波数応答モデルは周波数に依存する。この第２クラスタリング部では、この周波数に依存する周波数応答モデルを基準とし、周波数依存の位相・ノルム正規化ベクトルをクラスタリングするため、上述の巡回的なジャンプが生じていても各信号源に対応するクラスタを好適に生成することができる。そのため、当該クラスタの情報を用いることにより、好適に分離信号を求めることができる。 Here, if many of the feature vectors to be clustered by the first clustering unit do not cause the above-described cyclic jump, the centroid of each cluster to be generated is frequency normalized to the frequency response model. (Details will be described later). Therefore, the model parameter of the frequency response model can be estimated using this centroid. The second clustering unit clusters the frequency-dependent phase / norm normalized vectors with reference to the frequency response model into which the model parameter is substituted or the frequency response model into which the optimized model parameter is substituted. Here, the model parameter does not depend on the frequency, but the frequency response model into which the model parameter is substituted depends on the frequency. In this second clustering unit, frequency-dependent phase / norm normalized vectors are clustered based on this frequency-dependent frequency response model, so that even if the above-mentioned cyclic jump occurs, it corresponds to each signal source. A cluster can be suitably generated. Therefore, the separation signal can be suitably obtained by using the information of the cluster.

また、第１の本発明において好ましくは、第１クラスタリング部は、空間的エイリアシングが起き得ない周波数を含む周波数帯域の特徴量ベクトルをクラスタリングし、第２クラスタリング部は、空間的エイリアシングが起き得る周波数を含む周波数帯域の位相・ノルム正規化ベクトルをクラスタリングする。 Preferably, in the first aspect of the present invention, the first clustering unit clusters feature quantity vectors in a frequency band including a frequency at which spatial aliasing cannot occur, and the second clustering unit has a frequency at which spatial aliasing can occur. Cluster the phase / norm normalized vectors in the frequency band including.

ここで、第１クラスタリング部が、空間的エイリアシングが起き得ない周波数を含む周波数帯域の特徴量ベクトルをクラスタリングすることにより、クラスタリングの対象となる特徴量ベクトルの多くを上述の巡回的なジャンプを起こしていないものとすることができる。その結果、妥当なセントロイドを生成することができる。また、第２クラスタリング部が、空間的エイリアシングが起き得る周波数を含む周波数帯域の位相・ノルム正規化ベクトルをクラスタリングすることにより、第１クラスタリング部によって生成されなかった周波数に対応するクラスタをも生成することができる。その結果、空間的エイリアシングが生じる周波数に対しても適切に分離信号を生成することができる。 Here, the first clustering unit performs clustering of feature vectors in a frequency band including frequencies where spatial aliasing cannot occur, thereby causing many of the feature vectors to be clustered to have the above-described cyclic jumps. Can not be. As a result, a valid centroid can be generated. The second clustering unit also generates a cluster corresponding to the frequency that was not generated by the first clustering unit by clustering the phase / norm normalized vector of the frequency band including the frequency at which spatial aliasing can occur. be able to. As a result, a separation signal can be appropriately generated even for a frequency at which spatial aliasing occurs.

また、上述した「空間的エイリアシングが起き得ない周波数を含む周波数帯域」は、一部に空間的エイリアシングが起き得る周波数を含む周波数帯域であってもよいが、好ましくは、空間的エイリアシングが起き得ない周波数のみからなる周波数帯域であることが望ましい。これにより、第１クラスタリング部は、巡回的なジャンプを起こしていない特徴量ベクトルのみを用いてセントロイドを生成できる。その結果、巡回的なジャンプに起因してセントロイドの精度が低下することを防止できる。また、上述した「空間的エイリアシングが起き得る周波数を含む周波数帯域」は、全周波数帯域（全ての時間周波数スロットに対応する周波数帯域）であることが望ましい。 In addition, the above-mentioned “frequency band including a frequency at which spatial aliasing cannot occur” may be a frequency band partially including a frequency at which spatial aliasing may occur, but preferably, spatial aliasing may occur. It is desirable that the frequency band is composed only of no frequencies. As a result, the first clustering unit can generate a centroid using only the feature vector that has not caused a cyclic jump. As a result, it is possible to prevent the accuracy of the centroid from being lowered due to the cyclic jump. Further, the above-described “frequency band including a frequency at which spatial aliasing can occur” is desirably an entire frequency band (a frequency band corresponding to all time frequency slots).

また、第１の本発明において好ましくは、モデルパラメータ最適化部が、クラスタ情報と、空間的エイリアシングが起き得る周波数を含む周波数帯域に対応する位相・ノルム正規化ベクトルとを用い、モデルパラメータを最適化する。これにより、空間的エイリアシングが起き得る周波数に対応する情報をもモデルパラメータの生成に反映させることができ、モデルパラメータの精度を向上させることができる。 In the first aspect of the present invention, preferably, the model parameter optimizing unit uses the cluster information and a phase / norm normalization vector corresponding to a frequency band including a frequency at which spatial aliasing may occur to optimize the model parameter. Turn into. Thereby, information corresponding to a frequency at which spatial aliasing can occur can be reflected in the generation of the model parameter, and the accuracy of the model parameter can be improved.

ここでより好ましくは、モデルパラメータは、センサと、クラスタに対応する信号源との組に対応するパラメータである。そして、モデルパラメータ最適化部は、位相・ノルム正規化ベクトルの各要素に対応するセンサと、クラスタ情報が示す当該位相・ノルム正規化ベクトルが属するクラスタに対応する信号源と、の組に対応するモデルパラメータが代入された周波数応答モデルと、当該位相・ノルム正規化ベクトルの各要素と、の距離の総和からなるコスト関数に対し、モデルパラメータに関する偏微分を行い、算出された偏微分値に比例する値をモデルパラメータから減算することによって当該モデルパラメータを最適化する。これにより、第１クラスタリング部のクラスタリングで利用されなかった特徴量ベクトルの情報を利用し、モデルパラメータの精度を向上させることができる。 More preferably, the model parameter is a parameter corresponding to a set of a sensor and a signal source corresponding to a cluster. The model parameter optimization unit corresponds to a set of a sensor corresponding to each element of the phase / norm normalized vector and a signal source corresponding to the cluster to which the phase / norm normalized vector indicated by the cluster information belongs. The cost function consisting of the sum of the distances between the frequency response model into which the model parameters are assigned and the elements of the phase / norm normalized vector is subjected to partial differentiation with respect to the model parameters and is proportional to the calculated partial differential value. The model parameter is optimized by subtracting the value to be determined from the model parameter. As a result, it is possible to improve the accuracy of the model parameters by using the feature vector information that has not been used in the clustering of the first clustering unit.

さらにここで好ましくは、モデルパラメータの最適化処理及びクラスタ情報の生成処理は、所定の終了条件が満たされるまで交互に実行される。そして、分離信号生成部は、所定の終了条件を満たした時点で最新のクラスタ情報を用い、分離信号を生成する。これにより、モデルパラメータの精度を向上させ、精度の高い分離信号を生成できる。また、モデルパラメータ最適化部でのモデルパラメータの最適化処理は、全ての時間周波数スロットに対応する（即ち全周波数に対応する）位相・ノルム正規化ベクトルを用いて実行されることが望ましい。モデルパラメータの精度をより一層高めることができるからである。 Further preferably, the model parameter optimization process and the cluster information generation process are alternately executed until a predetermined end condition is satisfied. Then, the separation signal generation unit generates a separation signal using the latest cluster information when a predetermined termination condition is satisfied. Thereby, the precision of a model parameter can be improved and a highly accurate separation signal can be generated. In addition, it is preferable that the model parameter optimization processing in the model parameter optimization unit is executed using phase / norm normalization vectors corresponding to all time frequency slots (that is, corresponding to all frequencies). This is because the accuracy of the model parameters can be further increased.

また、第１の本発明において、第１クラスタリング部が、空間的エイリアシングが起き得ない周波数を含む周波数帯域に対応する特徴量ベクトルがそれぞれ属するクラスタに対応する初期クラスタ情報を生成し、第２クラスタリング部が、空間的エイリアシングが起き得る周波数を含む周波数帯域に対応する位相・ノルム正規化ベクトルがそれぞれ属するクラスタに対応するクラスタ情報を生成し、分離信号生成部が、第１クラスタリング部が生成した初期クラスタ情報と、第２クラスタリング部が生成したクラスタ情報とを用い、分離信号を生成してもよい。このように第１クラスタリング部が生成した初期クラスタ情報を分離信号の生成に利用することにより、第２クラスタリング部は、第１クラスタリング部が生成した初期クラスタと重複する周波数のクラスタ情報を生成する必要がなくなる。その結果、全体として演算量を低減できる。 In the first aspect of the present invention, the first clustering unit generates initial cluster information corresponding to clusters to which feature quantity vectors corresponding to frequency bands including frequencies in which spatial aliasing cannot occur, and second clustering. The unit generates cluster information corresponding to the clusters to which the phase and norm normalization vectors corresponding to the frequency bands including frequencies where spatial aliasing may occur, and the separated signal generation unit generates the initial data generated by the first clustering unit. The separation signal may be generated using the cluster information and the cluster information generated by the second clustering unit. As described above, by using the initial cluster information generated by the first clustering unit for the generation of the separation signal, the second clustering unit needs to generate cluster information having a frequency overlapping with the initial cluster generated by the first clustering unit. Disappears. As a result, the calculation amount can be reduced as a whole.

また、第２の本発明では、まず、周波数領域変換部が、複数のセンサで観測された混合信号を、それぞれ、周波数領域信号に変換する。また、モデルパラメータ選択部が、周波数応答モデルのモデルパラメータの初期値を選択する。そして、所定の終了条件を満たすまで、モデルパラメータが代入された周波数応答モデルを基準とし、上記周波数領域信号の位相・ノルム正規化値を各要素とする周波数依存の位相・ノルム正規化ベクトルをクラスタリングし、各位相・ノルム正規化ベクトルが属するクラスタに対応するクラスタ情報を生成する処理と、クラスタ情報と、位相・ノルム正規化ベクトルとを用い、モデルパラメータを最適化する処理とを繰り返す。そして、分離信号生成部が、所定の終了条件を満たした時点において最新のクラスタ情報と、周波数領域信号とを用い、周波数領域の分離信号を生成する。 In the second aspect of the present invention, first, the frequency domain conversion unit converts the mixed signals observed by a plurality of sensors into frequency domain signals. In addition, the model parameter selection unit selects an initial value of the model parameter of the frequency response model. Then, cluster the frequency-dependent phase / norm normalized vectors with the phase / norm normalized values of the frequency domain signal as elements, based on the frequency response model with the model parameters substituted until a predetermined termination condition is satisfied. Then, the process of generating cluster information corresponding to the cluster to which each phase / norm normalized vector belongs, and the process of optimizing the model parameters using the cluster information and the phase / norm normalized vector are repeated. Then, the separation signal generation unit generates a frequency domain separation signal by using the latest cluster information and the frequency domain signal when a predetermined termination condition is satisfied.

ここで、第２の本発明のクラスタリングは、周波数依存の位相・ノルム正規化ベクトルに対するクラスタリングのみであるため、上述した巡回的ジャンプに起因する問題が発生しない。 Here, since the clustering of the second aspect of the present invention is only clustering with respect to the frequency-dependent phase / norm normalized vector, the problem caused by the above-described cyclic jump does not occur.

本発明では、上述した従来の技術を拡張し、空間的エイリアシングが起こり得る周波数帯域でも好適にクラスタリングを行い、適切な分離信号を生成することが可能となる。 In the present invention, it is possible to extend the conventional technique described above, and perform suitable clustering even in a frequency band where spatial aliasing can occur, and generate an appropriate separated signal.

以下、本発明を実施するための最良の形態を図面を参照して説明する。以下では、各形態の原理を説明した後、各実施の形態を説明していく。
〔原理の概要〕
各形態では、まず、複数のセンサで観測された混合信号を、それぞれ、周波数領域信号に変換し、さらにそれを正規化等して得られる特徴量ベクトルをクラスタリングする。なお、各形態の例では、空間的エイリアシングが起き得ない周波数の特徴量ベクトルのみに対し、このクラスタリング（「第１クラスタリング部」でのクラスタリング）を実行する。このクラスタリング結果を解析すると、信号が混合された様子を表現する周波数応答モデルのモデルパラメータ、すなわち、信号源ｋからセンサｑ（ｑ∈｛１，...，Ｍ｝）への正規化された到達時間γ_ｑｋと、振幅ゲインλ_ｑｋ＞０とが算出できる。そして、これらのモデルパラメータγ_ｑｋ，λ_ｑｋが代入された周波数応答モデルを基準とし、前述の位相・ノルム正規化ベクトルをクラスタリング（「第２クラスタリング部」でのクラスタリング）する。なお、各形態の例では、空間的エイリアシングが起き得る周波数を含む全周波数帯域の位相・ノルム正規化ベクトルに対し、この第２クラスタリング部でのクラスタリングを実行する。このように生成された各クラスタは各信号源ｋに対応するため、当該クラスタの情報を用いることにより、分離信号を推定できる。 The best mode for carrying out the present invention will be described below with reference to the drawings. In the following, after explaining the principle of each embodiment, each embodiment will be described.
[Summary of Principle]
In each embodiment, first, mixed signals observed by a plurality of sensors are converted into frequency domain signals, and feature vectors obtained by normalizing the signals are clustered. In each form example, this clustering (clustering in the “first clustering unit”) is performed only on feature vectors of frequencies at which spatial aliasing cannot occur. When this clustering result is analyzed, the model parameters of the frequency response model expressing how the signals are mixed, that is, normalized from the signal source k to the sensor q (qε {1,..., M}) The arrival time γ _qk and the amplitude gain λ _qk > 0 can be calculated. Then, based on the frequency response model into which these model parameters γ _qk and λ _qk are substituted, the above-described phase / norm normalized vectors are clustered (clustering in the “second clustering unit”). In each example, clustering in the second clustering unit is performed on the phase / norm normalized vectors in all frequency bands including frequencies at which spatial aliasing can occur. Since each cluster generated in this way corresponds to each signal source k, the separated signal can be estimated by using the information of the cluster.

ただし、上記のモデルパラメータγ_ｑｋ，λ_ｑｋは、空間的エイリアシングが起き得ない周波数の特徴量ベクトルのみを用いて算出されたものである。そのため、あまり正確でない可能性がある。そこで、第２の実施の形態では、空間的エイリアシングが起き得る周波数を含む全周波数帯域の位相・ノルム正規化ベクトルを用い、モデルパラメータγ_ｑｋ，λ_ｑｋに微小な修正値を加算・減算する処理を繰り返し、モデルパラメータγ_ｑｋ，λ_ｑｋを最適化する。 However, the model parameters γ _qk and λ _qk are calculated using only feature quantity vectors of frequencies at which spatial aliasing cannot occur. Therefore, it may not be very accurate. Therefore, in the second embodiment, processing for adding / subtracting a small correction value to / from the model parameters γ _qk and λ _qk using the phase / norm normalized vector of the entire frequency band including the frequency at which spatial aliasing can occur. _Is repeated to optimize the model parameters γ _qk and λ _qk .

〔原理の詳細〕
次に、原理の詳細について説明する。
［周波数応答モデルと周波数領域信号］
各形態では、信号のスパース性と相互独立性を仮定する。この場合、各周波数領域信号X_q（f,τ）は、信号源からセンサへの周波数応答に相当する情報を持つ（式（５）（６）参照）。従って、周波数領域信号X_q(f,τ)を解析することで、信号源からセンサへの周波数応答を表現する周波数応答モデルのモデルパラメータを算出することができる。また、各形態では、信号源ｋからセンサｑヘの周波数応答を直接波モデル
H_qk(f)=λ_qk・exp[‐j・2π・f・γ_qk] …(14)
によって近似する。ここで、モデルパラメータγ_ｑｋとλ_ｑｋ＞０とは、それぞれ、信号源ｋからセンサｑヘの到達時間と減衰（振幅ゲイン）とを表現する。ただし、周波数領域の源信号S_k(f,τ)の位相及び振幅と、周波数応答H_qk(f)の位相及び振幅とを実際に区別することは困難である。そのため、各形態では、λ_ｑｋ及びγ_ｑｋを相対的なものとして考え、ある種の正規化を行う。到達時間γ_ｑｋは、ある基準センサＱにおけるものを０、すなわち、到達時間γ_Qｋ＝０とする。また、振幅ゲインλ_ｑｋは、

と正規化する。なお、αは正の実数であり、例えば、α＝１である。 [Details of Principle]
Next, the details of the principle will be described.
[Frequency response model and frequency domain signal]
Each form assumes signal sparsity and mutual independence. In this case, each frequency domain signal X _q (f, τ) has information corresponding to the frequency response from the signal source to the sensor (see equations (5) and (6)). Therefore, by analyzing the frequency domain signal X _q (f, τ), it is possible to calculate model parameters of a frequency response model that expresses the frequency response from the signal source to the sensor. In each embodiment, the frequency response from the signal source k to the sensor q is a direct wave model.
H _qk (f) = λ _qk・ exp [−j ・ 2π ・ f ・ γ _qk ]… (14)
Is approximated by Here, the model parameters γ _qk and λ _qk > 0 represent the arrival time and attenuation (amplitude gain) from the signal source k to the sensor q, respectively. However, it is difficult to actually distinguish the phase and amplitude of the source signal S _k (f, τ) in the frequency domain from the phase and amplitude of the frequency response H _qk (f). Therefore, in each embodiment, λ _qk and γ _qk are considered as relative ones, and some kind of normalization is performed. The arrival time γ _qk is 0 for a certain reference sensor Q, that is, the arrival time γ _Qk = 0. The amplitude gain λ _qk is

And normalize. Α is a positive real number, for example, α = 1.

そして、そのような正規化を混合信号ベクトルX(f,τ)=[X₁(f,τ),…,X_M(f,τ)]^Tにも反映する。これは、混合信号ベクトルX(f,τ)の位相とノルムとを正規化することにより行う。すなわち、混合信号ベクトルX(f,τ)に対し、ある基準センサＱに対応する要素X_Q(f,τ)の偏角を０として、各要素X₁(f,τ),...,X_M(f,τ)の偏角を正規化し、さらにノルムがα（例えば１）になるように正規化する。このように正規化されたベクトルを、位相・ノルム正規化ベクトルX'(f,τ)=[X₁'(f,τ),…,X_M'(f,τ)]^Tと呼ぶ。また、このような正規化は、例えば、

により達成できる。なお、ａｒｇ［・］は、・の偏角を示す。 Such normalization is also reflected in the mixed signal vector X (f, τ) = [X ₁ (f, τ),..., X _M (f, τ)] ^T. This is done by normalizing the phase and norm of the mixed signal vector X (f, τ). That is, mixed signal vector X (f, τ) with respect to the element X _Q (f, tau) corresponding to a reference sensor Q declination as 0, the elements _{X 1 (f, τ),} ..., The deviation angle of X _M (f, τ) is normalized, and further normalized so that the norm is α (for example, 1). The vector normalized in this way is called a phase / norm normalized vector X ′ (f, τ) = [X ₁ ′ (f, τ),..., X _M ′ (f, τ)] ^T. Such normalization is, for example,

Can be achieved. Note that arg [•] indicates the declination of •.

以上のように正規化を行うと、ある位相・ノルム正規化ベクトルX'(f,τ)は、ある信号源ｋからセンサｑヘの周波数応答をモデル化した式（１４）（以下「周波数応答モデル」と呼ぶ）に近づく。すなわち、位相・ノルム正規化ベクトルX'(f,τ)の要素と、周波数応答モデルとの距離の総和（各センサについての時間周波数スロット(f,τ)毎の総和）を最小値化する位相・ノルム正規化ベクトルX'(f,τ)が、信号源ｋに対応するものであると推定でき、この結果を用い、分離信号を推定できる。 When normalization is performed as described above, a certain phase / norm normalization vector X ′ (f, τ) is expressed by an equation (14) (hereinafter “frequency response”) that models the frequency response from a certain signal source k to the sensor q. Called a model). That is, the phase that minimizes the sum of the distances between the elements of the phase / norm normalized vector X ′ (f, τ) and the frequency response model (sum of each time frequency slot (f, τ) for each sensor). It can be estimated that the norm normalized vector X ′ (f, τ) corresponds to the signal source k, and the separated signal can be estimated using this result.

ここで、どの位相・ノルム正規化ベクトルX'(f,τ)が、どの周波数応答モデルH_qk(f)に対応するかは、時間周波数スロット(f,τ)毎に異なり、この位相・ノルム正規化ベクトルX'(f,τ)と周波数応答モデルH_qk(f)との対応関係は、以下のコスト関数を最小化するものとして表現できる。

Here, which phase / norm normalization vector X ′ (f, τ) corresponds to which frequency response model H _qk (f) is different for each time frequency slot (f, τ), and this phase / norm The correspondence relationship between the normalized vector X ′ (f, τ) and the frequency response model H _qk (f) can be expressed as minimizing the following cost function.

ここで、C(f,τ)（「クラスタ情報」と呼ぶ）は、各時間周波数スロット(f,τ)に対応する何らかの特徴量（例えば、位相・ノルム正規化ベクトルや特徴量ベクトル）をクラスタリングして得られる各クラスタに対応する情報である。つまり、ある時間周波数スロット（f,τ）に関してC(f,τ)=kであることは、その時間周波数スロット（f,τ）に対応する何らかの特徴量がｋ番目のクラスタに属することを意味する。また、C(f,τ)=kの条件を付加した総和Σは、C(f,τ)=kである全ての時間周波数スロット(f,τ)について総和をとることを意味する。 Here, C (f, τ) (referred to as “cluster information”) clusters any feature quantity (eg, phase / norm normalization vector or feature quantity vector) corresponding to each time frequency slot (f, τ). It is the information corresponding to each cluster obtained in this way. That is, C (f, τ) = k for a certain time frequency slot (f, τ) means that some feature quantity corresponding to the time frequency slot (f, τ) belongs to the kth cluster. To do. Further, the summation Σ with the condition of C (f, τ) = k means that the summation is performed for all time frequency slots (f, τ) where C (f, τ) = k.

［コスト関数Gの解法］
各形態では、式（１６）のコスト関数Gを最小値化する問題を解くにあたり、まず、空間的エイリアシングが起こり得ない周波数帯域Ｆ_Ｌ⊆Ｆの各周波数ｆに対してクラスタを決定するとともに、各信号源ｋから各センサｑヘの周波数応答モデルH_qk(f)（式（１４））のモデルパラメータλ_ｑｋ及びγ_ｑｋの初期値を算出する。なお、Ｆは考慮すべき周波数を全て集めた集合、すなわち、F={0,f_s/L,…,f_s(L-1)/L}（ｆ_ｓはサンプリング周波数）である。また、空間的エイリアシングが起こり得ない周波数帯域Ｆ_Ｌは、
F_L={f : 0＜f＜c/(2d_max)} …(17)
によって算出できる。なお、ｄ_ｍａｘは、基準センサＱと他のセンサｑとの距離の最大値である。 [Solution of cost function G]
In each form, in solving the problem of minimizing the cost function G of Equation (16), first, a cluster is determined for each frequency f of the frequency band F _L ⊆F where spatial aliasing cannot occur, The initial values of the model parameters λ _qk and γ _qk of the frequency response model H _qk (f) (Equation (14)) from each signal source k to each sensor q are calculated. Note that F is a set in which all the frequencies to be considered are collected, that is, F = {0, f _s / L,..., F _s (L−1) / L} (f _s is a sampling frequency). Further, the frequency band F _L that would not occur spatial aliasing,
F _L = {f: 0 <f <c / (2d _max )}… (17)
Can be calculated. D _max is the maximum distance between the reference sensor Q and another sensor q.

空間的エイリアシングが起こり得ない周波数帯域Ｆ_Ｌだけを考えた場合、式（１６）のコスト関数Ｇは、以下の手順により簡単に最小化できる。
まず、周波数帯域Ｆ_Ｌに属するすべての周波数ｆに関し、例えば、

に従い、位相・ノルム正規化ベクトルX'(f,τ)の周波数依存成分を排除する周波数正規化を行う。各形態では、このように正規化された各要素X_q''(f,τ)からなるベクトルX''(f,τ)=[X₁''(f,τ),..., X_M''(f,τ)]^Tを特徴量ベクトルとする。 When considering only the frequency band F _L where spatial aliasing does occur, the cost function G of the formula (16) can be easily minimized by the following steps.
First, For all frequencies f that belong to the frequency band F _L, for example,

The frequency normalization is performed to eliminate the frequency dependent component of the phase / norm normalization vector X ′ (f, τ). In each form, the vector X '' (f, τ) = [X ₁ '' (f, τ), ..., X consisting of each element X _q '' (f, τ) normalized in this way Let _M ″ (f, τ)] ^T be a feature vector.

このような特徴量ベクトルX''(f,τ)の各要素X_q''(f,τ)を用いると、式（１６）に示したコスト関数Ｇは、

と変更される。ここでH_qk’は、式（１４）に示した周波数応答モデルH_qk(f)の周波数依存成分を排除する正規化を、式（１８）と同様に行ったものであり、例えば、

と表現されるものである。なお、H_qk’を正規化周波数応答モデルと呼ぶ。
また、式（１９）をベクトル表記すると、

となる。なお、H_k'=[ H_1k',..., H_Mk']^Tは、正規化周波数応答モデルH_qk’を要素とするベクトルである。 When each element X _q ″ (f, τ) of such a feature vector X ″ (f, τ) is used, the cost function G shown in Expression (16) is

And changed. Here, H _qk ′ is _{obtained by performing} normalization for eliminating the frequency dependent component of the frequency response model H _qk (f) shown in the equation (14) in the same manner as in the equation (18).

It is expressed as H _qk ′ is called a normalized frequency response model.
Further, when expression (19) is expressed in vector,

It becomes. H _k ′ = [H _1k ′,..., H _Mk ′] ^T is a vector whose elements are normalized frequency response models H _qk ′.

［周波数帯域Ｆ_Ｌ⊆Ｆの各周波数ｆに対するクラスタの決定］
ここで、周波数帯域Ｆ_Ｌ⊆Ｆの周波数ｆのみを考慮すると、特徴量ベクトルX''（f,τ）は、信号源ｋ毎のクラスタを形成する（例えば、非特許文献３参照）。そして、上述の正規化周波数応答モデルH_qk’を要素とするベクトルH_k'=[ H_1k',..., H_Mk']^Tは、各クラスタのセントロイド（クラスタの中心ベクトル）とみなすことができる。従って、よく知られたk-means法（例えば、非特許文献２参照）と同様に、以下の＜演算１＞＜演算２＞を交互に繰返し適用することで、式（１９）のコスト関数Ｇ’を最小にする各クラスタと、それらのセントロイドH_k’とを算出することができる。 [Determination of cluster for each frequency f of frequency band F _L ⊆F]
Here, considering only the frequency f of the frequency band F _L ⊆F, the feature vector X ″ (f, τ) forms a cluster for each signal source k (see, for example, Non-Patent Document 3). The vector H _k '= [H _1k ', ..., H _Mk '] ^T whose elements are the normalized frequency response model H _qk ' described above is regarded as the centroid of each cluster (cluster center vector). be able to. Therefore, similarly to the well-known k-means method (see, for example, Non-Patent Document 2), the following <Calculation 1><Calculation2> is alternately applied repeatedly, whereby the cost function G of Equation (19) is applied. Each cluster that minimizes 'and their centroid H _k ' can be calculated.

＜演算１＞最新のクラスタ情報C(f,τ)に基づき、各クラスタのセントロイドH_k’を

に従って算出する。この演算は、最新のクラスタ情報C(f,τ)に基づき、最適なセントロイドH_k’を算出するものである。なお、特徴量ベクトルX''（f,τ）と同様に、セントロイドH_k’もベクトルとしてのノルムがα（式（２１）の例ではα＝１）になるように正規化される。 <Calculation 1> Based on the latest cluster information C (f, τ), the centroid H _k ′ of each cluster is calculated.

Calculate according to This calculation calculates an optimal centroid H _k ′ based on the latest cluster information C (f, τ). Similar to the feature vector X ″ (f, τ), the centroid H _k ′ is also normalized so that the norm as a vector is α (α = 1 in the example of Expression (21)).

＜演算２＞最新のセントロイドH_k’に基づき、クラスタ情報C(f,τ)を更新する。即ち、

に従い、各特徴量ベクトルX''（f,τ）に最も近いセントロイドH_k’を選択し、選択したセントロイドH_k’の添字ｋを、特徴量ベクトルX''（f,τ）に対応する時間周波数スロット（f,τ）のクラスタ情報C(f,τ)とする（C(f,τ)=k）。式（２２）におけるargmin_k・は、・を最小にするｋを意味する。 <Calculation 2> The cluster information C (f, τ) is updated based on the latest centroid H _k ′. That is,

The centroid H _k ′ closest to each feature vector X ″ (f, τ) is selected according to the above, and the subscript k of the selected centroid H _k ′ is used as the feature vector X ″ (f, τ). The cluster information C (f, τ) of the corresponding time frequency slot (f, τ) is assumed (C (f, τ) = k). Argmin _k · in Equation (22) means k that minimizes ·.

なお、＜演算１＞と＜演算２＞とはどちらを最初に開始してもよい。また、＜演算１＞を最初に行う場合のクラスタ情報C(f,τ)の初期値、及び、＜演算２＞を最初に行う場合のセントロイドH_k’の初期値の設定方法については特に制限はないが、広範囲に分布する特徴量ベクトルX''（f,τ）が同一のクラスタに属するようなクラスタ情報C(f,τ)や、狭い範囲に集中するセントロイドH_k’を初期値とすることは好ましくない。適切なクラスタが生成されない場合があるからである。このような好ましい初期値の設定方法には、特に制限はないが、例えば、以下のような方法を例示できる。 Either <Calculation 1> or <Calculation 2> may start first. The initial value of the cluster information C (f, τ) when <Calculation 1> is performed first and the setting method of the initial value of the centroid H _k ′ when <Calculation 2> is performed first Although there is no limitation, cluster information C (f, τ) such that feature vector X '' (f, τ) distributed over a wide range belongs to the same cluster and centroid H _k ′ concentrated in a narrow range are initially set It is not preferable to use a value. This is because an appropriate cluster may not be generated. Such a preferable initial value setting method is not particularly limited, and examples thereof include the following methods.

セントロイドH_k’の初期値設定例：
(1)セントロイドH_k’(k∈{1,...,N})の初期値候補の組み合わせをランダムに複数パターン生成する。
(2)各組み合わせに対し、セントロイドH_k’の初期値候補の内積を求め、内積が最小となる組み合わせをセントロイドH_k’の初期値とする。 Example of initial value setting for Centroid H _k ':
(1) A plurality of combinations of initial value candidate combinations of centroid H _k ′ (k∈ {1,..., N}) are randomly generated.
(2) For each combination, the inner product of the initial value candidates of the centroid H _k ′ is obtained, and the combination having the smallest inner product is set as the initial value of the centroid H _k ′.

クラスタ情報C(f,τ)の初期値設定例：
(1)クラスタ情報C(f,τ)の初期値候補の組み合わせをランダムに複数パターン生成する。
(2)各組み合わせに対し、式（２１）に従いセントロイドH_k’の初期値候補の組み合わせを算出する。
(3)セントロイドH_k’の初期値候補の各組み合わせに対し、セントロイドH_k’の初期値候補の内積を求め、内積が最小となる組み合わせに対応するクラスタ情報C(f,τ)の初期値候補をクラスタ情報C(f,τ)の初期値とする。 Example of initial value setting for cluster information C (f, τ):
(1) A plurality of patterns of random combinations of initial value candidates for cluster information C (f, τ) are generated.
(2) For each combination, a combination of initial value candidates of centroid H _k ′ is calculated according to equation (21).
(3) 'for each combination of initial value options of the centroid H _k' centroid H _k obtains an inner product of the initial value candidates, the cluster information corresponding to a combination of the inner product becomes minimum C of (f, tau) The initial value candidate is set as the initial value of the cluster information C (f, τ).

［モデルパラメータλ_ｑｋ及びγ_ｑｋの初期値算出］
＜演算１＞と＜演算２＞とは、所定の終了条件を満たすまで繰り返され、所定の条件を満たした際のセントロイドH_k’を用い、以下の式に従って、式（１４）の周波数応答モデルH_qk(f)のモデルパラメータλ_ｑｋ及びγ_ｑｋを算出する。

[Initial value calculation of model parameters λ _qk and γ _qk ]
<Calculation 1> and <Calculation 2> are repeated until a predetermined termination condition is satisfied, and the frequency response of Expression (14) is used according to the following expression using the centroid H _k ′ when the predetermined condition is satisfied. The model parameters λ _qk and γ _qk of the model H _qk (f) are calculated.

［コスト関数Gの解］
次に、得られたモデルパラメータλ_ｑｋ及びγ_ｑｋを用い、全周波数ｆ∈Ｆについて、式（１６）のコスト関数Ｇを最小化するように、位相・ノルム正規化ベクトルX'(f,τ)をクラスタリングする。単純な方法としては、得られたモデルパラメータλ_ｑｋ及びγ_ｑｋをそのまま用い、

に従い、クラスタ情報(f,τ)を生成し、位相・ノルム正規化ベクトルX'(f,τ)を各クラスタに割り当てる。 [Solution of cost function G]
Next, using the obtained model parameters λ _qk and γ _qk , the phase / norm normalized vector X ′ (f, τ) is set so as to minimize the cost function G of Equation (16) for all frequencies f∈F. ). As a simple method, the obtained model parameters λ _qk and γ _qk are used as _they are,

Then, cluster information (f, τ) is generated, and a phase / norm normalized vector X ′ (f, τ) is assigned to each cluster.

しかしながら、周波数帯域Ｆ_Ｌのみで算出したモデルパラメータλ_ｑｋ，γ_ｑｋは、それほど正確でない可能性がある。そのため、より良い方法として、全周波数ｆ∈Ｆの位相・ノルム正規化ベクトルX'(f,τ)を用いて、より正確なモデルパラメータλ_ｑｋ，γ_ｑｋを算出する。これは、式（１６）のコスト関数Ｇのモデルパラメータλ_ｑｋ及びγ_ｑｋそれぞれに関する偏微分値を用い、以下のような修正を繰り返すことで実現できる。

However, the model parameter lambda _qk calculated only in the frequency band F _{_L,} gamma _qk is likely less accurate. Therefore, as a better method, more accurate model parameters λ _qk and γ _qk are calculated using the phase / norm normalized vector X ′ (f, τ) of all frequencies f∈F. This can be realized by repeating the following corrections using partial differential values related to the model parameters λ _qk and γ _qk of the cost function G in Expression (16).

なお、μはステップサイズパラメータを示す実数である。また、

は、それぞれ、式（１６）のコスト関数Ｇのモデルパラメータλ_ｑｋ，γ_ｑｋに関する偏微分である。また、 (f,τ)=C_k ^fの条件を付加した総和Σは、C(f,τ)=kであって周波数がｆである全ての時間周波数スロット(f,τ)について総和をとることを意味する。また、ｉｍａｇ［・］とｒｅａｌ［・］は、それぞれ、複素数・の虚部と実部を抽出する関数を意味する。 Note that μ is a real number indicating a step size parameter. Also,

_Are partial derivatives with respect to the model parameters λ _qk and γ _qk of the cost function G in the equation (16), respectively. The sum Σ with the condition of (f, τ) = C _k ^f is summed for all time frequency slots (f, τ) with C (f, τ) = k and frequency f. Means that. Further, imag [•] and real [•] mean functions for extracting the imaginary part and real part of the complex number, respectively.

式（２４）によるクラスタ情報C(f,τ)の算出と、式（２５）によるモデルパラメータλ_ｑｋ，γ_ｑｋの修正とを交互に繰返し適用することで、より良いクラスタ情報C(f,τ)とモデルパラメータλ_ｑｋ，γ_ｑｋとを算出することができる。この繰り返しは、所定の終了条件を満たすまで行われる。 By repeatedly applying the calculation of the cluster information C (f, τ) according to Equation (24) and the correction of the model parameters λ _qk and γ _qk according to Equation (25), better cluster information C (f, τ ) And model parameters λ _qk and γ _qk can be calculated. This repetition is performed until a predetermined end condition is satisfied.

［分離信号生成］
その後、上述のように得られたクラスタ情報C(f,τ)とを用い、周波数領域の分離信号Y_k(f,τ)を生成する。具体的には、例えば、以下のように周波数領域の分離信号Y_k(f,τ)を生成する

そして、得られた周波数領域の分離信号Y_k(f,τ)を短時間逆フーリエ変換等によって時間領域の分離信号y_k(t)を求める。 [Separate signal generation]
Thereafter, using the cluster information C (f, τ) obtained as described above, a frequency domain separation signal Y _k (f, τ) is generated. Specifically, for example, the frequency domain separation signal Y _k (f, τ) is generated as follows.

Then, a time domain separation signal y _k (t) is obtained from the obtained frequency domain separation signal Y _k (f, τ) by short-time inverse Fourier transform or the like.

〔第１の実施の形態〕
次に、本発明における第１の実施の形態について説明する。本形態では、モデルパラメータλ_ｑｋ，γ_ｑｋの最適化を行わずに、全周波数ｆ∈Ｆについて、式（２４）に従ってクラスタ情報C(f,τ)を生成する。
＜ハードウェア構成＞
図１は、第１の実施の形態における信号分離装置１のハードウェア構成を例示したブロック図である。
図１に例示するように、この例の信号分離装置１は、ＣＰＵ（Central Processing Unit）１０、入力部２０、出力部３０、補助記憶装置４０、ＲＡＭ（Random Access Memory）１０ｄ、ＲＯＭ（Read Only Memory）５０及びバス７０を有している。
この例のＣＰＵ１０は、制御部１１、演算部１２及びレジスタ１３有し、レジスタ１３に読み込まれた各種プログラムに従って様々な演算処理を実行する。また、この例の入力部２０は、データが入力される入力ポート、キーボード、マウス等であり、出力部３０は、データを出力する出力ポート、ディスプレイ等である。補助記憶装置４０は、例えば、ハードディスク、ＭＯ（Magneto-Optical disc）、半導体メモリ等であり、本形態の信号分離処理を実行するための信号分離プログラムを格納した信号分離プログラム領域４１及びセンサで観測された時間領域の混合信号等の各種データが格納されるデータ領域４２を有している。また、ＲＡＭ５０は、例えば、ＳＲＡＭ（Static Random Access Memory）、ＤＲＡＭ（Dynamic Random Access Memory）等であり、信号分離プログラムが書き込まれる信号分離プログラム領域５１及び各種データが書き込まれるデータ領域５２を有している。また、この例のバス７０は、ＣＰＵ１０、入力部２０、出力部３０、補助記憶装置４０、ＲＡＭ５０及びＲＯＭ６０を通信可能に接続している。 [First Embodiment]
Next, a first embodiment of the present invention will be described. In this embodiment, the cluster information C (f, τ) is generated according to the equation (24) for all frequencies fεF without optimization of the model parameters λ _qk and γ _qk .
<Hardware configuration>
FIG. 1 is a block diagram illustrating a hardware configuration of a signal separation device 1 according to the first embodiment.
As illustrated in FIG. 1, a signal separation device 1 of this example includes a CPU (Central Processing Unit) 10, an input unit 20, an output unit 30, an auxiliary storage device 40, a RAM (Random Access Memory) 10d, and a ROM (Read Only). Memory) 50 and bus 70.
The CPU 10 in this example includes a control unit 11, a calculation unit 12, and a register 13, and executes various calculation processes according to various programs read into the register 13. The input unit 20 in this example is an input port for inputting data, a keyboard, a mouse, and the like, and the output unit 30 is an output port for outputting data, a display, and the like. The auxiliary storage device 40 is, for example, a hard disk, an MO (Magneto-Optical disc), a semiconductor memory, or the like, and is observed by a signal separation program area 41 storing a signal separation program for executing the signal separation processing of this embodiment and a sensor. And a data area 42 in which various data such as mixed signals in the time domain are stored. The RAM 50 is, for example, an SRAM (Static Random Access Memory), a DRAM (Dynamic Random Access Memory) or the like, and has a signal separation program area 51 in which a signal separation program is written and a data area 52 in which various data are written. Yes. The bus 70 in this example connects the CPU 10, the input unit 20, the output unit 30, the auxiliary storage device 40, the RAM 50, and the ROM 60 so that they can communicate with each other.

＜ハードウェアとソフトウェアとの協働＞
この例のＣＰＵ１０は、読み込まれたＯＳ（Operating System）プログラムに従い、補助記憶装置４０の信号分離プログラム領域４１に格納されている信号分離プログラムを、ＲＡＭ５０の信号分離プログラム領域５１に書き込む。同様にＣＰＵ１０は、補助記憶装置４０のデータ領域４２に格納されている時間領域の混合信号等の各種データをＲＡＭ５０のデータ領域５２に書き込む。さらに、ＣＰＵ１０は、この信号分離プログラムや各種データが書き込まれたＲＡＭ５０上のアドレスをレジスタ１３に格納する。そして、ＣＰＵ１０の制御部１１は、レジスタ１３に格納されたこれらのアドレスを順次読み出し、読み出したアドレスが示すＲＡＭ５０上の領域からプログラムやデータを読み出し、そのプログラムが示す演算を演算部１２に順次実行させ、その演算結果をレジスタ１３に格納していく。 <Cooperation between hardware and software>
The CPU 10 in this example writes the signal separation program stored in the signal separation program area 41 of the auxiliary storage device 40 in the signal separation program area 51 of the RAM 50 in accordance with the read OS (Operating System) program. Similarly, the CPU 10 writes various data such as a mixed signal in the time domain stored in the data area 42 of the auxiliary storage device 40 in the data area 52 of the RAM 50. Further, the CPU 10 stores the address on the RAM 50 where the signal separation program and various data are written in the register 13. Then, the control unit 11 of the CPU 10 sequentially reads these addresses stored in the register 13, reads a program and data from the area on the RAM 50 indicated by the read address, and sequentially executes the calculation indicated by the program to the calculation unit 12. The calculation result is stored in the register 13.

図２は、このようにＣＰＵ１０に信号分離プログラムが読み込まれることにより構成される信号分離装置１のブロック図の例示である。また、図３は、図２における混合信号分類部１３０の詳細を例示したブロック図である。
図２に例示するように、本形態の信号分離装置１は、メモリ１００、周波数領域変換部１２０、混合信号分類部１３０、分離信号生成部１４０、時間領域変換部１５０、制御部１６０及び一時メモリ１７０を有している。 FIG. 2 is an example of a block diagram of the signal separation device 1 configured by reading the signal separation program into the CPU 10 as described above. FIG. 3 is a block diagram illustrating details of the mixed signal classifying unit 130 in FIG.
As illustrated in FIG. 2, the signal separation device 1 of this embodiment includes a memory 100, a frequency domain conversion unit 120, a mixed signal classification unit 130, a separated signal generation unit 140, a time domain conversion unit 150, a control unit 160, and a temporary memory. 170.

ここで、メモリ１００は、記憶領域１０１〜１１１を有している。また、図３に例示するように、混合信号分類部１３０は、周波数帯域決定部１３１、正規化部１３２、第１クラスタリング部１３３、モデルパラメータ抽出部１３４、第２クラスタリング部１３５を有している。また、正規化部１３２は、位相・ノルム正規化部１３２ａ及び周波数正規化部１３２ｂを有している。さらに、第１クラスタリング部１３３は、クラスタ決定部１３３ａ及びセントロイド算出部１３３ｂを有している。 Here, the memory 100 has storage areas 101 to 111. As illustrated in FIG. 3, the mixed signal classification unit 130 includes a frequency band determination unit 131, a normalization unit 132, a first clustering unit 133, a model parameter extraction unit 134, and a second clustering unit 135. . In addition, the normalization unit 132 includes a phase / norm normalization unit 132a and a frequency normalization unit 132b. Furthermore, the first clustering unit 133 includes a cluster determination unit 133a and a centroid calculation unit 133b.

＜処理＞
次に、本形態における信号分離装置１の処理について説明する。なお、以下では、Ｎ個の源信号が混合され、Ｍ個のセンサで観測された状況を取り扱う。また、前処理において、各センサで観測された時間領域の混合信号ｘ_ｑ（ｔ）（q∈{1,...,M}）がメモリ１００の記憶領域１０１に格納され、基準センサＱを示す値Ｑと、基準センサＱと他のセンサｑとの距離の最大値ｄ_ｍａｘとが記憶領域１０５に格納され、セントロイドH_k'の初期値又はクラスタ情報C(f,τ)の初期値が記憶領域１０７に格納されているものとする。 <Processing>
Next, processing of the signal separation device 1 in this embodiment will be described. In the following, the situation where N source signals are mixed and observed by M sensors will be treated. In the preprocessing, the time domain mixed signal x _q (t) (q∈ {1,..., M}) observed by each sensor is stored in the storage area 101 of the memory 100, and the reference sensor Q is The indicated value Q and the maximum distance d _max between the reference sensor Q and another sensor q are stored in the storage area 105, and the initial value of the centroid H _k ′ or the initial value of the cluster information C (f, τ). Is stored in the storage area 107.

図４は、第１の実施の形態における信号分離装置１の処理を説明するためのフローチャートである。また、図５は、図４のステップＳ２の処理の詳細を説明するためのフローチャートである。また、図６及び図７は、図４のステップＳ３の処理の詳細を説明するためのフローチャートである。以下、これらの図を用いて本形態の処理を説明する。
なお、以下の各処理は制御部１６０の制御のもと実行される。また、明記しない限り、処理過程の各データは一時メモリ１７０に逐一格納・抽出される。 FIG. 4 is a flowchart for explaining processing of the signal separation device 1 according to the first embodiment. FIG. 5 is a flowchart for explaining details of the process in step S2 of FIG. 6 and 7 are flowcharts for explaining details of the process in step S3 of FIG. Hereinafter, the processing of this embodiment will be described with reference to these drawings.
The following processes are executed under the control of the control unit 160. Unless otherwise specified, each process data is stored and extracted in the temporary memory 170 one by one.

［周波数領域変換部１２０の処理］
まず、周波数領域変換部１２０（図２）において、メモリ１００の記憶領域１０１から時間領域の混合信号ｘ_ｑ（ｔ）を読み出し、これらを短時間離散フーリエ変換等によって時間周波数スロット(f,τ)毎の信号（周波数領域信号）X_q(f,τ)（q∈{1,...,M}）に変換し、メモリ１００の記憶領域１０２に格納する（ステップＳ１）。 [Processing of Frequency Domain Transformer 120]
First, in the frequency domain transform unit 120 (FIG. 2), the mixed signal x _q (t) in the time domain is read from the storage area 101 of the memory 100, and the time frequency slot (f, τ) is read by short-time discrete Fourier transform or the like. Each signal (frequency domain signal) X _q (f, τ) (qε {1,..., M}) is converted and stored in the storage area 102 of the memory 100 (step S1).

［混合信号分類部１３０の処理］
次に、混合信号分類部１３０の正規化部１３２が、メモリ１００の記憶領域１０２から周波数領域信号X_q(f,τ)を読み込む。そして、正規化部１３２は、空間的エイリアシングが起き得ない周波数（f∈F_L）に対し、周波数領域信号X_q(f,τ)の位相を正規化し、さらに周波数依存成分を排除した周波数正規化信号X_q''(f,τ)を生成する（ステップＳ２）。なお、ステップＳ２の詳細は後述する。生成された周波数正規化信号X_q''(f,τ)は、メモリ１００の記憶領域１０４に格納される。 [Processing of Mixed Signal Classification Unit 130]
Next, the normalization unit 132 of the mixed signal classification unit 130 reads the frequency domain signal X _q (f, τ) from the storage area 102 of the memory 100. Then, the normalization unit 132 normalizes the phase of the frequency domain signal X _q (f, τ) with respect to a frequency (f∈F _L ) at which spatial aliasing cannot occur, and further eliminates frequency-dependent components. Generated signal X _q ″ (f, τ) is generated (step S2). Details of step S2 will be described later. The generated frequency normalized signal X _q ″ (f, τ) is stored in the storage area 104 of the memory 100.

次に、混合信号分類部１３０の第１クラスタリング部１３３が、メモリ１００の記憶領域１０４から周波数正規化信号X_q'' (f,τ)を読み込む。そして、第１クラスタリング部１３３は、当該周波数正規化信号X_q'' (f,τ)を各要素とする、時間周波数毎の特徴量ベクトルX''(f,τ)=[X₁''(f,τ),...,X_M''(f,τ)]^Tをクラスタリングし、各クラスタのセントロイドH_k’を算出する（ステップＳ３）。なお、ステップＳ３の詳細は後述する。生成されたセントロイドH_k’は、メモリ１００の記憶領域１０８に格納される。 Next, the first clustering unit 133 of the mixed signal classifying unit 130 reads the frequency normalized signal X _q ″ (f, τ) from the storage area 104 of the memory 100. Then, the first clustering unit 133 uses the frequency normalized signal X _q ″ (f, τ) as each element, and the feature vector X ″ (f, τ) = [X ₁ ″ for each time frequency. (f, τ),..., X _M ″ (f, τ)] ^T are clustered to calculate the centroid H _k ′ of each cluster (step S3). Details of step S3 will be described later. The generated centroid H _k ′ is stored in the storage area 108 of the memory 100.

次に、混合信号分類部１３０のモデルパラメータ抽出部１３４が、メモリ１００の記憶領域１０８からセントロイドH_k’を読み込む。モデルパラメータ抽出部１３４は、当該セントロイドH_k’=[ H_1k',..., H_Mk']^Tの各要素を用い、前述の式（２３）に従い、周波数応答モデルH_qk(f)のモデルパラメータγ_qk, λ_qkを抽出する（ステップＳ４）。抽出されたモデルパラメータγ_qk, λ_qkは、メモリ１００の記憶領域１０９に格納される。 Next, the model parameter extraction unit 134 of the mixed signal classification unit 130 reads the centroid H _k ′ from the storage area 108 of the memory 100. The model parameter extraction unit 134 uses each element of the centroid H _k ′ = [H _1k ′,..., H _Mk ′] ^T and uses the frequency response model H _qk (f) according to the above equation (23). Model parameters γ _qk and λ _qk are extracted (step S4). The extracted model parameters γ _qk and λ _qk are stored in the storage area 109 of the memory 100.

次に、混合信号分類部１３０の第２クラスタリング部１３５が、メモリ１００の記憶領域１０９からモデルパラメータγ_qk, λ_qkを読み込む。そして、第２クラスタリング部１３５は、全周波数Fについて、前述の式（２４）に従い、モデルパラメータγ_qk, λ_qkが代入された周波数応答モデルH_k(f)を基準とし、周波数領域信号の正規化値X_q’(f,τ)を各要素とする周波数依存の位相・ノルム正規化ベクトルX’(f,τ)=[X₁’(f,τ),...,X_M’(f,τ)]^Tをクラスタリングする。これにより、各位相・ノルム正規化ベクトルX’(f,τ)が属するクラスタに対応する時間周波数スロット(f,τ)を示すクラスタ情報C(f,τ)が生成される（ステップＳ５）。生成されたクラスタ情報C(f,τ)は、メモリ１００の記憶領域１０７に格納される。 Next, the second clustering unit 135 of the mixed signal classifying unit 130 reads the model parameters γ _qk and λ _qk from the storage area 109 of the memory 100. Then, the second clustering unit 135 uses the frequency response model H _k (f) into which the model parameters γ _qk and λ _qk are substituted according to the above-described equation (24) for all the frequencies F, and normalizes the frequency domain signal. of values X _q '(f, τ) phase norm normalized vector X of the frequency-dependent to the elements' (f, τ) = [ X 1 '(f, τ), ..., X M' ( f, τ)] ^T is clustered. Thus, cluster information C (f, τ) indicating the time frequency slot (f, τ) corresponding to the cluster to which each phase / norm normalized vector X ′ (f, τ) belongs is generated (step S5). The generated cluster information C (f, τ) is stored in the storage area 107 of the memory 100.

［分離信号生成部１４０の処理］
次に、分離信号生成部１４０が、メモリ１００の記憶領域１０２，１０７から、それぞれ、周波数領域信号X_q(f,τ)及びクラスタ情報C(f,τ)を読み込む。そして、分離信号生成部１４０は、当該周波数領域信号X_q(f,τ)とクラスタ情報C(f,τ)とを用い、例えば、前述の式（２８）に従い、周波数領域の分離信号Y_k(f,τ)(k∈{1,...,N})を生成する（ステップＳ６）。生成された分離信号Y_k(f,τ)は、メモリ１００の記憶領域１１０に格納される。 [Processing of Separation Signal Generation Unit 140]
Next, the separated signal generation unit 140 reads the frequency domain signal X _q (f, τ) and the cluster information C (f, τ) from the storage areas 102 and 107 of the memory 100, respectively. Then, the separated signal generation unit 140 uses the frequency domain signal X _q (f, τ) and the cluster information C (f, τ), for example, according to the above equation (28), the frequency domain separated signal Y _k. (f, τ) (kε {1,..., N}) is generated (step S6). The generated separated signal Y _k (f, τ) is stored in the storage area 110 of the memory 100.

［時間領域変換部１５０の処理］
次に、時間領域変換部１５０が、メモリ１００の記憶領域１１０から分離信号Y_k(f,τ)を読み込み、それを短時間逆フーリエ変換等によって時間領域の分離信号y_k(t)に変換し、メモリ１００の記憶領域１１１に格納する（ステップＳ７）。
［ステップＳ２の処理の詳細］
次に、ステップＳ２の処理の詳細を説明する。
まず、周波数帯域決定部１３１が、メモリ１００の記憶領域１０５から値ｄ_ｍａｘを読み込み、空間的エイリアシングが起き得ない周波数帯域Ｆ_Ｌを
F_L={f: 0＜f＜c/(2d_max)}
によって算出する（ステップＳ１１）。算出された値Ｆ_Ｌは、メモリ１００の記憶領域１０６に格納される。 [Processing of time domain conversion unit 150]
Next, the time domain transform unit 150 reads the separation signal Y _k (f, τ) from the storage area 110 of the memory 100 and converts it into the time domain separation signal y _k (t) by short-time inverse Fourier transform or the like. Then, it is stored in the storage area 111 of the memory 100 (step S7).
[Details of Step S2 Processing]
Next, details of the process of step S2 will be described.
First, the frequency band determining unit 131, from the storage area 105 of the memory 100 reads the value _{d max,} the frequency band _{F L} that can not occur spatial aliasing
F _L = {f: 0 <f <c / (2d _max )}
(Step S11). Calculated value _{F L} is stored in the storage area 106 of the memory 100.

次に、正規化部１３２の位相・ノルム正規化部１３２ａが、メモリ１００の記憶領域１０５から基準センサＱを示す値Ｑを読み込み、さらに、記憶領域１０２から各周波数領域信号X_q(f,τ)と基準センサＱに対応する各周波数領域信号X_Q(f,τ)とを読み込む。そして、位相・ノルム正規化部１３２ａは、周波数領域信号X_q(f,τ)の偏角を、基準センサQに対応する周波数領域信号X_Q(f,τ)の偏角を基準に正規化し、その正規化値のノルムを所定値に正規化した値を各要素とする位相・ノルム正規化ベクトルX’(f,τ)=[X₁’(f,τ),...,X_M’(f,τ)]^Tを生成する（ステップＳ１２）。なお、この正規化は、例えば、例えば、前述の式（１５）に従って行われる。生成された位相・ノルム正規化ベクトルX’(f,τ)は、メモリ１００の記憶領域１０３に格納される。 Next, the phase / norm normalization unit 132 a of the normalization unit 132 reads the value Q indicating the reference sensor Q from the storage area 105 of the memory 100, and further, each frequency domain signal X _q (f, τ) from the storage area 102. ) And each frequency domain signal X _Q (f, τ) corresponding to the reference sensor Q is read. Then, the phase / norm normalization unit 132a normalizes the declination of the frequency domain signal X _q (f, τ) based on the declination of the frequency domain signal X _Q (f, τ) corresponding to the reference sensor Q. , The normalization vector X '(f, τ) = [X ₁ ' (f, τ), ..., X _M '(f, τ)] ^T is generated (step S12). Note that this normalization is performed, for example, according to the above-described equation (15). The generated phase / norm normalized vector X ′ (f, τ) is stored in the storage area 103 of the memory 100.

次に、正規化部１３２の周波数正規化部１３２ｂが、メモリ１００の記憶領域１０３，１０６から、それぞれ、位相・ノルム正規化ベクトルX’(f,τ)と値Ｆ_Ｌとを読み込む。そして、周波数正規化部１３２ｂは、例えば、式（１８）に従い、周波数帯域F_Lに属する全ての周波数fについて、位相・ノルム正規化ベクトルX’(f,τ)の各要素X’_q(f,τ)の偏角を周波数fに比例した値で除算する正規化を行い、位相・ノルム正規化ベクトルX’(f,τ)の周波数依存性を排除した周波数正規化信号X_q''(f,τ)(f∈F_L)を生成する（ステップＳ１３）。生成された周波数正規化信号X_q''(f,τ)は、メモリ１００の記憶領域１０４に格納される。 Then, the frequency normalizing section 132b of the normalization unit 132 reads from the storage area 103 and 106 of the memory 100, respectively, phase-norm normalized vector X '(f, τ) and the value _{F L.} Then, the frequency normalizing section 132b, for example, in accordance with the equation (18), the frequency band F _L for all frequencies f belonging to the phase-norm normalized vector X '(f, tau) each element of X' _q (f , τ) is normalized by dividing the declination by a value proportional to the frequency f, and the frequency normalized signal X _q '' (), which eliminates the frequency dependence of the phase / norm normalized vector X ′ (f, τ). f, τ) (fεF _L ) is generated (step S13). The generated frequency normalized signal X _q ″ (f, τ) is stored in the storage area 104 of the memory 100.

［ステップＳ３の処理の詳細（パターンＡ）］
次に、ステップＳ３の処理の詳細（パターンＡ）を説明する（図６）。
まず、第１クラスタリング部１３３が、メモリ１００の記憶領域１０８からセントロイドH_k’の初期値を読み込む（ステップＳ２１）。
次に、第１クラスタリング部１３３のクラスタ決定部１３３ａが、例えば、前述の式（２２）に従い、記憶領域１０４から読み込んだ特徴量ベクトルX''(f,τ)（f∈F_L）と、記憶領域１０８から読み込んだセントロイドH_k’との距離の総和が最小となるように、各特徴量ベクトルX''(f,τ)に対応するクラスタを決定し、各特徴量ベクトルX''(f,τ)の時間周波数スロット(f,τ)に対応するクラスタ情報C(f,τ)に、当該特徴量ベクトルX''(f,τ)が属するクラスタを特定する値ｋを代入する（ステップＳ２２）。このような処理がなされた各クラスタ情報C(f,τ)は、メモリ１００の記憶領域１０７に格納される。 [Details of processing in step S3 (pattern A)]
Next, details (pattern A) of the processing in step S3 will be described (FIG. 6).
First, the first clustering unit 133 reads the initial value of the centroid H _k ′ from the storage area 108 of the memory 100 (step S21).
Next, the cluster determining unit 133a of the first clustering unit 133, for example, according to the above-described equation (22), the feature quantity vector X ″ (f, τ) (f∈F _L ) read from the storage area 104, A cluster corresponding to each feature vector X ″ (f, τ) is determined so that the sum of the distances from the centroid H _k ′ read from the storage area 108 is minimized, and each feature vector X ″ is determined. A value k that identifies the cluster to which the feature vector X ″ (f, τ) belongs is substituted into the cluster information C (f, τ) corresponding to the time frequency slot (f, τ) of (f, τ). (Step S22). Each piece of cluster information C (f, τ) subjected to such processing is stored in the storage area 107 of the memory 100.

次に、第１クラスタリング部１３３のセントロイド算出部１３３ｂが、メモリ１００の記憶領域１０４，１０７から、それぞれ、特徴量ベクトルX''(f,τ)及びクラスタ情報C(f,τ)を読み込む。そして、セントロイド算出部１３３ｂは、例えば、前述の式（２１）に従い、C(f,τ)=kである時間周波数スロット(f,τ)に対応する特徴量ベクトルX''(f,τ)を足し合わせ、その加算結果のノルムを所定値に正規化したものを新たなセントロイドH_k’とする（ステップＳ２３）。生成された新たなセントロイドH_k’は、メモリ１００の記憶領域１０８に格納される。 Next, the centroid calculation unit 133b of the first clustering unit 133 reads the feature vector X ″ (f, τ) and the cluster information C (f, τ) from the storage areas 104 and 107 of the memory 100, respectively. . Then, the centroid calculation unit 133b, for example, according to the above-described equation (21), the feature vector X ″ (f, τ) corresponding to the time frequency slot (f, τ) where C (f, τ) = k. ) And the norm of the addition result normalized to a predetermined value is set as a new centroid H _k ′ (step S23). The generated new centroid H _k ′ is stored in the storage area 108 of the memory 100.

次に、制御部１６０が、所定の終了条件を満たした否かを判断する（ステップＳ２４）。この終了条件としては、例えば、「ステップＳ２２，Ｓ２３の処理を交互に所定回数繰り返したこと」や、「ステップＳ２３で更新されたセントロイドH_k’と更新前のセントロイドH_k’との変位が、所定の範囲内であること」等を例示できる。ここで、制御部１６０が、所定の終了条件を満たしていないと判断した場合、制御部１６０は、処理をステップＳ２２に戻す。一方、制御部１６０が、所定の終了条件を満たしたと判断した場合には、制御部１６０は、ステップＳ３の処理を終了させる。なお、ステップＳ３の処理が終了した時点でメモリ１００の記憶領域１０８に格納されているセントロイドH_k’を、正規のセントロイドH_k’として取り扱う（［ステップＳ３の処理の詳細（パターンＡ）］の説明終わり）。 Next, the control unit 160 determines whether or not a predetermined end condition is satisfied (step S24). As the termination condition, for example, “the process of steps S22 and S23 is alternately repeated a predetermined number of times” or “the displacement between the centroid H _k ′ updated in step S23 and the centroid H _k ′ before update Is within a predetermined range. " If the control unit 160 determines that the predetermined end condition is not satisfied, the control unit 160 returns the process to step S22. On the other hand, when the control unit 160 determines that the predetermined end condition is satisfied, the control unit 160 ends the process of step S3. Note that the centroid H _k ′ stored in the storage area 108 of the memory 100 at the time when the processing of step S3 is completed is treated as a regular centroid H _k ′ ([Details of processing of step S3 (pattern A) ] End of explanation).

［ステップＳ３の処理の詳細（パターンＢ）］
次に、ステップＳ３の処理の詳細（パターンＢ）を説明する（図７）。
まず、第１クラスタリング部１３３が、メモリ１００の記憶領域１０７からクラスタ情報C(f,τ)の初期値を読み込む（ステップＳ３１）。
次に、第１クラスタリング部１３３のセントロイド算出部１３３ｂが、例えば、前述の式（２１）に従い、C(f,τ)=kである時間周波数スロット(f,τ)に対応する特徴量ベクトルX''(f,τ)を足し合わせ、その加算結果のノルムを所定値に正規化したものを新たなセントロイドH_k’とする。生成された新たなセントロイドH_k’は、メモリ１００の記憶領域１０８に格納される（ステップＳ３２）。 [Details of processing in step S3 (pattern B)]
Next, details (pattern B) of the process of step S3 will be described (FIG. 7).
First, the first clustering unit 133 reads the initial value of the cluster information C (f, τ) from the storage area 107 of the memory 100 (step S31).
Next, the centroid calculation unit 133b of the first clustering unit 133, for example, according to the above-described equation (21), the feature quantity vector corresponding to the time frequency slot (f, τ) where C (f, τ) = k. X ″ (f, τ) is added, and the norm of the addition result is normalized to a predetermined value as a new centroid H _k ′. The generated new centroid H _k ′ is stored in the storage area 108 of the memory 100 (step S32).

次に、第１クラスタリング部１３３のクラスタ決定部１３３ａが、例えば、前述の式（２２）に従い、記憶領域１０４から読み込んだ特徴量ベクトルX''(f,τ)（f∈F_L）と、記憶領域１０８から読み込んだセントロイドH_k’との距離の総和が最小となるように、各特徴量ベクトルX''(f,τ)に対応するクラスタを決定し、各特徴量ベクトルX''(f,τ)の時間周波数スロット(f,τ)に対応するクラスタ情報C(f,τ)に、当該特徴量ベクトルX''(f,τ)が属するクラスタを特定する値ｋを代入する（ステップＳ３３）。このような処理がなされた各クラスタ情報C(f,τ)は、メモリ１００の記憶領域１０７に格納される。 Next, the cluster determining unit 133a of the first clustering unit 133, for example, according to the above-described equation (22), the feature quantity vector X ″ (f, τ) (f∈F _L ) read from the storage area 104, A cluster corresponding to each feature vector X ″ (f, τ) is determined so that the sum of the distances from the centroid H _k ′ read from the storage area 108 is minimized, and each feature vector X ″ is determined. A value k that identifies the cluster to which the feature vector X ″ (f, τ) belongs is substituted into the cluster information C (f, τ) corresponding to the time frequency slot (f, τ) of (f, τ). (Step S33). Each piece of cluster information C (f, τ) subjected to such processing is stored in the storage area 107 of the memory 100.

次に、制御部１６０が、所定の終了条件を満たした否かを判断する（ステップＳ３４）。この終了条件としては、例えば、「ステップＳ３２，Ｓ３３の処理を交互に所定回数繰り返したこと」や、「ステップＳ３２で更新されたセントロイドH_k’と更新前のセントロイドH_k’との変位が、所定の範囲内であること」等を例示できる。ここで、制御部１６０が、所定の終了条件を満たしていないと判断した場合、制御部１６０は、処理をステップＳ３２に戻す。一方、制御部１６０が、所定の終了条件を満たしたと判断した場合には、制御部１６０は、ステップＳ３の処理を終了させる。なお、ステップＳ３の処理が終了した時点でメモリ１００の記憶領域１０８に格納されているセントロイドH_k’を、正規のセントロイドH_k’として取り扱う（［ステップＳ３の処理の詳細（パターンＡ）］の説明終わり）。 Next, the control unit 160 determines whether or not a predetermined end condition is satisfied (step S34). As the termination condition, for example, “the process of steps S32 and S33 is alternately repeated a predetermined number of times” or “the displacement between the centroid H _k ′ updated in step S32 and the centroid H _k ′ before update is used. Is within a predetermined range. " If the control unit 160 determines that the predetermined end condition is not satisfied, the control unit 160 returns the process to step S32. On the other hand, when the control unit 160 determines that the predetermined end condition is satisfied, the control unit 160 ends the process of step S3. Note that the centroid H _k ′ stored in the storage area 108 of the memory 100 at the time when the processing of step S3 is completed is treated as a regular centroid H _k ′ ([Details of processing of step S3 (pattern A) ] End of explanation).

〔第２の実施の形態〕
次に、本発明における第２の実施の形態について説明する。本形態は、モデルパラメータλ_ｑｋ，γ_ｑｋの最適化を行って、全周波数ｆ∈Ｆについて、式（２４）に従ってクラスタ情報C(f,τ)を生成する。
＜構成＞
第１の実施の形態との相違点は混合信号分類部のみである。以下では、混合信号分類部の構成のみを示す。
図８は、第２の実施の形態における混合信号分類部２３０の詳細を例示したブロック図である。なお、図８において、第１の実施の形態と共通する部分については、図３と同じ符号を付し、説明を省略する。
図８に例示するように、本形態の混合信号分類部２３０は、周波数帯域決定部１３１、正規化部１３２、第１クラスタリング部１３３、モデルパラメータ抽出部１３４、第２クラスタリング部２３５を有している。ここで、本形態の第２クラスタリング部２３５は、クラスタ決定部２３５ａ及びモデルパラメータ最適化部２３５ｂを有している。 [Second Embodiment]
Next, a second embodiment of the present invention will be described. In this embodiment, the model parameters λ _qk and γ _qk are optimized to generate cluster information C (f, τ) for all frequencies fεF according to the equation (24).
<Configuration>
The difference from the first embodiment is only the mixed signal classification unit. In the following, only the configuration of the mixed signal classification unit is shown.
FIG. 8 is a block diagram illustrating details of the mixed signal classifying unit 230 according to the second embodiment. In FIG. 8, portions common to the first embodiment are denoted by the same reference numerals as those in FIG. 3, and description thereof is omitted.
As illustrated in FIG. 8, the mixed signal classification unit 230 of this embodiment includes a frequency band determination unit 131, a normalization unit 132, a first clustering unit 133, a model parameter extraction unit 134, and a second clustering unit 235. Yes. Here, the second clustering unit 235 of the present embodiment includes a cluster determination unit 235a and a model parameter optimization unit 235b.

＜処理＞
次に、本形態における信号分離装置の処理について説明する。なお、以下でも、Ｎ個の源信号が混合され、Ｍ個のセンサで観測された状況を取り扱う。また、第１の実施の形態と同じ前処理が行なわれているものとする。
図９は、第２の実施の形態における信号分離装置の処理を説明するためのフローチャートである。また、図１０は、図９に示したステップＳ１０５の処理の詳細を説明するためのフローチャートである。 <Processing>
Next, processing of the signal separation device in this embodiment will be described. In the following, the situation where N source signals are mixed and observed by M sensors will be treated. Further, it is assumed that the same preprocessing as in the first embodiment is performed.
FIG. 9 is a flowchart for explaining processing of the signal separation device according to the second embodiment. FIG. 10 is a flowchart for explaining details of the process in step S105 shown in FIG.

以下、これらの図を用いて本形態の処理を説明する。なお、以下の各処理は制御部１６０の制御のもと実行される。また、明記しない限り、処理過程の各データは一時メモリ１７０に逐一格納・抽出される。
［周波数領域変換部１２０、周波数帯域決定部１３１、正規化部１３２、第１クラスタリング部１３３、モデルパラメータ抽出部１３４の処理］
これらの処理は、第１の実施の形態のステップＳ１〜Ｓ４と同じである（ステップＳ１０１〜Ｓ１０４）。 Hereinafter, the processing of this embodiment will be described with reference to these drawings. The following processes are executed under the control of the control unit 160. Unless otherwise specified, each process data is stored and extracted in the temporary memory 170 one by one.
[Processing of Frequency Domain Transformer 120, Frequency Band Determining Unit 131, Normalizing Unit 132, First Clustering Unit 133, and Model Parameter Extracting Unit 134]
These processes are the same as steps S1 to S4 in the first embodiment (steps S101 to S104).

［第２クラスタリング部２３５の処理］
本形態の第２クラスタリング部２３５は、全周波数ｆ∈Ｆについて、モデルパラメータγ_qk, λ_qkが代入された周波数応答モデルH_qk(f)を基準とし、位相・ノルム正規化ベクトルX’(f,τ)をクラスタリングし、クラスタ情報C(f,τ)を生成する処理と、クラスタ情報C(f,τ)と、位相・ノルム正規化ベクトルX’(f,τ)とを用い、モデルパラメータγ_qk, λ_qkを最適化する処理と、を交互に繰り返す（ステップＳ１０５）。これにより、最適化されたモデルパラメータγ_qk, λ_qkと、最適化されたモデルパラメータγ_qk, λ_qkが代入された周波数応答モデルH_qk(f)を基準に生成されたクラスタ情報C(f,τ)とを得ることができる。以下、図１０を用い、ステップＳ１０５の処理の詳細を説明する。 [Processing of Second Clustering Unit 235]
The second clustering unit 235 of this embodiment uses the frequency response model H _qk (f) into which the model parameters γ _qk and λ _qk are substituted for all frequencies fεF as a reference, and a phase / norm normalized vector X ′ (f , τ) are clustered to generate cluster information C (f, τ), cluster information C (f, τ), and phase / norm normalized vector X ′ (f, τ). The process of optimizing γ _qk and λ _qk is alternately repeated (step S105). Thus, the optimized model parameter gamma _qk, lambda _qk and the optimized model parameter gamma _qk, lambda _qk assignment frequency response model H _qk cluster information generated based on (f) C (f , τ). Hereinafter, the details of the process of step S105 will be described with reference to FIG.

まず、第２クラスタリング部２３５が、メモリ１００の記憶領域１０３，１０９から、それぞれ、位相・ノルム正規化ベクトルX’(f,τ)とモデルパラメータγ_qk, λ_qkとを読み込む。そして、第２クラスタリング部２３５のクラスタ決定部２３５ａが、前述の式（２４）に従い、位相・ノルム正規化ベクトルX'(f,τ)の各要素X_q'(f,τ)と、モデルパラメータγ_qk, λ_qkが代入された周波数応答モデルλ_qk・exp(-j・2π・f・γ_qk)と、の距離の総和を最小にするｋを、位相・ノルム正規化ベクトルX'(f,τ)に対応するクラスタ情報C(f,τ)の値とする（ステップＳ１１１）。このような処理がなされたクラスタ情報C(f,τ)は、メモリ１００の記憶領域１０７に格納される。 First, the second clustering unit 235 reads the phase / norm normalized vector X ′ (f, τ) and the model parameters γ _qk and λ _qk from the storage areas 103 and 109 of the memory 100, respectively. Then, the cluster determining unit 235a of the second clustering unit 235, in accordance with the aforementioned equation (24), the phase norm normalized vector X '(f, tau) each element of X _q' (f, tau) and model parameters The frequency response model λ _qk · exp (−j · 2π · f · γ _qk ) to which γ _qk and λ _qk are substituted, and k that minimizes the sum of the distances are set to the phase / norm normalized vector X ′ (f , τ) as the value of the cluster information C (f, τ) (step S111). The cluster information C (f, τ) subjected to such processing is stored in the storage area 107 of the memory 100.

次に、第２クラスタリング部２３５のモデルパラメータ最適化部２３５ｂが、前述の式（２６）（２７）に従い、位相・ノルム正規化ベクトルX'(f,τ)の各要素X_q'(f,τ)に対応するセンサqと、クラスタ情報C(f,τ)が示す位相・ノルム正規化ベクトルX'(f,τ)が属するクラスタに対応する信号源kと、の組に対応するモデルパラメータγ_qk, λ_qkが代入された周波数応答モデルλ_qk・exp(-j・2π・f・γ_qk)と、当該位相・ノルム正規化ベクトルX'(f,τ)の各要素X_q'(f,τ)との距離の総和からなるコスト関数（式（１６））に対し、モデルパラメータγ_qk, λ_qkに関する偏微分を行う（ステップＳ１１２）。そして、モデルパラメータ最適化部２３５ｂは、前述の式（２５）に従い、各偏微分値に比例する値を、モデルパラメータγ_qk, λ_qkから、それぞれ、減算して当該モデルパラメータγ_qk, λ_qkを最適化する（ステップＳ１１３）。 Next, the model parameter optimization unit 235b of the second clustering unit 235 follows each of the elements X _q ′ (f, τ) of the phase / norm normalized vector X ′ (f, τ) according to the above-described equations (26) and (27). model parameter corresponding to the set of the sensor q corresponding to τ) and the signal source k corresponding to the cluster to which the phase / norm normalized vector X ′ (f, τ) indicated by the cluster information C (f, τ) belongs Frequency response model λ _qk・ exp (-j ・ 2π ・ f ・ γ _qk ) substituted with γ _qk , λ _qk and each element X _q ′ () of the phase / norm normalized vector X ′ (f, τ) A partial differentiation with respect to the model parameters γ _qk and λ _qk is performed on the cost function (equation (16)) consisting of the sum of the distances to f, τ) (step S112). Then, the model parameter optimizing unit 235b subtracts a value proportional to each partial differential value from the model parameters γ _qk and λ _qk according to the above equation (25), respectively, and the model parameters γ _qk and λ _qk. Is optimized (step S113).

次に、制御部１６０が、所定の終了条件を満たした否かを判断する（ステップＳ１１４）。この終了条件としては、例えば、「ステップＳ１１１，Ｓ１１２の処理を交互に所定回数繰り返したこと」や、「ステップＳ１１２で算出された各編微分値が所定の閾値以下であること」などを例示できる。ここで、制御部１６０が、所定の終了条件を満たしていないと判断した場合、制御部１６０は、処理をステップＳ１１１に戻す。一方、制御部１６０が、所定の終了条件を満たしたと判断した場合には、制御部１６０は、ステップＳ１０５の処理を終了させる。 Next, the control unit 160 determines whether or not a predetermined end condition is satisfied (step S114). Examples of the termination condition include “the processing of steps S111 and S112 being alternately repeated a predetermined number of times” and “the knitting differential values calculated in step S112 being equal to or less than a predetermined threshold”. . Here, when the control unit 160 determines that the predetermined end condition is not satisfied, the control unit 160 returns the process to step S111. On the other hand, when the control unit 160 determines that the predetermined end condition is satisfied, the control unit 160 ends the process of step S105.

［分離信号生成部１４０及び時間領域変換部１５０の処理］
これらの処理は、第１の実施の形態のステップＳ６、Ｓ７と同じである（ステップＳ１０６，１０７）。
〔第３の実施の形態〕
次に、本発明における第３の実施の形態について説明する。本形態は、第２の実施の形態の変形例である。本形態では、任意にモデルパラメータλ_ｑｋ，γ_ｑｋの初期値を選択し、第２の実施の形態と同様にモデルパラメータλ_ｑｋ，γ_ｑｋの最適化を行ってモデルパラメータλ_ｑｋ，γ_ｑｋを決定する。そして、全周波数ｆ∈Ｆについて、式（２４）に従ってクラスタ情報C(f,τ)を生成する。以下では、第１，２の実施の形態との相違点のみを示す。 [Processing of Separation Signal Generation Unit 140 and Time Domain Conversion Unit 150]
These processes are the same as steps S6 and S7 of the first embodiment (steps S106 and 107).
[Third Embodiment]
Next, a third embodiment of the present invention will be described. This embodiment is a modification of the second embodiment. In this embodiment, the initial values of the model parameters λ _qk and γ _qk are arbitrarily selected, and the model parameters λ _qk and γ _qk are optimized as in the second embodiment to obtain the model parameters λ _qk and γ _qk . decide. Then, cluster information C (f, τ) is generated for all frequencies fεF according to equation (24). Only differences from the first and second embodiments will be described below.

＜構成＞
第１の実施の形態との相違点は混合信号分類部のみである。以下では、混合信号分類部の構成のみを示す。
図１１は、第３の実施の形態における混合信号分類部３３０の詳細を例示したブロック図である。なお、図１１において、第１，２の実施の形態と共通する部分については、図３，８と同じ符号を付し、説明を省略する。
図１に例示するように、本形態の混合信号分類部３３０は、位相・ノルム正規化部１３２ａ、第２クラスタリング部２３５及びモデルパラメータ選択部３３３を有している。 <Configuration>
The difference from the first embodiment is only the mixed signal classification unit. In the following, only the configuration of the mixed signal classification unit is shown.
FIG. 11 is a block diagram illustrating details of the mixed signal classification unit 330 according to the third embodiment. In FIG. 11, portions common to the first and second embodiments are denoted by the same reference numerals as in FIGS. 3 and 8, and description thereof is omitted.
As illustrated in FIG. 1, the mixed signal classification unit 330 according to the present exemplary embodiment includes a phase / norm normalization unit 132 a, a second clustering unit 235, and a model parameter selection unit 333.

＜処理＞
次に、本形態における信号分離装置の処理について説明する。なお、以下でも、Ｎ個の源信号が混合され、Ｍ個のセンサで観測された状況を取り扱う。また、第１の実施の形態と同じ前処理が行なわれているものとする。
図１２は、第３の実施の形態における信号分離装置の処理を説明するためのフローチャートである。以下、これらの図を用いて本形態の処理を説明する。
［周波数領域変換部１２０の処理］
この処理は、第１の実施の形態のステップＳ１と同じである（ステップＳ２０１）。 <Processing>
Next, processing of the signal separation device in this embodiment will be described. In the following, the situation where N source signals are mixed and observed by M sensors will be treated. Further, it is assumed that the same preprocessing as in the first embodiment is performed.
FIG. 12 is a flowchart for explaining processing of the signal separation device according to the third embodiment. Hereinafter, the processing of this embodiment will be described with reference to these drawings.
[Processing of Frequency Domain Transformer 120]
This process is the same as step S1 of the first embodiment (step S201).

［位相・ノルム正規化部１３２ａの処理］
この処理は、第１の実施の形態のステップＳ１２と同じである（ステップＳ２０２）。
［モデルパラメータ選択部３３３の処理］
モデルパラメータ選択部３３３は、任意にモデルパラメータγ_qk, λ_qkの初期値を選択する。選択されたモデルパラメータγ_qk, λ_qkの初期値は、メモリ１００の記憶領域１０９に格納される（ステップＳ２０３）。
［第２クラスタリング部２３５の処理］
この処理は、第２の実施の形態のステップＳ１０５と同じである（ステップＳ２０４）。 [Processing of phase / norm normalization unit 132a]
This process is the same as step S12 of the first embodiment (step S202).
[Process of Model Parameter Selection Unit 333]
The model parameter selection unit 333 arbitrarily selects initial values of the model parameters γ _qk and λ _qk . The initial values of the selected model parameters γ _qk and λ _qk are stored in the storage area 109 of the memory 100 (step S203).
[Processing of Second Clustering Unit 235]
This process is the same as step S105 of the second embodiment (step S204).

［分離信号生成部１４０及び時間領域変換部１５０の処理］
これらの処理は、第１の実施の形態のステップＳ６、Ｓ７と同じである（ステップＳ２０５，２０６）。
〔実験結果〕
次に、上述した各形態の効果を示すために、図１３（ａ）に示した実験条件Ａと、さらに図１３（ｂ）に示した実験条件Ｂとについて複数の音声を分離する実験を行った結果を示す。これらの条件は、空間的エイリアシングが起き得る条件である。なお、比較のため、以下の３種類の手順で実験を行った。 [Processing of Separation Signal Generation Unit 140 and Time Domain Conversion Unit 150]
These processes are the same as steps S6 and S7 in the first embodiment (steps S205 and 206).
〔Experimental result〕
Next, in order to show the effects of the above-described embodiments, an experiment is performed to separate a plurality of sounds under the experimental condition A shown in FIG. 13 (a) and the experimental condition B shown in FIG. 13 (b). The results are shown. These conditions are conditions under which spatial aliasing can occur. For comparison, the experiment was performed by the following three types of procedures.

手順I：空間的エイリアシングを考慮せず、すべての周波数ｆ∈Ｆの混合信号ベクトルに対して、第１クラスタリング部によるクラスタリング、即ち、式（２１）（２２）を繰り返し適用した（従来構成）。そのため、空間的エイリアシングが起こっている高域では、生成されるクラスタが大きく間違っている可能性が高い。
手順II：空間的エイリアシングが起きない低域の周波数ｆ∈Ｆ_Ｌの混合信号ベクトルのみに対して、第１クラスタリング部によるクラスタリング、即ち、式（２１）（２２）を繰り返し適用し、クラスタのセントロイドを用いてモデルパラメータを算出した（式（２３））。そして、算出したモデルパラメータをそのまま用い、全ての周波数ｆ∈Ｆについて、式（２４）によってクラスタリングを行った（第１の実施の形態）。
手順III：モデルパラメータ最適化部によって、式（２４）と（２５）とを繰り返し適用して最適化したモデルパラメータλ_ｑｋ，γ_ｑｋを用い、全ての周波数ｆ∈Ｆについて、式（２４）によってクラスタリングを行った（第２，３の実施の形態）。 Procedure I: Spatial aliasing was not considered, and clustering by the first clustering unit, that is, Expressions (21) and (22) were repeatedly applied to the mixed signal vectors of all frequencies fεF (conventional configuration). Therefore, in the high region where spatial aliasing occurs, there is a high possibility that the generated cluster is largely incorrect.
Procedure II: The clustering by the first clustering unit, that is, the expressions (21) and (22) are repeatedly applied to only the mixed signal vector of the low frequency fεF _L where no spatial aliasing occurs, and the cent Model parameters were calculated using Lloyd (formula (23)). Then, using the calculated model parameters as they are, clustering was performed for all frequencies fεF according to Expression (24) (first embodiment).
Step III: Using the model parameters λ _qk and γ _qk optimized by repeatedly applying the equations (24) and (25) by the model parameter optimizing unit, for all frequencies fεF, the equation (24) is used. Clustering was performed (second and third embodiments).

以下は、１６通りの音声の組み合わせに対して行われた各条件に対応する分離性能の平均値を示した表である。

ここで、分離性能は、SIR（Signal-to-Interference Ratio，分離信号における目的信号成分と干渉信号成分のパワー比）で測定した。値が大きいほど良い結果を示す。双方の実験条件において、手順II、手順IIIと分離性能が向上していくことが見受けられ、各形態の効果が確認できた。なお、入力SIRとは、センサでの観測信号のSIRである。 The following is a table showing the average value of the separation performance corresponding to each condition performed for 16 combinations of sounds.

Here, the separation performance was measured by SIR (Signal-to-Interference Ratio, the power ratio between the target signal component and the interference signal component in the separated signal). Larger values indicate better results. Under both experimental conditions, it was observed that the separation performance was improved as in Procedure II and Procedure III, and the effects of each form could be confirmed. The input SIR is the SIR of the signal observed by the sensor.

〔各実施の形態の効果〕
以上、上述した各形態の効果をまとめる。各形態の手法により、空間的エイリアシングが起こる高城でも、混合信号ベクトルの分類が正しく行え、結果として分離性能が向上する。さらに、第２の実施の形態のように、全ての帯域での位相・ノルム正規化ベクトルを用いモデルパラメータλ_ｑｋ，γ_ｑｋを改善することにより、その正確さが高まる。さらに、第３の実施の形態のように、任意にモデルパラメータλ_ｑｋ，γ_ｑｋを選択し、これらを周波数依存の位相・ノルム正規化ベクトルを用いて改善していく手法によっても、空間的エイリアシングが起こる高城において精度の高い信号分離を行うことができる。 [Effect of each embodiment]
The effects of the above embodiments are summarized as above. With each form of technique, even in Takashiro, where spatial aliasing occurs, the mixed signal vectors can be correctly classified, resulting in improved separation performance. Further, as in the second embodiment, the accuracy is improved by improving the model parameters λ _qk and γ _qk using the phase / norm normalized vector in all bands. Further, as in the third embodiment, spatial aliasing is also _achieved by a _{method of} arbitrarily selecting model parameters λ _qk and γ _qk and improving them using a frequency-dependent phase / norm normalization vector. High-precision signal separation can be performed in Takashiro where this occurs.

〔変形例など〕
なお、本発明は上述の実施の形態に限定されるものではない。例えば、第１，２の実施の形態では、第１クラスタリング部が、空間的エイリアシングが起き得ない周波数ｆ∈Ｆ_Ｌの特徴量ベクトルのみをクラスタリングすることとした。しかし、第１クラスタリング部が、空間的エイリアシングが起き得ない周波数ｆ∈Ｆ_Ｌの特徴量ベクトルに、多少、空間的エイリアシングが起きうる周波数の特徴量ベクトルが混在したものをクラスタリングすることとしてもよい。前述の巡回的ジャンプが生じた特徴量ベクトルが多少混在していても、その割合が特徴量ベクトル全体からみてわずかであれば、ある程度の精度でクラスタリングを行い、モデルパラメータを生成することが可能だからである。すなわち、第１クラスタリング部は、空間的エイリアシングが起き得ない周波数を含む周波数帯域の特徴量ベクトルをクラスタリングするものであればよい。 [Modifications, etc.]
The present invention is not limited to the embodiment described above. For example, in the first and second embodiments, the first clustering unit clusters only the feature quantity vectors having the frequency fεF _L at which spatial aliasing cannot occur. However, the first clustering unit may perform clustering of a feature vector having a frequency fεF _L at which spatial aliasing cannot occur, and a feature vector having a frequency at which spatial aliasing may occur. . Even if feature vectors that have undergone the above-mentioned cyclic jumps are mixed together, if the ratio is small when viewed from the whole feature vectors, clustering can be performed with a certain degree of accuracy and model parameters can be generated. It is. That is, the first clustering unit only needs to cluster feature quantity vectors in a frequency band including frequencies where spatial aliasing cannot occur.

また、例えば、第１，２の実施の形態では、第２クラスタリング部が、全周波数ｆ∈Ｆの位相・ノルム正規化ベクトルをクラスタリングすることとした。しかし、一部の周波数での分離信号が不要なのであれば、第２クラスタリング部が、全周波数帯域の一部であって、空間的エイリアシングが起き得る周波数を含む周波数帯域の正規化混合信号ベクトルをクラスタリングする構成であってもよい。 Further, for example, in the first and second embodiments, the second clustering unit clusters the phase / norm normalized vectors of all frequencies fεF. However, if separation signals at some frequencies are not necessary, the second clustering unit may calculate a normalized mixed signal vector of a frequency band that includes a frequency that is a part of the entire frequency band and that may cause spatial aliasing. A configuration for clustering may be used.

また、全ての周波数での分離信号が必要な場合であっても、第１クラスタリング部で生成されたクラスタ情報（「初期クラスタ情報」に相当）をも分離信号の生成に流用し、第２クラスタリング部が、第１クラスタリング部でクラスタ情報が生成されていない周波数のクラスタ情報のみを生成する構成であってもよい。これによって、クラスタリングの演算量を低減させることができる。また、第１クラスタリング部で生成されたクラスタ情報と、第２クラスタリング部で生成されたクラスタ情報が一部の周波数で重複していてもよい。さらに、第１クラスタリング部で生成されたクラスタ情報と、第２クラスタリング部で生成されたクラスタ情報が全周波数帯域Ｆを網羅していなくてもよい。すなわち、第１クラスタリング部が、空間的エイリアシングが起き得ない周波数を含む周波数帯域に対応する特徴量ベクトルがそれぞれ属するクラスタに対応する初期クラスタ情報を生成し、第２クラスタリング部が、空間的エイリアシングが起き得る周波数を含む周波数帯域に対応する正規化混合信号ベクトルがそれぞれ属するクラスタに対応するクラスタ情報を生成し、分離信号生成部が、第１クラスタリング部が生成した初期クラスタ情報と、第２クラスタリング部が生成したクラスタ情報とを用い、分離信号を生成する構成であってもよい。 Further, even when separated signals at all frequencies are required, the cluster information generated by the first clustering unit (corresponding to “initial cluster information”) is also used for the generation of separated signals, and the second clustering is performed. The unit may generate only cluster information of a frequency for which cluster information is not generated by the first clustering unit. Thereby, the calculation amount of clustering can be reduced. Further, the cluster information generated by the first clustering unit and the cluster information generated by the second clustering unit may overlap at some frequencies. Furthermore, the cluster information generated by the first clustering unit and the cluster information generated by the second clustering unit may not cover the entire frequency band F. That is, the first clustering unit generates initial cluster information corresponding to clusters to which feature vectors corresponding to frequency bands including frequencies where spatial aliasing cannot occur, and the second clustering unit performs spatial aliasing. Cluster information corresponding to clusters to which each of the normalized mixed signal vectors corresponding to frequency bands including frequencies that may occur belongs, and the separated signal generation unit includes initial cluster information generated by the first clustering unit, and a second clustering unit. The configuration may be such that the separated signal is generated using the cluster information generated by.

また、上述の第１，２の実施の形態では、式（１５）によって位相・ノルム正規化ベクトルの各要素を生成する手法を例示した。しかし、周波数領域信号偏角を、基準センサに対応する周波数領域信号の偏角を基準に正規化し、その正規化値のノルムを所定値に正規化できるのであれば、特に、これに限定はされない。例えば、以下の式によって位相・ノルム正規化ベクトルの各要素を生成してもよい。

ただし、・^＊は、・の複素共役である。また、ψ｛・｝は関数であり、クラスタリング精度の観点から好ましくは単調増加関数であることが望ましい。 In the first and second embodiments described above, the method of generating each element of the phase / norm normalized vector by Expression (15) is exemplified. However, the frequency domain signal declination is not particularly limited as long as it normalizes the declination of the frequency domain signal corresponding to the reference sensor and normalizes the norm of the normalized value to a predetermined value. . For example, each element of the phase / norm normalized vector may be generated by the following expression.

However, ^* is a complex conjugate of. Also, ψ {·} is a function, and preferably a monotonically increasing function from the viewpoint of clustering accuracy.

また、第１，２の実施の形態では、位相・ノルム正規化ベクトルの周波数依存成分を排除したものを周波数正規化信号とした。しかし、ノルム正規化を行わず、周波数正規化信号を構成してもよい。すなわち、周波数領域信号偏角を、基準センサに対応する周波数領域信号の偏角を基準に正規化し、ノルム正規化を行うことなく周波数依存成分を排除したものを周波数正規化信号としてもよい。この場合、第１クラスタリング部は、ノルムが正規化されていない周波数正規化ベクトルのクラスタリングを行うことになる。この場合のクラスタリング基準は、ベクトルがノルムを含めて似ているかどうかではなく、ベクトルの方向のみが似ているかどうかになる。これは、類似度を用いた評価になる。類似度の１つとしてコサイン距離
cosθ=｜X'^H（f,τ）・H_k'｜/（‖X'（f,τ）‖・‖H_k'‖） …(29)
を例示できる。ここでθは、周波数正規化ベクトルと、セントロイドH_k'のベクトルとがなす角度である。なお、・^Ｈはベクトル・の複素共役転置ベクトルを示す。コサイン距離を用いる場合、第１クラスタリング部は、コサイン距離の総和

を最小値化するクラスタを生成する。なお、セントロイドH_k'は、各クラスタのメンバの平均として算出する。 In the first and second embodiments, the frequency normalized signal is obtained by eliminating the frequency dependent component of the phase / norm normalized vector. However, the frequency normalized signal may be configured without performing norm normalization. That is, the frequency domain signal deviation angle may be normalized based on the deviation angle of the frequency domain signal corresponding to the reference sensor, and the frequency-dependent component may be excluded without performing norm normalization. In this case, the first clustering unit performs clustering of frequency normalized vectors whose norms are not normalized. The clustering criterion in this case is not whether the vectors are similar including the norm, but only whether the vectors are similar. This is an evaluation using the similarity. Cosine distance as one of the similarities
^{cosθ = | X 'H (f} , τ) · H k' | / (‖X '(f, τ) || · ‖H _k' ||) ... (29)
Can be illustrated. Here, θ is an angle formed by the frequency normalization vector and the vector of the centroid H _k ′. Here, · ^H indicates a complex conjugate transposed vector of vector ·. When using the cosine distance, the first clustering unit calculates the sum of the cosine distances.

Generate a cluster that minimizes. The centroid H _k ′ is calculated as the average of the members of each cluster.

また、第１，２の実施の形態では、周波数正規化信号を各要素とし、特徴量ベクトルを構成した。しかし、周波数正規化信号を所定の関数へ代入して得られる値を各要素し、特徴量ベクトルを構成してもよい。例えば、ノルム正規化が行われていない周波数正規化信号を、ノルム正規化を行う関数に代入し、それによって得られた値を各要素して特徴量ベクトルを構成してもよい。また、センサ数２（Ｎ＝２）の場合には、前述の式（９）で得られる各値を要素とする特徴量ベクトルや、２つのセンサ（センサｑと基準センサＱ）における周波数領域信号の位相差（式（８））を周波数で除したものを各要素とする特徴量ベクトルを用いてもよい。なお、前述の式（９）で得られる各値を要素とする特徴量ベクトルを用いる場合、第１クラスタリング部は、例えば、式（２９）（３０）を用いてクラスタリングを行う。 In the first and second embodiments, the frequency normalization signal is used as each element, and the feature quantity vector is configured. However, the value vector obtained by substituting the frequency normalized signal into a predetermined function may be used as each element to form the feature vector. For example, a frequency normalization signal that has not been norm-normalized may be substituted into a function that performs norm normalization, and values obtained thereby may be used as elements to form a feature vector. Further, when the number of sensors is 2 (N = 2), a feature quantity vector having each value obtained by the above equation (9) as an element, and a frequency domain signal in two sensors (sensor q and reference sensor Q). A feature amount vector having each element obtained by dividing the phase difference (equation (8)) by the frequency may be used. In addition, when using the feature-value vector which makes each value obtained by above-mentioned Formula (9) an element, a 1st clustering part performs clustering using Formula (29) (30), for example.

また、上述の各実施の形態では、式（２８）を用いて周波数領域の分離信号を生成する手法を例示した。式（２８）は、C(f,τ)=kのとき、それに対応する時間周波数スロット(f,τ)の周波数領域信号X_Q(f,τ)（基準センサQに対応）を周波数領域の分離信号Y_k(f,τ)とするものである。しかし、C(f,τ)=kのとき、基準センサQ以外のセンサに対応する周波数領域信号を周波数領域の分離信号Y_k(f,τ)としてもよい。例えば、ある時間周波数スロット(f,τ)においてC(f,τ)=kのとき、その時間周波数スロット(f,τ)の周波数領域信号のうち最大のものを分離信号Y_k(f,τ)としてもよい。ただし、この場合であっても、分離信号Y_k(f,τ)として選択される周波数領域信号に対応するセンサは、各時間周波数スロット(f,τ)で同一とする。すなわち、ある時間周波数スロット(f,τ)で分離信号Y_k(f,τ)として選択する周波数領域信号に対応するセンサを決定したら、他の時間周波数スロット(f,τ)でも同じセンサに対応する周波数領域信号を分離信号Y_k(f,τ)として選択する。 In each of the above-described embodiments, the method of generating a frequency domain separation signal using Equation (28) is exemplified. Equation (28) shows that when C (f, τ) = k, the frequency domain signal X _Q (f, τ) (corresponding to the reference sensor Q) of the corresponding time frequency slot (f, τ) is The separated signal Y _k (f, τ) is used. However, when C (f, τ) = k, a frequency domain signal corresponding to a sensor other than the reference sensor Q may be used as the frequency domain separation signal Y _k (f, τ). For example, when C (f, τ) = k in a certain time frequency slot (f, τ), the largest one of the frequency domain signals in that time frequency slot (f, τ) is separated signal Y _k (f, τ ). However, even in this case, the sensor corresponding to the frequency domain signal selected as the separated signal Y _k (f, τ) is the same in each time frequency slot (f, τ). That is, if the sensor corresponding to the frequency domain signal to be selected as the separation signal Y _k (f, τ) is determined in a certain time frequency slot (f, τ), the same sensor is supported in the other time frequency slot (f, τ). The frequency domain signal to be selected is selected as the separation signal Y _k (f, τ).

また、上述の各種の処理は、記載に従って時系列に実行されるのみならず、処理を実行する装置の処理能力あるいは必要に応じて並列的にあるいは個別に実行されてもよい。その他、本発明の趣旨を逸脱しない範囲で適宜変更が可能であることはいうまでもない。
また、上述の処理内容を記述したプログラムは、コンピュータで読み取り可能な記録媒体に記録しておくことができる。コンピュータで読み取り可能な記録媒体としては、例えば、磁気記録装置、光ディスク、光磁気記録媒体、半導体メモリ等どのようなものでもよいが、具体的には、例えば、磁気記録装置として、ハードディスク装置、フレキシブルディスク、磁気テープ等を、光ディスクとして、ＤＶＤ（Digital Versatile Disc）、ＤＶＤ−ＲＡＭ（Random Access Memory）、ＣＤ−ＲＯＭ（Compact Disc Read Only Memory）、ＣＤ−Ｒ（Recordable）／ＲＷ（ReWritable）等を、光磁気記録媒体として、ＭＯ（Magneto-Optical disc）等を、半導体メモリとしてＥＥＰ−ＲＯＭ（Electronically Erasable and Programmable-Read Only Memory）等を用いることができる。 In addition, the various processes described above are not only executed in time series according to the description, but may be executed in parallel or individually according to the processing capability of the apparatus that executes the processes or as necessary. Needless to say, other modifications are possible without departing from the spirit of the present invention.
Further, the program describing the above-described processing contents can be recorded on a computer-readable recording medium. The computer-readable recording medium may be any medium such as a magnetic recording device, an optical disk, a magneto-optical recording medium, or a semiconductor memory. Specifically, for example, the magnetic recording device may be a hard disk device or a flexible Discs, magnetic tapes, etc. as optical disks, DVD (Digital Versatile Disc), DVD-RAM (Random Access Memory), CD-ROM (Compact Disc Read Only Memory), CD-R (Recordable) / RW (ReWritable), etc. As the magneto-optical recording medium, MO (Magneto-Optical disc) or the like can be used, and as the semiconductor memory, EEP-ROM (Electronically Erasable and Programmable-Read Only Memory) or the like can be used.

また、このプログラムの流通は、例えば、そのプログラムを記録したＤＶＤ、ＣＤ−ＲＯＭ等の可搬型記録媒体を販売、譲渡、貸与等することによって行う。さらに、このプログラムをサーバコンピュータの記憶装置に格納しておき、ネットワークを介して、サーバコンピュータから他のコンピュータにそのプログラムを転送することにより、このプログラムを流通させる構成としてもよい。
また、このプログラムの別の実行形態として、コンピュータが可搬型記録媒体から直接プログラムを読み取り、そのプログラムに従った処理を実行することとしてもよく、さらに、このコンピュータにサーバコンピュータからプログラムが転送されるたびに、逐次、受け取ったプログラムに従った処理を実行することとしてもよい。また、サーバコンピュータから、このコンピュータへのプログラムの転送は行わず、その実行指示と結果取得のみによって処理機能を実現する、いわゆるＡＳＰ（Application Service Provider）型のサービスによって、上述の処理を実行する構成としてもよい。なお、本形態におけるプログラムには、電子計算機による処理の用に供する情報であってプログラムに準ずるもの（コンピュータに対する直接の指令ではないがコンピュータの処理を規定する性質を有するデータ等）を含むものとする。 The program is distributed by selling, transferring, or lending a portable recording medium such as a DVD or CD-ROM in which the program is recorded. Furthermore, the program may be distributed by storing the program in a storage device of the server computer and transferring the program from the server computer to another computer via a network.
As another execution form of the program, the computer may directly read the program from a portable recording medium and execute processing according to the program, and the program is transferred from the server computer to the computer. Each time, the processing according to the received program may be executed sequentially. Also, the program is not transferred from the server computer to the computer, and the above-described processing is executed by a so-called ASP (Application Service Provider) type service that realizes the processing function only by the execution instruction and result acquisition. It is good. Note that the program in this embodiment includes information that is used for processing by an electronic computer and that conforms to the program (data that is not a direct command to the computer but has a property that defines the processing of the computer).

また、この形態では、コンピュータ上で所定のプログラムを実行させることにより、本装置を構成することとしたが、これらの処理内容の少なくとも一部をハードウェア的に実現することとしてもよい。 In this embodiment, the present apparatus is configured by executing a predetermined program on a computer. However, at least a part of these processing contents may be realized by hardware.

本発明の産業上の利用分野としては、例えば、数十センチ間隔でマイクロホンを配置したテレビ会議システムや、高いサンプリング周波数でサンプリングした音声信号の信号分離を行うシステム等のように、空間的エイリアシングがおきやすい環境で利用されるシステムを例示できる。本発明により、このように空間的エイリアシングがおきやすい環境でも、全周波数について正しく信号分離を行うことができる。 Industrial applications of the present invention include spatial aliasing, such as a video conference system in which microphones are arranged at intervals of several tens of centimeters, and a system that separates audio signals sampled at a high sampling frequency. A system used in an easy-to-occur environment can be exemplified. According to the present invention, signal separation can be correctly performed for all frequencies even in an environment where spatial aliasing is likely to occur.

図１は、第１の実施の形態における信号分離装置のハードウェア構成を例示したブロック図である。FIG. 1 is a block diagram illustrating a hardware configuration of a signal separation device according to the first embodiment. 図２は、第１の実施の形態の信号分離装置のブロック図の例示である。FIG. 2 is an example of a block diagram of the signal separation device according to the first embodiment. 図３は、図２における混合信号分類部の詳細を例示したブロック図である。FIG. 3 is a block diagram illustrating details of the mixed signal classifying unit in FIG. 図４は、第１の実施の形態における信号分離装置の処理を説明するためのフローチャートである。FIG. 4 is a flowchart for explaining the processing of the signal separation device according to the first embodiment. 図５は、図４のステップＳ２の処理の詳細を説明するためのフローチャートである。FIG. 5 is a flowchart for explaining details of the process in step S2 of FIG. 図６は、図４のステップＳ３の処理の詳細を説明するためのフローチャートである。FIG. 6 is a flowchart for explaining details of the process in step S3 of FIG. 図７は、図４のステップＳ３の処理の詳細を説明するためのフローチャートである。FIG. 7 is a flowchart for explaining details of the process in step S3 of FIG. 図８は、第２の実施の形態における混合信号分類部の詳細を例示したブロック図である。FIG. 8 is a block diagram illustrating details of the mixed signal classification unit in the second embodiment. 図９は、第２の実施の形態における信号分離装置の処理を説明するためのフローチャートである。FIG. 9 is a flowchart for explaining processing of the signal separation device according to the second embodiment. 図１０は、図９に示したステップＳ１０５の処理の詳細を説明するためのフローチャートである。FIG. 10 is a flowchart for explaining details of the process in step S105 shown in FIG. 図１１は、第３の実施の形態における混合信号分類部の詳細を例示したブロック図である。FIG. 11 is a block diagram illustrating details of the mixed signal classifying unit in the third embodiment. 図１２は、第３の実施の形態における信号分離装置の処理を説明するためのフローチャートである。FIG. 12 is a flowchart for explaining processing of the signal separation device according to the third embodiment. 図１３（ａ）は、実験条件Ａを示した表であり、図１３（ｂ）は、実験条件Ｂを示した図である。FIG. 13A is a table showing the experimental condition A, and FIG. 13B is a diagram showing the experimental condition B.

Explanation of symbols

１信号分離装置
１３０，２３０，３３０混合信号分類部 1 Signal separation device 130, 230, 330 Mixed signal classification unit

Claims

A signal separation device that separates a mixed signal composed of a mixture of source signals emitted from a plurality of signal sources into a separated signal that is an estimate of the source signal,
A frequency domain converter that converts the mixed signals observed by a plurality of sensors into frequency domain signals, and
A normalization unit that normalizes the phase of the frequency domain signal and further generates a frequency normalized signal that excludes frequency-dependent components;
Clustering feature vectors for each time frequency using the frequency normalized signal or a value obtained by assigning the frequency normalized signal to a predetermined function as each element, and calculating a centroid for each cluster One clustering unit;
Using the centroid, a model parameter extraction unit that calculates model parameters of a frequency response model,
A frequency response model in which the model parameter is substituted or a frequency response model in which the model parameter optimized by the model parameter is substituted as a reference, and a frequency having a phase / norm normalized value of the frequency domain signal as each element. Clustering dependent phase / norm normalized vectors and generating cluster information corresponding to clusters to which each phase / norm normalized vector belongs;
Using the frequency domain signal and the cluster information, a separation signal generation unit that generates a frequency domain separation signal;
A signal separation device comprising:

The signal separation device according to claim 1,
The first clustering unit includes:
Clustering the above feature vectors in a frequency band that includes frequencies where spatial aliasing cannot occur,
The second clustering unit includes
Clustering the phase-norm normalized vectors in a frequency band that includes frequencies where spatial aliasing can occur,
A signal separation device.

The signal separation device according to claim 2,
The frequency band including the frequency where the spatial aliasing cannot occur is
A frequency band consisting only of frequencies where spatial aliasing cannot occur,
A signal separation device.

The signal separation device according to claim 2 or 3,
The frequency band including the frequency at which the spatial aliasing can occur is
All frequency bands,
A signal separation device.

The signal separation device according to any one of claims 2 to 4,
The second clustering unit includes
A model parameter optimization unit that optimizes the model parameter using the cluster information and the phase / norm normalization vector in a frequency band including a frequency at which spatial aliasing may occur;
A signal separation device.

The signal separation device according to claim 5,
The model parameters above are
A parameter corresponding to a set of the sensor and a signal source corresponding to the cluster;
The model parameter optimization unit
The model parameter corresponding to the set of the sensor corresponding to each element of the phase / norm normalized vector and the signal source corresponding to the cluster to which the phase / norm normalized vector indicated by the cluster information belongs is substituted. The partial differential with respect to the above model parameter is performed on the cost function consisting of the sum of the distances between the frequency response model and each element of the phase / norm normalized vector, and a value proportional to the calculated partial differential value is obtained. Optimize the model parameter by subtracting from the model parameter,
A signal separation device.

The signal separation device according to claim 6,
The second clustering unit includes
Until the predetermined end condition is satisfied, the process of generating the cluster information and the process of optimizing the model parameter are alternately executed,
The separated signal generator is
When the predetermined termination condition is satisfied, the latest cluster information is used to generate the separation signal.
A signal separation device.

The signal separation device according to claim 2,
The first clustering unit includes:
Generating initial cluster information corresponding to each cluster to which the feature vector corresponding to a frequency band including a frequency in which spatial aliasing cannot occur;
The second clustering unit includes
Generating the cluster information corresponding to the clusters to which the phase and norm normalization vectors corresponding to frequency bands including frequencies where spatial aliasing may occur;
The separated signal generator is
Generating the separation signal using the initial cluster information generated by the first clustering unit and the cluster information generated by the second clustering unit;
A signal separation device.

A signal separation device that separates a mixed signal composed of a mixture of source signals emitted from a plurality of signal sources into a separated signal that is an estimate of the source signal,
A frequency domain converter that converts the mixed signals observed by a plurality of sensors into frequency domain signals, and
A model parameter selection unit for selecting initial values of model parameters of the frequency response model;
Clustering frequency-dependent phase / norm normalization vectors with each element as the phase / norm normalization value of the above frequency domain signal based on the frequency response model into which model parameters are substituted, and each phase / norm normalization vector is A cluster determination unit that generates cluster information corresponding to the cluster to which the cluster belongs;
A model parameter optimization unit that optimizes the model parameter using the cluster information and the phase / norm normalization vector;
Until the predetermined termination condition is satisfied, a control unit that alternately executes the processing of the cluster determination unit and the model parameter optimization unit,
A separation signal generation unit that generates a separation signal in the frequency domain using the latest cluster information and the frequency domain signal at the time when the predetermined termination condition is satisfied;
A signal separation device comprising:

A signal separation method for separating a mixed signal composed of a mixture of source signals emitted from a plurality of signal sources into a separated signal that is an estimation of the source signal,
A frequency domain conversion process for converting the mixed signals observed by a plurality of sensors into frequency domain signals,
Normalization process for generating a frequency normalized signal that normalizes the phase of the frequency domain signal and further eliminates frequency dependent components;
Clustering feature vectors for each time frequency using the frequency normalized signal or a value obtained by assigning the frequency normalized signal to a predetermined function as each element, and calculating a centroid for each cluster 1 clustering process,
Using the centroid, a model parameter calculation process for calculating a model parameter of a frequency response model,
A frequency response model in which the model parameter is substituted or a frequency response model in which the model parameter optimized by the model parameter is substituted as a reference, and a frequency having a phase / norm normalized value of the frequency domain signal as each element. Clustering dependent phase / norm normalized vectors and generating cluster information corresponding to clusters to which each phase / norm normalized vector belongs;
Using the frequency domain signal and the cluster information, a separation signal generation process for generating a frequency domain separation signal;
A signal separation method comprising:

A signal separation method for separating a mixed signal composed of a mixture of source signals emitted from a plurality of signal sources into a separated signal that is an estimation of the source signal,
A frequency domain conversion process for converting the mixed signals observed by a plurality of sensors into frequency domain signals,
Model parameter selection process for selecting initial values of model parameters of the frequency response model,
Repeated alternately until a predetermined termination condition is satisfied,
(A) Clustering frequency-dependent phase / norm normalized vectors having the phase / norm normalized values of the frequency domain signal as elements, based on the frequency response model into which model parameters are substituted, and each phase / norm normal A clustering process for generating cluster information corresponding to the cluster to which the quantization vector belongs;
(B) a model parameter optimization process for optimizing the model parameter using the cluster information and the phase / norm normalization vector;
A separation signal generation process for generating a separation signal in the frequency domain using the latest cluster information and the frequency domain signal at the time after satisfying the predetermined termination condition;
A signal separation method comprising:

A signal separation program for causing a computer to function as the signal separation device according to claim 1.

A computer-readable recording medium storing the signal separation program according to claim 12.