JP2007178590A

JP2007178590A - Object signal extracting device and method therefor, and program

Info

Publication number: JP2007178590A
Application number: JP2005375284A
Authority: JP
Inventors: Shi Cho; 志鵬張; Minoru Eito; 稔栄藤
Original assignee: NTT Docomo Inc
Current assignee: NTT Docomo Inc
Priority date: 2005-12-27
Filing date: 2005-12-27
Publication date: 2007-07-12

Abstract

<P>PROBLEM TO BE SOLVED: To provide an object signal extracting device and method therefor, and program which can extract the object signals from mixed signals at high speed, as well as with accuracy. <P>SOLUTION: The matrix-diagonalization calculation section 11 of the object signal extracting device 10 obtains an isolated matrix by independent component analysis, based on the maximum criteria of the posterior probability, and the object signal extracting section 12 extracts the object signals from the mixed signals by using the matrix diagonalization. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、混合信号から目的信号を抽出する目的信号抽出装置、目的信号抽出方法、及び、プログラムに関するものである。 The present invention relates to a target signal extraction apparatus, a target signal extraction method, and a program for extracting a target signal from a mixed signal.

混合信号から目的信号を分離抽出する手法の一つとして独立成分分析（Independent Component Analysis : ＩＣＡ、以下「ＩＣＡ」ともいう）による手法がある。これは、複数の線形混合された信号を、元の信号や混合過程についての知識を全く用いることなしに推定する手法であり、ブラインド音源分離（Blind Source Separation : ＢＳＳ）とよばれる。まず、ブラインド音源分離について説明する。
・実環境での混合信号（観測信号）モデル
図８は、ＩＣＡ法によるブラインド音源分離のモデルを示す図である。ｓ_iを信号源１１_iの信号、ｈ_jiを信号源１１_iからセンサ１２_jまでのインパルス応答（周波数応答）、Ｐをインパルス応答の次数、信号源１１_iの数をＮ個（Ｎ＞１）、センサ１２_jの数をＭ（Ｍ≧Ｎ）個、ｎを離散的時刻とすると、センサ１２_jで観測される信号ｙ_jは One technique for separating and extracting a target signal from a mixed signal is a technique based on independent component analysis (ICA) (hereinafter also referred to as “ICA”). This is a technique for estimating a plurality of linearly mixed signals without using any knowledge about the original signal and mixing process, and is called blind source separation (BSS). First, blind sound source separation will be described.
Mixed Signal (Observed Signal) Model in Real Environment FIG. 8 is a diagram showing a blind source separation model by the ICA method. s _i is the signal from the signal source 11 _i , h _ji is the impulse response (frequency response) from the signal source 11 _i to the sensor 12 _j , P is the order of the impulse response, and N is the number of the signal sources 11 _i (N > 1) ), the sensor 12 _j number of M (M ≧ n) pieces of, when the discrete time n, the signal y _j observed by the sensor 12 _j is

と表現される。ここでＮ個の信号ｓ_iは統計的に互いに独立であると仮定する。観測信号ｙ_j（ｎ）は一定周期で標本化され、ディジタル信号系列とされている。
・分離信号のモデル
ブラインド音源分離では、式（１）の形で得られる観測信号と、長さがＱタップ、インパルス応答がｗ_ijのＮ×Ｍ個の分離フィルタ群１３_ijから成る分離系を用いて分離する。この分離フィルタ群１３_ijを用いて、分離して得られる信号ｙ_i（ｎ）は

It is expressed. Here, it is assumed that the N signals s _i are statistically independent of each other. The observation signal y _j (n) is sampled at a constant period to form a digital signal sequence.
Separation signal model In blind sound source separation, a separation system consisting of an observation signal obtained in the form of equation (1), N × M separation filter groups 13 _ij having a length of Q taps and an impulse response of w _ij Use to separate. A signal y _i (n) obtained by separation using the separation filter group 13 _ij is

と表される。図８には、Ｎ＝Ｍ＝２の場合について、信号源１１₁，１１₂とセンサ１２₁，１２₂間の混合過程と、センサ１２₁，１２₂の出力信号ｙ₁，ｙ₂から２×２個のフィルタ群１３_ijを用いるＩＣＡ法により分離信号ｙ₁（ｎ），ｙ₂（ｎ）を出力端子１４₁，１４₂に得る分離過程を示している。
分離フィルタ係数（周波数応答）ｗ_ijの推定には、独立成分分析（ＩＣＡ）と呼ばれる技術が広く用いられる。時間領域信号を周波数領域へ変換し、各周波数において分離行列を求める手法（周波数領域ＢＳＳ）が広く用いられている（例えば、非特許文献１参照）。

It is expressed. In FIG. 8, in the case of N = M = 2, the mixing process between the

signal sources

11 ₁ and 11 ₂ and the

sensors

12 ₁ and 12 ₂ and the output signals y ₁ and y _{2 of} the

sensors

12 ₁ and 12 ₂ are 2 The separation process of obtaining separation signals y ₁ (n) and y ₂ (n) at output terminals 14 ₁ and 14 ₂ by the ICA method using two filter groups 13 _ij is shown.
A technique called independent component analysis (ICA) is widely used for estimating the separation filter coefficient (frequency response) w _ij . A technique (frequency domain BSS) for converting a time domain signal into a frequency domain and obtaining a separation matrix at each frequency is widely used (for example, see Non-Patent Document 1).

従来の尤度最大法に基づくＩＣＡによる周波数領域ＢＳＳ法の機能構成を図９に示す。観測信号ｙ₁（ｎ），ｙ₂（ｎ）を周波数領域変換部で、例えば短時間離散フーリエ変換（ＤＦＴ）（窓関数を掛け、例えば１／２フレームごとにずらしながら１フレームずつ離散フーリエ変換）して周波数領域の信号Ｙ（ω）に変換する。
尤度最大化基準に基づく分離行列推定部において、各出力信号の周波数領域の信号Ｙ_i（ｆ，ｍ）が互いに独立となるように、次式のように分離行列Ｗ（ｆ）を推定する。 FIG. 9 shows a functional configuration of the frequency domain BSS method by ICA based on the conventional maximum likelihood method. The observed signals y ₁ (n) and y ₂ (n) are subjected to a frequency domain transform unit, for example, a short-time discrete Fourier transform (DFT) (multiplied by a window function, for example, discrete Fourier transform frame by frame while shifting every 1/2 frame) ) To convert it to a frequency domain signal Y (ω).
In the separation matrix estimation unit based on the likelihood maximization criterion, the separation matrix W (f) is estimated as in the following equation so that the frequency domain signals Y _i (f, m) of each output signal are independent of each other. .

このようにして各周波数成分においての分離が達成される。
時間領域変換部において、周波数領域で出力される信号Ｙ_i（ｆ，ｍ）を例えば逆フーリエ変換により時間領域の信号に変換する。こうして得られる分離信号の中から、何らかの手法を用いて目的信号を選ぶことで、目的信号が分離抽出される。 In this way, separation at each frequency component is achieved.
In the time domain transform unit, the signal Y _i (f, m) output in the frequency domain is transformed into a time domain signal by, for example, inverse Fourier transform. The target signal is separated and extracted by selecting the target signal from the separated signals obtained in this way by using some method.

周波数領域においてＩＣＡを行う問題点の一つとして、分離行列の行が入れかわっても、出力Ｚ₁（ｆ，ｍ）とＺ₂（ｆ，ｍ）の独立性は保たれる。すなわち分離行列の１行目と２行目を入れかえると、一つ目の出力にＺ₂（ｆ，ｍ）が、二つ目の出力にＺ₁（ｆ，ｍ）が得られるが、ここでもやはり二つの出力信号は独立である。即ち、ＩＣＡは出力信号同士を互いに独立にはするが、その出力順序は拘束しない。 As one of the problems of performing ICA in the frequency domain, the independence of the outputs Z ₁ (f, m) and Z ₂ (f, m) is maintained even if the separation matrix rows are switched. That is, if the first and second rows of the separation matrix are switched, Z ₂ (f, m) is obtained as the _first output and Z ₁ (f, m) is obtained as the second output. Again, the two output signals are independent. That is, ICA makes output signals independent of each other, but does not constrain the output order.

これより、任意の二つの周波数ｆ₁とｆ₂を考えた時、例えば出力信号Ｚ₁（ｆ₁，ｍ）とＺ₁（ｆ₂，ｍ）とが、同じ信号ｓ_iに対する推定信号であるとは限らない。従って、周波数領域ＩＣＡでは、Ｚ_i（ｆ₁，ｍ）とＺ_i（ｆ₂，ｍ）が同じ信号源の信号ｓ_iの推定となるように、分離行列の行（各チャンネル）を正しく並べ替える必要がある。これを置換（Ｐｅｒｍｕｔａｔｉｏｎ）の問題と呼ぶ。
また、従来のＩＣＡでは尤度最大の基準に基づいて分離行列を算出しているため、分離行列の値が最適値に収束するまで繰り返し演算を行う必要があり、演算処理に時間を要する。また、算出された最適値はローカル最適値でありグローバル最適値であることは保証されない。 Thus, when any two frequencies f ₁ and f ₂ are considered, for example, the output signals Z ₁ (f ₁ , m) and Z ₁ (f ₂ , m) are estimated signals for the same signal s _i. Not necessarily. Therefore, in the frequency domain ICA, the rows (each channel) of the separation matrix are correctly arranged so that Z _i (f ₁ , m) and Z _i (f ₂ , m) are estimates of the signal s _i of the same signal source. It is necessary to change. This is called a problem of permutation.
In addition, since the conventional ICA calculates the separation matrix based on the maximum likelihood criterion, it is necessary to repeatedly perform the operation until the value of the separation matrix converges to the optimum value, which requires time. Further, the calculated optimum value is a local optimum value and is not guaranteed to be a global optimum value.

特許文献１には、事前知識を利用して、ＩＣＡのＰｅｒｍｕｔａｔｉｏｎの問題を解決する方法が記載されている。特許文献１では、事前知識から算出した分離ベクトル（分離行列の一成分）の初期値を用いて、尤度最大の基準に基づくＩＣＡにより分離行列の一成分を算出している（図１０参照）。事前知識としての事前周波数応答情報Ｈ₁（ｆ）は、目的信号源の方位（目的信号到来方向）θを既知としてＨ_j1（ｆ）＝ｅｘｐ（ｊ２πｆτ_j1），τ_j1＝（ｄ_j／ｃ）ｓｉｎθ₁を計算したもの、あるいは予め測定したものである。既知である周波数応答をＨ′_j1（ｆ）＝ｅｘｐ（ｊ２πｆτ_j1）とする。拘束条件として例えば次式で与えられ、
Σ_j=1 ^MＨ′_j1Ｗ_1j（ｆ）＝Ｗ₁（ｆ）Ｈ₁（ｆ）＝１ Patent Document 1 describes a method for solving the problem of ICA Permutation using prior knowledge. In Patent Document 1, one component of the separation matrix is calculated by ICA based on the maximum likelihood criterion using the initial value of the separation vector (one component of the separation matrix) calculated from prior knowledge (see FIG. 10). . Prior frequency response information H ₁ (f) as prior knowledge is obtained by knowing the direction (target signal arrival direction) θ of the target signal source as H _j1 (f) = exp (j2πfτ _j1 ), τ _j1 = (d _j / c ) Sin θ ₁ calculated or measured in advance. Let the known frequency response be H ′ _j1 (f) = exp (j2πfτ _j1 ). For example, the constraint condition is given by
Σ _{j = 1} ^M H ′ _j1 W _1j (f) = W ₁ (f) H ₁ (f) = 1

この式の条件を満たしながら誤差信号Ｅ（ｆ，ｍ）のパワーを最小とする係数Ｗ′_1j（ｆ）を求める。ここで、Ｈ₁（ｆ）＝［Ｈ′₁₁（ｆ），Ｈ′₂₁（ｆ）］^T，Ｗ₁（ｆ）＝［Ｗ₁₁（ｆ），Ｗ₁₂（ｆ）］である。上式は、目的信号から出力までの周波数応答を全ての周波数で１にする、という拘束条件となっている。これは目的信号が歪み無く出力されるための条件である。 A coefficient W ′ _1j (f) that minimizes the power of the error signal E (f, m) while satisfying the condition of this equation is obtained. Here, H ₁ (f) = [H ′ ₁₁ (f), H ′ ₂₁ (f)] ^T , W ₁ (f) = [W ₁₁ (f), W ₁₂ (f)]. The above equation is a constraint that the frequency response from the target signal to the output is 1 at all frequencies. This is a condition for outputting the target signal without distortion.

また、特許文献２では、信号源の数がセンサの数よりも小さいときに顕著に発生するＰｅｒｍｕｔａｔｉｏｎの問題を解決している。特許文献２では、周波数領域（ＦＤ）および時間領域（ＴＤ）での独立成分分析を順次行い、特に、ＦＤＩＣＡにおける信号の識別過程を複数個のサブブロックに分割する構成とし、この処理過程で信号源の数を推定した結果を用いて実質的に実働センサの数と信号源の数とを合わせるようにしている。
特許文献３には、音素モデル（ＨＭＭ）の平均ベクトルを、最大事後確率（ＭＡＰ；Maximum A Posteriori）推定法により推定する技術が開示されている。 Further, Patent Document 2 solves the problem of permutation that occurs remarkably when the number of signal sources is smaller than the number of sensors. In Patent Document 2, independent component analysis is sequentially performed in the frequency domain (FD) and time domain (TD), and in particular, the signal identification process in FDICA is divided into a plurality of sub-blocks. The number of active sensors and the number of signal sources are substantially matched using the result of estimating the number of sources.
Patent Document 3 discloses a technique for estimating an average vector of a phoneme model (HMM) using a maximum a posteriori (MAP) estimation method.

特開２００４−３０２１２２号公報JP 2004-302122 A 特開２００５−９１５６０号公報JP 2005-91560 A 特開平８−２２２９６号公報JP-A-8-22296 A. Hyvaerinen and J. Karhunen and E. Oja, "Independent Component Analysis, "John Wiley & Sons,2001,ISBN 0-471-40540A. Hyvaerinen and J. Karhunen and E. Oja, "Independent Component Analysis," John Wiley & Sons, 2001, ISBN 0-471-40540

しかしながら、特許文献１，２では、尤度最大の基準に基づくＩＣＡにより分離行列を求めているため、計算時間がかかり、かつ、求められた分離行列や抽出された目的信号の精度が低いという従来の問題点は解決されない。
また、特許文献３のように、ＨＭＭに事後確率最大推定法を適用する場合は、既知の事前情報として不特定話者の音声を用いることができるが、ＩＣＡでは既知の事前情報が存在しない。このため、ＩＣＡに事後確率最大推定法を適用するには事前情報を推定する必要がある。また、計算量を削減し高速に計算を行うためには精度の高い事前情報を推定する必要がある。
本発明は上記問題点に鑑みてなされたものであり、高速かつ高精度に混合信号から目的信号を抽出することを可能とする目的信号抽出装置、目的信号抽出方法、及び、プログラムを提供することを課題とする。 However, in Patent Documents 1 and 2, since the separation matrix is obtained by ICA based on the maximum likelihood criterion, calculation time is required, and the accuracy of the obtained separation matrix and the extracted target signal is low. The problem is not solved.
Also, as in Patent Document 3, when applying the maximum posterior probability estimation method to HMM, the voice of an unspecified speaker can be used as known prior information, but ICA has no known prior information. For this reason, prior information needs to be estimated in order to apply the maximum posterior probability estimation method to ICA. In addition, in order to reduce the amount of calculation and perform the calculation at high speed, it is necessary to estimate the prior information with high accuracy.
The present invention has been made in view of the above problems, and provides a target signal extraction device, a target signal extraction method, and a program capable of extracting a target signal from a mixed signal at high speed and with high accuracy. Is an issue.

上記課題を解決するために、請求項１に記載の発明は、複数の方向から到来する、目的信号を含む混合信号を複数のセンサで観測し、これら複数のセンサからの観測信号に基づき周波数毎の時系列を求め、該時系列を独立成分分析し周波数毎の分離行列を求める分離行列算出手段と、該周波数毎の分離行列を用いて混合信号から目的信号を抽出する目的信号抽出手段とを備える目的信号抽出装置であって、前記分離行列算出手段は、事後確率最大の基準に基づく独立成分分析により分離行列を求めることを特徴とする目的信号抽出装置を提供する。
本発明によれば、目的信号抽出装置は、事後確率最大の基準に基づく独立成分分析により分離行列を求めるため、従来の尤度最大の基準に基づく独立成分分析よりも、高速かつ高精度に分離行列を求めることができ、求めた分離行列を用いて高速かつ高精度に混合信号から目的信号を抽出することが可能となる。 In order to solve the above-mentioned problem, the invention according to claim 1 observes a mixed signal including a target signal, which comes from a plurality of directions, by a plurality of sensors, and determines each frequency based on the observation signals from the plurality of sensors. A separation matrix calculation means for obtaining an independent component analysis of the time series and obtaining a separation matrix for each frequency, and a target signal extraction means for extracting a target signal from the mixed signal using the separation matrix for each frequency. An objective signal extraction device provided, wherein the separation matrix calculation means obtains a separation matrix by independent component analysis based on a criterion of maximum posterior probability.
According to the present invention, since the target signal extraction apparatus obtains a separation matrix by independent component analysis based on the maximum posterior probability criterion, the target signal extraction device performs separation at higher speed and higher accuracy than the conventional independent component analysis based on the maximum likelihood criterion. A matrix can be obtained, and a target signal can be extracted from the mixed signal at high speed and with high accuracy using the obtained separation matrix.

請求項２に記載の発明は、請求項１に記載の目的信号抽出装置において、前記分離行列算出手段は、目的信号源からセンサまでの周波数応答に基づいて推定した分離行列を事前情報として用いて、事後確率最大の基準に基づく独立成分分析により分離行列を求めることを特徴とする。
本発明によれば、目的信号抽出装置は、目的信号源からセンサまでの周波数応答に基づいて推定した分離行列を事前情報として用いて、事後確率最大の基準に基づく独立成分分析により分離行列を求めるため、事前情報を高い精度で算出することができ、分離行列のチャンネルの入替（Ｐｅｒｍｕｔａｔｉｏｎ）が発生するという問題が生じなくなる。また、高速かつ高精度に分離行列を求めて、求めた分離行列を用いて高速かつ高精度に混合信号から目的信号を抽出することができる。 According to a second aspect of the present invention, in the target signal extraction device according to the first aspect, the separation matrix calculation means uses the separation matrix estimated based on the frequency response from the target signal source to the sensor as prior information. The separation matrix is obtained by independent component analysis based on the criterion of maximum posterior probability.
According to the present invention, the target signal extraction apparatus obtains the separation matrix by independent component analysis based on the criterion of the maximum posterior probability, using the separation matrix estimated based on the frequency response from the target signal source to the sensor as prior information. Therefore, prior information can be calculated with high accuracy, and the problem of permutation of the separation matrix channel does not occur. In addition, the separation matrix can be obtained at high speed and with high accuracy, and the target signal can be extracted from the mixed signal at high speed and with high accuracy using the obtained separation matrix.

請求項３に記載の発明は、請求項１又は２に記載の目的信号抽出装置において、前記分離行列算出手段は、干渉信号だけが存在する区間の信号を利用して推定した分離行列を事前情報として用いて、事後確率最大の基準に基づく独立成分分析により分離行列を求めることを特徴とする。
本発明によれば、目的信号抽出装置は、干渉信号だけが存在する区間の信号を利用して推定した分離行列を事前情報として用いて、事後確率最大の基準に基づく独立成分分析により分離行列を求めることを特徴とするため、事前情報を高い精度で推定することができ、高速かつ高精度に分離行列を求めることができる。従って、求めた分離行列を用いて高速かつ高精度に混合信号から目的信号を抽出することが可能となる。 According to a third aspect of the present invention, in the target signal extraction apparatus according to the first or second aspect, the separation matrix calculation means uses a signal in a section where only an interference signal exists to estimate a separation matrix estimated in advance. And a separation matrix is obtained by independent component analysis based on a criterion of maximum posterior probability.
According to the present invention, the target signal extraction apparatus uses, as a priori information, a separation matrix estimated by using a signal in a section where only an interference signal exists, and obtains a separation matrix by independent component analysis based on a criterion of maximum posterior probability. Since it is characterized by obtaining, the prior information can be estimated with high accuracy, and the separation matrix can be obtained with high speed and high accuracy. Therefore, the target signal can be extracted from the mixed signal with high speed and high accuracy using the obtained separation matrix.

請求項４に記載の発明は、請求項１から３の何れか１項に記載の目的信号抽出装置において、前記分離行列算出手段は、干渉信号だけが存在する区間の信号を利用し、エネルギー最小の拘束条件に基づいて推定した分離行列を事前情報として用いて、事後確率最大の基準に基づく独立成分分析により分離行列を求めることを特徴とする。
本発明によれば、目的信号抽出装置は、干渉信号だけが存在する区間の信号を利用し、エネルギー最小の拘束条件に基づいて推定した分離行列を事前情報として用いて、事後確率最大の基準に基づく独立成分分析により分離行列を求めるため、事前情報を高い精度で推定することができ、高速かつ高精度に分離行列を求めることができる。従って、求めた分離行列を用いて高速かつ高精度に混合信号から目的信号を抽出することが可能となる。 According to a fourth aspect of the present invention, in the target signal extraction device according to any one of the first to third aspects, the separation matrix calculation means uses a signal in a section where only an interference signal exists, and minimizes the energy. A separation matrix is obtained by independent component analysis based on a criterion of maximum posterior probability, using a separation matrix estimated based on the constraint condition of (2) as prior information.
According to the present invention, the target signal extraction device uses a signal in a section where only an interference signal exists, uses a separation matrix estimated based on a constraint condition with the minimum energy as a priori information, and uses it as a criterion for maximizing the posterior probability. Since the separation matrix is obtained by independent component analysis based on it, prior information can be estimated with high accuracy, and the separation matrix can be obtained at high speed and with high accuracy. Therefore, the target signal can be extracted from the mixed signal with high speed and high accuracy using the obtained separation matrix.

請求項５に記載の発明は、請求項１から４の何れか１項に記載の目的信号抽出装置において、前記分離行列算出手段は、各チャンネルの観測信号の相関を利用して推定した分離行列を事前情報として用いて、事後確率最大の基準に基づく独立成分分析により分離行列を求めることを特徴とする。
本発明によれば、目的信号抽出装置は、各チャンネルの観測信号の相関を利用して推定した分離行列を事前情報として用いて、事後確率最大の基準に基づく独立成分分析により分離行列を求めるため、事前情報を高い精度で推定することができ、高速かつ高精度に分離行列を求めることができる。従って、求めた分離行列を用いて高速かつ高精度に混合信号から目的信号を抽出することが可能となる。 According to a fifth aspect of the present invention, in the target signal extraction device according to any one of the first to fourth aspects, the separation matrix calculation means is a separation matrix estimated using the correlation of the observation signals of each channel. Is used as a priori information, and a separation matrix is obtained by independent component analysis based on a criterion of maximum posterior probability.
According to the present invention, the target signal extraction apparatus obtains a separation matrix by independent component analysis based on the criterion of the maximum posterior probability using the separation matrix estimated using the correlation of the observation signals of each channel as prior information. The prior information can be estimated with high accuracy, and the separation matrix can be obtained with high speed and high accuracy. Therefore, the target signal can be extracted from the mixed signal with high speed and high accuracy using the obtained separation matrix.

請求項６に記載の発明は、請求項１から５の何れか１項に記載の目的信号抽出装置において、前記分離行列算出手段は、各チャンネルの観測信号の相関が０との拘束条件に基づいて推定した分離行列を事前情報として用いて、事後確率最大の基準に基づく独立成分分析により分離行列を求めることを特徴とする。
本発明によれば、目的信号抽出装置は、各チャンネルの観測信号の相関が０との拘束条件に基づいて推定した分離行列を事前情報として用いて、事後確率最大の基準に基づく独立成分分析により分離行列を求めるため、混合信号から独立な特徴を高精度に抽出して事前情報を高い精度で推定することができ、高速かつ高精度に分離行列を求めることができる。従って、求めた分離行列を用いて高速かつ高精度に混合信号から目的信号を抽出することが可能となる。 According to a sixth aspect of the present invention, in the target signal extracting device according to any one of the first to fifth aspects, the separation matrix calculating means is based on a constraint condition that the correlation of the observation signals of each channel is zero. Using the separation matrix estimated in this way as a priori information, the separation matrix is obtained by independent component analysis based on the criterion of maximum posterior probability.
According to the present invention, the target signal extraction device uses the separation matrix estimated based on the constraint condition that the observation signal of each channel is 0 as the prior information, and performs independent component analysis based on the criterion of the maximum posterior probability. In order to obtain the separation matrix, it is possible to extract features independent from the mixed signal with high accuracy and estimate the prior information with high accuracy, and to obtain the separation matrix with high speed and high accuracy. Therefore, the target signal can be extracted from the mixed signal with high speed and high accuracy using the obtained separation matrix.

請求項７に記載の発明は、複数の方向から到来する、目的信号を含む混合信号を複数のセンサで観測する信号観測ステップと、前記複数のセンサからの観測信号に基づき周波数毎の時系列を求める周波数領域変換ステップと、前記周波数領域変換ステップにおいて求められた時系列について事後確率最大の基準に基づく独立成分分析を行うことにより、周波数毎の分離行列を求める分離行列算出ステップと、前記分離行列算出ステップにおいて求められた分離行列を用いて、混合信号から目的信号を抽出する目的信号抽出ステップとを備えることを特徴とする目的信号抽出方法を提供する。
本発明によれば、事後確率最大の基準に基づく独立成分分析により分離行列を求めることによって、従来の尤度最大の基準に基づく独立成分分析よりも高速かつ高精度に分離行列を求めることができる。従って、求めた分離行列を用いて高速かつ高精度に混合信号から目的信号を抽出することが可能となる。 The invention according to claim 7 is a signal observation step of observing a mixed signal including a target signal, which comes from a plurality of directions, by a plurality of sensors, and a time series for each frequency based on the observation signals from the plurality of sensors. A frequency domain transformation step to be obtained; a separation matrix calculation step for obtaining a separation matrix for each frequency by performing independent component analysis based on a criterion of maximum posterior probability for the time series obtained in the frequency domain transformation step; and the separation matrix There is provided a target signal extraction method comprising a target signal extraction step of extracting a target signal from a mixed signal using the separation matrix obtained in the calculation step.
According to the present invention, by obtaining a separation matrix by independent component analysis based on a criterion with the maximum posterior probability, it is possible to obtain a separation matrix at higher speed and with higher accuracy than independent component analysis based on a criterion with the maximum likelihood. . Therefore, the target signal can be extracted from the mixed signal with high speed and high accuracy using the obtained separation matrix.

請求項８に記載の発明は、コンピュータに、混合信号から目的信号を抽出するための分離行列の初期値を事前情報として推定する事前情報推定ステップと、事後確率最大の基準に基づく独立成分分析を用いて、前記事前情報推定ステップにおいて推定された事前情報を逐次更新することにより、分離行列を求める分離行列算出ステップと、前記分離行列算出ステップにおいて求められた分離行列を用いて、混合信号から目的信号を抽出する目的信号抽出ステップとを実行させるためのプログラムを提供する。
本発明によれば、推定した事前情報を用いて事後確率最大の基準に基づく独立成分分析を行うことにより、従来の尤度最大の基準に基づく独立成分分析よりも高速かつ高精度に分離行列を求めることができる。従って、求めた分離行列を用いて高速かつ高精度に混合信号から目的信号を抽出することが可能となる。 According to an eighth aspect of the present invention, a prior information estimation step for estimating, as prior information, an initial value of a separation matrix for extracting a target signal from a mixed signal, and independent component analysis based on a maximum posterior probability are performed on a computer. Using the separation matrix calculation step for obtaining a separation matrix by sequentially updating the prior information estimated in the prior information estimation step, and using the separation matrix obtained in the separation matrix calculation step, A program for executing a target signal extraction step of extracting a target signal is provided.
According to the present invention, by performing independent component analysis based on the criterion of maximum posterior probability using the estimated prior information, the separation matrix can be obtained at higher speed and higher accuracy than the independent component analysis based on the criterion of maximum likelihood. Can be sought. Therefore, the target signal can be extracted from the mixed signal with high speed and high accuracy using the obtained separation matrix.

本発明によれば、目的信号抽出装置は、事後確率最大の基準に基づく独立成分分析により分離行列を求めるため、従来の尤度最大の基準に基づく独立成分分析よりも、高速かつ高精度に分離行列を求めることができ、求めた分離行列を用いて高速かつ高精度に混合信号から目的信号を抽出することが可能となる。 According to the present invention, since the target signal extraction apparatus obtains a separation matrix by independent component analysis based on the maximum posterior probability criterion, the target signal extraction device performs separation at higher speed and higher accuracy than the independent component analysis based on the conventional maximum likelihood criterion. The matrix can be obtained, and the target signal can be extracted from the mixed signal at high speed and with high accuracy using the obtained separation matrix.

以下、図面を参照して、本発明に係る実施の形態について説明する。
（第１の実施の形態）
図１は、本発明の第１の実施の形態に係る目的信号抽出装置１０の機能構成を示すブロック図である。同図に示すように、目的信号抽出装置１０は、分離行列算出部１１と目的信号抽出部１２とで構成されている。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.
(First embodiment)
FIG. 1 is a block diagram showing a functional configuration of a target signal extraction apparatus 10 according to the first embodiment of the present invention. As shown in the figure, the target signal extraction device 10 includes a separation matrix calculation unit 11 and a target signal extraction unit 12.

分離行列算出部１１は、複数のセンサからの観測信号に基づき周波数毎の時系列を求め、当該時系列を事後確率最大の基準に基づく独立成分分析を行うことにより、周波数毎の分離行列を求める。
分離行列算出部１１は事前情報推定部１１１を備えている。事前情報推定部１１１は、事後確率最大基準に基づく独立成分分析の事前情報として、分離行列（の初期値）を推定する。 The separation matrix calculation unit 11 obtains a time series for each frequency based on observation signals from a plurality of sensors, and obtains a separation matrix for each frequency by performing independent component analysis on the time series based on a criterion of maximum posterior probability. .
The separation matrix calculation unit 11 includes a prior information estimation unit 111. The prior information estimation unit 111 estimates a separation matrix (its initial value) as prior information for independent component analysis based on the maximum posterior probability criterion.

事前知識保持部１１２は事前知識を保持する。本実施形態においては、目的信号源からセンサまでの周波数応答の事前計測値を事前知識として保持する。本実施の形態では、この周波数応答の事前計測値に基づいて分離行列の初期値が推定される。
なお、これらの機能構成は、目的信号抽出装置１０が備えるＣＰＵ、メモリ等のハードウェア、及び、メモリに記憶されているプログラム、データ等のソフトウェアによって実現される。 The prior knowledge holding unit 112 holds prior knowledge. In this embodiment, the prior measurement value of the frequency response from the target signal source to the sensor is held as prior knowledge. In the present embodiment, the initial value of the separation matrix is estimated based on the premeasured value of the frequency response.
These functional configurations are realized by hardware such as a CPU and a memory included in the target signal extraction apparatus 10 and software such as a program and data stored in the memory.

次に、図２を参照して、目的信号抽出装置１０が行う処理手順を説明する。ここでは、雑音源が一つであり、目的音源はユーザの音声であり、これらの音を目的信号抽出装置１０としての携帯電話機に入力する場合について述べる。また、分離行列が２×２（２行２列）の行列であるものとする。さらに、観測信号がＹ₁（ｎ），Ｙ₂（ｎ）、分離信号がＺ₁（ｎ），Ｚ₂（ｎ）であり、分離した目的信号としてＺ₁（ｎ）を抽出する場合を例とする。また、ユーザの口（目的信号源）から携帯電話機のマイク（センサ）までの周波数応答を事前に測定し、事前知識保持部１１２に事前測定値（事前知識）として予め保存しておくものとする。
まず、Ｓｔｅｐ１において、事前知識保持部１１２に保存しておいた事前測定値を取得する。
Ｓｔｅｐ２において、この値を分離行列の一列（つまり以下の式（１）の分離行列の二列目のａ₁とａ₂）とする。 Next, a processing procedure performed by the target signal extraction apparatus 10 will be described with reference to FIG. Here, there will be described a case where there is one noise source, the target sound source is the user's voice, and these sounds are input to the mobile phone as the target signal extraction device 10. The separation matrix is assumed to be a 2 × 2 (2 rows × 2 columns) matrix. Furthermore, the observation signals are Y ₁ (n), Y ₂ (n), the separated signals are Z ₁ (n), Z ₂ (n), and Z ₁ (n) is extracted as the separated target signal. And In addition, the frequency response from the user's mouth (target signal source) to the microphone (sensor) of the mobile phone is measured in advance, and is stored in advance in the prior knowledge holding unit 112 as a prior measurement value (prior knowledge). .
First, in Step 1, the prior measurement value stored in the prior knowledge holding unit 112 is acquired.
In Step 2, this value is set as one column of the separation matrix (that is, a ₁ and a _{2 in} the second column of the separation matrix of the following expression (1)).

Ｓｔｅｐ３において、発声音声のデータを測定し、周波数領域に変換する。
Ｓｔｅｐ４において、ＭＡＰ推定のＩＣＡ（事後確率最大基準に基づく独立成分分析）でＷの変動を吸収する。具体的には、従来のＷ＝ａｒｇｍａｘＰ（Ｗ／ｄａｔａ）の代わりにＷに関する事前知識を導入し、ＭＡＰ推定で式（２）によりＷの推定を図る。
Ｗ＝ａｒｇｍａｘＰ（Ｗ／ｄａｔａ）Ｐ（Ｗ）（２）
以上のｓｔｅｐで求めたａ₁，ａ₂をＷの平均値としてＷの確率分布を仮定し、事後確率最大基準に基づいて、式（３）によりＷを逐次に更新することにより推定する。 In Step 3, the voice data is measured and converted to the frequency domain.
In Step 4, the fluctuation of W is absorbed by ICA (independent component analysis based on the maximum posterior probability criterion) of MAP estimation. Specifically, prior knowledge about W is introduced instead of the conventional W = arg maxP (W / data), and W is estimated by Equation (2) by MAP estimation.
W = arg maxP (W / data) P (W) (2)
Assuming a probability distribution of W with a ₁ and a ₂ obtained in the above steps as an average value of W, estimation is performed by sequentially updating W according to Equation (3) based on the maximum posterior probability criterion.

たとえば、正規分布と仮定する場合は以下の式になる。 For example, when assuming a normal distribution, the following equation is obtained.

ただし、μは事前推定値、σ²は分離行列の事前推定値の変動を表す分散値である。
Ｓｔｅｐ５において、以上のＳｔｅｐで推定した分離行列を用いて分離を行って、目的信号を抽出する。
Ｓｔｅｐ６において、目的信号を時間領域の信号に復元する。
（第２の実施の形態）
次に、第２の実施の形態について説明する。図３は、本発明の第２の実施の形態に係る目的信号抽出装置１０の機能構成を示すブロック図である。同図に示すように、第２の実施の形態に係る分離行列算出部１１は、第１の実施の形態に係る分離行列算出部１１が備える事前情報推定部１１１及び事前知識保持部１１２に加えて、無音区間検出部１１３を備えている。 Here, μ is a prior estimated value, and σ ² is a variance value representing a change in the prior estimated value of the separation matrix.
In Step 5, the target signal is extracted by performing separation using the separation matrix estimated in the above Step.
In Step 6, the target signal is restored to a time domain signal.
(Second Embodiment)
Next, a second embodiment will be described. FIG. 3 is a block diagram showing a functional configuration of the target signal extraction apparatus 10 according to the second embodiment of the present invention. As shown in the figure, the separation matrix calculation unit 11 according to the second embodiment includes the prior information estimation unit 111 and the prior knowledge holding unit 112 included in the separation matrix calculation unit 11 according to the first embodiment. The silent section detecting unit 113 is provided.

第２の実施の形態に係る事前情報推定部１１１は、第１の実施の形態に係る事前情報推定部１１１が備える機能に加えて、干渉信号だけが存在する区間（無音区間）の信号を利用して分離行列の１成分を推定する機能を備えている。
次に、図４を参照して、本実施の形態に係る目的信号抽出装置１０が行う処理手順を説明する。ここでの前提条件は、第１の実施の形態で図２を参照して説明した前提条件と同様である。
まず、Ｓｔｅｐ１において、事前知識保持部１１２から事前測定値を取得する。
次に、Ｓｔｅｐ２において、事前知識保持部１１２から取得した値を分離行列の一列（つまり以下の式（１）に示す分離行列の二列目のａ₁とａ₂）にする。 The prior information estimation unit 111 according to the second embodiment uses a signal in a section (silent section) in which only an interference signal exists, in addition to the functions of the prior information estimation unit 111 according to the first embodiment. Thus, a function of estimating one component of the separation matrix is provided.
Next, a processing procedure performed by the target signal extraction apparatus 10 according to the present embodiment will be described with reference to FIG. The preconditions here are the same as the preconditions described with reference to FIG. 2 in the first embodiment.
First, in Step 1, a prior measurement value is acquired from the prior knowledge holding unit 112.
Next, in Step 2, the value acquired from the prior knowledge holding unit 112 is set to one column of the separation matrix (that is, a ₁ and a _{2 in} the second column of the separation matrix shown in the following formula (1)).

Ｓｔｅｐ３において、発声音声のデータを測定し、周波数領域に変換する。
Ｓｔｅｐ４において、ユーザが発声してない区間を検出する。
Ｓｔｅｐ５において、ユーザの発声してない区間の観測データ（Ｙ₁とＹ₂）と以下の式（２）を用いて、出力Ｚ₁のエネルギー最小化基準に基づいてｗ₁を求める。 In Step 3, the voice data is measured and converted to the frequency domain.
In Step 4, a section where the user is not speaking is detected.
In Step 5, w ₁ is obtained based on the energy minimization criterion of the output Z ₁ using observation data (Y ₁ and Y ₂ ) of the section not uttered by the user and the following equation (2).

その結果、ｗ₁を式（４）により推定する：
ｗ₁＝ａ₁＊（Ｒ₂₂−Ｒ₂₁）／（Ｒ₁₁−Ｒ₁₂）（４）
ここで、Ｒ＝Ｅ［ｙ_iｙ_j］（５）
Ｓｔｅｐ６において、ＭＡＰ推定のＩＣＡでＷの変動を吸収する。つまり、従来のＷ＝ａｒｇｍａｘＰ（Ｗ／ｄａｔａ）の代わりにＷに関する事前知識を導入し、ＭＡＰ推定で式（６）によりＷの推定を図る。
Ｗ＝ａｒｇｍａｘＰ（Ｗ／ｄａｔａ）Ｐ（Ｗ）（６）
以上のｓｔｅｐで求めたｗ₁及びａ₁，ａ₂をＷの平均値として、Ｗの確率分布を仮定し、事後確率最大基準に基づいて、式（７）によりＷを逐次に更新することにより推定する。 As a result, w ₁ is estimated by equation (4):
w ₁ = a ₁ * (R ₂₂ -R ₂₁ ) / (R ₁₁ -R ₁₂ ) (4)
Here, R = E [y _i y _j ] (5)
In Step 6, the fluctuation of W is absorbed by ICA of MAP estimation. That is, prior knowledge about W is introduced instead of the conventional W = arg maxP (W / data), and W is estimated by Equation (6) by MAP estimation.
W = arg maxP (W / data) P (W) (6)
By assuming the probability distribution of W with w ₁ and a ₁ , a ₂ obtained in the above step as the average value of W, and sequentially updating W by equation (7) based on the maximum posterior probability criterion presume.

たとえば、正規分布と仮定する場合は以下の式（８）のようになる。 For example, when assuming a normal distribution, the following equation (8) is obtained.

ただし、μは事前推定値、σ²は分離行列の事前推定値の変動を表す分散値である。
Ｓｔｅｐ７において、以上のＳｔｅｐで推定した分離行列を用いて分離を行って、目的信号を抽出する。
Ｓｔｅｐ８において、目的信号を時間領域の信号に復元する。
（第３の実施の形態）
次に第３の実施の形態について説明する。図５は、本発明の第３の実施の形態に係る目的信号抽出装置１０の機能構成を示すブロック図である。 Here, μ is a prior estimated value, and σ ² is a variance value representing a change in the prior estimated value of the separation matrix.
In Step 7, separation is performed using the separation matrix estimated in the above Step, and a target signal is extracted.
In Step 8, the target signal is restored to a time domain signal.
(Third embodiment)
Next, a third embodiment will be described. FIG. 5 is a block diagram showing a functional configuration of the target signal extraction apparatus 10 according to the third embodiment of the present invention.

第３の実施の形態に係る事前情報推定部１１１は、第２の実施の形態に係る事前情報推定部１１１が備える機能に加えて、各チャンネルの観測信号の相関を利用して分離行列の１成分を推定する機能を備えている。
次に、図６を参照して、本実施の形態に係る目的信号抽出装置１０が行う処理の手順について説明する。ここでの前提条件は、第１の実施の形態で図２を参照して説明した前提条件と同様である。
まず、Ｓｔｅｐ１において、事前知識保持部１１２から事前測定値を取得する。
次に、Ｓｔｅｐ２において、Ｓｔｅｐ１で取得した事前測定値を分離行列の一列（つまり以下の式（１）の分離行列の二列目のａ₁とａ₂）にする。 The prior information estimation unit 111 according to the third embodiment, in addition to the function of the prior information estimation unit 111 according to the second embodiment, uses the correlation of the observation signals of each channel to obtain 1 of the separation matrix. It has a function to estimate components.
Next, with reference to FIG. 6, the procedure of the process performed by the target signal extraction apparatus 10 according to the present embodiment will be described. The preconditions here are the same as the preconditions described with reference to FIG. 2 in the first embodiment.
First, in Step 1, a prior measurement value is acquired from the prior knowledge holding unit 112.
Next, in Step 2, the pre-measurement value acquired in Step 1 is set to one column of the separation matrix (that is, a ₁ and a _{2 in} the second column of the separation matrix of the following formula (1)).

Ｓｔｅｐ３において、発声音声のデータを測定し、周波数領域に変換する。
Ｓｔｅｐ４において、ユーザが発声していない区間（干渉波だけの区間）を検出する。
Ｓｔｅｐ５において、ユーザの発声してない区間の観測データ（Ｙ₁とＹ₂）と以下の式（２）を用いて、出力Ｚ₁のエネルギー最小化基準に基づいてｗ₁を求める。 In Step 3, the voice data is measured and converted to the frequency domain.
In Step 4, a section where the user is not speaking (a section including only interference waves) is detected.
In Step 5, w ₁ is obtained based on the energy minimization criterion of the output Z ₁ using observation data (Y ₁ and Y ₂ ) of the section not uttered by the user and the following equation (2).

その結果、ｗ₁を式（４）により推定する：
ｗ₁＝ａ₁＊（Ｒ₂₂−Ｒ₂₁）／（Ｒ₁₁−Ｒ₁₂）（４）
ここで、Ｒ＝Ｅ［ｙ_iｙ_j］
Ｓｔｅｐ６において、全発声区間の音声データＹを用いて、無相関化の拘束条件式（５）でｗ₂を求める（式（６））。
Ｅ［Ｚ₁，Ｚ₂］＝０（５）
ｗ₂＝−（ａ₂ｗ₁Ｅ（ｙ₁ｙ₂）＋ａ₁ａ₂Ｅ（ｙ₂ｙ₂））／（ｗ₁Ｅ（ｙ₁ｙ₁）＋ａ₁Ｅ（ｙ₁ｙ₂））（６） As a result, w ₁ is estimated by equation (4):
w ₁ = a ₁ * (R ₂₂ -R ₂₁ ) / (R ₁₁ -R ₁₂ ) (4)
Here, R = E [y _i y _j ]
In Step 6, w ₂ is obtained by using a correlation condition expression (5) for decorrelation using the voice data Y of all utterance sections (Expression (6)).
E [Z ₁ , Z ₂ ] = 0 (5)
w ₂ = − (a ₂ w ₁ E (y ₁ y ₂ ) + a ₁ a ₂ E (y ₂ y ₂ )) / (w ₁ E (y ₁ y ₁ ) + a ₁ E (y ₁ y ₂ )) ( 6)

Ｓｔｅｐ７において、ＭＡＰ推定のＩＣＡでＷの変動を吸収する。つまり、従来のＷ＝ａｒｇｍａｘＰ（Ｗ／ｄａｔａ）の代わりにＷに関する事前知識を導入し、ＭＡＰ推定で式（７）によりＷの推定を図る。
Ｗ＝ａｒｇｍａｘＰ（Ｗ／ｄａｔａ）Ｐ（Ｗ）（７）
以上のｓｔｅｐで求めたｗ₁、ｗ₂及びａ₁，ａ₂をＷの平均値として、Ｗの確率分布を仮定し、事後確率最大基準に基づいて、式（８）によりＷを逐次に更新することにより推定する。 In Step 7, the fluctuation of W is absorbed by ICA of MAP estimation. That is, prior knowledge about W is introduced instead of the conventional W = arg maxP (W / data), and W is estimated by Equation (7) by MAP estimation.
W = arg maxP (W / data) P (W) (7)
Assuming the probability distribution of W with w ₁ , w ₂ and a ₁ , a ₂ obtained in the above step as the mean value of W, W is sequentially updated by Equation (8) based on the maximum posterior probability criterion. To estimate.

たとえば、正規分布だと仮定する場合は以下の式（９）になる。 For example, when assuming a normal distribution, the following equation (9) is obtained.

ただし、μは事前推定値、σ²は分離行列の事前推定値の変動を表す分散値である。
Ｓｔｅｐ８において、以上のＳｔｅｐで推定した分離行列を用いて分離を行い、目的信号を抽出する。
Ｓｔｅｐ９において、目的信号を時間領域の信号に復元する。
図７には、従来の最尤度基準に基づくＩＣＡ手法を用いて復元した音声信号（図７（ａ））と、本発明手法の事後確率最大基準に基づくＩＣＡ手法を用いて復元した音声信号（図７（ｂ））の例を示している。グラフの横軸が時間を表し、縦軸が信号レベルを表している。図７（ａ）に示すグラフよりも図７（ｂ）に示すグラフの方が信号レベルの変動が小さく、音声信号から雑音信号がより除去されていることがわかる。つまり、本発明手法を用いて復元した音声信号については、従来のＩＣＡを用いて復元した音声信号よりも、ＳＮＲ（Signal-to-Noise Ratio；対雑音信号比）の向上が確認された。 Here, μ is a prior estimated value, and σ ² is a variance value representing a change in the prior estimated value of the separation matrix.
In Step 8, separation is performed using the separation matrix estimated in the above Step, and a target signal is extracted.
In Step 9, the target signal is restored to a time domain signal.
FIG. 7 shows a speech signal restored using the conventional ICA method based on the maximum likelihood criterion (FIG. 7A) and a speech signal restored using the ICA method based on the maximum posterior probability of the method of the present invention. An example of (FIG. 7B) is shown. The horizontal axis of the graph represents time, and the vertical axis represents the signal level. It can be seen that the signal level fluctuation is smaller in the graph shown in FIG. 7B than in the graph shown in FIG. 7A, and the noise signal is more removed from the audio signal. In other words, the SNR (Signal-to-Noise Ratio) was confirmed to be improved for the audio signal restored using the method of the present invention, compared to the audio signal restored using the conventional ICA.

以上説明したように、事前情報を用いて事後確率最大の基準に基づく独立成分分析を行うようにしたため、分離行列についてグローバル最適値を求めることができ、事前情報を用いない尤度最大の基準に基づく独立成分分析よりも、高速かつ高精度に混合信号から目的信号を抽出することが可能となる。また、高い精度で推定した分離行列を事前情報として用いることができるため、高速かつチャンネルの入替を発生させずに分離行列を算出することが可能となる。 As explained above, since independent component analysis based on the criterion of maximum posterior probability is performed using prior information, a global optimum value can be obtained for the separation matrix, and the maximum likelihood criterion without prior information is used. The target signal can be extracted from the mixed signal at a higher speed and with higher accuracy than the independent component analysis based on the above. In addition, since the separation matrix estimated with high accuracy can be used as prior information, it is possible to calculate the separation matrix at high speed without causing channel replacement.

なお、上述した実施の形態においては、分離行列が２×２の行列である場合について説明したが、これに限定されることはなく、３行以上３列以上の行列についても適用可能である。また、上述した実施の形態のフローチャートにおいては、周波数領域で抽出した目的信号を時間領域に変換するとして説明したが、機能構成図に示すように、時間領域に変換した分離行列を用いて目的信号を抽出するようにしてもよい。 In the above-described embodiment, the case where the separation matrix is a 2 × 2 matrix has been described. However, the present invention is not limited to this, and the present invention can also be applied to a matrix of 3 rows or more and 3 columns or more. Further, in the flowchart of the above-described embodiment, the target signal extracted in the frequency domain has been described as being converted into the time domain. However, as shown in the functional configuration diagram, the target signal is converted using the separation matrix converted into the time domain. May be extracted.

本発明は、複数方向からの信号が混合されて受信され、観測したい目的信号のみを直接観測することはできず、目的信号に他のノイズ（雑音）などが重畳されて観測される状況において、目的信号を推定する場合に有効である。特に、ＳＮＲが低い条件において本発明手法は有効である。
例えば、本発明に係る目的信号抽出装置１０を携帯電話機に適用することによって、雑音環境においてユーザの入力音声以外に周囲の雑音が携帯電話機に収録される状況でも、ユーザの音声だけを抽出することが可能となる。携帯電話機以外にも、ＰＤＡ（Personal Digital Assistance）、ＰＨＳ（Personal Handyphone System）、固定電話機、固定電話の子機等、あらゆる通信端末に適用することが可能である。また、無線通信のような外乱が発生する環境における干渉除去にも利用することができる。 In the present invention, signals from a plurality of directions are mixed and received, and it is not possible to directly observe only a target signal to be observed, and in a situation where other noise (noise) is superimposed on the target signal and observed, This is effective when estimating the target signal. In particular, the method of the present invention is effective under conditions where the SNR is low.
For example, by applying the target signal extraction device 10 according to the present invention to a mobile phone, only the user's voice can be extracted even in a situation where ambient noise is recorded in the mobile phone in addition to the user's input voice in a noisy environment. Is possible. In addition to mobile phones, the present invention can be applied to all communication terminals such as PDAs (Personal Digital Assistance), PHS (Personal Handyphone System), fixed phones, and fixed phones. It can also be used to eliminate interference in an environment where disturbances such as wireless communication occur.

本発明の第１の実施の形態に係る目的信号抽出装置の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the target signal extraction apparatus which concerns on the 1st Embodiment of this invention. 同実施の形態に係る目的信号抽出装置が行う処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of the process which the target signal extraction apparatus which concerns on the embodiment performs. 本発明の第２の実施の形態に係る目的信号抽出装置の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the target signal extraction apparatus which concerns on the 2nd Embodiment of this invention. 同実施の形態に係る目的信号抽出装置が行う処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of the process which the target signal extraction apparatus which concerns on the embodiment performs. 本発明の第３の実施の形態に係る目的信号抽出装置の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the target signal extraction apparatus which concerns on the 3rd Embodiment of this invention. 同実施の形態に係る目的信号抽出装置が行う処理の手順を示すフローチャートである。It is a flowchart which shows the procedure of the process which the target signal extraction apparatus which concerns on the embodiment performs. （ａ）は、従来の最尤度基準に基づくＩＣＡ手法を用いて復元した音声信号のグラフであり、（ｂ）は本発明手法の事後確率最大基準に基づくＩＣＡ手法を用いて復元した音声信号のグラフである。(A) is a graph of a speech signal restored using an ICA method based on the conventional maximum likelihood criterion, and (b) is a speech signal restored using an ICA method based on the maximum posterior probability criterion of the method of the present invention. It is a graph of. 従来のＩＣＡ法によるブラインド音源分離（ＢＳＳ）のモデルを示す図である。It is a figure which shows the model of the blind sound source separation (BSS) by the conventional ICA method. 従来の尤度最大法に基づくＩＣＡによる周波数領域ＢＳＳ法の機能構成を示す図である。It is a figure which shows the function structure of the frequency domain BSS method by ICA based on the conventional maximum likelihood method. 従来の特許文献１における目的信号抽出装置の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the target signal extraction apparatus in the conventional patent document 1. FIG.

Explanation of symbols

１０目的信号抽出装置
１１分離行列算出部
１２目的信号抽出部
１１１分離行列初期値算出部
１１２事前情報保持部
１１３無音区間検出部 DESCRIPTION OF SYMBOLS 10 Target signal extraction apparatus 11 Separation matrix calculation part 12 Objective signal extraction part 111 Separation matrix initial value calculation part 112 Prior information holding part 113 Silent section detection part

Claims

Observe mixed signals including target signals coming from multiple directions with multiple sensors, determine the time series for each frequency based on the observation signals from these multiple sensors, analyze the time series for independent components, A target signal extraction device comprising separation matrix calculation means for obtaining a separation matrix, and target signal extraction means for extracting a target signal from the mixed signal using the separation matrix for each frequency,
The separation matrix calculation means includes
An objective signal extraction apparatus characterized in that a separation matrix is obtained by independent component analysis based on a criterion of maximum posterior probability.

The separation matrix calculation means includes
The separation matrix is obtained by independent component analysis based on a criterion of maximum a posteriori probability using a separation matrix estimated based on a frequency response from a target signal source to a sensor as prior information. Signal extraction device.

The separation matrix calculation means includes
The separation matrix is obtained by independent component analysis based on a criterion of maximum a posteriori probability using a separation matrix estimated using a signal in a section in which only an interference signal exists as prior information. The target signal extraction device described.

The separation matrix calculation means includes
Using the signal of the section where only the interference signal exists, using the separation matrix estimated based on the minimum energy constraint condition as a priori information, the separation matrix is obtained by independent component analysis based on the criterion of the maximum posterior probability The target signal extraction device according to any one of claims 1 to 3.

The separation matrix calculation means includes
5. The separation matrix is obtained by independent component analysis based on a criterion of maximum posterior probability using a separation matrix estimated using correlation of observation signals of each channel as prior information. 2. The target signal extraction device according to item 1.

The separation matrix calculation means includes
2. The separation matrix is obtained by independent component analysis based on a criterion of maximum posterior probability, using as a priori information a separation matrix estimated based on a constraint that the correlation of observation signals of each channel is zero. 6. The target signal extraction device according to any one of items 1 to 5.

A signal observation step of observing a mixed signal including a target signal, coming from a plurality of directions, by a plurality of sensors;
A frequency domain transforming step for obtaining a time series for each frequency based on observation signals from the plurality of sensors;
A separation matrix calculation step for obtaining a separation matrix for each frequency by performing independent component analysis based on a criterion of maximum posterior probability for the time series obtained in the frequency domain transformation step;
A target signal extraction method comprising: a target signal extraction step of extracting a target signal from the mixed signal using the separation matrix obtained in the separation matrix calculation step.

On the computer,
A prior information estimation step for estimating an initial value of a separation matrix for extracting a target signal from a mixed signal as prior information;
A separation matrix calculation step for obtaining a separation matrix by sequentially updating the prior information estimated in the prior information estimation step using independent component analysis based on a criterion of maximum posterior probability;
A program for executing a target signal extraction step of extracting a target signal from a mixed signal using the separation matrix obtained in the separation matrix calculation step.