CN103218524B

CN103218524B - The deficient of density based determines blind source separation method

Info

Publication number: CN103218524B
Application number: CN201310116467.8A
Authority: CN
Inventors: 付卫红; 马丽芬; 曾兴雯; 严新; 李爱丽; 刘乃安; 黑永强; 李晓辉
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2013-04-03
Filing date: 2013-04-03
Publication date: 2016-01-20
Anticipated expiration: 2033-04-03
Also published as: CN103218524A

Abstract

The invention discloses the deficient of a kind of density based and determine blind source separation method, mainly solve prior art high, be subject to initial value impact, need the problem of given source signal number.Implementation step is: project on right half hypersphere of unit after removing lowenergy samples data to observation signal; Calculate the density parameter of all subpoints, delete the subpoint that density is less; Utilize the K-means clustering algorithm improved to carry out cluster to residue subpoint, determine best cluster number and cluster centre; Remove and comprise the little cluster of data object number, residue cluster number is the estimated value of source signal number, and corresponding cluster centre is the estimated value of each column vector of hybrid matrix; According to observation signal and the hybrid matrix estimated, adopt linear programming technique Restorer varieties signal.Present invention reduces computation complexity, reduce the impact of initial value on estimated performance, hybrid matrix can be estimated when source signal number is unknown, the estimated accuracy of hybrid matrix and source signal can be improved.

Description

Underdetermined blind source separation method based on density

Technical Field

The invention belongs to the field of signal processing, and particularly relates to an underdetermined blind source separation method which can be used in the fields of mechanical fault detection, voice signal processing, image signal processing, biomedical engineering and the like.

Background

The technology of Blind Source Separation (BSS) is a novel signal processing technology developed in the nineties of the twentieth century, and has wide application in the fields of mechanical failure detection, voice signal processing, image signal processing, biomedical engineering and the like. Blind source separation refers to recovering a source signal from only observed signals, if the source signal and transmission channel parameters are unknown. Underdetermined blind source separation (underdetermined bss) requires the number of observed signals to be smaller than the number of source signals, which is a more practical and challenging problem. The underdetermined blind source separation linear instantaneous hybrid model is as follows:

x(t)=As(t)+v(t),t=1,2,…,T₀1)

wherein x (t) ═ x₁(t)…x_i(t)…x_M(t)]^TAn observation signal vector with dimension M at the time of t sampling, A is an unknown mixing matrix of M × N (M < N), and s (t) [ s ]₁(t)…s_i(t)…s_N(t)]^TAn unknown source signal vector of N dimensions at the time of t sampling; v (t) ═ v₁(t)…v_i(t)…v_M(t)]^TAn M-dimensional additive white Gaussian noise vector is marked for the sampling moment T, and a symbol T represents the transposition of the vector; m is the number of observed signals (number of sensors), N is the number of source signals, T₀Indicating the sample data length.

In response to the underdetermined blind source separation problem, a two-step method based on Sparse Component Analysis (SCA) has become an important means to solve the problem. The sparse component analysis assumes that the source signal is a sparse signal with values at most of the sampling moments being zero or close to zero and values at only a few of the sampling moments being far from zero. When the source signals are all sparse signals, only one source signal plays a leading role at most of sampling moments, and the observation signals received by the receiving end have linear clustering characteristics. According to the linear clustering characteristics of the observed signals, referring to fig. 1, the two-step method firstly adopts a certain clustering algorithm to estimate a mixing matrix, and then adopts a certain algorithm to recover the source signals according to the mixing matrix and the observed signals, such as a linear programming method, a matching pursuit method and the like. Therefore, the estimation of the mixing matrix in the two-step method is very critical, and the accuracy thereof directly influences the recovery effect of the source signal. If the source signal is not a sparse signal, sparse transform tools of the signal, such as short-time fourier transform, wavelet transform and the like, can be adopted to perform sparse representation on the signal, so that the signal has better sparse characteristics in a transform domain. At present, statistical clustering methods such as K-means clustering and the like, potential function methods, robust competitive clustering methods and the like can be used for estimating the mixing matrix.

Statistical clustering methods based on K-means clustering and the like are low in complexity and easy to implement, but performance is easily affected by initial values, the number of source signals needs to be given, and the number of the source signals in practice may be unknown. The potential function method can estimate the mixing matrix when the number of the source signals is unknown, but the method lacks a certain theoretical basis, has strong subjective experience and is only suitable for a two-dimensional space. The method is suitable for a Laplace potential function method of a high-dimensional space, has strong anti-noise capability, but in order to ensure that each local maximum value is estimated, all data objects are selected as initial clustering centers, so that the algorithm complexity is high, and the method has no practicability. The robust competitive clustering method reduces the algorithm complexity, but has poor robustness and lower estimation accuracy of the mixing matrix.

Disclosure of Invention

The present invention aims to provide a density-based underdetermined blind source separation method to reduce the computational complexity, reduce the influence of the initial value on the estimation performance, estimate the mixing matrix when the number of source signals is unknown, and improve the estimation accuracy of the mixing matrix and the source signals.

The technical idea for realizing the invention is as follows: removing low-energy sampling data from the observation signals by using linear clustering characteristics of the observation signals, and projecting the observation signals onto a unit right half hypersphere; calculating density parameters of all projection points; according to the characteristic that the noise and the abnormal value deviate from a straight line corresponding to the array vector of the mixed matrix and the density parameter of the projection point is small, the projection point with the small density parameter is deleted, and the influence of the noise and the abnormal value is reduced; clustering and dividing the residual projection points by using an improved K-means clustering algorithm to determine the optimal clustering number and clustering center; removing clusters containing few data objects and corresponding cluster centers, wherein the number of the remaining cluster centers is an estimated value of the number of the source signals, and the remaining cluster centers are estimated values of column vectors of the mixed matrix, so that an estimated value of the mixed matrix is obtained; and recovering the source signal by adopting a linear programming method according to the observation signal and the estimated mixing matrix. The method comprises the following concrete implementation steps:

1) at all sampling time t, the energy of the observation signal x (t) received by the receiving end is measuredWith low energy threshold₁Make a comparison ifDeleting x (t), otherwise projecting x (t) to the right half hypersphere of unit to obtain projection point

\hat{x} (t) = sign (x_{1} (t)) \times x (t) / {| | x (t) | |}_{2},

Where x (T) is the observed signal at the sampling time T, T is 1,2, …, T₀，T₀For sample data length, x₁(t) is the first component of the observed signal x (t), sign (·) represents the sign of the solved variable, | · calvert |)₂The expression is given in the 2-norm,₁the value range of (a) is 0.1-0.5 times of the average value of the energy of the observation signal;

2) rearranging all the projection points obtained in the step 1) into a form with continuous and increasing subscript:obtaining a set of projection points Computing a set of proxelsOf arbitrary data objectsDensity parameter of

B_{r} ({\tilde{x}}_{i}) = Σu (r - d ({\tilde{x}}_{i}, {\tilde{x}}_{j})) i, j = 1, \cdot \cdot \cdot, n,

Wherein M is the number of observation signals, and n is the projection point setThe number of the medium data objects, and n is less than or equal to T₀，For sets of projection pointsAny two data objects inAndthe euclidean distance between them, r being the neighborhood radius of the data object:

r = \frac{1}{n} \times Σ_{i = 1}^{n} \min (d ({\tilde{x}}_{i}, {\tilde{x}}_{j}) j = 1, \cdot \cdot \cdot, n, i &NotEqual; j,

the function u is specifically defined as:

u (x) = \{\begin{matrix} 1 & x &GreaterEqual; 0 \\ 0 & x < 0 \end{matrix},

wherein

x = r - d ({\tilde{x}}_{i}, {\tilde{x}}_{j});

3) Setting a low densityThreshold η₁And a high density threshold η₂Set the projection pointsDensity parameter of data object in data objectAnd a low density threshold η₁And a high density threshold η₂Comparing, and deleting density parametersTo obtain a set of residual projection pointsSaving density parametersResulting in a high density data object set D, wherein η₁The value of (A) is in the range of 0.01-0.2 times of the maximum value of the density parameter, η₂The value range of (a) is 0.25-0.45 times of the maximum value of the density parameter;

4) using an improved K-means clustering algorithm to collect the residual projection point set obtained in the step 3)The data objects in the system are clustered and divided to obtain the estimated value of the original mixed matrix A

4.1) selecting the initial clustering number K as M and temporary variable F₁0.001, intermediate variable F₂＝0.002；

4.2) selecting K initial clustering centers z from the high-density data object set D obtained in the step 3)₁,z₂,…,z_K；

4.3) clustering with original K-meansThe algorithm sets the remaining proxelsThe data objects in the cluster are divided into K classes to obtain K clusters W₁…W_i…W_KThe cluster centers corresponding to the clusters are respectivelyThe number of the data objects in each cluster is n₁…n_i…n_KWherein i ∈ [1, K]；

4.4) Using the clusters W obtained in step 4.3)₁…W_i…W_KAnd a cluster centerCalculating DBI (Devison bauxid index value)_K：

{DBI}_{K} = \frac{1}{K} Σ_{i = 1}^{K} \max_{j} \frac{\frac{1}{n_{i}} Σ_{p = 1}^{n_{i}} {| | {\tilde{x}}_{p} - {\tilde{z}}_{i} | |}_{2} + \frac{1}{n_{j}} Σ_{q = 1}^{n_{j}} {| | {\tilde{x}}_{q} - {\tilde{z}}_{j} | |}_{2}}{{| | {\tilde{z}}_{i} - {\tilde{z}}_{j} | |}_{2}} j = 1, \cdot \cdot \cdot, K, j &NotEqual; i,

Wherein,for clustering W_iP 1, …, n_i，n_iFor clustering W_iThe number of data objects;for clustering W_jQ is 1, …, n_j，n_jFor clustering W_jThe number of data objects;andare respectively a cluster W_iAnd cluster W_jThe cluster center of (a);

4.5) obtaining the DBI index value of the Dywessonberg D of the iteration_KWith temporary variables F₁And intermediate variable F₂Making a comparison if DBI is not satisfied_K＞F₂And F₂＜F₁Let the cluster number K be K +1 and temporary variable F₁＝F₂Middle variable F₂＝DBI_KAnd returning to the step 4.2); otherwise, the optimal clustering number is K-1, and the corresponding clustering centerFinishing iteration for the optimal clustering center;

4.6) setting the number threshold₂Calculating the number n of data objects_iRatio lambda to the maximum number Q_i：λ_i＝n_iQ, where i ═ 1, …, K-1, Q ═ max (n)₁,n₂,…,n_K-1)，₂Is 0.2; the ratio lambda_iAnd a number threshold₂Making a comparison if_i＜₂Then delete cluster centerOtherwise it will beAs a mixing matrixOne column of (1), the number of cluster centers that are finally retainedI.e. an estimated value of the number N of source signals, a mixing matrixThe estimated value of the original mixed matrix A is obtained;

5) in the observation of signals x (t) and the mixing matrixOn the basis, the linear programming method is adopted to recover each path of source signal to obtain the estimation value of the source signalWhere T is 1,2, …, T₀，T₀The blind source separation under underdetermined conditions is completed for the length of the sampled data.

Compared with the prior art, the invention has the beneficial effects that:

(1) according to the method, the influence of noise and abnormal values is reduced by removing low-density data objects, the robustness is strong, and the estimation accuracy of the mixed matrix and the source signal is improved;

(2) according to the method, the high-density data objects are selected, and the initial clustering centers are selected by considering the distance factor, so that the influence of the selection of the initial values on the estimation performance is reduced;

(3) the method retains the advantage of high speed of the original K-means clustering algorithm, and reduces the calculation complexity;

(4) according to the invention, the Thevenin bauxid index is introduced, so that the mixing matrix can be estimated when the number of source signals is unknown.

Drawings

FIG. 1 is a schematic diagram of an existing two-step approach to underdetermined blind source separation;

FIG. 2 is a general flow diagram of the present invention for implementing underdetermined blind source separation;

FIG. 3 is a sub-flowchart of the present invention for performing mixed matrix estimation using an improved K-means clustering algorithm;

FIG. 4 shows five random signals used in simulation experiments of the present invention;

FIG. 5 is an observed signal obtained by linear instantaneous mixing of the five random signals of FIG. 4;

FIG. 6 is a comparison graph of computational complexity versus signal-to-noise ratio variation obtained by simulating the observation signal of FIG. 5 using the prior art method and the method of the present invention;

FIG. 7 is a comparison graph of the number accuracy of source signals as a function of the signal-to-noise ratio, obtained by simulating the observation signals in FIG. 5 using the prior art method and the method of the present invention;

FIG. 8 is a comparison graph of the estimated error of the mixing matrix obtained by simulating the observation signal of FIG. 5 by the prior art method and the method of the present invention as a function of the signal-to-noise ratio;

FIG. 9 is a five-way random signal separated from the observed signal of FIG. 5 by the method of the present invention;

FIG. 10 is a diagram of four voice signals used in a simulation experiment of the present invention;

FIG. 11 is a scattergram of observed signals obtained by linear temporal mixing of the four voice signals of FIG. 10;

FIG. 12 is a four-way speech signal separated from the observed signal of FIG. 11 by the method of the present invention.

Detailed Description

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention.

Referring to fig. 2, the implementation steps of the invention are as follows:

step 1, removing low energy points from the received observation signals and projecting the observation signals onto a unit semi-hypersphere.

(1.1) setting Low energy threshold₁Energy of observation signal x (t) received by receiving endWith low energy threshold₁Make a comparison ifDeleting x (t), otherwise projecting x (t) to the right half hypersphere of unit to obtain projection point

\hat{x} (t) = sign (x_{1} (t)) \times x (t) / {| | x (t) | |}_{2},

Where x (T) is the observed signal at the sampling time T, T is 1,2, …, T₀，T₀For sample data length, x₁(t) is the first component of the observed signal x (t), sign (·) represents the sign of the solved variable, | · calvert |)₂The expression is given in the 2-norm,₁the value range of (a) is 0.1-0.5 times of the average value of the energy of the observed signal, but₁Is not limited to this range, and in the embodiment of the present invention,₁is 0.2 times the average value of the observed signal energy.

(1.2) rearranging all proxels to a form with continuous and increasing subscripts:obtaining a set of projection points

\tilde{x} = {{\tilde{x}}_{i} | {\tilde{x}}_{i} &Element; R^{M}, i &Element; [1, n]},

Wherein M is the number of observation signals, and n is the projection point setThe number of the medium data objects, and n is less than or equal to T₀。

Step 2, calculating a projection point setOf arbitrary data objectsDensity parameter of

B_{r} ({\tilde{x}}_{i}) = Σu (r - d ({\tilde{x}}_{i}, {\tilde{x}}_{j})) i, j = 1, \cdot \cdot \cdot, n,

Wherein,for sets of projection pointsAny two data objects inAndthe euclidean distance between them, r being the neighborhood radius of the data object:

r = \frac{1}{n} \times Σ_{i = 1}^{n} \min (d ({\tilde{x}}_{i}, {\tilde{x}}_{j}) j = 1, \cdot \cdot \cdot, n, i &NotEqual; j,

the function u is specifically defined as:

u (x) = \{\begin{matrix} 1 & x &GreaterEqual; 0 \\ 0 & x < 0 \end{matrix},

wherein

x = r - d ({\tilde{x}}_{i}, {\tilde{x}}_{j}) .

Step 3, the projection point set obtained in the step 1And removing the low-density data objects and selecting the high-density data objects.

Setting a low density threshold η₁And a high density threshold η₂Set the projection pointsDensity parameter of data object in data objectAnd a low density threshold η₁And a high density threshold η₂Comparing, and deleting density parametersObtaining a set of remaining data objectsSaving density parametersResulting in a high density data object set D, wherein η₁And η₂Is determined according to the actually calculated density parameter.

And 4, estimating the mixing matrix by using an improved K-means clustering algorithm.

Referring to fig. 3, the specific implementation of this step is as follows:

(4.1) selecting the initial clustering number K as M and temporary variable F₁0.001, intermediate variable F₂＝0.002；

(4.2) selecting K initial clustering centers z from the high-density data object set D obtained in the step 3₁,z₂,…,z_KThe specific selection method comprises the following steps:

(4.2.1) selecting the data object with the maximum density parameter from the high-density set D as the first initial clustering center z₁Taking the distance z₁The farthest data object as the 2 nd initial cluster center z₂；

(4.2.2) for l 3,4, …, K, the remaining data objects in the high density set D are calculatedTo selected l-1 initial clustering centers z_jEuropean distance ofWherein j is 1,2, …, l-1, i ∈ [1, c]C is the number of data objects in the high-density set D; l-1 initial clustering centers z are selected_jThe data object with the largest minimum Euclidean distance is taken as the ith initial clustering center z_l：

z_{l} = \underset{{\tilde{x}}_{i} &NotEqual; z_{j}}{\underset{{\tilde{x}}_{i} &Element; D}{\arg \max}} {\min_{j = 1,2, . . ., l - 1} d ({\tilde{x}}_{i}, z_{j})} i = 1,2, \cdot \cdot \cdot, c,

Then z is₁,z₂,…,z_KSelecting K initial clustering centers;

(4.3) adopting an original K-means clustering algorithm to collect the residual data objects obtained in the step (3)The data objects in the cluster are divided into K classes to obtain K clusters W₁…W_i…W_KThe cluster centers corresponding to the clusters are respectivelyEach polyThe number of the data objects in the class is n₁…n_i…n_KWherein i ∈ [1, K]；

(4.4) Using the cluster W obtained in step 4.3)₁…W_i…W_KAnd a cluster centerCalculating DBI (Devison bauxid index value)_K：

{DBI}_{K} = \frac{1}{K} Σ_{i = 1}^{K} \max_{j} \frac{\frac{1}{n_{i}} Σ_{p = 1}^{n_{i}} {| | {\tilde{x}}_{p} - {\tilde{z}}_{i} | |}_{2} + \frac{1}{n_{j}} Σ_{q = 1}^{n_{j}} {| | {\tilde{x}}_{q} - {\tilde{z}}_{j} | |}_{2}}{{| | {\tilde{z}}_{i} - {\tilde{z}}_{j} | |}_{2}} j = 1, \cdot \cdot \cdot, K, j &NotEqual; i,

Wherein,for clustering W_iP 1, …, n_i，n_iFor clustering W_iThe number of data objects;for clustering W_j… when q is 1,n_j，n_jFor clustering W_jThe number of data objects;andare respectively a cluster W_iAnd cluster W_jThe cluster center of (a);

4.6) setting the number threshold₂Calculating the number n of data objects_iRatio lambda to the maximum number Q_i：λ_i＝n_iQ, where i ═ 1, …, K-1, Q ═ max (n)₁,n₂,…,n_K-1)，₂Is 0.2; the ratio lambda_iAnd a number threshold₂Making a comparison if_i＜₂Then delete cluster centerOtherwise it will beAs a mixing matrixOne column of (1), the number of cluster centers that are finally retainedI.e. an estimated value of the number N of source signals, a mixing matrixI.e. an estimate of the original mixing matrix a.

And 5, estimating the source signal by adopting a linear programming method.

(5.1) sequentially increasing the number of columns according to the principle that the column labels are different from each otherM column vectors are selected to constructM × M dimension submatrixWhereini∈[1,L]；

(5.2) at all sampling instants t, based on the observed signals x (t) and the full submatrixCalculating source signal estimates

{\tilde{s}}_{i} (t) = {\tilde{A}}_{i}^{- 1} x (t) i = 1,2, \cdot \cdot \cdot, L, t = 1,2, \cdot \cdot \cdot, T_{0},

Wherein,is a sub-matrixX (T) is the observed signal at the sampling instant T, T₀Is the length of the sampled data;

(5.3) calculating source signal estimation valuesModulus value of (a)_i(t) obtaining an optimal estimate of the source signal at the sampling time t

\tilde{s} (t) = \underset{{\tilde{s}}_{i} (t)}{\arg \min} {a_{i} (t)}, i = 1,2, \cdot \cdot \cdot, L, t = 1,2, \cdot \cdot \cdot, T_{0},

Wherein

a_{i} (t) = {| | {\tilde{s}}_{i} (t) | |}_{1},

||·||₁Representing a 1 norm.

The advantages of the present invention can be further illustrated by the following two simulation experiments:

simulation experiment 1:

1.1) simulation conditions

Setting the number of source signals N to 5, the number of observation signals M to 3, and the length of sampling data T₀3000; five random signals S were generated by computer simulation:

S＝-log(rand(N,T₀)).*max(0,sign(rand(N,T₀)-q))，

the value range of the parameter q is [0,1], the larger the value of the parameter q is, the higher the sparsity of the source signal is, and the smaller the parameter q is otherwise; selecting q as 0.9 in the simulation;

randomly selecting a mixing matrix A and normalizing the column vectors thereof:

A = [\begin{matrix} 0.5525 & - 0.1904 & 0.5707 & 0.7240 & - 0.0759 \\ 0.6863 & 0.5148 & - 0.5166 & 0.4088 & - 0.8293 \\ 0.4730 & 0.8459 & 0.6383 & - 0.5556 & 0.5536 \end{matrix}],

setting a low energy thresholdWhereinSetting a low density threshold η for the average value of the observed signal energy₁＝0.2*B_maxHigh density threshold η₂＝0.33*B_maxWhereinB_maxis the maximum value of the density parameter; setting a number threshold₂0.2; gaussian white noise is added into the observation signal, and the range of the signal to noise ratio is set to be 10-20 dB.

Estimated mixing matrixThe difference of the sequence and the sign exists between the original mixing matrix A and the original mixing matrix A, and the estimation error of the mixing matrix is definedWherein P is a permutation matrix, | · |. non-woven phosphor_FIs the F-norm of the matrix.

In order to verify the effectiveness of the algorithm, only the source signal is changed in the experimental process, other simulation parameters are kept unchanged, and the experiment is repeated 500 times under each signal-to-noise ratio.

1.2) simulation method

Existing robust competitive clustering method, existing Laplace potential function method and method of the invention

1.3) simulation content

Simulation 1

Under the simulation conditions, five paths of random signals generated by simulation are shown in fig. 4, an observation signal received by a receiving end is shown in fig. 5, a mixed matrix is estimated according to the observation signal in fig. 5 by using the existing robust competitive clustering method, the laplacian potential function method and the method of the invention, the computational complexity of the three methods is compared, the computational complexity is represented by using running time in the simulation experiment of the invention, and the result is shown in fig. 6.

In fig. 6, a curve 61 is a variation of the computational complexity with the signal-to-noise ratio obtained by the robust competitive clustering method; the curve 62 is the variation of the computation complexity with the signal-to-noise ratio obtained by simulation using the laplace potential function method; curve 63 is the case of the computational complexity as a function of the signal to noise ratio, obtained by simulation using the method of the present invention.

Comparing the curves 61, 62 and 63, it can be seen that the method of the present invention has the shortest operation time and the lowest calculation complexity;

comparing the curves 61 and 63, it can be seen that the running time of the method of the present invention is about 10% -50% of the robust competitive clustering method;

comparing the curves 62 and 63, it can be seen that the running time of the method of the present invention is only about 1% -3% of that of the Laplace potential function method; the method of the invention runs time and sample data length T₀Approximately proportional, Laplace potential function method running time and sampling data length T₀ ²In direct proportion, when T₀When the operation time is increased, the operation time of the Laplace potential function method is longer, and the complexity is higher.

Simulation 2

Under the simulation conditions, the observation signals in the fig. 5 are simulated by using the existing robust competitive clustering method, the laplace potential function method and the method of the invention, the number of source signals of the three methods is compared to estimate the correct rate, and the result is shown in fig. 7.

In fig. 7, a curve 71 shows the variation of the estimated accuracy of the number of source signals obtained by the robust competitive clustering method along with the signal-to-noise ratio; curve 72 is the variation of the estimated accuracy of the number of source signals with the signal-to-noise ratio, which is obtained by simulation using the laplace potential function method; curve 73 shows the variation of the estimated accuracy of the number of source signals with the signal-to-noise ratio, which is obtained by the simulation according to the present invention.

Comparing the curves 71 and 73, it can be seen that the source signal number estimation accuracy of the method of the present invention is much higher than that of the robust competitive clustering method;

comparing the curves 72 and 73, the source signal number estimation accuracy of the method is rapidly improved between 10 dB and 13 dB; when the signal-to-noise ratio is 13dB, the accuracy of the method reaches 96.6%, and only about 3% of loss is caused compared with a Laplace potential function method; when the signal-to-noise ratio is larger than 13dB, the source signal number accuracy of the method is the same as that of the Laplace potential function method, and the source signal number can be almost completely and correctly estimated.

Simulation 3

Under the simulation conditions, the existing robust competitive clustering method, the laplace potential function method and the method of the invention are used for estimating the mixing matrix according to the observation signals in the figure 5, the error of the mixed matrix estimation of the three methods is compared, and the result is shown in figure 8.

In fig. 8, a curve 81 is a variation of the estimated error of the mixing matrix obtained by the robust competitive clustering method along with the signal-to-noise ratio; the curve 82 is the variation of the hybrid matrix estimation error with the signal-to-noise ratio, which is obtained by simulation using the laplace potential function method; curve 83 is the variation of the hybrid matrix estimation error with the signal to noise ratio, which is obtained by the simulation of the method of the present invention.

Compared with curves 81, 82 and 83, the hybrid matrix estimation error of the method is rapidly reduced between 10 dB and 13dB, and when the signal-to-noise ratio is higher than 13dB, the hybrid estimation error of the method is minimum and the accuracy is highest; when the signal-to-noise ratio is 15dB, the mixed matrix estimation error is minimum, the Laplace potential function method is inferior, and the robust competitive clustering method is maximum.

Simulation 4

Under the simulation conditions, in order to explain the influence of the mixed matrix estimation precision on the source signal recovery effect, the linear programming method used in the method is adopted to recover the source signal on the basis of the existing robust competitive clustering method, the Laplace potential function method and the mixed matrix estimated by the method. The recovery effect of each path of Signal is measured by Signal-to-interference ratio (SIR), which is calculated as follows:

{SIR}_{i} = {10 \log}_{10} (\frac{Σ_{t = 1}^{T_{0}} {| s_{i} (t) |}^{2}}{Σ_{t = 1}^{T_{0}} {| s_{i} (t) - {\tilde{s}}_{i} (t) |}^{2}}) i = 1,2, \cdot \cdot \cdot, N,

wherein s is_i(t) is the ith original signal,obtaining an estimated value of the ith original signal obtained by blind separation; t is₀The length of the sampled data is N, and the number of the source signals is N; to avoid the effect of amplitude uncertainty, s is calculated before SIR calculation_i(t) andare subjected to normalization processing.

The simulation results are shown in fig. 9 and table 1, where:

at a signal-to-noise ratio of 15dB, the five random signals recovered from the observed signal in fig. 5 by the method of the present invention are shown in fig. 9, and a comparison of fig. 4 and 9 shows that the separation effect is satisfactory.

Table 1 shows the signal-to-interference ratios SIR of the signals of the respective channels, which are recovered by the linear programming method on the basis of the existing robust competitive clustering method, the laplace potential function method and the hybrid matrix estimated by the method of the present invention when the signal-to-noise ratio is 15 dB:

TABLE 1

It can be seen from table 1 that the signal-to-interference ratio of each source signal recovered by the method of the present invention is the highest, and the separation effect is the best. Therefore, the estimation of the mixing matrix is important, and the estimation accuracy directly affects the recovery effect of the source signal.

Simulation experiment 2

2.1) simulation conditions

Setting the number of source signals N to 4, the number of observation signals M to 3, and the length of sampling data T₀50000; the source signal is four actual voice signals, as shown in fig. 10; randomly selecting a mixing matrix A and normalizing the column vectors thereof:

A = [\begin{matrix} 0.4787 & - 0.5115 & 0.9863 & 0.1595 \\ 0.4569 & 0.4144 & - 0.1204 & 0.8399 \\ 0.7498 & 0.7528 & - 0.1124 & - 0.5188 \end{matrix}],

transforming each obtained observation signal into a time-frequency domain through short-time Fourier transform (STFT), and setting a transformation length N_fft2048, adding a rectangular window, wherein the window length is the same as the conversion length, and the moving step length is 512; setting a low energy thresholdWherein,setting a low density threshold η for the average value of the observed signal energy₁＝0.01*B_maxHigh density threshold η₂＝0.25*B_maxWherein B is_maxIs the maximum value of the density parameter; setting a number threshold₂＝0.2。

2.2) simulation content

Simulation 1

Under the above simulation conditions, the observation signal received by the receiving end is as shown in fig. 11, where fig. 11(a) is an observation signal time domain scattergram, and fig. 11(b) is an observation signal time-frequency domain real part scattergram. The observation signal in FIG. 11(b) is simulated by the method of the present invention to obtain the estimated value of the original mixing matrix A

\tilde{A} = [\begin{matrix} - 0.9862 & 0.4781 & - 0.1594 & - 0.5114 \\ 0.1206 & 0.4585 & - 0.8401 & 0.4131 \\ 0.1136 & 0.7491 & 0.5184 & 0.7535 \end{matrix}],

The mixing matrixThe estimation error of (2) is: error is 0.0027;

contrast mixing matrixAnd the original mixing matrix A, it can be seen that the method of the invention can estimate the mixing matrix when the number of the source signals is unknown, and the estimation accuracy is very high and is close to the true value.

Simulation 2

Under the above conditions, according to the mixing matrixAnd observing the signal and adopting the method of the invention to recover each path of voice signal, the result is shown in fig. 12 and table 2, wherein:

the four paths of voice signals separated from the observation signal in fig. 11 by the method of the present invention are shown in fig. 12, and comparing fig. 10 and 12, it can be seen that the waveforms of the recovered paths of signals are approximately the same as the waveforms of the original signals, and the separation effect is satisfactory.

The signal-to-interference ratio SIR of each signal is shown in table 2.

TABLE 2

It can be seen from table 2 that the error of the hybrid matrix estimated by the method of the present invention is very small, and the signal-to-interference ratio of each path of signals recovered on the basis of the hybrid matrix is high, and the separation effect is good.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. An underdetermined blind source separation method based on density is characterized by comprising the following steps:

\hat{x} (t) = s i g n (x_{1} (t)) \times x (t) / | | x (t) | |_{2},

B_{r} ({\tilde{x}}_{i}) = Σ u (r - d ({\tilde{x}}_{i}, {\tilde{x}}_{j})), i, j = 1, ..., n,

Wherein M is the number of observation signals, and n is the projection point setThe number of the medium data objects, and n is less than or equal to T₀，For sets of projection pointsAny two ofData objectAndthe euclidean distance between them, r being the neighborhood radius of the data object:

r = \frac{1}{n} \times Σ_{i = 1}^{n} m i n (d ({\tilde{x}}_{i}, {\tilde{x}}_{j})), j = 1, ..., n, i &NotEqual; j

the function u is specifically defined as:

u (x) = \{\begin{matrix} 1 & x &GreaterEqual; 0 \\ 0 & x < 0 \end{matrix},

wherein

x = r - d ({\tilde{x}}_{i}, {\tilde{x}}_{j});

3) Setting a low density threshold η₁And a high density threshold η₂Set the projection pointsDensity parameter of data object in data objectAnd a low density threshold η₁And a high density threshold η₂Comparing, and deleting density parametersTo obtain a set of residual projection pointsSaving density parametersResulting in a high density data object set D, wherein η₁The value of (A) is in the range of 0.01-0.2 times of the maximum value of the density parameter, η₂The value range of (a) is 0.25-0.45 of the maximum value of the density parameterDoubling;

4.3) using the original K-means clustering algorithm to collect the residual projection pointsThe data objects in the cluster are divided into K classes to obtain K clusters W₁…W_i…W_KThe cluster centers corresponding to the clusters are respectivelyThe number of the data objects in each cluster is n₁…n_i…n_KWherein i ∈ [1, K]；

{DBI}_{K} = \frac{1}{K} Σ_{i = 1}^{K} \underset{j}{m a x} \frac{\frac{1}{n_{i}} Σ_{p = 1}^{n_{i}} | | {\tilde{x}}_{p} - {\tilde{z}}_{i} | |_{2} + \frac{1}{n_{j}} Σ_{q = 1}^{n_{j}} | | {\tilde{x}}_{q} - {\tilde{z}}_{j} | |_{2}}{| | {\tilde{z}}_{i} - {\tilde{z}}_{j} | |_{2}}, j = 1, ..., K, j &NotEqual; i,

4.5) obtaining the DBI index value of the Dywessonberg D of the iteration_KWith temporary variables F₁And intermediate variable F₂Making a comparison if DBI is not satisfied_K>F₂And F₂<F₁Let the cluster number K be K +1 and temporary variable F₁＝F₂Middle variable F₂＝DBI_KAnd returning to the step 4.2); otherwise, the optimal clustering number is K-1, and the corresponding clustering centerFinishing iteration for the optimal clustering center;

4.6) setting the number threshold₂Calculating the number n of data objects_iRatio lambda to the maximum number Q_i：λ_i＝n_iQ, where i ═ 1, …, K-1, Q ═ max (n)₁,n₂,…,n_K-1)，₂Is 0.2; the ratio lambda_iAnd a number threshold₂Making a comparison if_i<₂Then delete cluster centerOtherwise it will beAs a mixing matrixOne column of (1), the number of cluster centers that are finally retainedI.e. an estimated value of the number N of source signals, a mixing matrixThe estimated value of the original mixed matrix A is obtained;

2. The method according to claim 1, wherein the step 4.2) of selecting K initial cluster centers is performed according to the following steps:

4.2.1) selecting the data object with the maximum density parameter from the high-density set D as the first initial clustering center z₁Taking the distance z₁The farthest data object as the 2 nd initial cluster center z₂；

4.2.2) for l 3,4, …, K, the remaining data objects within the high density set D are calculatedTo selected l-1 initial clustering centers z_jEuropean distance ofWherein j is 1,2, …, l-1, i ∈ [1, c]C is the number of data objects in the high-density set D; selecting the data object with the maximum minimum Euclidean distance from the l-1 initial clustering centers as the l-th initial clustering center z_l：

z_{l} = \underset{{\tilde{x}}_{i} &NotEqual; z_{j}}{\underset{{\tilde{x}}_{i} &Element; D}{\arg \max}} {\underset{j = 1, 2, ..., l - 1}{m i n} d ({\tilde{x}}_{i}, z_{j})}, i = 1, 2, ..., c,

Then z is₁,z₂,…,z_KSelecting K initial clustering centers, wherein K is the number of the initial clustering centers.

3. The method of claim 1, wherein step 5) is based on observing signals x (t) and a mixing matrixBased on the received signal, the linear programming method is adopted to recover the source signalThe method comprises the following steps:

5.1) sequentially increasing the number of columns according to the principle that the column marks are different from each otherM column vectors are selected to constructDimension sub-matrixWhereini∈[1,L]；

5.2) at all sampling instants t, based on the observed signals x (t) and the overall sub-matrixCalculating source signal estimates

{\tilde{s}}_{i} (t) = {\tilde{A}}_{i}^{- 1} x (t), i = 1, 2, ..., L, t = 1, 2, ..., T_{0},

5.3) calculating the source signal estimation valueModulus value of (a)_i(t) obtaining the source pair at the sampling time tOptimal estimation of a signal

\tilde{s} (t) = \underset{{\tilde{s}}_{i} (t)}{\arg \min} {a_{i} (t)}, i = 1, 2, ..., L, t = 1, 2, ..., T_{0},

Wherein||·||₁Representing a 1 norm.