CN105550716A

CN105550716A - Underdetermined blind source separation method applying multiple constraints

Info

Publication number: CN105550716A
Application number: CN201610046103.0A
Authority: CN
Inventors: 王敏; 王艳芳
Original assignee: Jiangsu University of Science and Technology
Current assignee: Jiangsu University of Science and Technology
Priority date: 2016-01-22
Filing date: 2016-01-22
Publication date: 2016-05-04

Abstract

The invention discloses an underdetermined blind source separation method applying multiple constraints, which is improved on the basis of a traditional blind source separation method to realize a better signal separating effect. The method disclosed by the invention aims to separate mixed signals and comprises the steps of firstly carrying out equalization and whitening to improve the robustness of initial conditions, carrying out multi-constraint limit on a non-negative matrix factorization algorithm and carrying out optimization on an objective function and further improving the signal separating property through a feedback mechanism finally. Therefore, the method disclosed by the invention has the advantages of good factor interpretability and high separated signal purity.

Description

underdetermined blind source separation method applying multiple constraints

Technical Field

The invention relates to an underdetermined blind source separation method applying multiple constraints, and belongs to the technical field of signal processing.

Background

BSS (blind source separation) is a technique for recovering a source signal only from a signal received by a sensor when the source signal and a mixing method are unknown. The blind source separation when the number of the sensors is less than that of the source signals is called underdetermined blind source separation, is a hot research problem in the field of blind signal processing in recent years, and has wide application prospects in the aspects of voice signal processing, wireless communication, digital image processing and the like.

In the conventional blind signal separation method, the ICA algorithm is developed based on a Principal Component Analysis (PCA) method, and is only applicable to an overdetermined or positive definite hybrid model, and certain assumptions and constraints are necessary. The first non-Negative Matrix Factorization (NMF) proposed by Lee and Seung is an effective method to solve the underdetermined hybrid model. One of the most useful properties of NMF is that the decomposition yields a low rank matrix that expresses significant data with a small number of elements, facilitates physical interpretation, and does not require statistical independence between signals. The blind source separation method is a combination of an objective function and an optimization algorithm, i.e. a determination of the source signal obtained by optimizing the objective function. If constraints are added to the NMF and the objective function is optimized, the decomposition quality is better than the original NMF.

At present, most researches solve the underdetermined blind separation problem by using the sparsity of signals and based on a sparse component analysis algorithm, M.Zibulivsky et al propose a two-step method to estimate a mixing matrix and a source signal by using sparse component analysis, and the quality of the mixed matrix estimation directly influences the subsequent signal separation effect; based on sparse representation, Bofill estimates a mixing matrix and source signals respectively by using a clustering method and a shortest path method, successfully separates six source signals from two observed mixed signals, but has strong requirements on the sparsity of the signals; aiming at the problem that the basic NMF algorithm cannot solve blind source separation under an underdetermined condition, Cichocki et al propose a multilayer NMF algorithm to realize extremely sparse blind separation of each layer of decomposed signals, but the algorithm is relatively complex to realize; the yellow break utilizes a non-negative matrix factorization algorithm (IS-NMF) based on board bin-vegetarian vine divergence to separate and study single-channel music signals, the sparsity of the signals after time-frequency domain transformation IS still not ideal, the estimation error of a mixed matrix IS large, the performance of the whole algorithm IS affected, the sparsity requirement and the algorithm complexity of source signals are reduced, and the separation precision IS improved, so that the yellow break IS a research direction with theoretical significance and economic value.

Disclosure of Invention

The purpose of the invention is as follows: aiming at the problems and the defects in the prior art, the invention provides an underdetermined blind source separation method applying multiple constraints.

The technical scheme is as follows: an underdetermined blind source separation method applying multiple constraints comprises the following steps:

step 1, sampling and quantizing a source signal to be separated to obtain an initial observation signal, and preprocessing the initial observation signal, wherein the method comprises the following steps: carrying out averaging and whitening treatment to obtain an initial mixing matrix;

step 2, firstly carrying out sparse constraint, minimum correlation constraint and determinant constraint processing on the initial mixed matrix obtained in the step 1, and then carrying out optimization processing on a target function based on reconstruction errors to obtain an estimation and initial separation signal of the mixed matrix;

and 3, performing feedback processing on the initial separation signal to obtain a final separation signal.

The step 1 specifically comprises:

step 1-1, sampling and quantizing an original observation signal to obtain a discrete signal s (k), and a source signal s (k) ═ s₁(k),s₂(k),…,s_M(k)]^TObtaining an initial observation signal X (k) ═ x through a nonsingular mixing matrix A₁(k),x₂(k),…,x_M(k)]^T: x (k) ═ as (k); wherein s is_M(k) Is the Mth component, x, of the source signal S (k)_M(k) Is the Mth component of the observed signal X (K), K is a time sequence, the superscript T represents the conjugate transpose, M is a positive integer, A is an M × M-dimensional matrix;

step 1-2, sending the observation signal X (k) obtained in the step 1-1 into a preprocessing filter for STFT processing to obtain a frequency spectrum matrix, then solving a power spectrum for the frequency spectrum matrix, and outputting a power spectrum signal matrix Z (k). The method specifically comprises the following steps: the observation signals X (k) are first subjected to a centering pretreatment, i.e.Where E represents the mathematical expectation, followed by a centering pre-processing resultWhitening treatment is carried out to obtain an output power spectrum signal matrix Z (k) of the preprocessing filter: z (k) vx (k), where V is the whitening matrix, making the obtained initialization conditions more robust.

The step 2 specifically comprises the following steps:

the basic idea of NMF is to put the matrix V ═ V under non-negative constraints₁,v₂,…,v_N]∈R^m×NDecomposed into two matrices W ═ 2w₁,w₂,…,w_N]∈R^m×nAnd H ═ H₁,h₂,…,h_N]∈R^n×NThe decomposition result satisfies V ≈ WH as much as possible, wherein elements in W, H are not negative. The mathematical model is represented as:

V＝WH

in general, the matrix dimensions described above need to satisfy the relation (m + N) N < mN.

Step 2, firstly, performing sparse constraint, minimum correlation constraint and determinant constraint processing on the initial mixing matrix, specifically as follows:

and 2-1, when the non-negative matrix decomposition is directly used for solving the underdetermined blind signal separation, the decomposition result is not unique, and the source signal cannot be separated correctly. In order to achieve the purposes of good separation performance and unique result under an underdetermined condition, triple constraints are required to be adopted in the non-negative matrix decomposition process, and meanwhile, an optimization function F (V | | WH) is selected as follows by combining a target function with the minimum reconstruction error:

F (V | | W H) = \partial_{D} D (V | | W H) + \partial_{φ} v o l (φ (W)) + \partial_{J} J (H) + \partial_{R} C (H)

wherein,

W_{i k}, H_{k j} > 0, \underset{i}{Σ} W_{i k} = 1, 1 \leq k \leq n, 1 \leq i \leq m, 1 \leq j \leq N

D (V | | W H) = \underset{i j}{Σ} (V_{i j} l o g \frac{V_{i j}}{{(W H)}_{i j}} - V_{i j} + {(W H)}_{i j})

in the formula

D (V | | WH) is a distance function of V and WH, namely a reconstruction error;

vol (Φ (W)) is a determinant criteria constraint of the mixing matrix W, and represents the volume of Φ (W), where W is the square matrix, vol (Φ (W)) ═ det (W)) |; when W is not a square matrix, vol (phi (W)) - | det (WW)^T)|；

J (H) is a sparsity constraint separating signal H, and

c (H) is the minimum correlation constraint for the split signal H;

are constraint parameters in order to guarantee algorithm convergence.

The NMF problem can then be converted into an optimization problem, namely

minD (V | | WH), for any W, H,

s.tW,H≥0

the dispersion D (V | | WH) is non-increasing under the following iterations:

\begin{matrix} H_{a μ} &LeftArrow; H_{a μ} \frac{\underset{i}{Σ} W_{i a} V_{i μ} / {(W H)}_{i μ}}{\underset{k}{Σ} W_{k a}} & W_{i a} &LeftArrow; W_{i a} \frac{\underset{i}{Σ} H_{a μ} V_{i μ} / {(W H)}_{i μ}}{\underset{k}{Σ} H_{a μ}} \end{matrix}

the dispersion no longer changes if and only if W, H are stable points.

We first derive H_aμThe iterative formula of (2) is that for the target function D (V | | WH), the negative gradient direction is used as descending analysis, and the undetermined learning rate is set to be η_aμBy selecting a suitable algorithmThen, an iterative formula for H can be obtained, and since W and H have symmetry, an overlap with respect to W can be similarly foundA generation formula.

Step 2-2 and step 2, the objective function based on the reconstruction error is optimized, specifically as follows:

the non-negative matrix factorization with sparse constraint is to add constraint conditions on the original objective function to acquire the decomposition information which is as sparse as possible. On the basis of the step 2-1, W and H both refer to algebraic expressions corresponding to the iterative formulas of W and H obtained in the step 2-1, and the constraint condition of adding 1-norm to the non-negative vector H is considered, namely minimizing H (| H |)₁The conversion into matrix form is to minimize the sufficient sparseness of the matrix HAnd obtaining a new iteration rule.

The iterative rule is proved to be convergent by mathematical correlation definition and lemma, when the iterative rule is used for solving W and H, the objective function G (V | | | WH) is non-increasing, and the condition that the objective function is not changed is that if and only if W and H are stable points of the objective function.

When the iteration is carried out according to the new iteration rules of W and H until F (V | | | WH) is not larger than a certain small threshold value, the algorithm convergence can be judged, and the correct W and H are obtained.

Under the condition that the signal and mixing matrix are not negative, decomposing an observation signal matrix X according to the formula X-WH, wherein W is the estimation of the mixing matrix A, and H is the estimation of a source signal matrix S, thereby realizing the blind source separation of the mixed signals.

The steps of the NMF underdetermined blind source separation algorithm proposed above are as follows:

a. randomly selecting a group of W and H obtained in the step 1 for initialization;

b. optimizing an objective function based on the reconstruction error;

c. and updating the learning process, namely setting a small threshold, iterating according to an updating rule, and stopping updating when the optimization function F (V | | WH) is smaller than the threshold. And performing inverse STFT processing on W and H to obtain a unique group of mixing matrix W and an estimation matrix H of the source signal.

The step 3 specifically comprises:

in the signal separation process, the idea of a feedback mechanism is to eliminate the most thoroughly separated signals on the basis of the estimated source signals, so as to form a new mixed signal for signal separation, and the separation process is repeated to realize the separation of all signals. Then, in case that the source signal is unknown, in order to determine the purity of the estimated signal, the correlation coefficients between each source signal component estimated by the feedback system and all the mixed signals are summed, and the signal component corresponding to the minimum value is determined as the purest signal, i.e. the most thoroughly separated signal. Two signals h₁And h₂The formula for calculating the correlation coefficient therebetween is defined as follows:

r_{h_{1} h_{2}} = \frac{Σ_{i = 1}^{N} (h_{1 i} - {\overset{&OverBar;}{h}}_{1}) (h_{2 i} - {\overset{&OverBar;}{h}}_{2})}{\sqrt{Σ_{i = 1}^{N} {(h_{1 i} - {\overset{&OverBar;}{h}}_{1})}^{2}} \sqrt{Σ_{i = 1}^{N} {(h_{2 i} - {\overset{&OverBar;}{h}}_{2})}^{2}}}

wherein h is_1iAnd h_2iAre respectively a signal h₁And h₂And (ii) a value ofAndrespectively represent signals h₁And h₂Corresponding average value. If it isIt means that they are not relevant. The judgment basis of the feedback mechanism is the comparison of the absolute values of the correlation coefficients of the signals.

As described above, the signal mixing matrix is subjected to multiple constraint constraints to optimize the reconstruction error objective function, and the mixing matrix is estimated, thereby enhancing the interpretability and physical significance of the decomposition factor.

The method comprises the following specific steps:

and 3-1, updating to obtain W ═ W according to the new iteration rule of W and H obtained in the step 2₁,w₂,…,w_n]And H ═ H₁,h₂,…,h_n]^TLet p be n;

step 3-2, judging the number of source signals: if the number p of the source signals is less than 2, obtaining a final mixing matrix W and a final source signal H, and finishing the algorithm; otherwise, turning to the next step, and continuing to separate signals;

step 3-3, selecting the source signal with the best separation effect: comparing each separated source signal h_i(i ═ 1,2, …, p) and the sum of the absolute values of the correlation coefficients of the original observed signal V, i.e. the sumWherein v is_jFor the jth component of the observed signal V, the minimum value, i.e. the valueCorresponding to h_tI.e. the signal with the best separation effect;

and 3-4, forming a new mixed signal: removing source signal h from mixed signal used in this separation_tTo form a new mixed signal V', i.e. V ═ V-w_th_tAnd let p be p-1;

step 3-5, according to the new mixed signal V', sequentially updating according to the iteration rule formula in the step 3-1 to obtain W ═ W₁,w₂,…,w_n]And H ═ H₁,h₂,…,h_n]^T(ii) a And returning to the step 3-2.

As described above, the separation accuracy of the blind source signals is improved by introducing a feedback mechanism to the initial separation result to separate the separation signals one by one.

Has the advantages that: compared with the prior art, the invention has the following advantages:

1. the observation signals are subjected to averaging and whitening pretreatment, the robustness of initial conditions is enhanced, and a foundation is laid for improving the separation precision in subsequent steps.

2. The minimum reconstruction error objective function based on Kullback-Leibler is optimized by carrying out multiple constraint limitation on the signal mixing matrix, and the estimation of the mixing matrix is carried out, so that the uniqueness of a blind source signal separation result is ensured, and the sparsity, interpretability and physical significance of a decomposition factor are enhanced.

3. And a feedback mechanism is introduced into the initial separation result to separate the separation signals one by one, so that the separation precision of the blind source signals is improved.

Drawings

FIG. 1 is a flow chart of blind source signal separation;

fig. 2 is a feedback mechanism flow diagram.

Detailed Description

The present invention is further illustrated by the following examples, which are intended to be purely exemplary and are not intended to limit the scope of the invention, as various equivalent modifications of the invention will occur to those skilled in the art upon reading the present disclosure and fall within the scope of the appended claims.

As shown in fig. 1, the underdetermined blind source separation method applying multiple constraints includes the following steps:

example (b): the method is suitable for the voice signal and the model, but not limited to the voice signal and the model.

The linear instantaneous mixture model of the underdetermined blind speech signal separation problem can be expressed as:

x(t)＝As(t)+n(t)(1)

wherein x (t) ═ x₁(t),x₂(t),…,x_M(t)]^TDenotes an observed signal vector of size M × 1 at time t, a ═ a₁,a₂,…,a_N]∈R^M×N(M < N) is a mixing matrix, a_iI-th column vector representing a hybrid matrix, s (t) ═ s₁(t),s₂(t),…,s_N(t)]^TThe source signal vector with the size of N × 1 at time t is shown, and N (t) represents additive gaussian noise, therefore, the formula (1) can also be expressed as:

x (t) = A s (t) = Σ_{i = 1}^{N} a_{i} s_{i} + n (t) - - - (2)

for the convenience of analysis, the influence of noise is not considered temporarily, so that the linear instantaneous mixed model of noise-free under-determined blind speech signal separation can be expressed as:

x (t) = A s (t) = Σ_{i = 1}^{N} a_{i} s_{i} - - - (3)

in sparse component analysis, the sparsity of the signal is mainly utilized for the hybrid matrix estimation. Taking time domain signal analysis as an example, assuming that only the ith source signal exists at time t, equation (3) is degenerated as:

x(t)＝a_is_i(t)(4)

step 1-1, sampling and quantizing an original observation time domain signal to obtain a discrete signal s (k), and a source signal s (k) ═ s₁(k),s₂(k),…,s_M(k)]^TObtaining an observation signal X (k) ═ x through a nonsingular mixing matrix A₁(k),x₂(k),…,x_M(k)]^T: x (k) ═ as (k); wherein s is_M(k) Is the Mth component, x, of the source signal S (k)_M(k) Is the Mth component of the observed signal X (K), K is a time sequence, the superscript T represents the conjugate transpose, M is a positive integer, A is an M × M-dimensional matrix;

step 1-2, sending the observation voice signal X (k) obtained in the step 1-1 into a preprocessing filter for STFT processing to obtain a frequency spectrum matrix, then solving a power spectrum of the frequency spectrum matrix, and outputting a power spectrum signal matrix Z (k). The method specifically comprises the following steps: the observation signals X (k) are first subjected to a centering pretreatment, i.e.Where E represents the mathematical expectation, followed by a centering pre-processing resultWhitening treatment is carried out to obtain an output power spectrum signal matrix Z (k) of the preprocessing filter: z (k) vx (k), where V is the whitening matrix, making the obtained initialization conditions more robust.

The step 2 specifically comprises the following steps:

the basic idea of NMF is to put the matrix V ═ V under non-negative constraints₁,v₂,…,v_N]∈R^m×NDecomposed into two matrices W ═ W₁,w₂,…,w_N]∈R^m×nAnd H ═ H₁,h₂,…,h_N]∈R^n×NBy multiplying the decomposition results as much as possible

V＝WH(5)

In general, the matrix dimensions described above need to satisfy the relation (m + N) N < mN. Specifically, in the present invention, W denotes a speech signal mixing matrix, and H denotes a speech source signal matrix.

Step 2-1, because W and H have non-negative limits, generally the decomposition for V cannot be strictly satisfied, but only the decomposition for V has errors. Since the decomposition does not eliminate the error, we can only make the decomposition error as small as possible, define an appropriate objective function, and find the decomposition with V as accurate as possible by minimizing the objective function. Two objective functions were given by d.d.lee, h.s.seung et al, one of which is based on Kullback-Leibler divergence, defined as follows:

D (V | | W H) = \underset{i j}{Σ} (V_{i j} l o g \frac{V_{i j}}{{(W H)}_{i j}} - V_{i j} + {(W H)}_{i j}) - - - (6)

when the non-negative matrix decomposition is directly used for solving the underdetermined blind signal separation, the decomposition result is not unique, and the source signals cannot be separated correctly. In order to achieve the purposes of good separation performance and unique result under underdetermined conditions, the invention adopts determinant criterion as the unique constraint condition of W and adopts l¹The norm is used as sparsity constraint of H, triple constraint of minimum correlation constraint of H is adopted, and an optimization function F (V | | WH) is selected as follows by combining an objective function of minimum reconstruction error based on Kullback-Leibler divergence:

F (V | | W H) = \partial_{D} D (V | | W H) + \partial_{φ} v o l (φ (W)) + \partial_{J} J (H) + \partial_{R} C (H) - - - (7)

wherein,

W_{i k}, H_{k j} > 0, \underset{i}{Σ} W_{i k} = 1, 1 \leq k \leq n, 1 \leq i \leq m, 1 \leq j \leq N

D (V | | W H) = \underset{i j}{Σ} (V_{i j} l o g \frac{V_{i j}}{{(W H)}_{i j}} - V_{i j} + {(W H)}_{i j}) - - - (8)

in the formula

D (V | | WH) is a distance function of V and WH, namely a reconstruction error;

vol (Φ (W)) is a determinant criteria constraint of the mixing matrix W, and represents the volume of Φ (W), where W is the square matrix, vol (Φ (W)) ═ det (W)) |; when W is not a square matrix, vol (phi (W)) - | det (WW)^T) And the latter corresponds to the case of underdetermination (i.e. the number of speech signals is greater than the number of sensors).

J (H) is a sparsity constraint separating signal H, and

c (H) is the minimum correlation constraint of the separated signal H, so that each component of the separated voice signal does not contain the information of other signal components as much as possible,

C (H) = \frac{1}{2} [Σ_{i = 1}^{n} l o g {(({HH}^{T}))}_{i i} - l o g | {HH}^{T} |];

are constraint parameters in order to guarantee the NMF algorithm convergence.

The NMF problem can then be converted into an optimization problem, namely

minD (V | | WH), for any W, H,

s.tW,H≥0

the dispersion D (V | | WH) is non-increasing under alternating iterations of the multiplicative update rule:

\begin{matrix} H_{a μ} &LeftArrow; H_{a μ} \frac{\underset{i}{Σ} W_{i a} V_{i μ} / {(W H)}_{i μ}}{\underset{k}{Σ} W_{k a}} & W_{i a} &LeftArrow; W_{i a} \frac{\underset{i}{Σ} H_{a μ} V_{i μ} / {(W H)}_{i μ}}{\underset{k}{Σ} H_{a μ}} \end{matrix} - - - (9)

the dispersion no longer changes if and only if W, H are stable points.

We first derive H_aμThe iterative formula of (2) is that for the target function D (V | | WH), the negative gradient direction is used as descending analysis, and the undetermined learning rate is set to be η_aμUsing the steepest descent method to calculate

\begin{matrix} \frac{\partial D (V | | W H)}{\partial H_{a μ}} = \frac{\partial \underset{i j}{Σ} (V_{i j} l o g \frac{V_{i j}}{{(W H)}_{i j}} - V_{i j} + {(W H)}_{i j})}{\partial H_{a μ}} \\ = \underset{i}{Σ} W_{i a} - \underset{i}{Σ} \frac{V_{i μ}}{{(W H)}_{i μ}} W_{i a} \end{matrix} - - - (10)

Thus, the

H_{a μ} &LeftArrow; H_{a μ} + η_{a μ} (\underset{i}{Σ} \frac{V_{i μ}}{{(W H)}_{i μ}} W_{i a} - \underset{i}{Σ} W_{i a}) - - - (11)

Selecting a learning rate ofWhere the equilibrium parameters are expressed with very small values.

Then, iterative formula (11) with respect to H can be obtained, and since W and H have symmetry, iterative formula with respect to W can be similarly found.

the non-negative matrix factorization with sparse constraint is to add constraint conditions on the original objective function to acquire the decomposition information which is as sparse as possible. On the basis of the step 2-1, W and H both refer to algebraic expressions corresponding to the iterative formulas of W and H obtained in the step 2-1, and the constraint condition of adding 1-norm to the non-negative vector H is considered, namely minimizing H (| H |)₁The conversion into matrix form is such thatThe sufficient sparseness of the matrix H should be minimizedAdding this constraint, the objective function becomes:

D (V | | W H) = \frac{1}{2} \underset{i, j}{Σ} {(V_{i j} - {(W H)}_{i j})}^{2} - \frac{1}{2} β \underset{i, j}{Σ} W_{i j}^{2} + λ \underset{i, j}{Σ} H_{i j} - - - (12)

the above formula λ is not less than 0, which is transformed into the optimization problem as follows:

\min G (V | | W H) s . t . W, H &GreaterEqual; 0, \underset{i}{Σ} W_{i j} = 1, 1 \leq j \leq r - - - (13)

obviously, the iteration rule of W is not changed, but passes through the formula

H_{a j}^{t} &LeftArrow; H_{a j}^{t - 1} - \partial_{a j} \frac{\partial G (V | | W H)}{\partial H_{a j}} - - - (14)

The iteration rule of G (V | | WH) with respect to H can be solved.

\begin{matrix} \frac{\partial G (V | | W H)}{\partial H_{a j}} = \frac{\partial (\frac{1}{2} \underset{i, j}{Σ} {(V_{i j} - {(W H)}_{i j})}^{2} - \frac{1}{2} β \underset{i, j}{Σ} W_{i j}^{2} + λ \underset{i, j}{Σ} H_{i j})}{\partial H_{a j}} \\ = {(W^{T} W H)}_{a j} - {(W^{T} V)}_{a j} + λ \end{matrix} - - - (15)

Is the step size in the direction of the negative gradient, taken

\partial_{a j} = \frac{H_{a j}}{{(W^{T} W H)}_{a j} + λ} - - - (16)

Substituting equations (15) and (16) into (3.14) yields the iterative equation for H:

H_{a j} &LeftArrow; H_{a j} \frac{{(W^{T} V)}_{a j}}{{(W^{T} W H)}_{a j} + λ} - - - (17)

finally, a new iteration rule is obtained as follows:

W_{i a} &LeftArrow; W_{i a} \frac{{({VH}^{T})}_{i a}}{{({WHH}^{T})}_{i a} - {βW}_{i a}} - - - (18)

H_{a j} &LeftArrow; H_{a j} \frac{{(W^{T} V)}_{a j}}{{(W^{T} W H)}_{a j} + λ} - - - (19)

the iterative rules (18), (19) can be proven to be convergent by mathematical correlation definitions and lemmas^[1]When solving for W and H using the iteration rules (18), (19), the objective function G (V | | | WH) is non-increasing, and the condition that the objective function no longer changes is if and only if W and H are their stable points.

However, because the iterative formula contains a minus sign, the new iterative rule cannot guarantee that the result is non-negative, and when the denominator is zero, the new iterative rule loses meaning, and a small adjusting parameter which is 1.0 × 10 is introduced for the reason^-9And normalizing each column of the base matrix W to obtain

W_{i a} &LeftArrow; W_{i a} \frac{m a x {{({VH}^{T})}_{i a}, ϵ}}{{({WHH}^{T})}_{i a} - {βW}_{i a} + ϵ} - - - (20)

H_{a j} &LeftArrow; H_{a j} \frac{m a x {{(W^{T} V)}_{a j}, ϵ}}{{(W^{T} W H)}_{a j} + λ + ϵ} - - - (21)

A base matrix W_iaNormalization is as follows:

W_{i a} &LeftArrow; \frac{W_{i a}}{\underset{i}{Σ} W_{i a}} - - - (22)

when the formulas (20) and (21) are iterated until F (V | | | WH) is not larger than a certain small threshold, the convergence of the algorithm can be judged, and the correct W and H are obtained.

Under the condition that the signal and mixing matrix is not negative, the formula (3) X is AS observation signal matrix X is decomposed according to the formula (4), namely X is WH, so that W is the estimation of the mixing matrix A, and H is the estimation of the source signal matrix S, thereby realizing the blind source separation of the mixed signals.

The steps of the NMF underdetermined blind speech signal separation algorithm based on K-L divergence are as follows:

a. randomly selecting a group of W and H obtained in the step 1 for initialization, and performing sparse constraint, minimum correlation constraint and determinant constraint processing;

b. further optimizing the target function based on the reconstruction error to realize the reconstruction of the source speech signal frequency spectrum matrix and obtain the frequency spectrum matrixes W and H of the reconstructed source speech signal (wherein W and H both refer to algebraic expressions corresponding to the iterative formulas of W and H obtained in the step 2-1);

c. and updating the learning process, namely setting a small threshold, iterating according to an updating rule as shown in formulas (20) and (21), after each iteration, setting the negative elements of W and H to zero, and normalizing the column vector of W as shown in a formula (22). When the optimization function F (V | | | WH) is less than the threshold, the algorithm terminates, stopping updating. And performing inverse STFT processing on W and H to obtain a unique group of voice signal mixing matrix W and a voice source signal estimation matrix H.

The step 3 specifically comprises:

in the signal separation process, the idea of a feedback mechanism is to eliminate the most thoroughly separated signals on the basis of the estimated source signals, so as to form a new mixed signal for signal separation, and the separation process is repeated to realize the separation of all signals. A flow chart of the feedback mechanism concept, as shown in fig. 2. Then, in case that the source signal is unknown, in order to determine the purity of the estimated signal, the correlation coefficients between each source signal component estimated by the feedback system and all the mixed signals are summed, and the signal component corresponding to the minimum value is determined as the purest signal, i.e. the most thoroughly separated signal. Two signals h₁And h₂The formula for calculating the correlation coefficient therebetween is defined as follows:

r_{h_{1} h_{2}} = \frac{Σ_{i = 1}^{N} (h_{1 i} - {\overset{&OverBar;}{h}}_{1}) (h_{2 i} - {\overset{&OverBar;}{h}}_{2})}{\sqrt{Σ_{i = 1}^{N} {(h_{1 i} - {\overset{&OverBar;}{h}}_{1})}^{2}} \sqrt{Σ_{i = 1}^{N} {(h_{2 i} - {\overset{&OverBar;}{h}}_{2})}^{2}}} - - - (23)

The method comprises the following specific steps:

and 3-1, updating to obtain W ═ W according to the iteration rule formulas (20) and (21) of W and H₁,w₂,…,w_n]And H ═ H₁,h₂,…,h_n]^TLet p be n;

step 3-2, judging the number of source audio signals: if the number p of the source speech signals is less than 2, obtaining a final speech signal mixing matrix W and an estimation matrix H of the speech source signals, and finishing the algorithm; otherwise, turning to the next step, and continuing to separate the voice signals;

step 3-3, selecting the source signal with the best separation effect: comparing each separated source signal h_i(i-1, 2, …, p) and the absolute value of the correlation coefficient of the original observed speech signal V, i.e. the sumWherein v is_jFor observing the jth component of the speech signal V, the minimum value, i.e. the valueCorresponding to h_tNamely the voice signal with the best separation effect;

and 3-4, forming a new mixed voice signal: removing source signal h from mixed speech signal used in this separation_tTo form a new mixed speech signal V', i.e. V ═ V-w_th_tAnd let p be p-1;

and 3-5, updating the W-W according to the new mixed voice signal V' and the iteration rule formulas (20) and (21) in sequence to obtain W-W₁,w₂,…,w_n]And H ═ H₁,h₂,…,h_n]^T(ii) a And returning to the step 3-2.

As described above, the separated voice signals are separated one by introducing a feedback mechanism to the initial separation result, so that the separation accuracy of the blind voice signals is improved.

Reference documents:

[1]LeeDD,SeungHS.Algorithmsfornon-negativematrixfactorization[J].AdvancesinNeuralInformationProcessingSystems,2001,13:556-562.

Claims

1. An underdetermined blind source separation method applying multiple constraints, characterized by comprising the following steps:

2. The method for multiple-constraint underdetermined blind source separation as defined in claim 1, wherein the step 1 specifically comprises:

3. The method for multiple-constraint underdetermined blind source separation according to claim 1, wherein the initial mixture matrix in step 2 is first subjected to sparsity constraint, minimum correlation constraint and determinant constraint, specifically as follows:

F (V | | W H) = \partial_{D} D (V | | W H) + \partial_{φ} v o l (φ (W)) + \partial_{J} J (H) + \partial_{R} C (H)

wherein,

W_{i k}, H_{k j} > 0, \underset{i}{Σ} W_{i k} = 1, 1 \leq k \leq n, 1 \leq i \leq m, 1 \leq j \leq N

D (V | | W H) = \underset{i j}{Σ} (V_{i j} l o g \frac{V_{i j}}{{(W H)}_{i j}} - V_{i j} + {(W H)}_{i j})

in the formula

D (V | | WH) is a distance function of V and WH, namely a reconstruction error;

vol (Φ (W)) is a determinant criteria constraint of the mixing matrix W, and represents the volume of Φ (W), where W is the square matrix, vol (Φ (W)) ═ det (W)) |;when W is not a square matrix, vol (phi (W)) - | det (WW)^T)|；

J (H) is a sparsity constraint separating signal H, and

c (H) is the minimum correlation constraint for the split signal H;

is a constraint parameter for the purpose of ensuring algorithm convergence;

the NMF problem can then be converted into an optimization problem, namely

minD (V | | WH), for any W, H,

s.tW,H≥0

the dispersion D (V | | WH) is non-increasing under the following iterations:

\begin{matrix} H_{a μ} &LeftArrow; H_{a μ} \frac{\underset{i}{Σ} W_{i a} V_{i μ} / {(W H)}_{i μ}}{\underset{k}{Σ} W_{k a}} & W_{i a} &LeftArrow; W_{i a} \frac{\underset{i}{Σ} H_{a μ} V_{i μ} / {(W H)}_{i μ}}{\underset{k}{Σ} H_{a μ}} \end{matrix}

if and only if W, H are stable points, the dispersion no longer changes;

we first derive H_aμThe iterative formula of (2) is that for the target function D (V | | WH), the negative gradient direction is used as descending analysis, and the undetermined learning rate is set to be η_aμTo obtainThen, an iterative formula about H can be obtained, and since W and H have symmetry, an iterative formula about W can be similarly solved;

on the basis of the step 2-1, the constraint condition that the non-negative vector h is added with 1-norm is considered, namely the calculation of the minimum | | h | | survival rate₁The conversion into matrix form is to minimize the sufficient sparseness of the matrix HObtaining a new iteration rule; when the iteration rule is used for solving W and H, the objective function G (V | | | WH) is non-increasing, and the condition that the objective function is not changed is that if and only if W and H are stable points of the objective function;

when the iteration is carried out according to the new iteration rules of W and H until F (V | | | WH) is not larger than a certain very small threshold value, the algorithm convergence can be judged, and correct W and H are obtained;

4. The method for multiple-constraint underdetermined blind source separation as defined in claim 1, wherein the step 3 specifically comprises the steps of:

5. A multi-constrained underdetermined blind source separation method as defined in claim 3,

the new iteration rule is as follows:

W_{i a} &LeftArrow; W_{i a} \frac{{({VH}^{T})}_{i a}}{{({WHH}^{T})}_{i a} - {βW}_{i a}}

H_{a j} &LeftArrow; H_{a j} \frac{{(W^{T} V)}_{a j}}{{(W^{T} W H)}_{a j} + λ}

when W and H are solved using the new iteration rule, the objective function G (V | | | WH) is non-increasing, and the condition that the objective function no longer changes is if and only if W and H are their stable points.