CN111429936A

CN111429936A - Voice signal separation method

Info

Publication number: CN111429936A
Application number: CN202010195601.8A
Authority: CN
Inventors: 李一兵; 吴静; 孙骞; 吕威; 田园
Original assignee: Harbin Engineering University
Current assignee: Harbin Engineering University
Priority date: 2020-03-19
Filing date: 2020-03-19
Publication date: 2020-07-17
Anticipated expiration: 2040-03-19
Also published as: CN111429936B

Abstract

The invention provides a voice signal separation method, which comprises the steps of firstly establishing a linear instantaneous mixed model of an observation signal, and aiming at the problem that the separation precision is obviously reduced under the condition that the number of source signals is increased, providing improved minimum l₁And (4) carrying out norm algorithm. The algorithm firstly preprocesses an observation signal and a mixing matrix, then finds a vector closest to the observation signal according to the length and the direction of the vector, changes the form of the mixing matrix on the basis, estimates a source signal at a certain moment by using the changed mixing matrix, and further estimates source signals at all moments. The method provided by the invention solves the problem that the separation precision is obviously reduced under the condition that the number of the source signals is increased, and simultaneously, the source signals are effectively separated.

Description

Voice signal separation method

Technical Field

The invention relates to a voice signal separation method under an under-determined model, in particular to a voice signal separation method, and belongs to the field of signal processing.

Background

In recent years, separation of speech signals has become a research hotspot in the field of signal processing. It has many applications and impacts in teleconferencing, hearing aids and machine speech recognition. Since the received sound is usually noisy, the problem of identifying the sound of interest and obtaining a clear sound in such an environment becomes a considerable problem, the so-called blind source separation problem.

Blind source separation is generally divided according to the number of source signals and observation signals, and can be divided into over-determined, adaptive and under-determined blind source separation, wherein the under-determined blind source separation is more in line with the actual situation, is more widely applied in life, and is more challenging. Underdetermined blind source separation refers to the case where the number of sensors or microphones is less than the number of source signals. In general, the method for solving the underdetermined blind source separation is also suitable for the over-determined and the adaptive situations, so that the research on the underdetermined blind source separation method is necessary. The general approach to underdetermined blind source separation is to use sparse component analysis, also commonly referred to as a "two-step" approach. The first step is to estimate the mixing matrix by observing the signals, and the second step is to separate the source signals by using the estimated mixing matrix. According to the current research situation of source signal separation, the problem that the existing source signal separation algorithm generally has obvious reduction under the condition that the number of source signals is increased is solved.

Disclosure of Invention

In view of the above prior art, the technical problem to be solved by the present invention is to provide an improvement-based minimization method that can improve the problem of significant reduction of separation accuracy when the number of source signals increases₁Norm speech signal separation method.

In order to solve the above technical problem, the present invention provides a method for separating a voice signal, comprising the following steps:

step 1: establishing a linear instantaneous mixed model of an observation signal, which specifically comprises the following steps:

wherein x (t) ═ x₁(t),x₂(t),L,x_N(t)]^TIs an N-dimensional observation signal vector, A ═ a₁,a₂,L,a_M]Is a mixture of N × M dimensionsMatrix, s (t) ═ s₁(t),s₂(t),L,s_M(t)]^TIs an M-dimensional source signal vector, t is a time sample point and a_iAn ith column vector representing the mixing matrix;

step 2: removing all zero column vectors in the observation signals, and then, symmetrical the observation signals to an upper plane:

and step 3: with improved minimisation₁Norm separation source signal:

minimization of₁Norm is:

the method comprises the following steps:

(3a) calculating the observation signal angle θ (t) at time t and the column vector direction angle α of the mixing matrix_i：

The calculation formula is as follows:

α_i＝arctan(a_i2/a_i1)i＝1,2,K,n

in the formula (I), the compound is shown in the specification,

representing two observation signals, a_inRepresenting the nth element in the ith column vector in the mixing matrix.

(3b) Calculating the mixed direction of any two column vectors in the mixed matrix through sine theorem and cosine theorem:

the specific process is as follows:

∠AOB＝∠AOx-∠BOx

AB²＝OA²+OB²-2OAOBcos∠AOB

OC²＝OA²+AC²-2OAACcos∠OAC

the vectors OA and OB are any two column vectors in the estimated mixing matrix, the angle ∠ AOx of the vector OA and the angle ∠ BOx of the vector OB are directions corresponding to the column vectors in the mixing matrix, respectively, and the lengths of the vectors OA and OB correspond to the lengths of the column vectors in the mixing matrix.

(3c) Calculating theta (t) and α_iAngle Δ θ:

if Δ θ is 0, use is made of:

x(t)＝a_is_i(t)

wherein x (t) is an observed signal vector at time t, a_iFor the ith column vector, s, of the mixing matrix_iAnd (t) is the ith source signal estimated at the time t.

If Δ θ ≠ 0, use:

in the formula, W_r＝A_r ^-1Wherein

a^cAnd a^dIs the two vectors closest to the observed signal vector at time t.

(3d) The method comprises the following steps Traversing all the time instants, obtaining the representation s (t) of the source signal at all the time instants.

The invention has the beneficial effects that: the present invention is directed to the second step of the sparse component analysis method. In the present invention, source signal separation is adopted based on improved minimization₁And (3) a norm separation method.

(1) The proposed source signal separation algorithm is applicable to two paths of observation signals;

(2) with the increase of the number of the source signals, the separation precision of the source signal separation algorithm is reduced more stably.

Drawings

FIG. 1 is a flow chart of the algorithm of the present invention;

FIG. 2 is a graph of a three-way initial source signal;

FIG. 3 shows two observation signals mixed together;

FIG. 4 is a diagram illustrating a mixture of any two column vectors;

fig. 5 is a diagram of the separated three-way source signal.

Detailed Description

The method comprises the steps of firstly, finding a vector closest to an observed signal according to the length and the angle of the vector, then, changing the form of a mixing matrix, estimating a source signal at a certain moment by using the changed mixing matrix, and further estimating source signals at all moments.

The invention is described in detail below with reference to the accompanying drawings and specific embodiments.

Referring to FIG. 1, an improvement-based minimization of the present invention₁The method for separating the norm voice signals comprises the following concrete steps:

step 1: establishing a linear instantaneous mixed model of an observation signal; fig. 2 is a three-way initial source signal, and fig. 3 is a two-way observation signal mixed.

In step 1, the established mathematical model is a linear instantaneous hybrid model. The speech signal is chosen as the source signal, the noise considered is additive noise, and the signal-to-noise ratio is 30 dB.

And establishing a linear instantaneous mixed model of the observed signals, wherein the specific expression is shown as follows.

Wherein x (t) ═ x₁(t),x₂(t),L,x_N(t)]^TIs an N-dimensional observation signal vector, A ═ a₁,a₂,L,a_M]Is an N ×M-dimensional mixing matrix, s (t) ═ s₁(t),s₂(t),L,s_M(t)]^TIs an M-dimensional source signal vector, t is a time sample point and a_iThe ith column vector representing the mixing matrix.

Step 2: removing all zero column vectors in the observation signals, and then, symmetrically arranging the observation signals to an upper plane;

in step 2, since all zero column vectors in the observation signal have no effect on the separation source signal, all zero column vectors need to be removed. In order to facilitate post-processing of the signals, the observed signals are symmetrical to the upper plane.

And step 3: using improved minimization₁The norm separates the source signals.

For separating the source signals, the invention uses minimization₁Norm criterion.

The method comprises the following specific steps:

(3a) computing α the angle of the observed signal at time t (t) and the column vector direction angle of the mixing matrix_i。

The source signal at each sampling instant can be separated from the observed signal x (t) at that instant, so that the source signal separation problem translates into a source signal separation problem at a single sampling instant, the observed signal direction θ (t) at the next time t and the column vector direction α of the mixing matrix are first calculated_i。

The calculation formula is as follows:

θ(t)＝arctan(xt2/xt1)

α_i＝arctan(a_i2/a_i1)i＝1,2,K,n

in the formula (I), the compound is shown in the specification,

(3b) And calculating the mixed direction of any two column vectors in the mixed matrix through sine theorem and cosine theorem. Fig. 4 is a diagram illustrating a mixture of any two column vectors.

Since the length and direction of the column vector are considered simultaneously to seek the minimization of the sum of the modulus values of the source signals, on the basis of knowing the length and direction of the column vector of the mixing matrix, the sine theorem and the cosine theorem are needed to be used for solving the direction of any two column vectors after mixing.

The specific process is as follows:

∠AOB＝∠AOx-∠BOx

AB²＝OA²+OB²-2OAOBcos∠AOB

OC²＝OA²+AC²-2OAACcos∠OAC

the vectors OA and OB are any two column vectors in the mixing matrix, the angle ∠ AOx of the vector OA and the angle ∠ BOx of the vector OB are directions corresponding to the column vectors in the mixing matrix, respectively, and the lengths of the vectors OA and OB correspond to the lengths of the column vectors in the mixing matrix.

(3c) Calculating theta (t) and α_iAngle Δ θ:

if Δ θ is 0, the slope of the sampling point of the observation signal is the same as the direction of one column vector of the hybrid matrix, and then the following formula is used to obtain the slope;

x(t)＝a_is_i(t)

If Δ θ ≠ 0, it means that the slope of the sampling point of the observation signal is different from the direction of one column vector of the mixing matrix, and at this time, the direction obtained by mixing any two column vectors of the mixing matrix obtained in (3b) is used to find two column vectors a which minimize the sum of the modulus values of the source signal^cAnd a^d. And then, the source signal at the corresponding moment is obtained by using the following formula.

In the formula, W_r＝A_r ^-1Wherein

a^cAnd a^dIs the two vectors closest to x at time t.

(3d) The method comprises the following steps And traversing all the moments to obtain the representation s (t) of the source signals at all the moments, and fig. 5 is a diagram of the separated three-way source signals.

Minimization of l based on improvement of the invention₁The norm voice signal separation method has the advantage that the separation precision is gradually reduced along with the increase of the number of the source signals.

Minimization of l based on improvement of the invention₁The norm speech signal separation method is only suitable for two paths of observation signals.

In summary, the following steps: the invention provides a method for minimizing l based on improvement₁The norm method comprises the steps of firstly establishing a linear instantaneous mixed model of an observation signal, and providing improved minimum l aiming at the problem that the separation precision is obviously reduced under the condition that the number of source signals is increased₁And (4) carrying out norm algorithm. The algorithm firstly preprocesses the observation signal and the mixed matrix, then finds the vector closest to the observation signal according to the length and the direction of the vector, changes the form of the mixed matrix on the basis, and utilizes the changed formThe mixing matrix of (a) estimates the source signal at a certain time, and then estimates the source signals at all times. The method provided by the invention solves the problem that the separation precision is obviously reduced under the condition that the number of the source signals is increased, and simultaneously, the source signals are effectively separated.

It should be noted that the above embodiments do not limit the present invention in any way, and all technical solutions obtained by using equivalent alternatives or equivalent variations fall within the scope of the present invention.

Claims

1. A method for separating a speech signal, comprising the steps of:

wherein x (t) ═ x₁(t),x₂(t),L,x_N(t)]^TIs an N-dimensional observation signal vector, A ═ a₁,a₂,L,a_M]Is a mixed matrix of N × M dimensions, s (t) ═ s₁(t),s₂(t),L,s_M(t)]^TIs an M-dimensional source signal vector, t is a time sample point and a_iAn ith column vector representing the mixing matrix;

and step 3: with improved minimisation₁Norm separation source signal:

minimization of₁Norm is:

the method comprises the following steps:

The calculation formula is as follows:

α_i＝arctan(a_i2/a_i1)i＝1,2,K,n

in the formula (I), the compound is shown in the specification,

representing two observation signals, a_inRepresenting the nth element in the ith column vector in the mixing matrix;

the specific process is as follows:

∠AOB＝∠AOx-∠BOx

AB²＝OA²+OB²-2OAOBcos∠AOB

OC²＝OA²+AC²-2OAACcos∠OAC

∠COx＝∠AOx-∠AOC

the vectors OA and OB are any two column vectors in the estimated mixing matrix, the angle ∠ AOx of the vector OA and the angle ∠ BOx of the vector OB are respectively the directions corresponding to the column vectors in the mixing matrix, and the lengths of the vectors OA and OB correspond to the lengths of the column vectors of the mixing matrix;

(3c) calculating theta (t) and α_iAngle Δ θ:

if Δ θ is 0, use is made of:

x(t)＝a_is_i(t)

wherein x (t) is an observed signal vector at time t, a_iFor the ith column vector, s, of the mixing matrix_i(t) estimating an ith source signal at the time t;

if Δ θ ≠ 0, use:

in the formula, W_r＝A_r ^-1Wherein

a^cAnd a^dAre the two vectors closest to the observed signal vector at time t;