CN109243482A - Improve the miniature array voice de-noising method of ACRANC and Wave beam forming - Google Patents
Improve the miniature array voice de-noising method of ACRANC and Wave beam forming Download PDFInfo
- Publication number
- CN109243482A CN109243482A CN201811275824.4A CN201811275824A CN109243482A CN 109243482 A CN109243482 A CN 109243482A CN 201811275824 A CN201811275824 A CN 201811275824A CN 109243482 A CN109243482 A CN 109243482A
- Authority
- CN
- China
- Prior art keywords
- voice
- signal
- filter
- acranc
- microphone
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
Abstract
The invention discloses a kind of miniature array voice de-noising methods for improving ACRANC and Wave beam forming, it is related to speech signal processing technology, the technical issues of solution is the noise suppressed performance for how further increasing ACRANC method and carrying out voice drop, the following steps are included: (one) improves ACRANC method, it is specifically as follows step by step: (1) that the distortion voice signal after multichannel noise reduction is obtained by multichannel adaptive noise cancellation;(2) using multichannel distortion voice signal as the input for restoring filter in ACRANC system, to obtain reducing noise of voice;(2) Wave beam forming, it is specifically as follows step by step: (1) to establish multiple improvement ACRANC subsystems and adaptive model control AMC subsystem, obtain multichannel reducing noise of voice;(2) better reducing noise of voice is obtained by Wave beam forming to multichannel reducing noise of voice.The present invention can make the effect for exporting voice more preferable, and further improve voice de-noising effect.
Description
Technical field
The present invention relates to speech signal processing technology more particularly to a kind of improve the miniature of ACRANC and Wave beam forming
Array voice de-noising method.
Background technique
Voice de-noising technology can effectively improve the discrimination of voice quality and speech recognition system, miniature array voice de-noising
Technology is a kind of effective voice de-noising method.Mini microphone battle array refers to that the array of array aperture very little, array aperture are usual
All within 5 centimetres, and element number of array is less.Since miniature array is easier to be embedded in a variety of application apparatus, have wide
General application value.Generalized sidelobe based on VAD (Voice Activity Detector) offsets (Generalized
Sidelobe Cancellation) method (being abbreviated as VAD-GSC) is a kind of common and effectively mini microphone battle array voice drops
Method for de-noising.And array resistance to crosstalk adaptive noise cancellation (Array Crosstalk Resistant Adaptive Noise
Cancellation is abbreviated as ACRANC) and a kind of effective mini microphone battle array voice de-noising method, and the side ACRANC
Method has more preferable than VAD-GSC and its many improved methods in many occasions, the especially closer occasion of speech source distance arrays
Noise reduction effect.
In ACRANC, only signal, the input are the voice to distort all the way in fact all the way for the input of second level filter
Signal, that is, the output of first order filter, the function of second level filter are to restore pure by the voice signal of distortion in fact
Net voice signal, even if also it is exported close to the clean speech signal in main microphon.In the actual environment due to audio signal
For the complexity and ACRANC first order filter of propagation to distortion property caused by voice signal, second level filter restores output
Sound effect still have shortcoming.
Summary of the invention
In view of the deficiencies of the prior art, technical problem solved by the invention is how to further increase ACRANC method
Noise suppressed performance.
In order to solve the above technical problems, the technical solution adopted by the present invention is that a kind of improve the micro- of ACRANC and Wave beam forming
Type array voice de-noising method, by input multichannel distort voice be input to restore filter and with DAS (Delay And Sum,
Delay summation) Wave beam forming combines carry out voice de-noising, comprising the following steps:
(1) ACRANC method is improved, specifically as follows step by step:
(1) the distortion voice signal after multichannel noise reduction is obtained by multichannel adaptive noise cancellation, detailed process is as follows:
Assuming that voice signal is s (k), noise signal is n (k), they arrive separately at microphone M by mulitpathiAnd
It is converted to signal si(k) and ni(k);Microphone M is reached from speech source and noise sourceiPropagation impulse response be assumed to be hsi(k) and
hni(k);Microphone MiThe signal actually picked up is expressed as xi(k)=si(k)+ni(k), wherein i=1,2 ... N, k=0,1,
2 ..., N indicates the number of microphone in array in formula, and k is discrete time serial number, obtains:
xi(k)=si(k)+ni(k) (1)
si(k)=hsi(k)*s(k)(2)
ni(k)=hni(k) * n (k) i=1,2 ..., N (3)
* is convolution algorithm symbol in formula;
If voice signal siTo voice signal sjThe intermediate shock response propagated beAnd noise signal niTo making an uproar
Acoustical signal njIntermediate propagate shock response and beThen:
This step by step in, to each microphone Mi, with microphone MiThe signal x of acquisitioni(k) as main path signal, and its
The signal x that its N-1 microphone obtainsj(k) (j=1 ..., i-1, i+1 ..., N) it is used as reference signal;In global noiseless rank
Section, the road Ji Ge signal is all the noiseless stage, passes through filter AiIt goes adaptively to offset with the noise in multichannel reference signal
Noise in main road;And in non-global silent period, it keeps the coefficient of filter Ai constant, only makees filtering output;Then, it can obtain
Obtain multichannel distortion voice signal.The reason is as follows that:
Due to the voice signal s in global silent periodi(k)=0, i=1,2 ..., N, so that
xi(k)=yi1(k)+ei1(k) (6)
ni(k)=wini(k)+erri(k) (7)
X in formulai(k)=ni(k), ei1(k)=erriIt (k) is prediction error, yi1=winiIt (k) is filter AiOutput,
wiIt is the filter A of 1 × (N-1) (L+1) dimensioniCoefficient row vector, that is:
wi=(wi1,…,wi(i-1),wi(i+1)…,wN) (8)
W in formulaij=(wij0,wij1,…,wijL),niIt (k) is the noise signal column vector of (N-1) (L+1) × 1 dimension;
ni(k)=[ni1(k),…,ni(i-1)(k),ni(i+1)(k),…,niN(k)]T (9)
N in formulaij(k)=[nij(k),nij(k-1),…,nij(k-L)]T, L is the sampling point of reference channel noise signal delay
Number;
If minimal error power isAnd corresponding optimal coefficient vector are as follows:
It is above-mentioned to acquireWithFilter A need to only be adjustediCoefficient so that ei1Quadratic sum minimum be
It can;
Following the global silent period subsequent stage closely, under conditions of it is assumed that noise circumstance is constant or slowly varying,
Keep filter AiOptimal coefficient it is constant, only make filtering output, then have:
X in formulai(k) and si(k) the pure speech vector of noisy speech vector sum for respectively indicating pickup, by formula (6) and formula (11)
Have:
Wherein:
Above-mentioned ei1It (k) is the distortion voice containing residual noise all the way, pi(k) be distortion therein voice, by formula
(13) as it can be seen that it in fact in the road N clean speech signal distortion from;
ei1(k) Shi Jiang the i-th road signal is as main signal, and other signals are as obtained from reference signal, if allowing i from 1
To N, i.e., respectively by each road signal as main signal, remaining signal is as reference signal, then the road N just can be obtained containing residual noise
Distortion voice signal ej1(k) (j=1,2 ... N).
(2) language using multichannel distortion voice signal as the input for restoring filter in ACRANC system, after obtaining noise reduction
Sound signal, detailed process is as follows:
By multichannel distortion voice signal ej1(k) (j=1,2 ... N), input the second level filter B in ACRANC systemi,
Stage other than global silent period adjusts filter BiCoefficient so that its export e2i(k) quadratic sum is minimum, in which:
||ei2(k)||2=| | xi(k)-yi2(k)||2
=| | si(k)+ni(k)-yi2(k)||2
=| | ni(k)||2+||si(k)-yi2(k)||2+2ni(k)[si(k)-yi2(k)] (14)
By formula (15) as it can be seen that minimizingIt is equivalent to minimize E [si(k)-yi2(k)2], and the latter is equivalent to most
Smallization yi2(k) with voice si(k) error, therefore filter BiOutput yi2(k) clean speech signal s can be approachedi(k).By
In filter BiInput be not only single channel but multichannel distort voice signal, thus can get voice more better than ACRANC
Noise reduction effect remembers that better voice de-noising signal is
(2) Wave beam forming is combined with Wave beam forming by that will improve ACRANC, further increases voice de-noising effect,
It is specific as follows step by step:
(1) multiple improvement ACRANC subsystems and adaptive model control AMC subsystem are established, multichannel noise reduction language is obtained
Sound, detailed process is as follows:
Using often for signal as main signal, remaining signal all establishes an improved ACRANC as reference signal all the way, from
And set up N number of such subsystem.
In each improved ACRANC, filter BiInput be all filter AiThe output of (i=1,2 ... N), and
A non-filter AiOutput;Adaptive model control AMC is used to control the filter in these subsystems and when updates
It counts and when fixed coefficient is constant;
It, can be by adjusting filter A in silent period, that is, NVP stage of not voiceiOptimal coefficient compensate environment
Error caused by factor changes.For this purpose, a global silent period, that is, ONVP stage is defined, the first order filtering of subsystems
Device AiOptimal coefficient is only adjusted during ONVP;
By microphone MiPick up to obtain the i-th tunnel noisy speech signal xi(k) silent period is set as NVP (i), and NVP (i) is by one
Serial variance section composition, it may be assumed that
Wherein discrete segment:
[k'ij,k”ij]={ k'ij,k'ij+1,…,k”ij}
The discrete segment is xi(k) j-th of NVP, it is clear that NVP (i1) may not be with NVP (i2) equal, i1≠i2,i1,i2∈
{ 1,2 ..., N } but NVP (i1) it is NVP (i2) translation result on a timeline;
Define ONVP are as follows:
Then, it is easy to show that
Wherein:
If k "j< k'j, then [k' in definition (18)j,k”j]=φ;
Adjust filter AiOptimal coefficient when, all should not contain voice signal in any signal all the way, otherwise, voice meeting
It is taken as noise to offset together, therefore, only adjusts filter A in the following L-ONVP stageiCoefficient;
Wherein L is reference signal input filter AiDelay time number of samples, and:
[k'j+L,k”j]={ k'j+L,k'j+L+1,…,k”j} (20)
If k "j< k'j+ L, [k' in same definition (26)j+L,k”j]=φ;
In the L-ONVP stage, all signals and the delay used all belong to silent period, do not include any voice signal,
Therefore filter A can be adjusted in the L-ONVP stageiOptimal coefficient, it is aforementioned in the NVP stage refers to is exactly L-ONVP or L-
A part of ONVP;
Device A is filtered in (Δ, Δ ')-ONVP stageiThe adjustment of optimal coefficient:
In formulaIt is composition i-th0NVP (the i of road signal0) discrete time section, Δ ' be a positive integer, it
It can arbitrarily be chosen according to the accuracy that VD is adjudicated, it is therefore an objective to guarantee that time interval used is pure sound section, Δ is also one
Optional positive integer, but should meet
Δ≥L+δ+Δ' (22)
Wherein δ is that noise from other microphones of microphone array travels to i-th0Time delay between a microphone, with
Postpone number of samples meter, at most delay number of samples are as follows:
Wherein diIt is microphoneWith microphone MiThe distance between, f is the sample frequency of array, and c is audio signal
Aerial spread speed;
Stage except (Δ, Δ ')-ONVP, the filter A of each subsystemiOptimal coefficient keep initial value it is constant,
Filter AiOnly make filtering to use.
Remaining stage other than global silent period is adaptively adjusted all filter BiOptimal coefficient, in order to
It, can also be to B for the sake of simplicityiCeaselessly make adaptive adjustment from beginning to end;
(2) final reducing noise of voice is obtained by delay summation DAS Wave beam forming, detailed process is as follows:
The output of each subsystem is that the voice signal after noise reduction, all road N output can input one all the way
Beam-former to obtain better voice de-noising effect, if using common DAS Beam-former, can be by following defeated
Enter output relation description are as follows:
τ in formulaiIt is relative to the reference microphone selected in arrayFor, voice reaches microphone MiDelay
Time;Reference microphoneIt may optionally be any one microphone in array, generally select central or close positioned at microphone array
The microphone in center is as reference microphone;
Delay time TiIt can be calculated with cross-correlation method or broad sense cross-correlation method or following methods calculate:
1) (δ, T) _ OVP discrete time section [k', k "] is chosen, meets k >=k "+δ and k- (k "+δ) as far as possible
It is small;
2) τ is foundiMeet:
If the array aperture very little of microphone array, and the sample frequency of array signal is not very high words, it is all
τiIt can be considered 0 processing.
Compared with prior art, beneficial effects of the present invention:
It is input to recovery filter by introducing multichannel distortion voice, inputs and restores compared to the original voice that only distorts all the way
The effect of filter is more preferable, is imitated than common ACRANC method with better voice de-noising by improved ACRANC method
Fruit, and improved ACRANC method combines with Beamforming Method and can further improve noise reduction effect.
Detailed description of the invention
Fig. 1 is flow diagram of the present invention;
Fig. 2 is voice and make an uproar sonic propagation and crosstalk schematic diagram;
Fig. 3 is improved ACRANC system schematic;
Fig. 4 is to improve ACRANC and Wave beam forming combination schematic diagram.
Specific embodiment
A specific embodiment of the invention is further described with reference to the accompanying drawing, but is not to limit of the invention
It is fixed.
Fig. 1 shows a kind of miniature array voice de-noising method for improving ACRANC and Wave beam forming, by inputting multichannel
Distortion voice to ACRANC recovery filter and combine with Wave beam forming carry out voice de-noising, comprising the following steps:
(1) ACRANC method is improved, specifically as follows step by step:
(1) the distortion voice signal after multichannel noise reduction is obtained by multichannel adaptive noise cancellation, detailed process is as follows:
Assuming that voice signal is s (k), noise signal is n (k), as shown in Fig. 2, they are arrived separately at by mulitpath
Microphone MiAnd it is converted to signal si(k) and ni(k);Microphone M is reached from speech source and noise sourceiPropagation impulse response it is false
It is set as hsi(k) and hni(k);Microphone MiThe signal actually picked up is expressed as xi(k)=si(k)+ni(k), wherein i=1,2 ...
N, k=0,1,2 ..., N indicates the number of microphone in array in formula, and k is discrete time serial number, it obtains:
xi(k)=si(k)+ni(k) (1)
si(k)=hsi(k)*s(k) (2)
ni(k)=hni(k) * n (k) i=1,2 ..., N (3)
* is convolution algorithm symbol in formula;
If voice signal siTo voice signal sjThe intermediate shock response propagated beAnd noise signal niTo making an uproar
Acoustical signal njIntermediate propagate shock response and beThen:
This step by step in, to each microphone Mi, with microphone MiThe signal x of acquisitioni(k) as main path signal, and its
The signal x that its N-1 microphone obtainsj(k) (j=1 ..., i-1, i+1 ..., N) it is used as reference signal;In global noiseless rank
Section, the road Ji Ge signal is all the noiseless stage, as shown in figure 3, passing through filter AiIt is gone certainly with the noise in multichannel reference signal
Adaptively offset the noise in main road;And in non-global silent period, it keeps the coefficient of filter Ai constant, it is defeated only to make filtering
Out;Then, it can get multichannel distortion voice signal.The reason is as follows that:
Due to the voice signal s in global silent periodi(k)=0, i=1,2 ..., N, so that
xi(k)=yi1(k)+ei1(k) (6)
ni(k)=wini(k)+erri(k) (7)
X in formulai(k)=ni(k), ei1(k)=erriIt (k) is prediction error, yi1=winiIt (k) is filter AiOutput,
wiIt is the filter A of 1 × (N-1) (L+1) dimensioniCoefficient row vector, that is:
wi=(wi1,…,wi(i-1),wi(i+1)…,wN) (8)
W in formulaij=(wij0,wij1,…,wijL),niIt (k) is the noise signal column vector of (N-1) (L+1) × 1 dimension;
ni(k)=[ni1(k),…,ni(i-1)(k),ni(i+1)(k),…,niN(k)]T (9)
N in formulaij(k)=[nij(k),nij(k-1),…,nij(k-L)]T, L is the sampling point of reference channel noise signal delay
Number;
If minimal error power isAnd corresponding optimal coefficient vector are as follows:
It is above-mentioned to acquireWithFilter A need to only be adjustediCoefficient so that ei1Quadratic sum it is minimum;
Following the global silent period subsequent stage closely, under conditions of it is assumed that noise circumstance is constant or slowly varying,
Keep filter AiOptimal coefficient it is constant, only make filtering output, then have:
X in formulai(k) and si(k) the pure speech vector of noisy speech vector sum for respectively indicating pickup, by formula (6) and formula (11)
Have:
Wherein:
Above-mentioned ei1It (k) is the distortion voice containing residual noise all the way, pi(k) be distortion therein voice, by formula
(13) as it can be seen that it in fact in the road N clean speech signal distortion from;
ei1(k) Shi Jiang the i-th road signal is as main signal, and other signals are as obtained from reference signal, if allowing i from 1
To N, i.e., respectively by each road signal as main signal, remaining signal is as reference signal, then the road N just can be obtained containing residual noise
Distortion voice signal ej1(k) (j=1,2 ... N).
(2) language using multichannel distortion voice signal as the input for restoring filter in ACRANC system, after obtaining noise reduction
Sound signal, detailed process is as follows:
By multichannel distortion voice signal ej1(k) (j=1,2 ... N), input the second level filter B in ACRANC systemi,
Stage other than global silent period adjusts filter BiCoefficient so that its export e2i(k) quadratic sum is minimum, in which:
||ei2(k)||2=| | xi(k)-yi2(k)||2
=| | si(k)+ni(k)-yi2(k)||2
=| | ni(k)||2+||si(k)-yi2(k)||2+2ni(k)[si(k)-yi2(k)] (14)
By formula (15) as it can be seen that minimizingIt is equivalent to minimize E [si(k)-yi2(k)2], and the latter is equivalent to most
Smallization yi2(k) with voice si(k) error, therefore filter BiOutput yi2(k) clean speech signal s can be approachedi(k)。
Due to filter BiInput be the road N signal ej1(k) (j=1,2 ... N), they are all by the road N voice by formula
(13) the distortion voice signal become, the output that this multichannel input generates approach, will be than only all the way signal ei1(k) input and
The output Approximation effect of generation is more preferable, theoretically, as long as filter BiIn to other road input signal ej1(k) (j=1 ...,
(i-1), (i+1) ... N) all coefficients when taking 0 value, the input of the road N is just degenerated to only all the way signal ei1(k) input feelings
Shape.Therefore, above-mentioned improved ACRANC method is also necessarily more preferable than existing ACRANC method effect, remembers better voice de-noising
Signal is
(2) Wave beam forming is combined with Wave beam forming by that will improve ACRANC, further increases voice de-noising effect,
It is specific as follows step by step:
(1) multiple improvement ACRANC subsystems and adaptive model control AMC subsystem are established, multichannel noise reduction language is obtained
Sound, detailed process is as follows:
Using often for signal as main signal, remaining signal all establishes an improved ACRANC as reference signal all the way, from
And set up N number of such subsystem.
In each improved ACRANC, filter BiInput be all filter AiThe output of (i=1,2 ... N), and
A non-filter AiOutput;As shown in figure 4, adaptive model control AMC is used to control the filter in these subsystems
When update coefficient and when fixed coefficient is constant;
It, can be by adjusting filter A in silent period, that is, NVP stage of not voiceiOptimal coefficient compensate environment
Error caused by factor changes.For this purpose, a global silent period, that is, ONVP stage is defined, the first order filtering of subsystems
Device AiOptimal coefficient is only adjusted during ONVP;
By microphone MiPick up to obtain the i-th tunnel noisy speech signal xi(k) silent period is set as NVP (i), and NVP (i) is by one
Serial variance section composition, it may be assumed that
Wherein discrete segment:
[k’ij,k”ij]={ k 'ij,k’ij+1,…,k”ij}
The discrete segment is xi(k) j-th of NVP, it is clear that NVP (i1) may not be with NVP (i2) equal, i1≠i2,i1,i2∈
{ 1,2 ..., N } but NVP (i1) it is NVP (i2) translation result on a timeline;
Define ONVP are as follows:
Then, it is easy to show that
Wherein:
If k "j< k'j, then [k' in definition (18)j, k " j] and=φ;
Adjust filter AiOptimal coefficient when, all should not contain voice signal in any signal all the way, otherwise, voice meeting
It is taken as noise to offset together, therefore, only adjusts filter A in the following L-ONVP stageiCoefficient;
Wherein L is reference signal input filter AiDelay time number of samples, and:
[k'j+L,k”j]={ k'j+L,k'j+L+1,…,k”j} (20)
If k "j< k 'j+ L, [k' in same definition (26)j+L,k”j]=φ;
In the L-ONVP stage, all signals and the delay used all belong to silent period, do not include any voice signal,
Therefore filter A can be adjusted in the L-ONVP stageiOptimal coefficient, it is aforementioned in the NVP stage refers to is exactly L-ONVP or L-
A part of ONVP;
Device A is filtered in (Δ, Δ ')-ONVP stageiThe adjustment of optimal coefficient:
In formulaIt is composition i-th0NVP (the i of road signal0) discrete time section, Δ ' be a positive integer, it
It can arbitrarily be chosen according to the accuracy that VD is adjudicated, it is therefore an objective to guarantee that time interval used is pure sound section, Δ is also one
Optional positive integer, but should meet:
Δ≥L+δ+Δ' (22)
Wherein δ is that noise from other microphones of microphone array travels to i-th0Time delay between a microphone, with
Postpone number of samples meter, at most delay number of samples are as follows:
Wherein diIt is microphoneWith microphone MiThe distance between, f is the sample frequency of array, and c is audio signal
Aerial spread speed;
Stage except (Δ, Δ ')-ONVP, the filter A of each subsystemiOptimal coefficient keep initial value it is constant,
Filter AiOnly make filtering to use.
Remaining stage other than global silent period is adaptively adjusted all filter BiOptimal coefficient, in order to
It, can also be to B for the sake of simplicityiCeaselessly make adaptive adjustment from beginning to end;
(2) final reducing noise of voice is obtained by delay summation DAS Wave beam forming, detailed process is as follows:
The output of each subsystem is that the voice signal after noise reduction, all road N output can input one all the way
Beam-former to obtain better voice de-noising effect, if using common DAS Beam-former, can be by following defeated
Enter output relation description are as follows:
τ in formulaiIt is relative to the reference microphone selected in arrayFor, voice reaches microphone MiDelay
Time;Reference microphoneIt may optionally be any one microphone in array, generally select central or close positioned at microphone array
The microphone in center is as reference microphone;
Delay time TiIt can be calculated with cross-correlation method or broad sense cross-correlation method or following methods:
1) (δ, T) _ OVP discrete time section [k', k "] is chosen, meets k >=k "+δ and k- (k "+δ) as far as possible
It is small;
2) τ is foundiMeet:
If the array aperture very little of microphone array, and the sample frequency of array signal is not very high words, it is all
τiIt can be considered 0 processing.
For example, if any one microphone M in arrayiTo reference microphoneDistance be both less than 2 centimetres, and
The snap sample frequency of array be 8000Hz, then the maximum extension time will less than half sampling time interval, so, at this moment
All τ might as well be takeni=0.
(3) about the complexity of calculating
Fig. 4 shows the voice de-noising process for improving ACRANC in conjunction with DAS Wave beam forming, wherein AMC and DAS wave beam shape
The calculation amount grown up to be a useful person is all little, and AMC can be realized in fact by a VAD (Voice Activity Detector).So
The computation complexity of method depends primarily on the calculation amount estimation of the improvement ACRANC algorithm of N number of subsystem, for each improvement
ACRANC, calculation amount depend on all filter A againiAnd BiUsed adaptive algorithm.If adaptively calculated using LMS
Method, the then calculation amount for being not difficult to calculate the improvement ACRANC algorithm of N number of subsystem are no more than
(2A+3M)[(L+1)(N-1)+(LB+1)N]Nf (26)
2 in formulaAIndicate 2 sub-addition operations, 3MIndicate 3 multiplyings, L is to determine filter AiIn the formula (10) of order
Delay time number of samples used in reference signal, N are the number of microphone in array, LBIt is filter BiOrder, f is wheat
The sample rate of gram wind array.It is true since many chips can be completed at the same time an additions and multiplications in once-through operation
It is real to calculate the time much smaller than the required time shown in formula (32).
For example, determining filter A if choseniThe L=24 of length, it is resolved that filter BiThe L of lengthB=20, sample frequency
F=8000 and array is made of N=5 microphone, then can obtain calculation amount of concern according to formula (32) and be not more than
41MFLOPS。
Compared with prior art, beneficial effects of the present invention:
It is input to recovery filter by introducing multichannel distortion voice, inputs and restores compared to the original voice that only distorts all the way
The effect of filter is more preferable, is imitated than common ACRANC method with better voice de-noising by improved ACRANC method
Fruit, and improved ACRANC method combines with Beamforming Method and can further improve noise reduction effect.
Detailed description is made that embodiments of the present invention in conjunction with attached drawing above, but the present invention be not limited to it is described
Embodiment.To those skilled in the art, without departing from the principles and spirit of the present invention, to these implementations
Mode carries out various change, modification, replacement and variant are still fallen in protection scope of the present invention.
Claims (6)
1. a kind of miniature array voice de-noising method for improving ACRANC and Wave beam forming, which is characterized in that by inputting multichannel
Distortion voice combines carry out voice de-noising to recovery filter and with Wave beam forming, comprising the following steps:
(1) ACRANC method is improved, specifically as follows step by step:
(1) the distortion voice signal after multichannel noise reduction is obtained by multichannel adaptive noise cancellation;
(2) using multichannel distortion voice signal as the input for restoring filter in ACRANC system, to obtain reducing noise of voice;
(2) Wave beam forming is combined with Wave beam forming by that will improve ACRANC, further increases voice de-noising effect, specifically
It is as follows step by step:
(1) multiple improvement ACRANC subsystems and adaptive model control AMC subsystem are established, multichannel reducing noise of voice is obtained;
(2) final reducing noise of voice is obtained by delay summation DAS Wave beam forming.
2. improving the miniature array voice de-noising method of ACRANC and Wave beam forming as described in claim 1, which is characterized in that
(1) detailed process is as follows step by step in step (1):
Assuming that voice signal is s (k), noise signal is n (k), they arrive separately at microphone M by mulitpathiAnd it is converted to
Signal si(k) and ni(k);Microphone M is reached from speech source and noise sourceiPropagation impulse response be assumed to be hsi(k) and hni
(k);Microphone MiThe signal actually picked up is expressed as xi(k)=si(k)+ni(k), wherein i=1,2 ... N, k=0,1,2 ...,
N indicates the number of microphone in array in formula, and k is discrete time serial number, obtains:
xi(k)=si(k)+ni(k) (1)
si(k)=hsi(k)*s(k) (2)
ni(k)=hni(k) * n (k) i=1,2 ..., N (3)
* is convolution algorithm symbol in formula;
If voice signal siTo voice signal sjThe intermediate shock response propagated beAnd noise signal niTo noise signal
njIntermediate propagate shock response and beThen:
This step by step in, to each microphone Mi, with microphone MiThe signal x of acquisitioni(k) main path signal, and other N- are used as
The signal x that 1 microphone obtainsj(k) (j=1 ..., i-1, i+1 ..., N) it is used as reference signal;In global silent period, i.e.,
Each road signal is all the noiseless stage, passes through filter AiIt goes adaptively to offset in main road with the noise in multichannel reference signal
Noise;And in non-global silent period, it keeps the coefficient of filter Ai constant, only makees filtering output;Then, it can get multichannel
Distort voice signal;The reason is as follows that:
Due to the voice signal s in global silent periodi(k)=0, i=1,2 ..., N, so that
xi(k)=yi1(k)+ei1(k) (6)
ni(k)=wini(k)+erri(k) (7)
X in formulai(k)=ni(k), ei1(k)=erriIt (k) is prediction error, yi1=winiIt (k) is filter AiOutput, wiIt is 1
The filter A of × (N-1) (L+1) dimensioniCoefficient row vector, that is:
wi=(wi1,…,wi(i-1),wi(i+1)…,wN) (8)
W in formulaij=(wij0,wij1,…,wijL),niIt (k) is the noise signal column vector of (N-1) (L+1) × 1 dimension;
ni(k)=[ni1(k),…,ni(i-1)(k),ni(i+1)(k),…,niN(k)]T (9)
N in formulaij(k)=[nij(k),nij(k-1),…,nij(k-L)]T, L is the number of samples of reference channel noise signal delay;
If minimal error power is P [erri 0(k)], corresponding optimal coefficient vector are as follows:
It is above-mentioned to acquireWith P [erri 0(k)] filter A, need to only be adjustediCoefficient so that ei1Quadratic sum it is minimum;
It is following the global silent period subsequent stage closely, under conditions of it is assumed that noise circumstance is constant or slowly varying, is keeping
Filter AiOptimal coefficient it is constant, only make filtering output, then have:
X in formulai(k) and si(k) the pure speech vector of noisy speech vector sum for respectively indicating pickup, is had by formula (6) and formula (11):
Wherein:
Above-mentioned ei1It (k) is the distortion voice containing residual noise all the way, pi(k) be distortion therein voice, by formula (13)
As it can be seen that it in fact in the road N clean speech signal distortion from;
ei1(k) Shi Jiang the i-th road signal is as main signal, and other signals are as obtained from reference signal, if allowing i from 1 to N,
I.e. respectively by each road signal as main signal, remaining signal is as reference signal, then the road N just can be obtained containing the abnormal of residual noise
Become voice signal ej1(k) (j=1,2 ... N).
3. improving the miniature array voice de-noising method of ACRANC and Wave beam forming as described in claim 1, which is characterized in that
(2) detailed process is as follows step by step in step (1):
By multichannel distortion voice signal ej1(k) (j=1,2 ... N), input the second level filter B in ACRANC systemi, complete
In stage other than office's silent period, adjust filter BiCoefficient so that its export e2i(k) quadratic sum is minimum, in which:
||ei2(k)||2=| | xi(k)-yi2(k)||2
=| | si(k)+ni(k)-yi2(k)||2
=| | ni(k)||2+||si(k)-yi2(k)||2+2ni(k)[si(k)-yi2(k)] (14)
By formula (15) as it can be seen that minimizingIt is equivalent to minimize E [si(k)-yi2(k)2], and the latter is equivalent to minimum
yi2(k) with voice si(k) error, therefore filter BiOutput yi2(k) clean speech signal s can be approachedi(k);Due to filter
Wave device BiInput be not only single channel but multichannel distort voice signal, thus can get voice de-noising more better than ACRANC
Effect remembers that better voice de-noising signal is
4. improving the miniature array voice de-noising method of ACRANC and Wave beam forming as described in claim 1, which is characterized in that
(1) detailed process is as follows step by step in step (2):
Per for signal as main signal, remaining signal all establishes an improved ACRANC, to build as reference signal all the way
Erect N number of such subsystem;
In each improved ACRANC, filter BiInput be all filter AiThe output of (i=1,2 ... N), Er Feiyi
A filter AiOutput;The filter that adaptive model control AMC is used to control in these subsystems when update coefficient with
And when fixed coefficient is constant;
It, can be by adjusting filter A in silent period, that is, NVP stage of not voiceiOptimal coefficient come compensate environmental factor become
Error caused by changing;For this purpose, defining a global silent period, that is, ONVP stage, the first order filter A of subsystemsiOnly
Optimal coefficient is adjusted during ONVP;
By microphone MiPick up to obtain the i-th tunnel noisy speech signal xi(k) silent period is set as NVP (i), and NVP (i) is by a series of
Discrete segment composition, it may be assumed that
Wherein discrete segment:
[k′ij,k″ij]={ k 'ij,k′ij+1,…,k″ij}
The discrete segment is xi(k) j-th of NVP, it is clear that NVP (i1) may not be with NVP (i2) equal, i1≠i2,i1,i2∈{1,
2 ..., N } but NVP (i1) it is NVP (i2) translation result on a timeline;
Define ONVP are as follows:
Then, it is easy to show that
Wherein:
If k "j< k'j, then [k' in definition (18)j,k″j]=φ;
Adjust filter AiOptimal coefficient when, all should not contain voice signal in any signal all the way, otherwise, voice can be worked as
Make noise to offset together, therefore, only adjusts filter A in the following L-ONVP stageiCoefficient;
Wherein L is reference signal input filter AiDelay time number of samples, and:
[k'j+L,k″j]={ k'j+L,k'j+L+1,…,k″j} (20)
If k "j< k 'j+ L, [k' in same definition (26)j+L,k″j]=φ;
In the L-ONVP stage, all signals and the delay used all belong to silent period, do not include any voice signal, therefore
Filter A can be adjusted in the L-ONVP stageiOptimal coefficient, it is aforementioned in the NVP stage refers to is exactly L-ONVP or L-ONVP
A part;
Device A is filtered in (Δ, Δ ')-ONVP stageiThe adjustment of optimal coefficient:
In formulaIt is composition i-th0NVP (the i of road signal0) discrete time section, Δ ' be a positive integer, it can root
It is arbitrarily chosen according to the accuracy of VD judgement, it is therefore an objective to guarantee that time interval used is pure sound section, Δ is also one optional
Positive integer, but should meet:
Δ≥L+δ+Δ' (22)
Wherein δ is that noise from other microphones of microphone array travels to i-th0Time delay between a microphone, to postpone sample
Points meter, at most delay number of samples are as follows:
Wherein diIt is microphoneWith microphone MiThe distance between, f is the sample frequency of array, and c is audio signal in sky
Spread speed in gas;
Stage except (Δ, Δ ')-ONVP, the filter A of each subsystemiOptimal coefficient keep initial value it is constant, filtering
Device AiOnly make filtering to use;
Remaining stage other than global silent period is adaptively adjusted all filter BiOptimal coefficient.
5. improving the miniature array voice de-noising method of ACRANC and Wave beam forming as described in claim 1, which is characterized in that
(2) detailed process is as follows step by step in step (2):
The output of each subsystem is that the voice signal after noise reduction, all road N output can input a wave beam all the way
Shaper to obtain better voice de-noising effect, if using common DAS Beam-former, can be defeated by following input
Relationship description out are as follows:
τ in formulaiIt is relative to the reference microphone selected in arrayFor, voice reaches microphone MiDelay when
Between;Reference microphoneIt may optionally be any one microphone in array, generally select positioned at microphone array center or in
The microphone of centre is as reference microphone.
6. improving the miniature array voice de-noising method of ACRANC and Wave beam forming as claimed in claim 5, which is characterized in that
The delay time TiIt can be calculated with cross-correlation method or broad sense cross-correlation method or following methods:
1) (δ, T) _ O VP discrete time section [k', k "] is chosen, meets k >=k "+δ and k- (k "+δ) is as small as possible;
2) τ is foundiMeet:
If the array aperture very little of microphone array, and the sample frequency of array signal is not very high words, all τiVisually
For 0 processing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811275824.4A CN109243482B (en) | 2018-10-30 | 2018-10-30 | Micro-array voice noise reduction method for improving ACROC and beam forming |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811275824.4A CN109243482B (en) | 2018-10-30 | 2018-10-30 | Micro-array voice noise reduction method for improving ACROC and beam forming |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109243482A true CN109243482A (en) | 2019-01-18 |
CN109243482B CN109243482B (en) | 2022-03-18 |
Family
ID=65079322
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811275824.4A Active CN109243482B (en) | 2018-10-30 | 2018-10-30 | Micro-array voice noise reduction method for improving ACROC and beam forming |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109243482B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112951260A (en) * | 2021-03-02 | 2021-06-11 | 桂林电子科技大学 | Method for enhancing voice of double microphones |
CN117278896A (en) * | 2023-11-23 | 2023-12-22 | 深圳市昂思科技有限公司 | Voice enhancement method and device based on double microphones and hearing aid equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1529528A (en) * | 2003-09-28 | 2004-09-15 | 曾庆宁 | Multi sampling rate array signal noise-removing method |
US20060222184A1 (en) * | 2004-09-23 | 2006-10-05 | Markus Buck | Multi-channel adaptive speech signal processing system with noise reduction |
CN105575397A (en) * | 2014-10-08 | 2016-05-11 | 展讯通信(上海)有限公司 | Voice noise reduction method and voice collection device |
CN105814627A (en) * | 2013-12-16 | 2016-07-27 | 哈曼贝克自动系统股份有限公司 | Active noise control system |
CN106024001A (en) * | 2016-05-03 | 2016-10-12 | 电子科技大学 | Method used for improving speech enhancement performance of microphone array |
-
2018
- 2018-10-30 CN CN201811275824.4A patent/CN109243482B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1529528A (en) * | 2003-09-28 | 2004-09-15 | 曾庆宁 | Multi sampling rate array signal noise-removing method |
US20060222184A1 (en) * | 2004-09-23 | 2006-10-05 | Markus Buck | Multi-channel adaptive speech signal processing system with noise reduction |
CN105814627A (en) * | 2013-12-16 | 2016-07-27 | 哈曼贝克自动系统股份有限公司 | Active noise control system |
CN105575397A (en) * | 2014-10-08 | 2016-05-11 | 展讯通信(上海)有限公司 | Voice noise reduction method and voice collection device |
CN106024001A (en) * | 2016-05-03 | 2016-10-12 | 电子科技大学 | Method used for improving speech enhancement performance of microphone array |
Non-Patent Citations (2)
Title |
---|
QINGNING ZENG ET AL.: "Speech Enhancement by Multi-Channel Crosstalk Resistant Adaptive Noise Cancellation", 《2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS SPEECH AND SIGNAL PROCESSING PROCEEDINGS》 * |
曾庆宁等: "基于阵列抗串扰自适应噪声抵消的语音增强", 《电子学报》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112951260A (en) * | 2021-03-02 | 2021-06-11 | 桂林电子科技大学 | Method for enhancing voice of double microphones |
CN112951260B (en) * | 2021-03-02 | 2022-07-19 | 桂林电子科技大学 | Method for enhancing speech by double microphones |
CN117278896A (en) * | 2023-11-23 | 2023-12-22 | 深圳市昂思科技有限公司 | Voice enhancement method and device based on double microphones and hearing aid equipment |
CN117278896B (en) * | 2023-11-23 | 2024-03-19 | 深圳市昂思科技有限公司 | Voice enhancement method and device based on double microphones and hearing aid equipment |
Also Published As
Publication number | Publication date |
---|---|
CN109243482B (en) | 2022-03-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Lee et al. | Blind source separation of real world signals | |
CN110610715B (en) | Noise reduction method based on CNN-DNN hybrid neural network | |
CN108172231B (en) | Dereverberation method and system based on Kalman filtering | |
KR100549133B1 (en) | Noise reduction method and device | |
EP1439526B1 (en) | Adaptive beamforming method and apparatus using feedback structure | |
US8867754B2 (en) | Dereverberation apparatus and dereverberation method | |
CN109979476B (en) | Method and device for removing reverberation of voice | |
JPH1152988A (en) | Control method of adaptive array and adaptive array device | |
CN108141656A (en) | Use the prewhitening adaptive matrix in block form for Adaptive beamformer | |
CN1851806A (en) | Adaptive microphone array system and its voice signal processing method | |
US5999567A (en) | Method for recovering a source signal from a composite signal and apparatus therefor | |
CN109243482A (en) | Improve the miniature array voice de-noising method of ACRANC and Wave beam forming | |
Doclo et al. | Multimicrophone noise reduction using recursive GSVD-based optimal filtering with ANC postprocessing stage | |
CN110111802B (en) | Kalman filtering-based adaptive dereverberation method | |
CN110111804B (en) | Self-adaptive dereverberation method based on RLS algorithm | |
CN112201276B (en) | TC-ResNet network-based microphone array voice separation method | |
CN112331226B (en) | Voice enhancement system and method for active noise reduction system | |
CN110708651B (en) | Hearing aid squeal detection and suppression method and device based on segmented trapped wave | |
CN108039179B (en) | Efficient self-adaptive algorithm for microphone array generalized sidelobe canceller | |
Varma et al. | Robust TDE-based DOA estimation for compact audio arrays | |
US20220053268A1 (en) | Adaptive delay diversity filter and echo cancellation apparatus and method using the same | |
Seltzer et al. | Speech-recognizer-based filter optimization for microphone array processing | |
CN116935879A (en) | Two-stage network noise reduction and dereverberation method based on deep learning | |
Heuchel et al. | Adapting transfer functions to changes in atmospheric conditions for outdoor sound field control | |
Acero et al. | Towards environment-independent spoken language systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |