CN109243482A - Improve the miniature array voice de-noising method of ACRANC and Wave beam forming - Google Patents

Improve the miniature array voice de-noising method of ACRANC and Wave beam forming Download PDF

Info

Publication number
CN109243482A
CN109243482A CN201811275824.4A CN201811275824A CN109243482A CN 109243482 A CN109243482 A CN 109243482A CN 201811275824 A CN201811275824 A CN 201811275824A CN 109243482 A CN109243482 A CN 109243482A
Authority
CN
China
Prior art keywords
voice
signal
filter
acranc
microphone
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811275824.4A
Other languages
Chinese (zh)
Other versions
CN109243482B (en
Inventor
曾庆宁
罗瀛
方韶劻
林凤梅
谢先明
龙超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Aangsi Science & Technology Co Ltd
Original Assignee
Shenzhen Aangsi Science & Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Aangsi Science & Technology Co Ltd filed Critical Shenzhen Aangsi Science & Technology Co Ltd
Priority to CN201811275824.4A priority Critical patent/CN109243482B/en
Publication of CN109243482A publication Critical patent/CN109243482A/en
Application granted granted Critical
Publication of CN109243482B publication Critical patent/CN109243482B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming

Abstract

The invention discloses a kind of miniature array voice de-noising methods for improving ACRANC and Wave beam forming, it is related to speech signal processing technology, the technical issues of solution is the noise suppressed performance for how further increasing ACRANC method and carrying out voice drop, the following steps are included: (one) improves ACRANC method, it is specifically as follows step by step: (1) that the distortion voice signal after multichannel noise reduction is obtained by multichannel adaptive noise cancellation;(2) using multichannel distortion voice signal as the input for restoring filter in ACRANC system, to obtain reducing noise of voice;(2) Wave beam forming, it is specifically as follows step by step: (1) to establish multiple improvement ACRANC subsystems and adaptive model control AMC subsystem, obtain multichannel reducing noise of voice;(2) better reducing noise of voice is obtained by Wave beam forming to multichannel reducing noise of voice.The present invention can make the effect for exporting voice more preferable, and further improve voice de-noising effect.

Description

Improve the miniature array voice de-noising method of ACRANC and Wave beam forming
Technical field
The present invention relates to speech signal processing technology more particularly to a kind of improve the miniature of ACRANC and Wave beam forming Array voice de-noising method.
Background technique
Voice de-noising technology can effectively improve the discrimination of voice quality and speech recognition system, miniature array voice de-noising Technology is a kind of effective voice de-noising method.Mini microphone battle array refers to that the array of array aperture very little, array aperture are usual All within 5 centimetres, and element number of array is less.Since miniature array is easier to be embedded in a variety of application apparatus, have wide General application value.Generalized sidelobe based on VAD (Voice Activity Detector) offsets (Generalized Sidelobe Cancellation) method (being abbreviated as VAD-GSC) is a kind of common and effectively mini microphone battle array voice drops Method for de-noising.And array resistance to crosstalk adaptive noise cancellation (Array Crosstalk Resistant Adaptive Noise Cancellation is abbreviated as ACRANC) and a kind of effective mini microphone battle array voice de-noising method, and the side ACRANC Method has more preferable than VAD-GSC and its many improved methods in many occasions, the especially closer occasion of speech source distance arrays Noise reduction effect.
In ACRANC, only signal, the input are the voice to distort all the way in fact all the way for the input of second level filter Signal, that is, the output of first order filter, the function of second level filter are to restore pure by the voice signal of distortion in fact Net voice signal, even if also it is exported close to the clean speech signal in main microphon.In the actual environment due to audio signal For the complexity and ACRANC first order filter of propagation to distortion property caused by voice signal, second level filter restores output Sound effect still have shortcoming.
Summary of the invention
In view of the deficiencies of the prior art, technical problem solved by the invention is how to further increase ACRANC method Noise suppressed performance.
In order to solve the above technical problems, the technical solution adopted by the present invention is that a kind of improve the micro- of ACRANC and Wave beam forming Type array voice de-noising method, by input multichannel distort voice be input to restore filter and with DAS (Delay And Sum, Delay summation) Wave beam forming combines carry out voice de-noising, comprising the following steps:
(1) ACRANC method is improved, specifically as follows step by step:
(1) the distortion voice signal after multichannel noise reduction is obtained by multichannel adaptive noise cancellation, detailed process is as follows:
Assuming that voice signal is s (k), noise signal is n (k), they arrive separately at microphone M by mulitpathiAnd It is converted to signal si(k) and ni(k);Microphone M is reached from speech source and noise sourceiPropagation impulse response be assumed to be hsi(k) and hni(k);Microphone MiThe signal actually picked up is expressed as xi(k)=si(k)+ni(k), wherein i=1,2 ... N, k=0,1, 2 ..., N indicates the number of microphone in array in formula, and k is discrete time serial number, obtains:
xi(k)=si(k)+ni(k) (1)
si(k)=hsi(k)*s(k)(2)
ni(k)=hni(k) * n (k) i=1,2 ..., N (3)
* is convolution algorithm symbol in formula;
If voice signal siTo voice signal sjThe intermediate shock response propagated beAnd noise signal niTo making an uproar Acoustical signal njIntermediate propagate shock response and beThen:
This step by step in, to each microphone Mi, with microphone MiThe signal x of acquisitioni(k) as main path signal, and its The signal x that its N-1 microphone obtainsj(k) (j=1 ..., i-1, i+1 ..., N) it is used as reference signal;In global noiseless rank Section, the road Ji Ge signal is all the noiseless stage, passes through filter AiIt goes adaptively to offset with the noise in multichannel reference signal Noise in main road;And in non-global silent period, it keeps the coefficient of filter Ai constant, only makees filtering output;Then, it can obtain Obtain multichannel distortion voice signal.The reason is as follows that:
Due to the voice signal s in global silent periodi(k)=0, i=1,2 ..., N, so that
xi(k)=yi1(k)+ei1(k) (6)
ni(k)=wini(k)+erri(k) (7)
X in formulai(k)=ni(k), ei1(k)=erriIt (k) is prediction error, yi1=winiIt (k) is filter AiOutput, wiIt is the filter A of 1 × (N-1) (L+1) dimensioniCoefficient row vector, that is:
wi=(wi1,…,wi(i-1),wi(i+1)…,wN) (8)
W in formulaij=(wij0,wij1,…,wijL),niIt (k) is the noise signal column vector of (N-1) (L+1) × 1 dimension;
ni(k)=[ni1(k),…,ni(i-1)(k),ni(i+1)(k),…,niN(k)]T (9)
N in formulaij(k)=[nij(k),nij(k-1),…,nij(k-L)]T, L is the sampling point of reference channel noise signal delay Number;
If minimal error power isAnd corresponding optimal coefficient vector are as follows:
It is above-mentioned to acquireWithFilter A need to only be adjustediCoefficient so that ei1Quadratic sum minimum be It can;
Following the global silent period subsequent stage closely, under conditions of it is assumed that noise circumstance is constant or slowly varying, Keep filter AiOptimal coefficient it is constant, only make filtering output, then have:
X in formulai(k) and si(k) the pure speech vector of noisy speech vector sum for respectively indicating pickup, by formula (6) and formula (11) Have:
Wherein:
Above-mentioned ei1It (k) is the distortion voice containing residual noise all the way, pi(k) be distortion therein voice, by formula (13) as it can be seen that it in fact in the road N clean speech signal distortion from;
ei1(k) Shi Jiang the i-th road signal is as main signal, and other signals are as obtained from reference signal, if allowing i from 1 To N, i.e., respectively by each road signal as main signal, remaining signal is as reference signal, then the road N just can be obtained containing residual noise Distortion voice signal ej1(k) (j=1,2 ... N).
(2) language using multichannel distortion voice signal as the input for restoring filter in ACRANC system, after obtaining noise reduction Sound signal, detailed process is as follows:
By multichannel distortion voice signal ej1(k) (j=1,2 ... N), input the second level filter B in ACRANC systemi, Stage other than global silent period adjusts filter BiCoefficient so that its export e2i(k) quadratic sum is minimum, in which:
||ei2(k)||2=| | xi(k)-yi2(k)||2
=| | si(k)+ni(k)-yi2(k)||2
=| | ni(k)||2+||si(k)-yi2(k)||2+2ni(k)[si(k)-yi2(k)] (14)
By formula (15) as it can be seen that minimizingIt is equivalent to minimize E [si(k)-yi2(k)2], and the latter is equivalent to most Smallization yi2(k) with voice si(k) error, therefore filter BiOutput yi2(k) clean speech signal s can be approachedi(k).By In filter BiInput be not only single channel but multichannel distort voice signal, thus can get voice more better than ACRANC Noise reduction effect remembers that better voice de-noising signal is
(2) Wave beam forming is combined with Wave beam forming by that will improve ACRANC, further increases voice de-noising effect, It is specific as follows step by step:
(1) multiple improvement ACRANC subsystems and adaptive model control AMC subsystem are established, multichannel noise reduction language is obtained Sound, detailed process is as follows:
Using often for signal as main signal, remaining signal all establishes an improved ACRANC as reference signal all the way, from And set up N number of such subsystem.
In each improved ACRANC, filter BiInput be all filter AiThe output of (i=1,2 ... N), and A non-filter AiOutput;Adaptive model control AMC is used to control the filter in these subsystems and when updates It counts and when fixed coefficient is constant;
It, can be by adjusting filter A in silent period, that is, NVP stage of not voiceiOptimal coefficient compensate environment Error caused by factor changes.For this purpose, a global silent period, that is, ONVP stage is defined, the first order filtering of subsystems Device AiOptimal coefficient is only adjusted during ONVP;
By microphone MiPick up to obtain the i-th tunnel noisy speech signal xi(k) silent period is set as NVP (i), and NVP (i) is by one Serial variance section composition, it may be assumed that
Wherein discrete segment:
[k'ij,k”ij]={ k'ij,k'ij+1,…,k”ij}
The discrete segment is xi(k) j-th of NVP, it is clear that NVP (i1) may not be with NVP (i2) equal, i1≠i2,i1,i2∈ { 1,2 ..., N } but NVP (i1) it is NVP (i2) translation result on a timeline;
Define ONVP are as follows:
Then, it is easy to show that
Wherein:
If k "j< k'j, then [k' in definition (18)j,k”j]=φ;
Adjust filter AiOptimal coefficient when, all should not contain voice signal in any signal all the way, otherwise, voice meeting It is taken as noise to offset together, therefore, only adjusts filter A in the following L-ONVP stageiCoefficient;
Wherein L is reference signal input filter AiDelay time number of samples, and:
[k'j+L,k”j]={ k'j+L,k'j+L+1,…,k”j} (20)
If k "j< k'j+ L, [k' in same definition (26)j+L,k”j]=φ;
In the L-ONVP stage, all signals and the delay used all belong to silent period, do not include any voice signal, Therefore filter A can be adjusted in the L-ONVP stageiOptimal coefficient, it is aforementioned in the NVP stage refers to is exactly L-ONVP or L- A part of ONVP;
Device A is filtered in (Δ, Δ ')-ONVP stageiThe adjustment of optimal coefficient:
In formulaIt is composition i-th0NVP (the i of road signal0) discrete time section, Δ ' be a positive integer, it It can arbitrarily be chosen according to the accuracy that VD is adjudicated, it is therefore an objective to guarantee that time interval used is pure sound section, Δ is also one Optional positive integer, but should meet
Δ≥L+δ+Δ' (22)
Wherein δ is that noise from other microphones of microphone array travels to i-th0Time delay between a microphone, with Postpone number of samples meter, at most delay number of samples are as follows:
Wherein diIt is microphoneWith microphone MiThe distance between, f is the sample frequency of array, and c is audio signal Aerial spread speed;
Stage except (Δ, Δ ')-ONVP, the filter A of each subsystemiOptimal coefficient keep initial value it is constant, Filter AiOnly make filtering to use.
Remaining stage other than global silent period is adaptively adjusted all filter BiOptimal coefficient, in order to It, can also be to B for the sake of simplicityiCeaselessly make adaptive adjustment from beginning to end;
(2) final reducing noise of voice is obtained by delay summation DAS Wave beam forming, detailed process is as follows:
The output of each subsystem is that the voice signal after noise reduction, all road N output can input one all the way Beam-former to obtain better voice de-noising effect, if using common DAS Beam-former, can be by following defeated Enter output relation description are as follows:
τ in formulaiIt is relative to the reference microphone selected in arrayFor, voice reaches microphone MiDelay Time;Reference microphoneIt may optionally be any one microphone in array, generally select central or close positioned at microphone array The microphone in center is as reference microphone;
Delay time TiIt can be calculated with cross-correlation method or broad sense cross-correlation method or following methods calculate:
1) (δ, T) _ OVP discrete time section [k', k "] is chosen, meets k >=k "+δ and k- (k "+δ) as far as possible It is small;
2) τ is foundiMeet:
If the array aperture very little of microphone array, and the sample frequency of array signal is not very high words, it is all τiIt can be considered 0 processing.
Compared with prior art, beneficial effects of the present invention:
It is input to recovery filter by introducing multichannel distortion voice, inputs and restores compared to the original voice that only distorts all the way The effect of filter is more preferable, is imitated than common ACRANC method with better voice de-noising by improved ACRANC method Fruit, and improved ACRANC method combines with Beamforming Method and can further improve noise reduction effect.
Detailed description of the invention
Fig. 1 is flow diagram of the present invention;
Fig. 2 is voice and make an uproar sonic propagation and crosstalk schematic diagram;
Fig. 3 is improved ACRANC system schematic;
Fig. 4 is to improve ACRANC and Wave beam forming combination schematic diagram.
Specific embodiment
A specific embodiment of the invention is further described with reference to the accompanying drawing, but is not to limit of the invention It is fixed.
Fig. 1 shows a kind of miniature array voice de-noising method for improving ACRANC and Wave beam forming, by inputting multichannel Distortion voice to ACRANC recovery filter and combine with Wave beam forming carry out voice de-noising, comprising the following steps:
(1) ACRANC method is improved, specifically as follows step by step:
(1) the distortion voice signal after multichannel noise reduction is obtained by multichannel adaptive noise cancellation, detailed process is as follows:
Assuming that voice signal is s (k), noise signal is n (k), as shown in Fig. 2, they are arrived separately at by mulitpath Microphone MiAnd it is converted to signal si(k) and ni(k);Microphone M is reached from speech source and noise sourceiPropagation impulse response it is false It is set as hsi(k) and hni(k);Microphone MiThe signal actually picked up is expressed as xi(k)=si(k)+ni(k), wherein i=1,2 ... N, k=0,1,2 ..., N indicates the number of microphone in array in formula, and k is discrete time serial number, it obtains:
xi(k)=si(k)+ni(k) (1)
si(k)=hsi(k)*s(k) (2)
ni(k)=hni(k) * n (k) i=1,2 ..., N (3)
* is convolution algorithm symbol in formula;
If voice signal siTo voice signal sjThe intermediate shock response propagated beAnd noise signal niTo making an uproar Acoustical signal njIntermediate propagate shock response and beThen:
This step by step in, to each microphone Mi, with microphone MiThe signal x of acquisitioni(k) as main path signal, and its The signal x that its N-1 microphone obtainsj(k) (j=1 ..., i-1, i+1 ..., N) it is used as reference signal;In global noiseless rank Section, the road Ji Ge signal is all the noiseless stage, as shown in figure 3, passing through filter AiIt is gone certainly with the noise in multichannel reference signal Adaptively offset the noise in main road;And in non-global silent period, it keeps the coefficient of filter Ai constant, it is defeated only to make filtering Out;Then, it can get multichannel distortion voice signal.The reason is as follows that:
Due to the voice signal s in global silent periodi(k)=0, i=1,2 ..., N, so that
xi(k)=yi1(k)+ei1(k) (6)
ni(k)=wini(k)+erri(k) (7)
X in formulai(k)=ni(k), ei1(k)=erriIt (k) is prediction error, yi1=winiIt (k) is filter AiOutput, wiIt is the filter A of 1 × (N-1) (L+1) dimensioniCoefficient row vector, that is:
wi=(wi1,…,wi(i-1),wi(i+1)…,wN) (8)
W in formulaij=(wij0,wij1,…,wijL),niIt (k) is the noise signal column vector of (N-1) (L+1) × 1 dimension;
ni(k)=[ni1(k),…,ni(i-1)(k),ni(i+1)(k),…,niN(k)]T (9)
N in formulaij(k)=[nij(k),nij(k-1),…,nij(k-L)]T, L is the sampling point of reference channel noise signal delay Number;
If minimal error power isAnd corresponding optimal coefficient vector are as follows:
It is above-mentioned to acquireWithFilter A need to only be adjustediCoefficient so that ei1Quadratic sum it is minimum;
Following the global silent period subsequent stage closely, under conditions of it is assumed that noise circumstance is constant or slowly varying, Keep filter AiOptimal coefficient it is constant, only make filtering output, then have:
X in formulai(k) and si(k) the pure speech vector of noisy speech vector sum for respectively indicating pickup, by formula (6) and formula (11) Have:
Wherein:
Above-mentioned ei1It (k) is the distortion voice containing residual noise all the way, pi(k) be distortion therein voice, by formula (13) as it can be seen that it in fact in the road N clean speech signal distortion from;
ei1(k) Shi Jiang the i-th road signal is as main signal, and other signals are as obtained from reference signal, if allowing i from 1 To N, i.e., respectively by each road signal as main signal, remaining signal is as reference signal, then the road N just can be obtained containing residual noise Distortion voice signal ej1(k) (j=1,2 ... N).
(2) language using multichannel distortion voice signal as the input for restoring filter in ACRANC system, after obtaining noise reduction Sound signal, detailed process is as follows:
By multichannel distortion voice signal ej1(k) (j=1,2 ... N), input the second level filter B in ACRANC systemi, Stage other than global silent period adjusts filter BiCoefficient so that its export e2i(k) quadratic sum is minimum, in which:
||ei2(k)||2=| | xi(k)-yi2(k)||2
=| | si(k)+ni(k)-yi2(k)||2
=| | ni(k)||2+||si(k)-yi2(k)||2+2ni(k)[si(k)-yi2(k)] (14)
By formula (15) as it can be seen that minimizingIt is equivalent to minimize E [si(k)-yi2(k)2], and the latter is equivalent to most Smallization yi2(k) with voice si(k) error, therefore filter BiOutput yi2(k) clean speech signal s can be approachedi(k)。
Due to filter BiInput be the road N signal ej1(k) (j=1,2 ... N), they are all by the road N voice by formula (13) the distortion voice signal become, the output that this multichannel input generates approach, will be than only all the way signal ei1(k) input and The output Approximation effect of generation is more preferable, theoretically, as long as filter BiIn to other road input signal ej1(k) (j=1 ..., (i-1), (i+1) ... N) all coefficients when taking 0 value, the input of the road N is just degenerated to only all the way signal ei1(k) input feelings Shape.Therefore, above-mentioned improved ACRANC method is also necessarily more preferable than existing ACRANC method effect, remembers better voice de-noising Signal is
(2) Wave beam forming is combined with Wave beam forming by that will improve ACRANC, further increases voice de-noising effect, It is specific as follows step by step:
(1) multiple improvement ACRANC subsystems and adaptive model control AMC subsystem are established, multichannel noise reduction language is obtained Sound, detailed process is as follows:
Using often for signal as main signal, remaining signal all establishes an improved ACRANC as reference signal all the way, from And set up N number of such subsystem.
In each improved ACRANC, filter BiInput be all filter AiThe output of (i=1,2 ... N), and A non-filter AiOutput;As shown in figure 4, adaptive model control AMC is used to control the filter in these subsystems When update coefficient and when fixed coefficient is constant;
It, can be by adjusting filter A in silent period, that is, NVP stage of not voiceiOptimal coefficient compensate environment Error caused by factor changes.For this purpose, a global silent period, that is, ONVP stage is defined, the first order filtering of subsystems Device AiOptimal coefficient is only adjusted during ONVP;
By microphone MiPick up to obtain the i-th tunnel noisy speech signal xi(k) silent period is set as NVP (i), and NVP (i) is by one Serial variance section composition, it may be assumed that
Wherein discrete segment:
[k’ij,k”ij]={ k 'ij,k’ij+1,…,k”ij}
The discrete segment is xi(k) j-th of NVP, it is clear that NVP (i1) may not be with NVP (i2) equal, i1≠i2,i1,i2∈ { 1,2 ..., N } but NVP (i1) it is NVP (i2) translation result on a timeline;
Define ONVP are as follows:
Then, it is easy to show that
Wherein:
If k "j< k'j, then [k' in definition (18)j, k " j] and=φ;
Adjust filter AiOptimal coefficient when, all should not contain voice signal in any signal all the way, otherwise, voice meeting It is taken as noise to offset together, therefore, only adjusts filter A in the following L-ONVP stageiCoefficient;
Wherein L is reference signal input filter AiDelay time number of samples, and:
[k'j+L,k”j]={ k'j+L,k'j+L+1,…,k”j} (20)
If k "j< k 'j+ L, [k' in same definition (26)j+L,k”j]=φ;
In the L-ONVP stage, all signals and the delay used all belong to silent period, do not include any voice signal, Therefore filter A can be adjusted in the L-ONVP stageiOptimal coefficient, it is aforementioned in the NVP stage refers to is exactly L-ONVP or L- A part of ONVP;
Device A is filtered in (Δ, Δ ')-ONVP stageiThe adjustment of optimal coefficient:
In formulaIt is composition i-th0NVP (the i of road signal0) discrete time section, Δ ' be a positive integer, it It can arbitrarily be chosen according to the accuracy that VD is adjudicated, it is therefore an objective to guarantee that time interval used is pure sound section, Δ is also one Optional positive integer, but should meet:
Δ≥L+δ+Δ' (22)
Wherein δ is that noise from other microphones of microphone array travels to i-th0Time delay between a microphone, with Postpone number of samples meter, at most delay number of samples are as follows:
Wherein diIt is microphoneWith microphone MiThe distance between, f is the sample frequency of array, and c is audio signal Aerial spread speed;
Stage except (Δ, Δ ')-ONVP, the filter A of each subsystemiOptimal coefficient keep initial value it is constant, Filter AiOnly make filtering to use.
Remaining stage other than global silent period is adaptively adjusted all filter BiOptimal coefficient, in order to It, can also be to B for the sake of simplicityiCeaselessly make adaptive adjustment from beginning to end;
(2) final reducing noise of voice is obtained by delay summation DAS Wave beam forming, detailed process is as follows:
The output of each subsystem is that the voice signal after noise reduction, all road N output can input one all the way Beam-former to obtain better voice de-noising effect, if using common DAS Beam-former, can be by following defeated Enter output relation description are as follows:
τ in formulaiIt is relative to the reference microphone selected in arrayFor, voice reaches microphone MiDelay Time;Reference microphoneIt may optionally be any one microphone in array, generally select central or close positioned at microphone array The microphone in center is as reference microphone;
Delay time TiIt can be calculated with cross-correlation method or broad sense cross-correlation method or following methods:
1) (δ, T) _ OVP discrete time section [k', k "] is chosen, meets k >=k "+δ and k- (k "+δ) as far as possible It is small;
2) τ is foundiMeet:
If the array aperture very little of microphone array, and the sample frequency of array signal is not very high words, it is all τiIt can be considered 0 processing.
For example, if any one microphone M in arrayiTo reference microphoneDistance be both less than 2 centimetres, and The snap sample frequency of array be 8000Hz, then the maximum extension time will less than half sampling time interval, so, at this moment All τ might as well be takeni=0.
(3) about the complexity of calculating
Fig. 4 shows the voice de-noising process for improving ACRANC in conjunction with DAS Wave beam forming, wherein AMC and DAS wave beam shape The calculation amount grown up to be a useful person is all little, and AMC can be realized in fact by a VAD (Voice Activity Detector).So The computation complexity of method depends primarily on the calculation amount estimation of the improvement ACRANC algorithm of N number of subsystem, for each improvement ACRANC, calculation amount depend on all filter A againiAnd BiUsed adaptive algorithm.If adaptively calculated using LMS Method, the then calculation amount for being not difficult to calculate the improvement ACRANC algorithm of N number of subsystem are no more than
(2A+3M)[(L+1)(N-1)+(LB+1)N]Nf (26)
2 in formulaAIndicate 2 sub-addition operations, 3MIndicate 3 multiplyings, L is to determine filter AiIn the formula (10) of order Delay time number of samples used in reference signal, N are the number of microphone in array, LBIt is filter BiOrder, f is wheat The sample rate of gram wind array.It is true since many chips can be completed at the same time an additions and multiplications in once-through operation It is real to calculate the time much smaller than the required time shown in formula (32).
For example, determining filter A if choseniThe L=24 of length, it is resolved that filter BiThe L of lengthB=20, sample frequency F=8000 and array is made of N=5 microphone, then can obtain calculation amount of concern according to formula (32) and be not more than 41MFLOPS。
Compared with prior art, beneficial effects of the present invention:
It is input to recovery filter by introducing multichannel distortion voice, inputs and restores compared to the original voice that only distorts all the way The effect of filter is more preferable, is imitated than common ACRANC method with better voice de-noising by improved ACRANC method Fruit, and improved ACRANC method combines with Beamforming Method and can further improve noise reduction effect.
Detailed description is made that embodiments of the present invention in conjunction with attached drawing above, but the present invention be not limited to it is described Embodiment.To those skilled in the art, without departing from the principles and spirit of the present invention, to these implementations Mode carries out various change, modification, replacement and variant are still fallen in protection scope of the present invention.

Claims (6)

1. a kind of miniature array voice de-noising method for improving ACRANC and Wave beam forming, which is characterized in that by inputting multichannel Distortion voice combines carry out voice de-noising to recovery filter and with Wave beam forming, comprising the following steps:
(1) ACRANC method is improved, specifically as follows step by step:
(1) the distortion voice signal after multichannel noise reduction is obtained by multichannel adaptive noise cancellation;
(2) using multichannel distortion voice signal as the input for restoring filter in ACRANC system, to obtain reducing noise of voice;
(2) Wave beam forming is combined with Wave beam forming by that will improve ACRANC, further increases voice de-noising effect, specifically It is as follows step by step:
(1) multiple improvement ACRANC subsystems and adaptive model control AMC subsystem are established, multichannel reducing noise of voice is obtained;
(2) final reducing noise of voice is obtained by delay summation DAS Wave beam forming.
2. improving the miniature array voice de-noising method of ACRANC and Wave beam forming as described in claim 1, which is characterized in that (1) detailed process is as follows step by step in step (1):
Assuming that voice signal is s (k), noise signal is n (k), they arrive separately at microphone M by mulitpathiAnd it is converted to Signal si(k) and ni(k);Microphone M is reached from speech source and noise sourceiPropagation impulse response be assumed to be hsi(k) and hni (k);Microphone MiThe signal actually picked up is expressed as xi(k)=si(k)+ni(k), wherein i=1,2 ... N, k=0,1,2 ..., N indicates the number of microphone in array in formula, and k is discrete time serial number, obtains:
xi(k)=si(k)+ni(k) (1)
si(k)=hsi(k)*s(k) (2)
ni(k)=hni(k) * n (k) i=1,2 ..., N (3)
* is convolution algorithm symbol in formula;
If voice signal siTo voice signal sjThe intermediate shock response propagated beAnd noise signal niTo noise signal njIntermediate propagate shock response and beThen:
This step by step in, to each microphone Mi, with microphone MiThe signal x of acquisitioni(k) main path signal, and other N- are used as The signal x that 1 microphone obtainsj(k) (j=1 ..., i-1, i+1 ..., N) it is used as reference signal;In global silent period, i.e., Each road signal is all the noiseless stage, passes through filter AiIt goes adaptively to offset in main road with the noise in multichannel reference signal Noise;And in non-global silent period, it keeps the coefficient of filter Ai constant, only makees filtering output;Then, it can get multichannel Distort voice signal;The reason is as follows that:
Due to the voice signal s in global silent periodi(k)=0, i=1,2 ..., N, so that
xi(k)=yi1(k)+ei1(k) (6)
ni(k)=wini(k)+erri(k) (7)
X in formulai(k)=ni(k), ei1(k)=erriIt (k) is prediction error, yi1=winiIt (k) is filter AiOutput, wiIt is 1 The filter A of × (N-1) (L+1) dimensioniCoefficient row vector, that is:
wi=(wi1,…,wi(i-1),wi(i+1)…,wN) (8)
W in formulaij=(wij0,wij1,…,wijL),niIt (k) is the noise signal column vector of (N-1) (L+1) × 1 dimension;
ni(k)=[ni1(k),…,ni(i-1)(k),ni(i+1)(k),…,niN(k)]T (9)
N in formulaij(k)=[nij(k),nij(k-1),…,nij(k-L)]T, L is the number of samples of reference channel noise signal delay;
If minimal error power is P [erri 0(k)], corresponding optimal coefficient vector are as follows:
It is above-mentioned to acquireWith P [erri 0(k)] filter A, need to only be adjustediCoefficient so that ei1Quadratic sum it is minimum;
It is following the global silent period subsequent stage closely, under conditions of it is assumed that noise circumstance is constant or slowly varying, is keeping Filter AiOptimal coefficient it is constant, only make filtering output, then have:
X in formulai(k) and si(k) the pure speech vector of noisy speech vector sum for respectively indicating pickup, is had by formula (6) and formula (11):
Wherein:
Above-mentioned ei1It (k) is the distortion voice containing residual noise all the way, pi(k) be distortion therein voice, by formula (13) As it can be seen that it in fact in the road N clean speech signal distortion from;
ei1(k) Shi Jiang the i-th road signal is as main signal, and other signals are as obtained from reference signal, if allowing i from 1 to N, I.e. respectively by each road signal as main signal, remaining signal is as reference signal, then the road N just can be obtained containing the abnormal of residual noise Become voice signal ej1(k) (j=1,2 ... N).
3. improving the miniature array voice de-noising method of ACRANC and Wave beam forming as described in claim 1, which is characterized in that (2) detailed process is as follows step by step in step (1):
By multichannel distortion voice signal ej1(k) (j=1,2 ... N), input the second level filter B in ACRANC systemi, complete In stage other than office's silent period, adjust filter BiCoefficient so that its export e2i(k) quadratic sum is minimum, in which:
||ei2(k)||2=| | xi(k)-yi2(k)||2
=| | si(k)+ni(k)-yi2(k)||2
=| | ni(k)||2+||si(k)-yi2(k)||2+2ni(k)[si(k)-yi2(k)] (14)
By formula (15) as it can be seen that minimizingIt is equivalent to minimize E [si(k)-yi2(k)2], and the latter is equivalent to minimum yi2(k) with voice si(k) error, therefore filter BiOutput yi2(k) clean speech signal s can be approachedi(k);Due to filter Wave device BiInput be not only single channel but multichannel distort voice signal, thus can get voice de-noising more better than ACRANC Effect remembers that better voice de-noising signal is
4. improving the miniature array voice de-noising method of ACRANC and Wave beam forming as described in claim 1, which is characterized in that (1) detailed process is as follows step by step in step (2):
Per for signal as main signal, remaining signal all establishes an improved ACRANC, to build as reference signal all the way Erect N number of such subsystem;
In each improved ACRANC, filter BiInput be all filter AiThe output of (i=1,2 ... N), Er Feiyi A filter AiOutput;The filter that adaptive model control AMC is used to control in these subsystems when update coefficient with And when fixed coefficient is constant;
It, can be by adjusting filter A in silent period, that is, NVP stage of not voiceiOptimal coefficient come compensate environmental factor become Error caused by changing;For this purpose, defining a global silent period, that is, ONVP stage, the first order filter A of subsystemsiOnly Optimal coefficient is adjusted during ONVP;
By microphone MiPick up to obtain the i-th tunnel noisy speech signal xi(k) silent period is set as NVP (i), and NVP (i) is by a series of Discrete segment composition, it may be assumed that
Wherein discrete segment:
[k′ij,k″ij]={ k 'ij,k′ij+1,…,k″ij}
The discrete segment is xi(k) j-th of NVP, it is clear that NVP (i1) may not be with NVP (i2) equal, i1≠i2,i1,i2∈{1, 2 ..., N } but NVP (i1) it is NVP (i2) translation result on a timeline;
Define ONVP are as follows:
Then, it is easy to show that
Wherein:
If k "j< k'j, then [k' in definition (18)j,k″j]=φ;
Adjust filter AiOptimal coefficient when, all should not contain voice signal in any signal all the way, otherwise, voice can be worked as Make noise to offset together, therefore, only adjusts filter A in the following L-ONVP stageiCoefficient;
Wherein L is reference signal input filter AiDelay time number of samples, and:
[k'j+L,k″j]={ k'j+L,k'j+L+1,…,k″j} (20)
If k "j< k 'j+ L, [k' in same definition (26)j+L,k″j]=φ;
In the L-ONVP stage, all signals and the delay used all belong to silent period, do not include any voice signal, therefore Filter A can be adjusted in the L-ONVP stageiOptimal coefficient, it is aforementioned in the NVP stage refers to is exactly L-ONVP or L-ONVP A part;
Device A is filtered in (Δ, Δ ')-ONVP stageiThe adjustment of optimal coefficient:
In formulaIt is composition i-th0NVP (the i of road signal0) discrete time section, Δ ' be a positive integer, it can root It is arbitrarily chosen according to the accuracy of VD judgement, it is therefore an objective to guarantee that time interval used is pure sound section, Δ is also one optional Positive integer, but should meet:
Δ≥L+δ+Δ' (22)
Wherein δ is that noise from other microphones of microphone array travels to i-th0Time delay between a microphone, to postpone sample Points meter, at most delay number of samples are as follows:
Wherein diIt is microphoneWith microphone MiThe distance between, f is the sample frequency of array, and c is audio signal in sky Spread speed in gas;
Stage except (Δ, Δ ')-ONVP, the filter A of each subsystemiOptimal coefficient keep initial value it is constant, filtering Device AiOnly make filtering to use;
Remaining stage other than global silent period is adaptively adjusted all filter BiOptimal coefficient.
5. improving the miniature array voice de-noising method of ACRANC and Wave beam forming as described in claim 1, which is characterized in that (2) detailed process is as follows step by step in step (2):
The output of each subsystem is that the voice signal after noise reduction, all road N output can input a wave beam all the way Shaper to obtain better voice de-noising effect, if using common DAS Beam-former, can be defeated by following input Relationship description out are as follows:
τ in formulaiIt is relative to the reference microphone selected in arrayFor, voice reaches microphone MiDelay when Between;Reference microphoneIt may optionally be any one microphone in array, generally select positioned at microphone array center or in The microphone of centre is as reference microphone.
6. improving the miniature array voice de-noising method of ACRANC and Wave beam forming as claimed in claim 5, which is characterized in that The delay time TiIt can be calculated with cross-correlation method or broad sense cross-correlation method or following methods:
1) (δ, T) _ O VP discrete time section [k', k "] is chosen, meets k >=k "+δ and k- (k "+δ) is as small as possible;
2) τ is foundiMeet:
If the array aperture very little of microphone array, and the sample frequency of array signal is not very high words, all τiVisually For 0 processing.
CN201811275824.4A 2018-10-30 2018-10-30 Micro-array voice noise reduction method for improving ACROC and beam forming Active CN109243482B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811275824.4A CN109243482B (en) 2018-10-30 2018-10-30 Micro-array voice noise reduction method for improving ACROC and beam forming

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811275824.4A CN109243482B (en) 2018-10-30 2018-10-30 Micro-array voice noise reduction method for improving ACROC and beam forming

Publications (2)

Publication Number Publication Date
CN109243482A true CN109243482A (en) 2019-01-18
CN109243482B CN109243482B (en) 2022-03-18

Family

ID=65079322

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811275824.4A Active CN109243482B (en) 2018-10-30 2018-10-30 Micro-array voice noise reduction method for improving ACROC and beam forming

Country Status (1)

Country Link
CN (1) CN109243482B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112951260A (en) * 2021-03-02 2021-06-11 桂林电子科技大学 Method for enhancing voice of double microphones
CN117278896A (en) * 2023-11-23 2023-12-22 深圳市昂思科技有限公司 Voice enhancement method and device based on double microphones and hearing aid equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1529528A (en) * 2003-09-28 2004-09-15 曾庆宁 Multi sampling rate array signal noise-removing method
US20060222184A1 (en) * 2004-09-23 2006-10-05 Markus Buck Multi-channel adaptive speech signal processing system with noise reduction
CN105575397A (en) * 2014-10-08 2016-05-11 展讯通信(上海)有限公司 Voice noise reduction method and voice collection device
CN105814627A (en) * 2013-12-16 2016-07-27 哈曼贝克自动系统股份有限公司 Active noise control system
CN106024001A (en) * 2016-05-03 2016-10-12 电子科技大学 Method used for improving speech enhancement performance of microphone array

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1529528A (en) * 2003-09-28 2004-09-15 曾庆宁 Multi sampling rate array signal noise-removing method
US20060222184A1 (en) * 2004-09-23 2006-10-05 Markus Buck Multi-channel adaptive speech signal processing system with noise reduction
CN105814627A (en) * 2013-12-16 2016-07-27 哈曼贝克自动系统股份有限公司 Active noise control system
CN105575397A (en) * 2014-10-08 2016-05-11 展讯通信(上海)有限公司 Voice noise reduction method and voice collection device
CN106024001A (en) * 2016-05-03 2016-10-12 电子科技大学 Method used for improving speech enhancement performance of microphone array

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
QINGNING ZENG ET AL.: "Speech Enhancement by Multi-Channel Crosstalk Resistant Adaptive Noise Cancellation", 《2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS SPEECH AND SIGNAL PROCESSING PROCEEDINGS》 *
曾庆宁等: "基于阵列抗串扰自适应噪声抵消的语音增强", 《电子学报》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112951260A (en) * 2021-03-02 2021-06-11 桂林电子科技大学 Method for enhancing voice of double microphones
CN112951260B (en) * 2021-03-02 2022-07-19 桂林电子科技大学 Method for enhancing speech by double microphones
CN117278896A (en) * 2023-11-23 2023-12-22 深圳市昂思科技有限公司 Voice enhancement method and device based on double microphones and hearing aid equipment
CN117278896B (en) * 2023-11-23 2024-03-19 深圳市昂思科技有限公司 Voice enhancement method and device based on double microphones and hearing aid equipment

Also Published As

Publication number Publication date
CN109243482B (en) 2022-03-18

Similar Documents

Publication Publication Date Title
Lee et al. Blind source separation of real world signals
CN110610715B (en) Noise reduction method based on CNN-DNN hybrid neural network
CN108172231B (en) Dereverberation method and system based on Kalman filtering
KR100549133B1 (en) Noise reduction method and device
EP1439526B1 (en) Adaptive beamforming method and apparatus using feedback structure
US8867754B2 (en) Dereverberation apparatus and dereverberation method
CN109979476B (en) Method and device for removing reverberation of voice
JPH1152988A (en) Control method of adaptive array and adaptive array device
CN108141656A (en) Use the prewhitening adaptive matrix in block form for Adaptive beamformer
CN1851806A (en) Adaptive microphone array system and its voice signal processing method
US5999567A (en) Method for recovering a source signal from a composite signal and apparatus therefor
CN109243482A (en) Improve the miniature array voice de-noising method of ACRANC and Wave beam forming
Doclo et al. Multimicrophone noise reduction using recursive GSVD-based optimal filtering with ANC postprocessing stage
CN110111802B (en) Kalman filtering-based adaptive dereverberation method
CN110111804B (en) Self-adaptive dereverberation method based on RLS algorithm
CN112201276B (en) TC-ResNet network-based microphone array voice separation method
CN112331226B (en) Voice enhancement system and method for active noise reduction system
CN110708651B (en) Hearing aid squeal detection and suppression method and device based on segmented trapped wave
CN108039179B (en) Efficient self-adaptive algorithm for microphone array generalized sidelobe canceller
Varma et al. Robust TDE-based DOA estimation for compact audio arrays
US20220053268A1 (en) Adaptive delay diversity filter and echo cancellation apparatus and method using the same
Seltzer et al. Speech-recognizer-based filter optimization for microphone array processing
CN116935879A (en) Two-stage network noise reduction and dereverberation method based on deep learning
Heuchel et al. Adapting transfer functions to changes in atmospheric conditions for outdoor sound field control
Acero et al. Towards environment-independent spoken language systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant