CN109616137A

CN109616137A - Method for processing noise and device

Info

Publication number: CN109616137A
Application number: CN201910080549.9A
Authority: CN
Inventors: 张跃进; 黄德昌; 李波; 展爱云
Original assignee: Zhongxiang Bo Qian Mdt Infotech Ltd
Current assignee: Zhongxiang Bo Qian Mdt Infotech Ltd
Priority date: 2019-01-28
Filing date: 2019-01-28
Publication date: 2019-04-12

Abstract

The present invention relates to a kind of method for processing noise and devices, it include: that the first voice signal collected to the first microphone carries out Fourier transformation processing, obtain first object signal, Fourier transformation processing is carried out to the collected second sound signal of second microphone, obtain the second echo signal, according to preset algorithm, establish the cross-correlation relationship of first object signal and the second echo signal, according to cross-correlation relationship, determine cross-correlation function, based on cross-correlation function, obtain the first Filtering Formula and the second Filtering Formula, based on the first Filtering Formula and the second Filtering Formula, obtain target Filtering Formula, based on target Filtering Formula, first object signal is handled, obtain the voice signal of frequency domain form, Fourier inversion and overlap-add processing are carried out to the voice signal of frequency domain form, obtain the voice signal of forms of time and space, make Purer voice signal is obtained for targeted voice signal preferably to complete the processing to noise.

Description

Method for processing noise and device

Technical field

The present invention relates to noise technique fields, and in particular to a kind of method for processing noise and device.

Background technique

Noise is very universal, the either traffic work such as heavy mechanical equipment or automobile in the production and living of people Have the various noises generated, so that work and life environment locating for people becomes very noisy, in order to obtain more Pure voice signal people just start to design various noise cancellation methods, the quality of Lai Tigao voice signal.

Current most noise cancellation method, is all by way of single microphone, still, in some downtown area, station The equal biggish place of noises, the noise cancellation of single microphone can not remove the noise signal in voice signal well.

Summary of the invention

In view of this, the purpose of the present invention is to provide a kind of method for processing noise and device, it is more preferable by dual microphone Complete the processing to noise in ground.

In order to achieve the above object, the present invention adopts the following technical scheme:

A kind of method for processing noise, which comprises

The first voice signal collected to the first microphone carries out Fourier transformation processing, obtains first object signal；

Fourier transformation processing is carried out to the collected second sound signal of second microphone, obtains the second echo signal；

According to preset algorithm, the cross-correlation relationship of the first object signal and second echo signal is established；

According to the cross-correlation relationship, cross-correlation function is determined；

Based on the cross-correlation function, the first Filtering Formula and the second Filtering Formula are obtained；

Based on first Filtering Formula and second Filtering Formula, target Filtering Formula is obtained；

Based on the target Filtering Formula, the first object signal is handled, obtains the voice letter of frequency domain form Number；

Fourier inversion and overlap-add processing are carried out to the voice signal of the frequency domain form, obtain the frequency domain shape The voice signal of the corresponding forms of time and space of the voice signal of formula, as targeted voice signal.

Optionally, described above according to preset algorithm, establish the first object signal and second echo signal Cross-correlation relationship, comprising:

According to cross-spectral density, the cross-correlation relationship of the first object signal and second echo signal is obtained；

Optionally, described above according to the cross-correlation relationship, determine that cross-correlation function includes:

According to the cross-correlation relationship and autopower spectral density, cross-correlation function is determined.

Optionally, described above to be based on the cross-correlation function, the first Filtering Formula and the second Filtering Formula are obtained, is wrapped It includes:

The cross-spectral density of first voice signal and the second sound signal is decomposed, is converted and whole It closes, obtains auxiliary filter formula；

Based on the cross-correlation function, simplified and multi-angular analysis is carried out to the auxiliary filter formula, obtains the first filter Wave formula and the second Filtering Formula.

Optionally, described above according to the cross-correlation relationship, after determining cross-correlation function, comprising:

Determine first microphone and the corresponding use environment of the second microphone；

Detect whether the use environment is non-diffuse noise circumstance；

If so, using the corresponding simplified function of the cross-correlation function as the cross-correlation function.

Optionally, above-mentioned to handle it in the first voice signal progress Fourier transformation collected to the first microphone Before, further includes:

Based on window function, the specified voice signal of first microphone collected first is captured, is obtained wait locate Manage the first voice signal；

By first voice signal to be processed, the non-recursive type filter process for being 75% by overlap ratio is obtained everywhere The first voice signal after reason, as the first voice signal；

Accordingly, before the progress Fourier transformation processing to second microphone collected second sound signal, Further include:

Based on window function, the second specified voice signal collected to the second microphone is captured, and is obtained wait locate Manage second sound signal；

By the second sound signal to be processed, the non-recursive type filter process for being 75% by overlap ratio is obtained everywhere Second sound signal after reason, as second sound signal.

A kind of noise processing apparatus, described device include:

Conversion module: Fourier transformation processing is carried out for the first voice signal collected to the first microphone, is obtained First object signal；

It is also used to carry out Fourier transformation processing to the collected second sound signal of second microphone, obtains the second target Signal；

Determining module: for according to preset algorithm, establishing the mutual of the first object signal and second echo signal Correlativity；

It is also used to determine cross-correlation function according to the cross-correlation relationship；

It obtains module: for being based on the cross-correlation function, obtaining the first Filtering Formula and the second Filtering Formula；

It is also used to obtain target Filtering Formula based on first Filtering Formula and second Filtering Formula；

It is also used to handle the first object signal based on the target Filtering Formula, obtain frequency domain form Voice signal；

Inverse transform block: it is carried out at Fourier inversion and overlap-add for the voice signal to the frequency domain form Reason, obtains the voice signal of the corresponding forms of time and space of voice signal of the frequency domain form, as targeted voice signal.

Optionally, determining module described above is specifically used for:

According to cross-spectral density, the cross-correlation relationship of the first object signal and second echo signal is obtained.

Optionally, acquisition module described above is specifically used for:

Optionally, noise processing apparatus described above, further includes:

Capture module: for being based on window function, the specified voice signal of first microphone collected first is carried out It captures, obtains the first voice signal to be processed；

Accordingly, capture module is also used to:

A kind of method for processing noise that the present invention uses, comprising: the first voice signal collected to the first microphone into The processing of row Fourier transformation, obtains first object signal, carries out Fourier to the collected second sound signal of second microphone Conversion process obtains the second echo signal, and according to preset algorithm, first object signal and the second echo signal are established cross-correlation Relationship determines cross-correlation function by cross-correlation relationship, is based on cross-correlation function, obtains the first Filtering Formula and the second filtering Formula handles the first Filtering Formula and the second Filtering Formula, obtains target Filtering Formula, is based on target Filtering Formula, First object signal is handled, the voice signal of frequency domain form is obtained, Fourier is carried out to the voice signal of frequency domain form Inverse transformation and overlap-add processing, obtain the voice signal of forms of time and space, as targeted voice signal.By the way that two Mikes are arranged Wind and cross-correlation function establish more optimized filter, can guarantee complete to handle to the noise from different directions, The effect that ensure that processing, solve the problems, such as single microphone can only handle single direction noise, be provided with stronger practical Property, better processing effect.

Detailed description of the invention

In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with It obtains other drawings based on these drawings.

Fig. 1 is a kind of flow chart of method for processing noise provided in an embodiment of the present invention.

Fig. 2 is a kind of structural schematic diagram of noise processing apparatus provided in an embodiment of the present invention.

Specific embodiment

To make the object, technical solutions and advantages of the present invention clearer, technical solution of the present invention will be carried out below Detailed description.Obviously, described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Base Embodiment in the present invention, those of ordinary skill in the art are obtained all without making creative work Other embodiment belongs to the range that the present invention is protected.

As shown in Figure 1, a kind of method for processing noise of the present embodiment, comprising the following steps:

S11, the first voice signal collected to the first microphone carry out Fourier transformation processing, obtain first object letter Number.

S12, Fourier transformation processing is carried out to the collected second sound signal of second microphone, obtains the second target letter Number.

Clean speech component is divided into noise component(s) (between the two without related in the signal theory of microphone acquisition Property), wherein noise signal is considered as garbage signal or interference signal, and there are noises to believe that various additivitys are made an uproar in life Sound and non additivity noise, the present embodiment are analyzed by taking additive noise as an example.Assuming that target purified signal is s (n), additive noise It is d (n), noisy speech is x (n), and the frequency domain distribution of general voice and noise has very big difference, so the two is irrelevant , it can be expressed as x (n)=s (n)+d (n) and E [s (n) d (n)]=0, utilize Fourier transformation (FFT), the expression formula of frequency domain It is written as X (w)=S (w)+D (w).

X (n) generally use signal-to-noise ratio as measurement noise doping number reference, SNR is defined as:

Wherein, it indicates the energy of signal, indicates noiseEnergy, signal-to-noise ratio oneAs refer to It is the ratio size of signal and noise, therefore clean speech, noise signal, resultant signal are it is only necessary to know that second, can calculate Signal-to-noise ratio.

Voice signal is pre-processed, when calculating correlation function or power spectrum function first is that most with multinomial Small square law eliminates trend term error and DC component, such as amplifier can generate temperature drift with the variation of temperature, second is that using number Word filter removes ambient noise significant in noisy speech general extra noise filtering, pretreatment, and recurrence can be used (IIR) filter and onrecurrent (FIR) filter.Citing, the sound card and microphone of computer while collecting voice signal, Computer is taken through instruction to this AC signal of voice in equipment, and the alternating current wave sound of 50Hz forms unnecessary interference, So the alternating current wave sound for filtering out 50Hz must be used.

For the ease of when studying the dual microphone sound enhancement method of cross-correlation function, there is comparison reference below, draw Wiener filtering algorithm.

The basic conception of Wiener filtering algorithm is time domain and the frequency by logic analysis echo signal component and noise component The information in domain rule of thumb constructs filter, and it is minimum will constantly to adulterate the voice signal of the voice signal of noise after treatment Square law fitting, finally obtains immediate signal, to achieve the purpose that filtering.

When microphone receives signal x (n), filter h (n) is proposed in advance, but be not aware that concrete form, by x (n) target voice is obtained by filter for the first timeEstimation:

Wherein,Then the mean square error of itself and clean speech s (n) is made to tend to ideal value with least square method.Here Echo signal and noise signal be it is irrelevant, i.e.,It willCarry out Fourier's change It changes, and carries out inverting, obtain:

Wherein, P_S(k) power spectral density of clean speech, P are represented_d(k) power spectral density of noise signal is represented, finally Obtaining H (k) is exactly the filter frequency domain form, the Wiener filter that voice signal designs is isolated noise component(s), with reason Think that result compares, constantly corrects.

Dual microphone is not merely that one microphone reception signal of increase is so simple compared with single microphone, two microphones Two-dimensional position analysis can be carried out by saying from space angle, and targeted voice signal and noise signal are clearlyed distinguish out.It is double Microphone speech enhancement technique uses two microphones, the information in available two-dimensional space, to the position of voice signal into Row detection, and it can be combined together with a variety of single microphone speech enhancement techniques, and plasticity and programmability are significantly It improves, in addition dual microphone is also relatively more flexible, strong operability, and cost performance is higher than microphone array.

Therefore, the first voice signal and rising tone message can be collected by the first microphone and second microphone Number, in order to guarantee that the removal for preferably carrying out noise obtains at the signal of frequency domain form by it by Fourier transformation Reason can guarantee the more excellent to get to first object signal and the second echo signal of processing result.

S13, according to preset algorithm, establish the cross-correlation relationship of first object signal and the second echo signal.

For example, the cross-correlation relationship of first object signal and the second echo signal can be obtained according to cross-spectral density.

If the energy of x (n) can be accumulated in time domain, and meet Dirichlet condition.So its auto-correlation function It is defined as:

If signal is not that time domain can be accumulated, such as some random or periodic discrete signal, they are Power signal, its auto-correlation function are defined as:

Auto-correlation function is for determining a signal in the time domain and agreeing with degree and correlation between its own Yardstick.

If the energy of x (n), y (n) can be accumulated in time domain, and meet Dirichlet condition.X (n), y (n) it is mutual Correlation function is defined as:

If signal is not that time domain can be accumulated, they are some power signals, and cross-correlation function is defined as:

Cross-correlation function refers to that two signals agree with the measurement of degree in the time domain, it is used to measure two random processes The degree of correlation of x (n), y (n) in any two different moments extracts speech components.

From physical significance, the two correlation functions are all the correlations for portraying signal, that is, are measured similar The amount of degree, auto-correlation function are the measuring similarity scales of signal and itself.If wherein adulterating cyclical component, The maximum of auto-correlation function is obviously embodied, and peak occurs；Cross-correlation function is the similarity between two functions Scale, equally, the peak value of cross-correlation function can also show the identical cyclical component of the two.

One signal often represents a random process, and power spectral density is a kind of absolute from the point of view of definition is counted and learns angle It is average, every case is counted.If a signal is from energy signal be a signal auto-correlation function Fu Vertical leaf transformation.Fourier transformation can be used in the power spectral density of one signal, absolute integrable, easily obtains power in frequency Distribution situation under scale.If a signal is power signal, its Fourier transformation is not present.Briefly it is exactly: Power signal is a random process, if a random process carries out Fourier transformation, then calculates power spectrum in frequency The case where, discovery time is different, and the power spectrum of the same signal but differs widely, and just can not find out one and unified portray one The power characteristic of a non-energy signal also has a definition, if a signal is that cannot lead to time change Parameters variation Fourier transformation is crossed, is directly converted.

By the definition of the power of random signal, it is known that can be in the hope of mean power with two methods, the first is It finds out power spectral density to integrate again, second is to find out increasing side's value to integrate again.The frequency spectrum of random signal be it is random, with time domain Variation, find out each time come frequency spectrum be it is different, not only spectral change is indefinite, the amplitude spectrum and phase of random signal Spectrum is also different every time, but power spectrum is but fixed and invariable, and the definition of power spectrum here is a kind of statistical average function, still Square of not simple frequency spectrum again, the statistics of average frequency spectrum, here more additional statistics, therefore value be it is determining, The former is time statistic average amount, and the latter's power spectral density is frequency domain statistical average amount, therefore is assured that a random letter Number identified power spectrum, this value fixes, the more preferable characteristic for embodying random signal.

Such as: in a closed spatial noise, and noise signal and target voice are not in the same direction, are It can separate, be then expressed as:

Yi (m)=xi (m)+ni (m) (8)

I is the number of the speech components after framing, and i=l represents the first microphone, and i=2 represents second microphone.M is to adopt The serial number of sampling point, x (m) are target voices, and what n (m) was represented is noise component(s), by two signals, one discrete Fourier transform (DFT) after handling, dual microphone signal is transformed to frequency domain, available:

Yi (w, k)=Xi (w, k)+Ni (w, k) (9)

Wherein k represents the serial number of the frame sampled, and the π of ω=2 l/L, ω indicate angular frequency.

Here P is used_y1y2(w, k) indicates the cross-spectral density of two-way microphone voice, P_y1y1What (w, k) was represented is first The autopower spectral density of microphone, P_y2y2What (w, k) was represented is the autopower spectral density of second microphone, is then just believed Relationship between number has just obtained the cross-correlation relationship of first object signal and the second echo signal.

S14, according to cross-correlation relationship, determine cross-correlation function.

For example, can determine cross-correlation function according to cross-correlation relationship and autopower spectral density.

Cross-spectral density are as follows:

With (10), autopower spectral density can also be defined out, we use two different Mikes of (11) Unified Expression Wind, expression formula are as follows:

P_yiyi(w, k)=E [| Y_i(w,k)|²] (11)

The cross-correlation function of first object signal and the second echo signal may further be obtained according to (11):

For example, can be by determining the first microphone and the corresponding use environment of second microphone, detection use environment No is non-diffuse noise circumstance, if so, using the corresponding simplified function of cross-correlation function as cross-correlation function.

We assume that the application environment of the algorithm is the smart machines such as mobile phone, the distance mm of the dual microphone in mobile phone It portrays, the correlation apart from close, received noise signal will increase.But in the actual environment of non-diffuse noise Under the premise of, cross-correlation function can simplify are as follows:

D indicates distance, and the spread speed of c representative voice, Sinc function is special function, and ω therein represents angular frequency, Fs indicates the interval of sampling, and mathematical model required for constructing is, it is specified that signal source is located at the position in the front of dual microphone, water It is fixed square to the position for target speaker.But noise position angle, θ is angle between the two.And And can be different by change modeling application environment.The horizontal distance of microphone and speaker are 2m, but dual microphone Distance only 20mm.Then, the cross-correlation function based on this mobile phone communication model is just obtained are as follows:

Γ_v1v2(w)=e^{jwfs(d/c)cosθ} (14)

Therefore, just for from different use environments, determine different cross-correlation functions, accomplished that particular problem is specific Analysis, so that the use scope of this method is bigger.

S15, it is based on cross-correlation function, obtains the first Filtering Formula and the second Filtering Formula.

Further, it is based on cross-correlation function, obtains the first Filtering Formula and the second Filtering Formula, including, to the first sound The cross-spectral density of sound signal and second sound signal is decomposed, is converted and integrated, and auxiliary filter formula is obtained, based on mutual Correlation function carries out simplified and multi-angular analysis to auxiliary filter formula, obtains the first Filtering Formula and the second Filtering Formula.

According to above-mentioned analysis it has been found that noise and voice signal are incoherent, but noise signal and target voice are certainly Body be it is relevant, then the power spectral density of the first voice signal and second sound signal becomes can to decompose, can be with table It is shown as formula (15):

P_y1y2(w, k)=P_x1x2(w,k)+P_n1n2(w, k) (15)

To the both ends of (15) equation respectively divided by oneThe left side of equation clearly becomes The cross-correlation function of received dual microphone signal, the right can also do corresponding transformation:

Then it brings the expression formula of voice signal x (n) into formula (16), is further simplified:

The definition of Signal to Noise Ratio (SNR) described in above-mentioned analysis is brought into formula to obtain:

Since two microphone wheat distances are very close, so the noise SNR1 ≈ SNR2=SNR of diamylose, and τ= f_sC/d, f_sIndicate sample frequency, c is 340m/s (speed of sound), and d represents distance.Formula (14) are substituted into formula (18), meeting Obtain auxiliary filter formula (19):

Then the auxiliary filter formula for having finally obtained two microphones, next needs to be arranged variable, when angle difference The value of formula (19) is different, then carrys out doing mathematics analysis to these values, public to the first Filtering Formula and the second filtering based on this Formula completes construction.In order to enable the calculated value of the first Filtering Formula and the second Filtering Formula is more accurate, it is based on auxiliary filter Formula (19) carries out some simplification according to the variation of angle, then calculates the first Filtering Formula according to auxiliary filter formula With the second Filtering Formula.

(1) when θ=pi/2, θ=0 cos, Section 2 is also just zero, the expression formula of abbreviation auxiliary filter formula are as follows:

If targeted voice signal is not present, SNR zero, auxiliary filter formula is not in imaginary part, if target voice In the presence of, imaginary part just has existing possibility, and then when signal-to-noise ratio is relatively low, auxiliary filter formula also only has 1/1+SNR, At this moment the signal received is noise signal entirely.Therefore it can indicate that noise component(s) accounts for larger ratio with the real part of auxiliary filter formula Weight, expression formula are as follows:

It should be noted that cos (ω τ) function in formula can be easy mistake close to 1 when speech frequency is lower Ground thinks that voice signal is noise component(s), only appears in imaginary part, in this way if signal is all filtered out, it may appear that serious Distortion, therefore low-frequency range will be treated with a certain discrimination, necessary processing in advance, it is ensured that the signal of low-frequency range remains, if sample rate It is π/4 according to the thresholding for calculating angular frequency if being 8kHz, briefly, is constructing the first Filtering Formula and the second filtering public affairs It when formula, needs in position plus a modification factor, it is ensured that it is public just to have obtained the first filtering for the reservation of low-frequency range Formula:

G₁(w, k)=1- | R [Γ_y1y2(w,k)]|^P(w) (22)

P (ω) modifying factor herein, if | ω | when being greater than π/4, value 2；If | ω | when being less than π/4, value It is 8；

(2) it when pi/2 < θ≤π, is expressed according to the imaginary part of the available auxiliary filter formula at this time of formula (18) Formula:

When signal-to-noise ratio is smaller, the imaginary part of auxiliary filter formula is sin (w τ cos θ), where noise position at this time The distance of angle pi/2 < θ≤π, dual microphone is 20mm, and sample frequency is set as 8kHz, then the τ obtained at this time is one small In 1 constant.Sin (w τ cos θ) value but becomes negative, and analysis imaginary part first item is about zero, can learn that imaginary part is negative The major part of signal, at this moment can be regarded as noise signal by number.

Here if when angle takes 180 degree, according to analysis above, the imaginary part of auxiliary filter formula is also negative, noise Than no matter being difficult to be greater than 1, the major part of signal regards noise signal as；But when angular configurations are pi/2, according to SNR's Definition it is known that the value of SNR is centainly greater than zero, so as to obtain assuming when angle is 90 degree not at It is vertical.

When also needing the very low situation of frequency, sin (w τ cos θ) ≈ 0, very low in signal-to-noise ratio if frequency is lower at this time In the environment of, the imaginary part of auxiliary filter formula keeps positive value with greater need for higher frequency, it then needs to handle low frequency, The line of demarcation or π/4 of angular frequency then learn, voice signal can at this time when minus for the imaginary part of auxiliary filter formula To be considered noise component(s), can be removed with Filtering Formula.Available second Filtering Formula taking into account the above:

A value u is introduced in formula, it is desirable to when the imaginary part of auxiliary filter formula is zero, noise signal can be filtered It removes, so u=0, if being set as zero, can be led it is known however that it is zero that the signal, which is most of, when imaginary part is zero Cause also removes the part of target voice, so needing that a suitable value is arranged to u, voice signal distortion can't be made may be used also The noise section perfection in voice signal is filtered out, therefore the first Filtering Formula and the second filter are just obtained according to the size of u value Wave formula.

S16, it is based on the first Filtering Formula and the second Filtering Formula, obtains target Filtering Formula.

First Filtering Formula and the second Filtering Formula are integrated, target Filtering Formula has just been obtained, has been filtered according to target Wave formula can design complete filter.By above-mentioned analysis, the expression formula of target Filtering Formula is summed up are as follows:

G (w, k)=G1 (w, k) G2 (w, k) (26)

Equally, it is desired nonetheless to it discusses to specific position, when it is 90 degree that noise signal is with microphone, at this time The value of the second filter arrived is 1, and then the function of entire noise processed depends entirely on another filter.Therefore To obtain the final expression formula of filter according to target Filtering Formula, the better filter of effect is designed.

S17, it is based on target Filtering Formula, first object signal is handled, the voice signal of frequency domain form is obtained.

S18, Fourier inversion and overlap-add processing are carried out to the voice signal of frequency domain form, obtains frequency domain form The voice signal of the corresponding forms of time and space of voice signal, as targeted voice signal.

After the filter that first object signal is made up of target Filtering Formula, first object signal or frequency form, It needs again to change it for forms of time and space, therefore it is handled by Fourier inversion and overlap-add, can finally obtain To the targeted voice signal of forms of time and space.

Further, in order to enable its to noise processed effect more preferably, to collected first sound of the first microphone Sound signal carries out before Fourier transformation processing, further includes:

Based on window function, the first specified voice signal collected to the first microphone is captured, and obtains to be processed One voice signal, by the first voice signal to be processed, the non-recursive type filter process for being 75% by overlap ratio is handled The first voice signal afterwards, as the first voice signal；

Accordingly, it before carrying out Fourier transformation processing to the collected second sound signal of second microphone, also wraps It includes:

Based on window function, the second specified voice signal collected to second microphone is captured, and obtains to be processed Two voice signals, by second sound signal to be processed, the non-recursive type filter process for being 75% by overlap ratio is handled Second sound signal afterwards, as second sound signal.

It, in advance will be to voice signal framing adding window, therefore to the first Mike when to the processing of the voice signal first step The the first specified voice signal and the second specified voice signal that wind and second microphone are received will first carry out a 200ms's The window function of frame length is handled, and then obtains the first voice signal and second sound signal, to guarantee the accuracy of result.

In order to keep result more accurate, a smoothing factor can also be designed to power spectrum and crosspower spectrum to complete to language The processing of sound signal:

λ refers to that the dynamic smoothing factor, value interval are [0,1], and the power spectrum introduced here needs more accurate, relationship To the calculating of target Filtering Formula above-mentioned.

A kind of method for processing noise that the present embodiment uses, comprising: the first voice signal collected to the first microphone Fourier transformation processing is carried out, first object signal is obtained, the collected second sound signal of second microphone is carried out in Fu Leaf transformation processing, obtains the second echo signal, according to preset algorithm, first object signal and the second echo signal is established mutual Pass relationship determines cross-correlation function by cross-correlation relationship, is based on cross-correlation function, obtains the first Filtering Formula and the second filter Wave formula handles the first Filtering Formula and the second Filtering Formula, obtains target Filtering Formula, public based on target filtering Formula handles first object signal, obtains the voice signal of frequency domain form, carries out in Fu to the voice signal of frequency domain form Leaf inverse transformation and overlap-add processing, obtain the voice signal of forms of time and space, as targeted voice signal.By the way that two wheats are arranged Gram wind and cross-correlation function establish more optimized filter, can guarantee complete to locate to the noise from different directions Reason, ensure that the effect of processing, solve the problems, such as single microphone can only handle single direction noise, be provided with stronger reality With property, better processing effect.

As shown in Fig. 2, a kind of noise processing apparatus of the present embodiment, comprising:

Conversion module 10: Fourier transformation processing is carried out for the first voice signal collected to the first microphone, is obtained To first object signal；

Determining module 20: for according to preset algorithm, establishing the mutual of first object signal and second echo signal Pass relationship；

It is also used to determine cross-correlation function according to cross-correlation relationship；

It obtains module 30: for being based on cross-correlation function, obtaining the first Filtering Formula and the second Filtering Formula；

It is also used to obtain target Filtering Formula based on the first Filtering Formula and the second Filtering Formula；

It is also used to handle first object signal based on target Filtering Formula, obtain the voice signal of frequency domain form；

Inverse transform block 40: carrying out Fourier inversion and overlap-add processing for the voice signal to frequency domain form, The voice signal of the corresponding forms of time and space of voice signal of frequency domain form is obtained, as targeted voice signal.

Further, it is determined that module 20 is specifically used for:

According to cross-spectral density, the cross-correlation relationship of first object signal and the second echo signal is obtained.

Further, module 30 is obtained to be specifically used for:

The cross-spectral density of first voice signal and second sound signal is decomposed, converted and integrated, is obtained auxiliary Help Filtering Formula；

Based on cross-correlation function, simplified and multi-angular analysis is carried out to auxiliary filter formula, obtain the first Filtering Formula and Second Filtering Formula.

Further, noise processing apparatus, further includes:

Capture module: for being based on window function, the first specified voice signal collected to the first microphone is captured, Obtain the first voice signal to be processed；

By the first voice signal to be processed, the non-recursive type filter process for being 75% by overlap ratio, after obtaining processing The first voice signal, as the first voice signal；

Accordingly, capture module is also used to:

Based on window function, the second specified voice signal collected to second microphone is captured, and obtains to be processed Two voice signals；

By second sound signal to be processed, the non-recursive type filter process for being 75% by overlap ratio, after obtaining processing Second sound signal, as second sound signal.

About the device in above-described embodiment, wherein modules execute the concrete mode of operation in related this method Embodiment in be described in detail, will no longer elaborate explanation herein.

It is understood that same or similar part can mutually refer in the various embodiments described above, in some embodiments Unspecified content may refer to the same or similar content in other embodiments.

It should be noted that in the description of the present invention, term " first ", " second " etc. are used for description purposes only, without It can be interpreted as indication or suggestion relative importance.In addition, in the description of the present invention, unless otherwise indicated, the meaning of " multiple " Refer at least two.

Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes It is one or more for realizing specific logical function or process the step of executable instruction code module, segment or portion Point, and the range of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discussed suitable Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, Lai Zhihang function, this should be of the invention Embodiment person of ordinary skill in the field understood.

It should be appreciated that each section of the invention can be realized with hardware, software, firmware or their combination.Above-mentioned In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage Or firmware is realized.It, and in another embodiment, can be under well known in the art for example, if realized with hardware Any one of column technology or their combination are realized: having a logic gates for realizing logic function to data-signal Discrete logic, with suitable combinational logic gate circuit specific integrated circuit, programmable gate array (PGA), scene Programmable gate array (FPGA) etc..

Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.

It, can also be in addition, each functional unit in each embodiment of the present invention can integrate in a processing module It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould Block both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module is such as Fruit is realized and when sold or used as an independent product in the form of software function module, also can store in a computer In read/write memory medium.

Storage medium mentioned above can be read-only memory, disk or CD etc..

In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example Point is included at least one embodiment or example of the invention.In the present specification, schematic expression of the above terms are not Centainly refer to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be any One or more embodiment or examples in can be combined in any suitable manner.

Although the embodiments of the present invention has been shown and described above, it is to be understood that above-described embodiment is example Property, it is not considered as limiting the invention, those skilled in the art within the scope of the invention can be to above-mentioned Embodiment is changed, modifies, replacement and variant.

Claims

1. a kind of method for processing noise, which is characterized in that the described method includes:

Based on the target Filtering Formula, the first object signal is handled, the voice signal of frequency domain form is obtained；

Fourier inversion and overlap-add processing are carried out to the voice signal of the frequency domain form, obtain the frequency domain form The voice signal of the corresponding forms of time and space of voice signal, as targeted voice signal.

2. method for processing noise according to claim 1, which is characterized in that it is described according to preset algorithm, establish described The cross-correlation relationship of one echo signal and second echo signal, comprising:

3. method for processing noise according to claim 2, which is characterized in that it is described according to the cross-correlation relationship, it determines Cross-correlation function includes:

4. method for processing noise according to claim 3, which is characterized in that it is described to be based on the cross-correlation function, it obtains First Filtering Formula and the second Filtering Formula, comprising:

The cross-spectral density of first voice signal and the second sound signal is decomposed, converted and integrated, is obtained To auxiliary filter formula；

Based on the cross-correlation function, simplified and multi-angular analysis is carried out to the auxiliary filter formula, obtains the first filtering public affairs Formula and the second Filtering Formula.

5. method for processing noise according to claim 1, which is characterized in that it is described according to the cross-correlation relationship, it determines After cross-correlation function, comprising:

Detect whether the use environment is non-diffuse noise circumstance；

6. method for processing noise according to claim 1, which is characterized in that described collected to the first microphone One voice signal carries out before Fourier transformation processing, further includes:

Based on window function, the specified voice signal of first microphone collected first is captured, obtains to be processed One voice signal；

By first voice signal to be processed, the non-recursive type filter process for being 75% by overlap ratio, after obtaining processing The first voice signal, as the first voice signal；

Accordingly, it before the progress Fourier transformation processing to second microphone collected second sound signal, also wraps It includes:

Based on window function, the second specified voice signal collected to the second microphone is captured, and obtains to be processed Two voice signals；

By the second sound signal to be processed, the non-recursive type filter process for being 75% by overlap ratio, after obtaining processing Second sound signal, as second sound signal.

7. a kind of noise processing apparatus, which is characterized in that described device includes:

Conversion module: Fourier transformation processing is carried out for the first voice signal collected to the first microphone, obtains first Echo signal；

It is also used to carry out Fourier transformation processing to the collected second sound signal of second microphone, obtains the second target letter Number；

Determining module: for establishing the cross-correlation of the first object signal and second echo signal according to preset algorithm Relationship；

It is also used to handle the first object signal based on the target Filtering Formula, obtain the voice of frequency domain form Signal；

Inverse transform block: Fourier inversion and overlap-add processing are carried out for the voice signal to the frequency domain form, is obtained To the voice signal of the corresponding forms of time and space of voice signal of the frequency domain form, as targeted voice signal.

8. noise processing apparatus according to claim 7, which is characterized in that the determining module is specifically used for:

9. noise processing apparatus according to claim 8, which is characterized in that the acquisition module is specifically used for:

10. noise processing apparatus according to claim 7, which is characterized in that further include:

Capture module: for being captured to the specified voice signal of first microphone collected first based on window function, Obtain the first voice signal to be processed；

Accordingly, capture module is also used to: