CN109616137A - Method for processing noise and device - Google Patents
Method for processing noise and device Download PDFInfo
- Publication number
- CN109616137A CN109616137A CN201910080549.9A CN201910080549A CN109616137A CN 109616137 A CN109616137 A CN 109616137A CN 201910080549 A CN201910080549 A CN 201910080549A CN 109616137 A CN109616137 A CN 109616137A
- Authority
- CN
- China
- Prior art keywords
- signal
- cross
- voice signal
- formula
- microphone
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000012545 processing Methods 0.000 title claims abstract description 74
- 238000000034 method Methods 0.000 title claims abstract description 52
- 238000001914 filtration Methods 0.000 claims abstract description 94
- 238000005314 correlation function Methods 0.000 claims abstract description 55
- 230000005236 sound signal Effects 0.000 claims abstract description 43
- 230000009466 transformation Effects 0.000 claims abstract description 33
- 230000008569 process Effects 0.000 claims description 19
- 238000004458 analytical method Methods 0.000 claims description 15
- 230000003595 spectral effect Effects 0.000 claims description 14
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 230000006870 function Effects 0.000 description 23
- 238000001228 spectrum Methods 0.000 description 15
- 230000009977 dual effect Effects 0.000 description 10
- 238000005311 autocorrelation function Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 6
- 230000008859 change Effects 0.000 description 5
- 238000013461 design Methods 0.000 description 3
- 238000003860 storage Methods 0.000 description 3
- 241000209140 Triticum Species 0.000 description 2
- 235000021307 Triticum Nutrition 0.000 description 2
- 239000000654 additive Substances 0.000 description 2
- 230000000996 additive effect Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000009432 framing Methods 0.000 description 2
- 238000009499 grossing Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 101100398248 Arabidopsis thaliana KIN10 gene Proteins 0.000 description 1
- 101000835860 Homo sapiens SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily B member 1 Proteins 0.000 description 1
- 102100025746 SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily B member 1 Human genes 0.000 description 1
- 241000638935 Senecio crassissimus Species 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02165—Two microphones, one receiving mainly the noise signal and the other one mainly the speech signal
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
The present invention relates to a kind of method for processing noise and devices, it include: that the first voice signal collected to the first microphone carries out Fourier transformation processing, obtain first object signal, Fourier transformation processing is carried out to the collected second sound signal of second microphone, obtain the second echo signal, according to preset algorithm, establish the cross-correlation relationship of first object signal and the second echo signal, according to cross-correlation relationship, determine cross-correlation function, based on cross-correlation function, obtain the first Filtering Formula and the second Filtering Formula, based on the first Filtering Formula and the second Filtering Formula, obtain target Filtering Formula, based on target Filtering Formula, first object signal is handled, obtain the voice signal of frequency domain form, Fourier inversion and overlap-add processing are carried out to the voice signal of frequency domain form, obtain the voice signal of forms of time and space, make Purer voice signal is obtained for targeted voice signal preferably to complete the processing to noise.
Description
Technical field
The present invention relates to noise technique fields, and in particular to a kind of method for processing noise and device.
Background technique
Noise is very universal, the either traffic work such as heavy mechanical equipment or automobile in the production and living of people
Have the various noises generated, so that work and life environment locating for people becomes very noisy, in order to obtain more
Pure voice signal people just start to design various noise cancellation methods, the quality of Lai Tigao voice signal.
Current most noise cancellation method, is all by way of single microphone, still, in some downtown area, station
The equal biggish place of noises, the noise cancellation of single microphone can not remove the noise signal in voice signal well.
Summary of the invention
In view of this, the purpose of the present invention is to provide a kind of method for processing noise and device, it is more preferable by dual microphone
Complete the processing to noise in ground.
In order to achieve the above object, the present invention adopts the following technical scheme:
A kind of method for processing noise, which comprises
The first voice signal collected to the first microphone carries out Fourier transformation processing, obtains first object signal;
Fourier transformation processing is carried out to the collected second sound signal of second microphone, obtains the second echo signal;
According to preset algorithm, the cross-correlation relationship of the first object signal and second echo signal is established;
According to the cross-correlation relationship, cross-correlation function is determined;
Based on the cross-correlation function, the first Filtering Formula and the second Filtering Formula are obtained;
Based on first Filtering Formula and second Filtering Formula, target Filtering Formula is obtained;
Based on the target Filtering Formula, the first object signal is handled, obtains the voice letter of frequency domain form
Number;
Fourier inversion and overlap-add processing are carried out to the voice signal of the frequency domain form, obtain the frequency domain shape
The voice signal of the corresponding forms of time and space of the voice signal of formula, as targeted voice signal.
Optionally, described above according to preset algorithm, establish the first object signal and second echo signal
Cross-correlation relationship, comprising:
According to cross-spectral density, the cross-correlation relationship of the first object signal and second echo signal is obtained;
Optionally, described above according to the cross-correlation relationship, determine that cross-correlation function includes:
According to the cross-correlation relationship and autopower spectral density, cross-correlation function is determined.
Optionally, described above to be based on the cross-correlation function, the first Filtering Formula and the second Filtering Formula are obtained, is wrapped
It includes:
The cross-spectral density of first voice signal and the second sound signal is decomposed, is converted and whole
It closes, obtains auxiliary filter formula;
Based on the cross-correlation function, simplified and multi-angular analysis is carried out to the auxiliary filter formula, obtains the first filter
Wave formula and the second Filtering Formula.
Optionally, described above according to the cross-correlation relationship, after determining cross-correlation function, comprising:
Determine first microphone and the corresponding use environment of the second microphone;
Detect whether the use environment is non-diffuse noise circumstance;
If so, using the corresponding simplified function of the cross-correlation function as the cross-correlation function.
Optionally, above-mentioned to handle it in the first voice signal progress Fourier transformation collected to the first microphone
Before, further includes:
Based on window function, the specified voice signal of first microphone collected first is captured, is obtained wait locate
Manage the first voice signal;
By first voice signal to be processed, the non-recursive type filter process for being 75% by overlap ratio is obtained everywhere
The first voice signal after reason, as the first voice signal;
Accordingly, before the progress Fourier transformation processing to second microphone collected second sound signal,
Further include:
Based on window function, the second specified voice signal collected to the second microphone is captured, and is obtained wait locate
Manage second sound signal;
By the second sound signal to be processed, the non-recursive type filter process for being 75% by overlap ratio is obtained everywhere
Second sound signal after reason, as second sound signal.
A kind of noise processing apparatus, described device include:
Conversion module: Fourier transformation processing is carried out for the first voice signal collected to the first microphone, is obtained
First object signal;
It is also used to carry out Fourier transformation processing to the collected second sound signal of second microphone, obtains the second target
Signal;
Determining module: for according to preset algorithm, establishing the mutual of the first object signal and second echo signal
Correlativity;
It is also used to determine cross-correlation function according to the cross-correlation relationship;
It obtains module: for being based on the cross-correlation function, obtaining the first Filtering Formula and the second Filtering Formula;
It is also used to obtain target Filtering Formula based on first Filtering Formula and second Filtering Formula;
It is also used to handle the first object signal based on the target Filtering Formula, obtain frequency domain form
Voice signal;
Inverse transform block: it is carried out at Fourier inversion and overlap-add for the voice signal to the frequency domain form
Reason, obtains the voice signal of the corresponding forms of time and space of voice signal of the frequency domain form, as targeted voice signal.
Optionally, determining module described above is specifically used for:
According to cross-spectral density, the cross-correlation relationship of the first object signal and second echo signal is obtained.
Optionally, acquisition module described above is specifically used for:
The cross-spectral density of first voice signal and the second sound signal is decomposed, is converted and whole
It closes, obtains auxiliary filter formula;
Based on the cross-correlation function, simplified and multi-angular analysis is carried out to the auxiliary filter formula, obtains the first filter
Wave formula and the second Filtering Formula.
Optionally, noise processing apparatus described above, further includes:
Capture module: for being based on window function, the specified voice signal of first microphone collected first is carried out
It captures, obtains the first voice signal to be processed;
By first voice signal to be processed, the non-recursive type filter process for being 75% by overlap ratio is obtained everywhere
The first voice signal after reason, as the first voice signal;
Accordingly, capture module is also used to:
Based on window function, the second specified voice signal collected to the second microphone is captured, and is obtained wait locate
Manage second sound signal;
By the second sound signal to be processed, the non-recursive type filter process for being 75% by overlap ratio is obtained everywhere
Second sound signal after reason, as second sound signal.
A kind of method for processing noise that the present invention uses, comprising: the first voice signal collected to the first microphone into
The processing of row Fourier transformation, obtains first object signal, carries out Fourier to the collected second sound signal of second microphone
Conversion process obtains the second echo signal, and according to preset algorithm, first object signal and the second echo signal are established cross-correlation
Relationship determines cross-correlation function by cross-correlation relationship, is based on cross-correlation function, obtains the first Filtering Formula and the second filtering
Formula handles the first Filtering Formula and the second Filtering Formula, obtains target Filtering Formula, is based on target Filtering Formula,
First object signal is handled, the voice signal of frequency domain form is obtained, Fourier is carried out to the voice signal of frequency domain form
Inverse transformation and overlap-add processing, obtain the voice signal of forms of time and space, as targeted voice signal.By the way that two Mikes are arranged
Wind and cross-correlation function establish more optimized filter, can guarantee complete to handle to the noise from different directions,
The effect that ensure that processing, solve the problems, such as single microphone can only handle single direction noise, be provided with stronger practical
Property, better processing effect.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with
It obtains other drawings based on these drawings.
Fig. 1 is a kind of flow chart of method for processing noise provided in an embodiment of the present invention.
Fig. 2 is a kind of structural schematic diagram of noise processing apparatus provided in an embodiment of the present invention.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, technical solution of the present invention will be carried out below
Detailed description.Obviously, described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Base
Embodiment in the present invention, those of ordinary skill in the art are obtained all without making creative work
Other embodiment belongs to the range that the present invention is protected.
Fig. 1 is a kind of flow chart of method for processing noise provided in an embodiment of the present invention.
As shown in Figure 1, a kind of method for processing noise of the present embodiment, comprising the following steps:
S11, the first voice signal collected to the first microphone carry out Fourier transformation processing, obtain first object letter
Number.
S12, Fourier transformation processing is carried out to the collected second sound signal of second microphone, obtains the second target letter
Number.
Clean speech component is divided into noise component(s) (between the two without related in the signal theory of microphone acquisition
Property), wherein noise signal is considered as garbage signal or interference signal, and there are noises to believe that various additivitys are made an uproar in life
Sound and non additivity noise, the present embodiment are analyzed by taking additive noise as an example.Assuming that target purified signal is s (n), additive noise
It is d (n), noisy speech is x (n), and the frequency domain distribution of general voice and noise has very big difference, so the two is irrelevant
, it can be expressed as x (n)=s (n)+d (n) and E [s (n) d (n)]=0, utilize Fourier transformation (FFT), the expression formula of frequency domain
It is written as X (w)=S (w)+D (w).
X (n) generally use signal-to-noise ratio as measurement noise doping number reference, SNR is defined as:
Wherein, it indicates the energy of signal, indicates noiseEnergy, signal-to-noise ratio oneAs refer to
It is the ratio size of signal and noise, therefore clean speech, noise signal, resultant signal are it is only necessary to know that second, can calculate
Signal-to-noise ratio.
Voice signal is pre-processed, when calculating correlation function or power spectrum function first is that most with multinomial
Small square law eliminates trend term error and DC component, such as amplifier can generate temperature drift with the variation of temperature, second is that using number
Word filter removes ambient noise significant in noisy speech general extra noise filtering, pretreatment, and recurrence can be used
(IIR) filter and onrecurrent (FIR) filter.Citing, the sound card and microphone of computer while collecting voice signal,
Computer is taken through instruction to this AC signal of voice in equipment, and the alternating current wave sound of 50Hz forms unnecessary interference,
So the alternating current wave sound for filtering out 50Hz must be used.
For the ease of when studying the dual microphone sound enhancement method of cross-correlation function, there is comparison reference below, draw
Wiener filtering algorithm.
The basic conception of Wiener filtering algorithm is time domain and the frequency by logic analysis echo signal component and noise component
The information in domain rule of thumb constructs filter, and it is minimum will constantly to adulterate the voice signal of the voice signal of noise after treatment
Square law fitting, finally obtains immediate signal, to achieve the purpose that filtering.
When microphone receives signal x (n), filter h (n) is proposed in advance, but be not aware that concrete form, by x
(n) target voice is obtained by filter for the first timeEstimation:
Wherein,Then the mean square error of itself and clean speech s (n) is made to tend to ideal value with least square method.Here
Echo signal and noise signal be it is irrelevant, i.e.,It willCarry out Fourier's change
It changes, and carries out inverting, obtain:
Wherein, PS(k) power spectral density of clean speech, P are representedd(k) power spectral density of noise signal is represented, finally
Obtaining H (k) is exactly the filter frequency domain form, the Wiener filter that voice signal designs is isolated noise component(s), with reason
Think that result compares, constantly corrects.
Dual microphone is not merely that one microphone reception signal of increase is so simple compared with single microphone, two microphones
Two-dimensional position analysis can be carried out by saying from space angle, and targeted voice signal and noise signal are clearlyed distinguish out.It is double
Microphone speech enhancement technique uses two microphones, the information in available two-dimensional space, to the position of voice signal into
Row detection, and it can be combined together with a variety of single microphone speech enhancement techniques, and plasticity and programmability are significantly
It improves, in addition dual microphone is also relatively more flexible, strong operability, and cost performance is higher than microphone array.
Therefore, the first voice signal and rising tone message can be collected by the first microphone and second microphone
Number, in order to guarantee that the removal for preferably carrying out noise obtains at the signal of frequency domain form by it by Fourier transformation
Reason can guarantee the more excellent to get to first object signal and the second echo signal of processing result.
S13, according to preset algorithm, establish the cross-correlation relationship of first object signal and the second echo signal.
For example, the cross-correlation relationship of first object signal and the second echo signal can be obtained according to cross-spectral density.
If the energy of x (n) can be accumulated in time domain, and meet Dirichlet condition.So its auto-correlation function
It is defined as:
If signal is not that time domain can be accumulated, such as some random or periodic discrete signal, they are
Power signal, its auto-correlation function are defined as:
Auto-correlation function is for determining a signal in the time domain and agreeing with degree and correlation between its own
Yardstick.
If the energy of x (n), y (n) can be accumulated in time domain, and meet Dirichlet condition.X (n), y (n) it is mutual
Correlation function is defined as:
If signal is not that time domain can be accumulated, they are some power signals, and cross-correlation function is defined as:
Cross-correlation function refers to that two signals agree with the measurement of degree in the time domain, it is used to measure two random processes
The degree of correlation of x (n), y (n) in any two different moments extracts speech components.
From physical significance, the two correlation functions are all the correlations for portraying signal, that is, are measured similar
The amount of degree, auto-correlation function are the measuring similarity scales of signal and itself.If wherein adulterating cyclical component,
The maximum of auto-correlation function is obviously embodied, and peak occurs;Cross-correlation function is the similarity between two functions
Scale, equally, the peak value of cross-correlation function can also show the identical cyclical component of the two.
One signal often represents a random process, and power spectral density is a kind of absolute from the point of view of definition is counted and learns angle
It is average, every case is counted.If a signal is from energy signal be a signal auto-correlation function Fu
Vertical leaf transformation.Fourier transformation can be used in the power spectral density of one signal, absolute integrable, easily obtains power in frequency
Distribution situation under scale.If a signal is power signal, its Fourier transformation is not present.Briefly it is exactly:
Power signal is a random process, if a random process carries out Fourier transformation, then calculates power spectrum in frequency
The case where, discovery time is different, and the power spectrum of the same signal but differs widely, and just can not find out one and unified portray one
The power characteristic of a non-energy signal also has a definition, if a signal is that cannot lead to time change Parameters variation
Fourier transformation is crossed, is directly converted.
By the definition of the power of random signal, it is known that can be in the hope of mean power with two methods, the first is
It finds out power spectral density to integrate again, second is to find out increasing side's value to integrate again.The frequency spectrum of random signal be it is random, with time domain
Variation, find out each time come frequency spectrum be it is different, not only spectral change is indefinite, the amplitude spectrum and phase of random signal
Spectrum is also different every time, but power spectrum is but fixed and invariable, and the definition of power spectrum here is a kind of statistical average function, still
Square of not simple frequency spectrum again, the statistics of average frequency spectrum, here more additional statistics, therefore value be it is determining,
The former is time statistic average amount, and the latter's power spectral density is frequency domain statistical average amount, therefore is assured that a random letter
Number identified power spectrum, this value fixes, the more preferable characteristic for embodying random signal.
Such as: in a closed spatial noise, and noise signal and target voice are not in the same direction, are
It can separate, be then expressed as:
Yi (m)=xi (m)+ni (m) (8)
I is the number of the speech components after framing, and i=l represents the first microphone, and i=2 represents second microphone.M is to adopt
The serial number of sampling point, x (m) are target voices, and what n (m) was represented is noise component(s), by two signals, one discrete Fourier transform
(DFT) after handling, dual microphone signal is transformed to frequency domain, available:
Yi (w, k)=Xi (w, k)+Ni (w, k) (9)
Wherein k represents the serial number of the frame sampled, and the π of ω=2 l/L, ω indicate angular frequency.
Here P is usedy1y2(w, k) indicates the cross-spectral density of two-way microphone voice, Py1y1What (w, k) was represented is first
The autopower spectral density of microphone, Py2y2What (w, k) was represented is the autopower spectral density of second microphone, is then just believed
Relationship between number has just obtained the cross-correlation relationship of first object signal and the second echo signal.
S14, according to cross-correlation relationship, determine cross-correlation function.
For example, can determine cross-correlation function according to cross-correlation relationship and autopower spectral density.
Cross-spectral density are as follows:
With (10), autopower spectral density can also be defined out, we use two different Mikes of (11) Unified Expression
Wind, expression formula are as follows:
Pyiyi(w, k)=E [| Yi(w,k)|2] (11)
The cross-correlation function of first object signal and the second echo signal may further be obtained according to (11):
For example, can be by determining the first microphone and the corresponding use environment of second microphone, detection use environment
No is non-diffuse noise circumstance, if so, using the corresponding simplified function of cross-correlation function as cross-correlation function.
We assume that the application environment of the algorithm is the smart machines such as mobile phone, the distance mm of the dual microphone in mobile phone
It portrays, the correlation apart from close, received noise signal will increase.But in the actual environment of non-diffuse noise
Under the premise of, cross-correlation function can simplify are as follows:
D indicates distance, and the spread speed of c representative voice, Sinc function is special function, and ω therein represents angular frequency,
Fs indicates the interval of sampling, and mathematical model required for constructing is, it is specified that signal source is located at the position in the front of dual microphone, water
It is fixed square to the position for target speaker.But noise position angle, θ is angle between the two.And
And can be different by change modeling application environment.The horizontal distance of microphone and speaker are 2m, but dual microphone
Distance only 20mm.Then, the cross-correlation function based on this mobile phone communication model is just obtained are as follows:
Γv1v2(w)=ejwfs(d/c)cosθ (14)
Therefore, just for from different use environments, determine different cross-correlation functions, accomplished that particular problem is specific
Analysis, so that the use scope of this method is bigger.
S15, it is based on cross-correlation function, obtains the first Filtering Formula and the second Filtering Formula.
Further, it is based on cross-correlation function, obtains the first Filtering Formula and the second Filtering Formula, including, to the first sound
The cross-spectral density of sound signal and second sound signal is decomposed, is converted and integrated, and auxiliary filter formula is obtained, based on mutual
Correlation function carries out simplified and multi-angular analysis to auxiliary filter formula, obtains the first Filtering Formula and the second Filtering Formula.
According to above-mentioned analysis it has been found that noise and voice signal are incoherent, but noise signal and target voice are certainly
Body be it is relevant, then the power spectral density of the first voice signal and second sound signal becomes can to decompose, can be with table
It is shown as formula (15):
Py1y2(w, k)=Px1x2(w,k)+Pn1n2(w, k) (15)
To the both ends of (15) equation respectively divided by oneThe left side of equation clearly becomes
The cross-correlation function of received dual microphone signal, the right can also do corresponding transformation:
Then it brings the expression formula of voice signal x (n) into formula (16), is further simplified:
The definition of Signal to Noise Ratio (SNR) described in above-mentioned analysis is brought into formula to obtain:
Since two microphone wheat distances are very close, so the noise SNR1 ≈ SNR2=SNR of diamylose, and τ=
fsC/d, fsIndicate sample frequency, c is 340m/s (speed of sound), and d represents distance.Formula (14) are substituted into formula (18), meeting
Obtain auxiliary filter formula (19):
Then the auxiliary filter formula for having finally obtained two microphones, next needs to be arranged variable, when angle difference
The value of formula (19) is different, then carrys out doing mathematics analysis to these values, public to the first Filtering Formula and the second filtering based on this
Formula completes construction.In order to enable the calculated value of the first Filtering Formula and the second Filtering Formula is more accurate, it is based on auxiliary filter
Formula (19) carries out some simplification according to the variation of angle, then calculates the first Filtering Formula according to auxiliary filter formula
With the second Filtering Formula.
(1) when θ=pi/2, θ=0 cos, Section 2 is also just zero, the expression formula of abbreviation auxiliary filter formula are as follows:
If targeted voice signal is not present, SNR zero, auxiliary filter formula is not in imaginary part, if target voice
In the presence of, imaginary part just has existing possibility, and then when signal-to-noise ratio is relatively low, auxiliary filter formula also only has 1/1+SNR,
At this moment the signal received is noise signal entirely.Therefore it can indicate that noise component(s) accounts for larger ratio with the real part of auxiliary filter formula
Weight, expression formula are as follows:
It should be noted that cos (ω τ) function in formula can be easy mistake close to 1 when speech frequency is lower
Ground thinks that voice signal is noise component(s), only appears in imaginary part, in this way if signal is all filtered out, it may appear that serious
Distortion, therefore low-frequency range will be treated with a certain discrimination, necessary processing in advance, it is ensured that the signal of low-frequency range remains, if sample rate
It is π/4 according to the thresholding for calculating angular frequency if being 8kHz, briefly, is constructing the first Filtering Formula and the second filtering public affairs
It when formula, needs in position plus a modification factor, it is ensured that it is public just to have obtained the first filtering for the reservation of low-frequency range
Formula:
G1(w, k)=1- | R [Γy1y2(w,k)]|P(w) (22)
P (ω) modifying factor herein, if | ω | when being greater than π/4, value 2;If | ω | when being less than π/4, value
It is 8;
(2) it when pi/2 < θ≤π, is expressed according to the imaginary part of the available auxiliary filter formula at this time of formula (18)
Formula:
When signal-to-noise ratio is smaller, the imaginary part of auxiliary filter formula is sin (w τ cos θ), where noise position at this time
The distance of angle pi/2 < θ≤π, dual microphone is 20mm, and sample frequency is set as 8kHz, then the τ obtained at this time is one small
In 1 constant.Sin (w τ cos θ) value but becomes negative, and analysis imaginary part first item is about zero, can learn that imaginary part is negative
The major part of signal, at this moment can be regarded as noise signal by number.
Here if when angle takes 180 degree, according to analysis above, the imaginary part of auxiliary filter formula is also negative, noise
Than no matter being difficult to be greater than 1, the major part of signal regards noise signal as;But when angular configurations are pi/2, according to SNR's
Definition it is known that the value of SNR is centainly greater than zero, so as to obtain assuming when angle is 90 degree not at
It is vertical.
When also needing the very low situation of frequency, sin (w τ cos θ) ≈ 0, very low in signal-to-noise ratio if frequency is lower at this time
In the environment of, the imaginary part of auxiliary filter formula keeps positive value with greater need for higher frequency, it then needs to handle low frequency,
The line of demarcation or π/4 of angular frequency then learn, voice signal can at this time when minus for the imaginary part of auxiliary filter formula
To be considered noise component(s), can be removed with Filtering Formula.Available second Filtering Formula taking into account the above:
A value u is introduced in formula, it is desirable to when the imaginary part of auxiliary filter formula is zero, noise signal can be filtered
It removes, so u=0, if being set as zero, can be led it is known however that it is zero that the signal, which is most of, when imaginary part is zero
Cause also removes the part of target voice, so needing that a suitable value is arranged to u, voice signal distortion can't be made may be used also
The noise section perfection in voice signal is filtered out, therefore the first Filtering Formula and the second filter are just obtained according to the size of u value
Wave formula.
S16, it is based on the first Filtering Formula and the second Filtering Formula, obtains target Filtering Formula.
First Filtering Formula and the second Filtering Formula are integrated, target Filtering Formula has just been obtained, has been filtered according to target
Wave formula can design complete filter.By above-mentioned analysis, the expression formula of target Filtering Formula is summed up are as follows:
G (w, k)=G1 (w, k) G2 (w, k) (26)
Equally, it is desired nonetheless to it discusses to specific position, when it is 90 degree that noise signal is with microphone, at this time
The value of the second filter arrived is 1, and then the function of entire noise processed depends entirely on another filter.Therefore
To obtain the final expression formula of filter according to target Filtering Formula, the better filter of effect is designed.
S17, it is based on target Filtering Formula, first object signal is handled, the voice signal of frequency domain form is obtained.
S18, Fourier inversion and overlap-add processing are carried out to the voice signal of frequency domain form, obtains frequency domain form
The voice signal of the corresponding forms of time and space of voice signal, as targeted voice signal.
After the filter that first object signal is made up of target Filtering Formula, first object signal or frequency form,
It needs again to change it for forms of time and space, therefore it is handled by Fourier inversion and overlap-add, can finally obtain
To the targeted voice signal of forms of time and space.
Further, in order to enable its to noise processed effect more preferably, to collected first sound of the first microphone
Sound signal carries out before Fourier transformation processing, further includes:
Based on window function, the first specified voice signal collected to the first microphone is captured, and obtains to be processed
One voice signal, by the first voice signal to be processed, the non-recursive type filter process for being 75% by overlap ratio is handled
The first voice signal afterwards, as the first voice signal;
Accordingly, it before carrying out Fourier transformation processing to the collected second sound signal of second microphone, also wraps
It includes:
Based on window function, the second specified voice signal collected to second microphone is captured, and obtains to be processed
Two voice signals, by second sound signal to be processed, the non-recursive type filter process for being 75% by overlap ratio is handled
Second sound signal afterwards, as second sound signal.
It, in advance will be to voice signal framing adding window, therefore to the first Mike when to the processing of the voice signal first step
The the first specified voice signal and the second specified voice signal that wind and second microphone are received will first carry out a 200ms's
The window function of frame length is handled, and then obtains the first voice signal and second sound signal, to guarantee the accuracy of result.
In order to keep result more accurate, a smoothing factor can also be designed to power spectrum and crosspower spectrum to complete to language
The processing of sound signal:
λ refers to that the dynamic smoothing factor, value interval are [0,1], and the power spectrum introduced here needs more accurate, relationship
To the calculating of target Filtering Formula above-mentioned.
A kind of method for processing noise that the present embodiment uses, comprising: the first voice signal collected to the first microphone
Fourier transformation processing is carried out, first object signal is obtained, the collected second sound signal of second microphone is carried out in Fu
Leaf transformation processing, obtains the second echo signal, according to preset algorithm, first object signal and the second echo signal is established mutual
Pass relationship determines cross-correlation function by cross-correlation relationship, is based on cross-correlation function, obtains the first Filtering Formula and the second filter
Wave formula handles the first Filtering Formula and the second Filtering Formula, obtains target Filtering Formula, public based on target filtering
Formula handles first object signal, obtains the voice signal of frequency domain form, carries out in Fu to the voice signal of frequency domain form
Leaf inverse transformation and overlap-add processing, obtain the voice signal of forms of time and space, as targeted voice signal.By the way that two wheats are arranged
Gram wind and cross-correlation function establish more optimized filter, can guarantee complete to locate to the noise from different directions
Reason, ensure that the effect of processing, solve the problems, such as single microphone can only handle single direction noise, be provided with stronger reality
With property, better processing effect.
Fig. 2 is a kind of structural schematic diagram of noise processing apparatus provided in an embodiment of the present invention.
As shown in Fig. 2, a kind of noise processing apparatus of the present embodiment, comprising:
Conversion module 10: Fourier transformation processing is carried out for the first voice signal collected to the first microphone, is obtained
To first object signal;
It is also used to carry out Fourier transformation processing to the collected second sound signal of second microphone, obtains the second target
Signal;
Determining module 20: for according to preset algorithm, establishing the mutual of first object signal and second echo signal
Pass relationship;
It is also used to determine cross-correlation function according to cross-correlation relationship;
It obtains module 30: for being based on cross-correlation function, obtaining the first Filtering Formula and the second Filtering Formula;
It is also used to obtain target Filtering Formula based on the first Filtering Formula and the second Filtering Formula;
It is also used to handle first object signal based on target Filtering Formula, obtain the voice signal of frequency domain form;
Inverse transform block 40: carrying out Fourier inversion and overlap-add processing for the voice signal to frequency domain form,
The voice signal of the corresponding forms of time and space of voice signal of frequency domain form is obtained, as targeted voice signal.
Further, it is determined that module 20 is specifically used for:
According to cross-spectral density, the cross-correlation relationship of first object signal and the second echo signal is obtained.
Further, module 30 is obtained to be specifically used for:
The cross-spectral density of first voice signal and second sound signal is decomposed, converted and integrated, is obtained auxiliary
Help Filtering Formula;
Based on cross-correlation function, simplified and multi-angular analysis is carried out to auxiliary filter formula, obtain the first Filtering Formula and
Second Filtering Formula.
Further, noise processing apparatus, further includes:
Capture module: for being based on window function, the first specified voice signal collected to the first microphone is captured,
Obtain the first voice signal to be processed;
By the first voice signal to be processed, the non-recursive type filter process for being 75% by overlap ratio, after obtaining processing
The first voice signal, as the first voice signal;
Accordingly, capture module is also used to:
Based on window function, the second specified voice signal collected to second microphone is captured, and obtains to be processed
Two voice signals;
By second sound signal to be processed, the non-recursive type filter process for being 75% by overlap ratio, after obtaining processing
Second sound signal, as second sound signal.
About the device in above-described embodiment, wherein modules execute the concrete mode of operation in related this method
Embodiment in be described in detail, will no longer elaborate explanation herein.
It is understood that same or similar part can mutually refer in the various embodiments described above, in some embodiments
Unspecified content may refer to the same or similar content in other embodiments.
It should be noted that in the description of the present invention, term " first ", " second " etc. are used for description purposes only, without
It can be interpreted as indication or suggestion relative importance.In addition, in the description of the present invention, unless otherwise indicated, the meaning of " multiple "
Refer at least two.
Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes
It is one or more for realizing specific logical function or process the step of executable instruction code module, segment or portion
Point, and the range of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discussed suitable
Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, Lai Zhihang function, this should be of the invention
Embodiment person of ordinary skill in the field understood.
It should be appreciated that each section of the invention can be realized with hardware, software, firmware or their combination.Above-mentioned
In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage
Or firmware is realized.It, and in another embodiment, can be under well known in the art for example, if realized with hardware
Any one of column technology or their combination are realized: having a logic gates for realizing logic function to data-signal
Discrete logic, with suitable combinational logic gate circuit specific integrated circuit, programmable gate array (PGA), scene
Programmable gate array (FPGA) etc..
Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries
It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium
In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.
It, can also be in addition, each functional unit in each embodiment of the present invention can integrate in a processing module
It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould
Block both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module is such as
Fruit is realized and when sold or used as an independent product in the form of software function module, also can store in a computer
In read/write memory medium.
Storage medium mentioned above can be read-only memory, disk or CD etc..
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show
The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example
Point is included at least one embodiment or example of the invention.In the present specification, schematic expression of the above terms are not
Centainly refer to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be any
One or more embodiment or examples in can be combined in any suitable manner.
Although the embodiments of the present invention has been shown and described above, it is to be understood that above-described embodiment is example
Property, it is not considered as limiting the invention, those skilled in the art within the scope of the invention can be to above-mentioned
Embodiment is changed, modifies, replacement and variant.
Claims (10)
1. a kind of method for processing noise, which is characterized in that the described method includes:
The first voice signal collected to the first microphone carries out Fourier transformation processing, obtains first object signal;
Fourier transformation processing is carried out to the collected second sound signal of second microphone, obtains the second echo signal;
According to preset algorithm, the cross-correlation relationship of the first object signal and second echo signal is established;
According to the cross-correlation relationship, cross-correlation function is determined;
Based on the cross-correlation function, the first Filtering Formula and the second Filtering Formula are obtained;
Based on first Filtering Formula and second Filtering Formula, target Filtering Formula is obtained;
Based on the target Filtering Formula, the first object signal is handled, the voice signal of frequency domain form is obtained;
Fourier inversion and overlap-add processing are carried out to the voice signal of the frequency domain form, obtain the frequency domain form
The voice signal of the corresponding forms of time and space of voice signal, as targeted voice signal.
2. method for processing noise according to claim 1, which is characterized in that it is described according to preset algorithm, establish described
The cross-correlation relationship of one echo signal and second echo signal, comprising:
According to cross-spectral density, the cross-correlation relationship of the first object signal and second echo signal is obtained.
3. method for processing noise according to claim 2, which is characterized in that it is described according to the cross-correlation relationship, it determines
Cross-correlation function includes:
According to the cross-correlation relationship and autopower spectral density, cross-correlation function is determined.
4. method for processing noise according to claim 3, which is characterized in that it is described to be based on the cross-correlation function, it obtains
First Filtering Formula and the second Filtering Formula, comprising:
The cross-spectral density of first voice signal and the second sound signal is decomposed, converted and integrated, is obtained
To auxiliary filter formula;
Based on the cross-correlation function, simplified and multi-angular analysis is carried out to the auxiliary filter formula, obtains the first filtering public affairs
Formula and the second Filtering Formula.
5. method for processing noise according to claim 1, which is characterized in that it is described according to the cross-correlation relationship, it determines
After cross-correlation function, comprising:
Determine first microphone and the corresponding use environment of the second microphone;
Detect whether the use environment is non-diffuse noise circumstance;
If so, using the corresponding simplified function of the cross-correlation function as the cross-correlation function.
6. method for processing noise according to claim 1, which is characterized in that described collected to the first microphone
One voice signal carries out before Fourier transformation processing, further includes:
Based on window function, the specified voice signal of first microphone collected first is captured, obtains to be processed
One voice signal;
By first voice signal to be processed, the non-recursive type filter process for being 75% by overlap ratio, after obtaining processing
The first voice signal, as the first voice signal;
Accordingly, it before the progress Fourier transformation processing to second microphone collected second sound signal, also wraps
It includes:
Based on window function, the second specified voice signal collected to the second microphone is captured, and obtains to be processed
Two voice signals;
By the second sound signal to be processed, the non-recursive type filter process for being 75% by overlap ratio, after obtaining processing
Second sound signal, as second sound signal.
7. a kind of noise processing apparatus, which is characterized in that described device includes:
Conversion module: Fourier transformation processing is carried out for the first voice signal collected to the first microphone, obtains first
Echo signal;
It is also used to carry out Fourier transformation processing to the collected second sound signal of second microphone, obtains the second target letter
Number;
Determining module: for establishing the cross-correlation of the first object signal and second echo signal according to preset algorithm
Relationship;
It is also used to determine cross-correlation function according to the cross-correlation relationship;
It obtains module: for being based on the cross-correlation function, obtaining the first Filtering Formula and the second Filtering Formula;
It is also used to obtain target Filtering Formula based on first Filtering Formula and second Filtering Formula;
It is also used to handle the first object signal based on the target Filtering Formula, obtain the voice of frequency domain form
Signal;
Inverse transform block: Fourier inversion and overlap-add processing are carried out for the voice signal to the frequency domain form, is obtained
To the voice signal of the corresponding forms of time and space of voice signal of the frequency domain form, as targeted voice signal.
8. noise processing apparatus according to claim 7, which is characterized in that the determining module is specifically used for:
According to cross-spectral density, the cross-correlation relationship of the first object signal and second echo signal is obtained.
9. noise processing apparatus according to claim 8, which is characterized in that the acquisition module is specifically used for:
The cross-spectral density of first voice signal and the second sound signal is decomposed, converted and integrated, is obtained
To auxiliary filter formula;
Based on the cross-correlation function, simplified and multi-angular analysis is carried out to the auxiliary filter formula, obtains the first filtering public affairs
Formula and the second Filtering Formula.
10. noise processing apparatus according to claim 7, which is characterized in that further include:
Capture module: for being captured to the specified voice signal of first microphone collected first based on window function,
Obtain the first voice signal to be processed;
By first voice signal to be processed, the non-recursive type filter process for being 75% by overlap ratio, after obtaining processing
The first voice signal, as the first voice signal;
Accordingly, capture module is also used to:
Based on window function, the second specified voice signal collected to the second microphone is captured, and obtains to be processed
Two voice signals;
By the second sound signal to be processed, the non-recursive type filter process for being 75% by overlap ratio, after obtaining processing
Second sound signal, as second sound signal.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910080549.9A CN109616137A (en) | 2019-01-28 | 2019-01-28 | Method for processing noise and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910080549.9A CN109616137A (en) | 2019-01-28 | 2019-01-28 | Method for processing noise and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109616137A true CN109616137A (en) | 2019-04-12 |
Family
ID=66018335
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910080549.9A Withdrawn CN109616137A (en) | 2019-01-28 | 2019-01-28 | Method for processing noise and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109616137A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110459236A (en) * | 2019-08-15 | 2019-11-15 | 北京小米移动软件有限公司 | Noise estimation method, device and the storage medium of audio signal |
CN111863017A (en) * | 2020-07-20 | 2020-10-30 | 上海汽车集团股份有限公司 | In-vehicle directional pickup method based on double-microphone array and related device |
CN111951818A (en) * | 2020-08-20 | 2020-11-17 | 北京驭声科技有限公司 | Double-microphone speech enhancement method based on improved power difference noise estimation algorithm |
CN112053669A (en) * | 2020-08-27 | 2020-12-08 | 海信视像科技股份有限公司 | Method, device, equipment and medium for eliminating human voice |
CN113160840A (en) * | 2020-01-07 | 2021-07-23 | 北京地平线机器人技术研发有限公司 | Noise filtering method, device, mobile equipment and computer readable storage medium |
CN115683632A (en) * | 2023-01-03 | 2023-02-03 | 北京博华信智科技股份有限公司 | Method, device, equipment and medium for acquiring fault signal of gearbox bearing |
-
2019
- 2019-01-28 CN CN201910080549.9A patent/CN109616137A/en not_active Withdrawn
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110459236A (en) * | 2019-08-15 | 2019-11-15 | 北京小米移动软件有限公司 | Noise estimation method, device and the storage medium of audio signal |
CN110459236B (en) * | 2019-08-15 | 2021-11-30 | 北京小米移动软件有限公司 | Noise estimation method, apparatus and storage medium for audio signal |
CN113160840A (en) * | 2020-01-07 | 2021-07-23 | 北京地平线机器人技术研发有限公司 | Noise filtering method, device, mobile equipment and computer readable storage medium |
CN113160840B (en) * | 2020-01-07 | 2022-10-25 | 北京地平线机器人技术研发有限公司 | Noise filtering method, device, mobile equipment and computer readable storage medium |
CN111863017A (en) * | 2020-07-20 | 2020-10-30 | 上海汽车集团股份有限公司 | In-vehicle directional pickup method based on double-microphone array and related device |
CN111951818A (en) * | 2020-08-20 | 2020-11-17 | 北京驭声科技有限公司 | Double-microphone speech enhancement method based on improved power difference noise estimation algorithm |
CN111951818B (en) * | 2020-08-20 | 2023-11-03 | 北京驭声科技有限公司 | Dual-microphone voice enhancement method based on improved power difference noise estimation algorithm |
CN112053669A (en) * | 2020-08-27 | 2020-12-08 | 海信视像科技股份有限公司 | Method, device, equipment and medium for eliminating human voice |
CN112053669B (en) * | 2020-08-27 | 2023-10-27 | 海信视像科技股份有限公司 | Method, device, equipment and medium for eliminating human voice |
CN115683632A (en) * | 2023-01-03 | 2023-02-03 | 北京博华信智科技股份有限公司 | Method, device, equipment and medium for acquiring fault signal of gearbox bearing |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109616137A (en) | Method for processing noise and device | |
CN107845389B (en) | Speech enhancement method based on multi-resolution auditory cepstrum coefficient and deep convolutional neural network | |
US20160155434A1 (en) | Voice activity detector (vad)-based multiple-microphone acoustic noise suppression | |
US20020039425A1 (en) | Method and apparatus for removing noise from electronic signals | |
CN105225672B (en) | Merge the system and method for the dual microphone orientation noise suppression of fundamental frequency information | |
CN103948398B (en) | Be applicable to the heart sound location segmentation method of android system | |
CN103002170A (en) | Audio equipment including means for de-noising a speech signal by fractional delay filtering | |
CN106340292A (en) | Voice enhancement method based on continuous noise estimation | |
CN105448302B (en) | A kind of the speech reverberation removing method and system of environment self-adaption | |
Talmon et al. | Single-channel transient interference suppression with diffusion maps | |
KR102429152B1 (en) | Deep learning voice extraction and noise reduction method by fusion of bone vibration sensor and microphone signal | |
CN108597505A (en) | Audio recognition method, device and terminal device | |
CN102419972B (en) | Method of detecting and identifying sound signals | |
CN110059633A (en) | A kind of body gait based on ultrasound perception and its personal identification method | |
CN112580486A (en) | Human behavior classification method based on radar micro-Doppler signal separation | |
CN109766798A (en) | Gesture data processing method, server and awareness apparatus based on experience small echo | |
Mesgarani et al. | Speech enhancement based on filtering the spectrotemporal modulations | |
US20030128848A1 (en) | Method and apparatus for removing noise from electronic signals | |
EP1480589A1 (en) | Method and apparatus for removing noise from electronic signals | |
Nabi et al. | A dual-channel noise reduction algorithm based on the coherence function and the bionic wavelet | |
CN204795569U (en) | Portable pair of microphone sound source is discerned and positioner | |
CN115481689A (en) | System and method for simultaneously recognizing user gesture and identity based on ultrasonic waves | |
CN105025416B (en) | A kind of portable two microphones Sound sources identification and localization device | |
Liutkus et al. | Source separation for target enhancement of food intake acoustics from noisy recordings | |
CN206728275U (en) | Long-range array sound collector |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20190412 |
|
WW01 | Invention patent application withdrawn after publication |