CN106356071A - Noise detection method and device - Google Patents
Noise detection method and device Download PDFInfo
- Publication number
- CN106356071A CN106356071A CN201610769237.5A CN201610769237A CN106356071A CN 106356071 A CN106356071 A CN 106356071A CN 201610769237 A CN201610769237 A CN 201610769237A CN 106356071 A CN106356071 A CN 106356071A
- Authority
- CN
- China
- Prior art keywords
- ecorr
- audio frame
- spectrum
- noise
- corr
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 9
- 238000001228 spectrum Methods 0.000 claims abstract description 182
- 238000000034 method Methods 0.000 claims abstract description 55
- 230000005236 sound signal Effects 0.000 claims abstract description 30
- 238000005100 correlation spectroscopy Methods 0.000 claims description 30
- 230000000875 corresponding effect Effects 0.000 claims description 28
- 238000004364 calculation method Methods 0.000 claims description 25
- 238000005728 strengthening Methods 0.000 claims description 22
- 238000001914 filtration Methods 0.000 claims description 14
- 230000002708 enhancing effect Effects 0.000 claims description 10
- 208000011580 syndromic disease Diseases 0.000 claims description 8
- 238000011946 reduction process Methods 0.000 description 20
- 230000006870 function Effects 0.000 description 17
- 230000008569 process Effects 0.000 description 12
- 230000006854 communication Effects 0.000 description 7
- 238000012545 processing Methods 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 230000006378 damage Effects 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- 230000009467 reduction Effects 0.000 description 6
- 230000008859 change Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 5
- 230000005540 biological transmission Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000001965 increasing effect Effects 0.000 description 3
- 208000027418 Wounds and injury Diseases 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 208000014674 injury Diseases 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 238000010295 mobile communication Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 241000208340 Araliaceae Species 0.000 description 1
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 1
- 235000003140 Panax quinquefolius Nutrition 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000005314 correlation function Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 235000008434 ginseng Nutrition 0.000 description 1
- 230000005484 gravity Effects 0.000 description 1
- 238000012905 input function Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000001728 nano-filtration Methods 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000010897 surface acoustic wave method Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
Abstract
An embodiment of the invention discloses a noise detection method and device. The method comprises the steps as follows: a to-be-processed audio signal is obtained, and a power spectrum (omega) of an audio frame in the audio signal is computed; omega is the frequency of 2pi*power spectrum; autocorrelation Corr(tau) is computed according to the power spectrum of the audio frame, and tau is a time value; an enhanced correlation spectrum Ecorr (tau) is computed on the basis of the autocorrelation Corr(tau); the maximum value Max (ECorr) in Ecorr (tau) is obtained, and the audio frame is determined as noise if Max (ECorr) of the continuous predetermined number of audio frames is smaller than a first threshold; or, tau corresponding to Max (ECorr) is obtained, and the audio frame is determined as noise if tau corresponding to Max (ECorr) is not in the preset threshold range. Noise is identified on the basis of the enhanced correlation spectrum Ecorr (tau) and can be separated from music and human sound, so that a basis is provided for denoising.
Description
Technical field
The present invention relates to field of computer technology, particularly to a kind of noise detecting method, and device.
Background technology
Carry out live network application by mobile phone progressively to popularize, but live and in voice call process audio signal is deposited
In place of relatively big difference, for example: making a phone call is the transmission of speech data, the live biography not simply carrying out speech data
Pass, main broadcaster may sing during live or perform, be also possible to there is musical background or scene accompaniment etc. simultaneously
Situation.
Can be using to noise reduction skill in webpage real-time Communication for Power (web real-time communication, webrtc) technology
Art, specific as follows: webrtc technology calculates each frequency using the spectral change degree of frequency spectrum flatness parameter and adjacent interframe
The speech/noise probability of point, then updates noise spectrum, removes noise finally by Wiener filtering.
But webrtc is to carry out noise reduction process for voice, when there being music in background sound, especially frequency spectrum is basic
Indeclinable snatch of music (long note of such as bowstring kind musical instrument), the renewal noise spectrum of meeting mistake, this section of music is suppressed
Fall, thus causing to music to damage.And although common Autocorrelation Detection is capable of detecting when the relevant peaks of music, due to environment
Noise great majority are pink colour noise (pink noise), and the relevant peaks of music are in the autocorrelation spectrum of pink colour noise and inconspicuous,
Therefore rarer use autocorrelation spectrum distinguishes music and noise.
Therefore at present in the urgent need to being suitable under such as live scene, comprise voice and music etc. in audio signal all types of
Accurate noise detection scheme in the case of voice data, thus provide foundation for noise reduction process.
Content of the invention
Embodiments provide a kind of noise detecting method, and device, it is used for accurately identifying noise.
On the one hand embodiments provide a kind of noise detecting method, comprising:
Obtain pending audio signal, calculate the power spectrum spectrum (ω) of described audio signal sound intermediate frequency frame;Institute
State the frequency that ω is 2 π * power spectrum;
According to spectra calculation class autocorrelation spectrum corr (τ) of described audio frame, described τ is time value;
Calculate according to described autocorrelation spectrum corr (τ) and strengthen Correlated Spectroscopy ecorr (τ);
Obtain the maximum max (ecorr) in described ecorr (τ), if the max of the audio frame of continuous predetermined number
(ecorr) it is respectively less than first threshold it is determined that described audio frame is noise, described first threshold is the threshold value strengthening Correlated Spectroscopy;
Or, obtain the corresponding τ of described max (ecorr), if the corresponding τ of described max (ecorr) is not in predetermined threshold value model
Enclose it is determined that described audio frame is noise, described preset threshold range is default time range.
In an optional implementation, described calculating according to described autocorrelation spectrum corr (τ) strengthens Correlated Spectroscopy ecorr
(τ) include:
The value being less than 0 in described corr (τ) is entered as 0 and then calculates enhancing spectrum ecorr (τ);
The value being less than 0 in described ecorr (τ) is entered as 0, obtains strengthening Correlated Spectroscopy ecorr (τ).
In an optional implementation, described calculating strengthens spectrum ecorr (τ) inclusion:
Calculate described ecorr (τ) according to ecorr (τ)=corr (τ)-corr (τ/2), if τ is odd number, described corr
(τ/2) are obtained by near stratum exhaust.
In an optional implementation, the described spectra calculation class autocorrelation spectrum corr according to described audio frame
(τ) include:
Calculate the cube root of the frequency of described spectrum (ω), and three times of the frequency to described spectrum (ω)
Root makees fast Fourier transform treating excess syndrome portion, obtains described corr (τ).
In an optional implementation, before the described audio frame of described determination is noise, methods described also includes:
Calculate the average distance d of the amplitude spectrum of amplitude spectrum s and noise spectrum n of described audio frame, d=20 (log10 (s)-
log10(n));If described d is less than Second Threshold and the max (ecorr) of the audio frame of continuous predetermined number is respectively less than described first
Threshold value, or, if described d is less than described Second Threshold and the corresponding τ of described max (ecorr) not in preset threshold range, really
Fixed described audio frame is noise, and described preset threshold range is default time range.
In an optional implementation, methods described also includes:
If it is determined that described audio frame is noise, then determine new noise spectrum by the way of window is average.
In an optional implementation, after the new noise spectrum of described determination, methods described also includes:
Using described new noise spectrum, Wiener filtering is carried out to the audio frame of described audio signal.
In an optional implementation, methods described also includes:
If described audio frame is not defined as noise it is determined that described audio frame is voice or music.
In an optional implementation, before the described audio frame of described determination is voice or music, described side
Method also includes:
If if described d is more than described Second Threshold and described audio frame is not defined as noise it is determined that described audio frame is
Voice or music.
In an optional implementation, methods described also includes:
If not determining, described audio frame is voice or music, uses described audio frame by the way of window is average
Ecorr (τ) updates described first threshold.
The two aspect embodiment of the present invention additionally provide a kind of noise detection apparatus, comprising:
Signal acquiring unit, for obtaining pending audio signal;
Computing unit, for calculating the power spectrum spectrum (ω) of described audio signal sound intermediate frequency frame;Described ω is 2 π *
The frequency of power spectrum;According to spectra calculation class autocorrelation spectrum corr (τ) of described audio frame, described τ is time value;According to institute
State autocorrelation spectrum corr (τ) and calculate and strengthen Correlated Spectroscopy ecorr (τ);
Signal determining unit, for obtaining the maximum max (ecorr) in described ecorr (τ), if continuous predetermined number
Audio frame max (ecorr) be respectively less than first threshold it is determined that described audio frame be noise, described first threshold be strengthen
The threshold value of Correlated Spectroscopy;Or, obtain the corresponding τ of described max (ecorr), if the corresponding τ of described max (ecorr) is not in default threshold
It is determined that described audio frame is noise, described preset threshold range is default time range to value scope.
In an optional implementation, described computing unit, specifically for being less than 0 value in described corr (τ)
It is entered as 0 and then calculate enhancing spectrum ecorr (τ);The value being less than 0 in described ecorr (τ) is entered as 0, obtains strengthening Correlated Spectroscopy
ecorr(τ).
In an optional implementation, described computing unit, specifically for according to ecorr (τ)=corr (τ)-
Corr (τ/2) calculates described ecorr (τ), if τ is odd number, described corr (τ/2) is obtained by near stratum exhaust.
In an optional implementation, described computing unit, specifically for calculating the frequency of described spectrum (ω)
The cube root of point, and the cube root of the frequency of described spectrum (ω) is made with fast Fourier transform treating excess syndrome portion, obtain
Described corr (τ).
In an optional implementation, described computing unit, it is additionally operable to described in the determination of described signal determining unit
Before audio frame is noise, calculate the average distance d, d=20 of the amplitude spectrum of amplitude spectrum s and noise spectrum n of described audio frame
(log10(s)-log10(n));
Described signal determining unit, if be less than the audio frame of Second Threshold and continuous predetermined number specifically for described d
Max (ecorr) is respectively less than described first threshold, or, if described d is less than described Second Threshold and described max (ecorr) is corresponding
τ not in preset threshold range it is determined that described audio frame be noise, described preset threshold range be default time range.
In an optional implementation, described device also includes:
Noise spectrum updating block, if determining that described audio frame is noise for described signal determining unit, adopts window
Average mode determines new noise spectrum n.
In an optional implementation, described device also includes:
Filter unit, for carrying out Wiener filtering using described new noise spectrum to the audio frame of described audio signal.
In an optional implementation, described signal determining unit, if be additionally operable to described audio frame not to be defined as making an uproar
Sound is it is determined that described audio frame is voice or music.
In an optional implementation, described signal determining unit, it is additionally operable to determine that described audio frame is described
Before voice or music, if if described d is more than described Second Threshold and described audio frame and is not defined as noise it is determined that described
Audio frame is voice or music.
In an optional implementation, described device also includes:
Threshold value updating block, if for not determining that described audio frame is voice or music, using the average side of window
Formula updates described first threshold using the ecorr (τ) of described audio frame.
As can be seen from the above technical solutions, the embodiment of the present invention has the advantage that based on enhancing Correlated Spectroscopy ecorr
(τ) to accurately identify noise, noise can be distinguished with music and voice, thus providing foundation for noise reduction process.
Brief description
For the technical scheme being illustrated more clearly that in the embodiment of the present invention, will make to required in embodiment description below
Accompanying drawing briefly introduce it should be apparent that, drawings in the following description are only some embodiments of the present invention, for this
For the those of ordinary skill in field, without having to pay creative labor, it can also be obtained according to these accompanying drawings
His accompanying drawing.
Fig. 1 is present invention method schematic flow sheet;
Fig. 2 is present invention method schematic flow sheet;
Fig. 3 is embodiment of the present invention apparatus structure schematic diagram;
Fig. 4 is embodiment of the present invention apparatus structure schematic diagram;
Fig. 5 is embodiment of the present invention apparatus structure schematic diagram;
Fig. 6 is embodiment of the present invention apparatus structure schematic diagram;
Fig. 7 is embodiment of the present invention terminal unit structural representation;
Fig. 8 is embodiment of the present invention terminal unit structural representation.
Specific embodiment
In order that the object, technical solutions and advantages of the present invention are clearer, below in conjunction with accompanying drawing the present invention is made into
One step ground describes in detail it is clear that described embodiment is only present invention some embodiments, rather than whole enforcement
Example.Based on the embodiment in the present invention, those of ordinary skill in the art are obtained under the premise of not making creative work
All other embodiment, broadly falls into the scope of protection of the invention.
Embodiments provide a kind of noise detecting method, as shown in Figure 1, comprising:
101: obtain pending audio signal, calculate the power spectrum spectrum of above-mentioned audio signal sound intermediate frequency frame
(ω);Above-mentioned ω is the frequency of 2 π * power spectrum;
Wherein spectrum is the function name of power spectrum, and ω is the independent variable of power spectrum function.
102: according to spectra calculation class autocorrelation spectrum corr (τ) of above-mentioned audio frame, above-mentioned τ is time value;
Wherein corr is the function name of class autocorrelation spectrum, τ class autocorrelation spectrum argument of function.
103: calculate according to above-mentioned autocorrelation spectrum corr (τ) and strengthen Correlated Spectroscopy ecorr (τ);
How enhancement process is carried out to it after corr (τ) determines, the embodiment of the present invention is not made uniqueness and limited.Follow-up
Optional implementation will be given in embodiment.
104: obtain the maximum max (ecorr) in above-mentioned ecorr (τ), if the max of the audio frame of continuous predetermined number
(ecorr) it is respectively less than first threshold it is determined that above-mentioned audio frame is noise, described first threshold is the threshold value strengthening Correlated Spectroscopy;
Or, obtain the corresponding τ of above-mentioned max (ecorr), if the corresponding τ of above-mentioned max (ecorr) is not in predetermined threshold value model
Enclose it is determined that above-mentioned audio frame is noise, described preset threshold range is default time range.
The embodiment of the present invention, accurately identifies noise based on strengthening Correlated Spectroscopy ecorr (τ), can by noise and music and
Voice distinguishes, thus providing foundation for noise reduction process.
Alternatively, embodiments provide, as an optional implementation, ecorr is calculated by corr (τ)
(τ) scheme, it should be noted that carrying out strengthening the realization not affecting the embodiment of the present invention by other means, the present invention is implemented
Example is not made uniqueness and is limited, and above-mentioned calculating according to above-mentioned autocorrelation spectrum corr (τ) strengthens Correlated Spectroscopy ecorr (τ) inclusion:
The value being less than 0 in above-mentioned corr (τ) is entered as 0 and then calculates enhancing spectrum ecorr (τ);
The value being less than 0 in above-mentioned ecorr (τ) is entered as 0, obtains strengthening Correlated Spectroscopy ecorr (τ).
The above calculation strengthening corr (τ), amount of calculation is less can be used as a more preferred implementation.
Alternatively, the embodiment of the present invention additionally provides the scheme calculating ecorr (τ), for improving in subsequent calculations ecorr
(τ) effect, specific as follows: above-mentioned calculating strengthens spectrum ecorr (τ) and includes:
Calculate above-mentioned ecorr (τ) according to ecorr (τ)=corr (τ)-corr (τ/2), if τ is odd number, above-mentioned corr
(τ/2) are obtained by near stratum exhaust.
The scheme of this calculating ecorr (τ) both can improve the accuracy of ecorr (τ), and amount of calculation is also less, is adapted to straight
Broadcast etc. under application scenarios, the larger situation of data processing amount.
Alternatively, the present invention implements to additionally provide the preferential implementation calculating corr (τ), specific as follows: above-mentioned foundation
Spectra calculation class autocorrelation spectrum corr (τ) of above-mentioned audio frame includes:
Calculate the cube root of the frequency of above-mentioned spectrum (ω), and three times of the frequency to above-mentioned spectrum (ω)
Root makees fast Fourier transform treating excess syndrome portion, obtains above-mentioned corr (τ).
Further, the embodiment of the present invention additionally provides the amplitude passing through the amplitude spectrum s and noise spectrum n of audio frame further
Spectrum determines the scheme of noise as reference value, can improve the accuracy of noise determination further, specific as follows: in above-mentioned determination
Before above-mentioned audio frame is noise, said method also includes:
Calculate the average distance d of the amplitude spectrum of amplitude spectrum s and noise spectrum n of above-mentioned audio frame, d=20 (log10 (s)-
log10(n));If above-mentioned d is less than Second Threshold and the max (ecorr) of the audio frame of continuous predetermined number is respectively less than above-mentioned first
Threshold value, or, if above-mentioned d is less than above-mentioned Second Threshold and the corresponding τ of above-mentioned max (ecorr) not in preset threshold range, really
Fixed above-mentioned audio frame is noise, and described preset threshold range is default time range.
Further, the embodiment of the present invention additionally provides the implementation updating noise spectrum, updates noise spectrum later permissible
To make to determine more accurate during noise next time, also provides accurate foundation for subsequently carrying out noise reduction process, specific as follows: said method
Also include:
If it is determined that above-mentioned audio frame is noise, then determine new noise spectrum by the way of window is average.
Window averagely refers to, using a window as reference, calculate average mode;For example: the value of window is 8, before
1~No. 8 audio frame of serial number through determining according to time-series is noise, then if current audio frame is also to make an uproar
Sound, serial number 9;The corresponding noise of window that the audio frame of so serial number 2~9 is new, calculates the audio frame of serial number 2~9
Noise spectrum meansigma methodss.
Further, the embodiment of the present invention additionally provides the specific implementation of noise reduction process, as follows: new in above-mentioned determination
After noise spectrum, said method also includes:
Using above-mentioned new noise spectrum, Wiener filtering is carried out to the audio frame of above-mentioned audio signal.
Wiener filtering, as more conventional noise reduction process means, is attached in the embodiment of the present invention and accurately newly makes an uproar
Sound spectrum, it is possible to obtain preferably noise reduction, noise reduction process will not cause damage to music and voice, can improve audio signal
Quality, be adapted to the complex application context with music, voice and noise such as live.
Further, the embodiment of the present invention additionally provides the application scenarios determining non-noise, specific as follows: said method is also
Including:
If above-mentioned audio frame is not defined as noise it is determined that above-mentioned audio frame is voice or music.
In the present embodiment, it is not defined as other situations that noise refers to be unsatisfactory for the condition determining noise, that is to say this
Inventive embodiments do not determine the situation that above-mentioned audio frame is noise.
Further, in order to improve the accuracy determining that audio frame is voice or music, the embodiment of the present invention also provides
With reference to the implementation of the average distance d of the amplitude spectrum of the amplitude spectrum s and noise spectrum n of above-mentioned audio frame, specific as follows: upper
State before determining that above-mentioned audio frame is voice or music, said method also includes:
If if above-mentioned d is more than above-mentioned Second Threshold and above-mentioned audio frame is not defined as noise it is determined that above-mentioned audio frame is
Voice or music.
In the present embodiment, if above-mentioned d is not greater than above-mentioned Second Threshold, then it is considered that this audio frequency cannot be determined
Frame is noise, also cannot determine that this audio frame is voice or music.
Further, in view of the voice accurately having determined or music, the embodiment of the present invention additionally provides renewal threshold value
Scheme, can improve further subsequent audio frame type determine accuracy, specific as follows: said method also includes:
If not determining, above-mentioned audio frame is voice or music, uses above-mentioned audio frame by the way of window is average
Ecorr (τ) updates above-mentioned first threshold.
The application scenarios of the embodiment of the present invention, are primarily related to real-time high definition speech processes;With video
The growth of the Internet service such as live, detection and noise reduction to music are increasingly becoming a new demand.Noise reduction technology is strengthening
In addition it is also necessary to the music being capable of detecting when in environment outside voice, reduce the injury to music in noise reduction process as far as possible.The present invention is real
Apply example to make a return journey except the impact of pink noise, the power of test to musical sound for the lifting using enhanced auto-correlation, be simultaneously based on this increasing
Strong auto-correlation it is proposed that update noise spectrum strategy, after Wiener filtering, can protection music injury-free before
Put, filter most of background noise.Carry out because Wiener filtering requires transformation into frequency domain in itself, calculating enhanced auto-correlation can
So that using existing frequency spectrum data, the amount of calculation of filtering can't be significantly increased, can smooth run on a handheld device.Specifically such as
Shown in Fig. 2.
Embodiment of the present invention technical scheme is divided into enhancing auto-correlation and noise spectrum to update two parts;Wherein strengthen autocorrelative
Calculation procedure is:
201: calculate the work(of present frame using fast Fourier transform (fast fourier transformation, fft)
Rate is composed, and obtains spectrum (ω);
Present frame is the audio frame currently extracting in audio signal.
202: cube root is asked to each frequency, obtains (spectrum (ω))1/3;
203: to (spectrum (ω))1/3Make fft treating excess syndrome portion, obtain class autocorrelation spectrum corr (τ);
204: 0 is entered as to the value being less than 0 in corr (τ), then calculates and strengthen spectrum ecorr (τ);
Ecorr (τ)=corr (τ)-corr (τ/2);When τ is for odd number, corr (τ/2) is obtained by near stratum exhaust.
205: the value being less than 0 in ecorr (τ) is entered as 0, that is, obtains final enhancing Correlated Spectroscopy ecorr (τ);
Wherein noise spectrum renewal step is:
206: the maximum max (ecorr) in detection ecorr (τ) and its corresponding τ;
207: if the maximum of successive frame is respectively less than first threshold, or τ not in the threshold range setting, is then judged to noise,
Otherwise it is judged to music/voice;Described first threshold is the threshold value strengthening Correlated Spectroscopy;
208: calculate the average distance d of the amplitude spectrum of amplitude spectrum s and noise spectrum n of present frame;
Wherein, d=20 (log10 (s)-log10 (n));
209: if d is less than in Second Threshold, and step 207 is judged to noise it is determined that present frame is judged to noise;
210: if d is more than in Second Threshold, and step 207 is judged to music or voice, then present frame is judged to music or voice;
211: if not 209 or 210, be then judged to uncertain sound type.
212: be judged to noise or through 211 uncertain sound through 209, then gone using enhancing Correlated Spectroscopy ecorr (τ) of present frame
Update Second Threshold used in step 209 and 210.Update mode can be average for window.
213: be judged to noise through 209, then update noise spectrum using 209 result.Update mode can be average for window.
214: using the noise spectrum updating, Wiener filtering is carried out to the audio frame of input, obtain the later audio frequency letter of denoising
Number.
The real-time example of the present invention, can significantly distinguish music and pink colour noise under conditions of not dramatically increasing operand,
Reduce noise reduction process to the damage with noise music in happy.
The embodiment of the present invention additionally provides a kind of noise detection apparatus, as shown in Figure 3, comprising:
Signal acquiring unit 301, for obtaining pending audio signal;
Computing unit 302, for calculating the power spectrum spectrum (ω) of above-mentioned audio signal sound intermediate frequency frame;Above-mentioned ω is
The frequency of 2 π * power spectrum;According to spectra calculation class autocorrelation spectrum corr (τ) of above-mentioned audio frame, above-mentioned τ is time value;According to
Calculate according to above-mentioned autocorrelation spectrum corr (τ) and strengthen Correlated Spectroscopy ecorr (τ);
Signal determining unit 303, for obtaining the maximum max (ecorr) in above-mentioned ecorr (τ), if continuously make a reservation for individual
The max (ecorr) of the audio frame of number is respectively less than first threshold it is determined that above-mentioned audio frame is noise, and described first threshold is to increase
The threshold value of strong correlation spectrum;Or, obtain the corresponding τ of above-mentioned max (ecorr), if the corresponding τ of above-mentioned max (ecorr) is not default
It is determined that above-mentioned audio frame is noise, described preset threshold range is default time range to threshold range.
How enhancement process is carried out to it after corr (τ) determines, the embodiment of the present invention is not made uniqueness and limited.Follow-up
Optional implementation will be given in embodiment.
The embodiment of the present invention, accurately identifies noise based on strengthening Correlated Spectroscopy ecorr (τ), can by noise and music and
Voice distinguishes, thus providing foundation for noise reduction process.
Alternatively, embodiments provide, as an optional implementation, ecorr is calculated by corr (τ)
(τ) scheme, it should be noted that carrying out strengthening the realization not affecting the embodiment of the present invention by other means, the present invention is implemented
Example is not made uniqueness and is limited, above-mentioned computing unit 302, specifically for then the value being less than 0 in above-mentioned corr (τ) is entered as 0
Calculate and strengthen spectrum ecorr (τ);The value being less than 0 in above-mentioned ecorr (τ) is entered as 0, obtains strengthening Correlated Spectroscopy ecorr (τ).
The above calculation strengthening corr (τ), amount of calculation is less can be used as a more preferred implementation.
Alternatively, the embodiment of the present invention additionally provides the scheme calculating ecorr (τ), for improving in subsequent calculations ecorr
(τ) effect, specific as follows: above-mentioned computing unit 302, specifically for according to ecorr (τ)=corr (τ)-corr (τ/2) meter
Count in stating ecorr (τ), if τ is odd number, above-mentioned corr (τ/2) is obtained by near stratum exhaust.
The scheme of this calculating ecorr (τ) both can improve the accuracy of ecorr (τ), and amount of calculation is also less, is adapted to straight
Broadcast etc. under application scenarios, the larger situation of data processing amount.
Alternatively, the present invention implements to additionally provide the preferential implementation calculating corr (τ), specific as follows: above-mentioned calculating
Unit 302, specifically for calculating the cube root of the frequency of above-mentioned spectrum (ω), and the frequency to above-mentioned spectrum (ω)
The cube root of point makees fast Fourier transform treating excess syndrome portion, obtains above-mentioned corr (τ).
Further, the embodiment of the present invention additionally provides the amplitude passing through the amplitude spectrum s and noise spectrum n of audio frame further
Spectrum determines the scheme of noise as reference value, can improve the accuracy of noise determination further, specific as follows: above-mentioned calculating list
Unit 302, is additionally operable to, before above-mentioned signal determining unit 303 determines that above-mentioned audio frame is noise, calculate the width of above-mentioned audio frame
The average distance d, d=20 (log10 (s)-log10 (n)) of the amplitude spectrum of degree spectrum s and noise spectrum n;
Above-mentioned signal determining unit 303, if be less than the audio frame of Second Threshold and continuous predetermined number specifically for above-mentioned d
Max (ecorr) be respectively less than above-mentioned first threshold, or, if above-mentioned d is less than above-mentioned Second Threshold and above-mentioned max (ecorr) is right
Not in preset threshold range it is determined that above-mentioned audio frame is noise, described preset threshold range is default time model to the τ answering
Enclose.
Further, the embodiment of the present invention additionally provides the implementation updating noise spectrum, updates noise spectrum later permissible
To make to determine more accurate during noise next time, also provides accurate foundation for subsequently carrying out noise reduction process, specific as follows: as Fig. 4 institute
Show, said apparatus also include:
Noise spectrum updating block 401, if determining that above-mentioned audio frame is noise for above-mentioned signal determining unit 303, adopts
Determine new noise spectrum n with the average mode of window.
Window averagely refers to, using a window as reference, calculate average mode;For example: the value of window is 8, before
1~No. 8 audio frame of serial number through determining according to time-series is noise, then if current audio frame is also to make an uproar
Sound, serial number 9;The corresponding noise of window that the audio frame of so serial number 2~9 is new, calculates the audio frame of serial number 2~9
Noise spectrum meansigma methodss.
Further, the embodiment of the present invention additionally provides the specific implementation of noise reduction process, as follows: as shown in figure 5, on
State device also to include:
Filter unit 501, for carrying out Wiener filtering using above-mentioned new noise spectrum to the audio frame of above-mentioned audio signal.
Wiener filtering, as more conventional noise reduction process means, is attached in the embodiment of the present invention and accurately newly makes an uproar
Sound spectrum, it is possible to obtain preferably noise reduction, noise reduction process will not cause damage to music and voice, can improve audio signal
Quality, be adapted to the complex application context with music, voice and noise such as live.
Further, the embodiment of the present invention additionally provides the application scenarios determining non-noise, specific as follows: above-mentioned signal is true
Order unit 303, if be additionally operable to above-mentioned audio frame not to be defined as noise it is determined that above-mentioned audio frame is voice or music.
In the present embodiment, it is not defined as other situations that noise refers to be unsatisfactory for the condition determining noise, that is to say this
Inventive embodiments do not determine the situation that above-mentioned audio frame is noise.
Further, in order to improve the accuracy determining that audio frame is voice or music, the embodiment of the present invention also provides
With reference to the implementation of the average distance d of the amplitude spectrum of the amplitude spectrum s and noise spectrum n of above-mentioned audio frame, specific as follows: above-mentioned
Signal determining unit 303, was additionally operable to before the above-mentioned audio frame of above-mentioned determination is voice or music, if above-mentioned d is more than above-mentioned
If Second Threshold and above-mentioned audio frame are not defined as noise it is determined that above-mentioned audio frame is voice or music.
In the present embodiment, if above-mentioned d is not greater than above-mentioned Second Threshold, then it is considered that this audio frequency cannot be determined
Frame is noise, also cannot determine that this audio frame is voice or music.
Further, in view of the voice accurately having determined or music, the embodiment of the present invention additionally provides renewal threshold value
Scheme, the accuracy that the type of subsequent audio frame determines can be improved further, specific as follows: as shown in fig. 6, said apparatus
Also include:
Threshold value updating block 601, if for not determining that above-mentioned audio frame is voice or music, average using window
Mode updates above-mentioned first threshold using the ecorr (τ) of above-mentioned audio frame.
The embodiment of the present invention additionally provides a kind of terminal unit, as shown in fig. 7, comprises: input-output equipment 701, process
Device 702 and memorizer 703;Wherein, memorizer 703 can be used for storing the data that inputted by input-output equipment 701 or
The data that person will be exported by input-output equipment 701, can be also used for providing required for processor 702 execution data processing
Caching;
Wherein, above-mentioned processor 702, for obtaining pending audio signal, calculates above-mentioned audio signal sound intermediate frequency frame
Power spectrum spectrum (ω);Above-mentioned ω is the frequency of 2 π * power spectrum;
According to spectra calculation class autocorrelation spectrum corr (τ) of above-mentioned audio frame, above-mentioned τ is time value;
Calculate according to above-mentioned autocorrelation spectrum corr (τ) and strengthen Correlated Spectroscopy ecorr (τ);
Obtain the maximum max (ecorr) in above-mentioned ecorr (τ), if the max of the audio frame of continuous predetermined number
(ecorr) it is respectively less than first threshold it is determined that above-mentioned audio frame is noise, described first threshold is the threshold value strengthening Correlated Spectroscopy;
Or, obtain the corresponding τ of above-mentioned max (ecorr), if the corresponding τ of above-mentioned max (ecorr) is not in predetermined threshold value model
Enclose it is determined that above-mentioned audio frame is noise, described preset threshold range is default time range.
How enhancement process is carried out to it after corr (τ) determines, the embodiment of the present invention is not made uniqueness and limited.Follow-up
Optional implementation will be given in embodiment.
The embodiment of the present invention, accurately identifies noise based on strengthening Correlated Spectroscopy ecorr (τ), can by noise and music and
Voice distinguishes, thus providing foundation for noise reduction process.
Alternatively, embodiments provide, as an optional implementation, ecorr is calculated by corr (τ)
(τ) scheme, it should be noted that carrying out strengthening the realization not affecting the embodiment of the present invention by other means, the present invention is implemented
Example is not made uniqueness and is limited, and above-mentioned processor 702 strengthens Correlated Spectroscopy ecorr for calculating according to above-mentioned autocorrelation spectrum corr (τ)
(τ) include:
The value being less than 0 in above-mentioned corr (τ) is entered as 0 and then calculates enhancing spectrum ecorr (τ);
The value being less than 0 in above-mentioned ecorr (τ) is entered as 0, obtains strengthening Correlated Spectroscopy ecorr (τ).
The above calculation strengthening corr (τ), amount of calculation is less can be used as a more preferred implementation.
Alternatively, the embodiment of the present invention additionally provides the scheme calculating ecorr (τ), for improving in subsequent calculations ecorr
(τ) effect, specific as follows: above-mentioned processor 702, strengthen spectrum ecorr (τ) inclusion for calculating:
Calculate above-mentioned ecorr (τ) according to ecorr (τ)=corr (τ)-corr (τ/2), if τ is odd number, above-mentioned corr
(τ/2) are obtained by near stratum exhaust.
The scheme of this calculating ecorr (τ) both can improve the accuracy of ecorr (τ), and amount of calculation is also less, is adapted to straight
Broadcast etc. under application scenarios, the larger situation of data processing amount.
Alternatively, the present invention implements to additionally provide the preferential implementation calculating corr (τ), specific as follows: above-mentioned process
Device 702, includes for spectra calculation class autocorrelation spectrum corr (τ) according to above-mentioned audio frame:
Calculate the cube root of the frequency of above-mentioned spectrum (ω), and three times of the frequency to above-mentioned spectrum (ω)
Root makees fast Fourier transform treating excess syndrome portion, obtains above-mentioned corr (τ).
Further, the embodiment of the present invention additionally provides the amplitude passing through the amplitude spectrum s and noise spectrum n of audio frame further
Spectrum determines the scheme of noise as reference value, can improve the accuracy of noise determination further, specific as follows: above-mentioned processor
702, it is additionally operable to calculate the width of the amplitude spectrum s and noise spectrum n of above-mentioned audio frame before the above-mentioned audio frame of above-mentioned determination is noise
The average distance d, d=20 (log10 (s)-log10 (n)) of degree spectrum;If above-mentioned d is less than Second Threshold and continuous predetermined number
The max (ecorr) of audio frame is respectively less than above-mentioned first threshold, or, if above-mentioned d is less than above-mentioned Second Threshold and above-mentioned max
(ecorr) not in preset threshold range it is determined that above-mentioned audio frame is noise, described preset threshold range is default to corresponding τ
Time range.
Further, the embodiment of the present invention additionally provides the implementation updating noise spectrum, updates noise spectrum later permissible
To make to determine more accurate during noise next time, also provides accurate foundation for subsequently carrying out noise reduction process, specific as follows: above-mentioned process
Device 702, is additionally operable to if it is determined that above-mentioned audio frame is noise, then determine new noise spectrum by the way of window is average.
Window averagely refers to, using a window as reference, calculate average mode;For example: the value of window is 8, before
1~No. 8 audio frame of serial number through determining according to time-series is noise, then if current audio frame is also to make an uproar
Sound, serial number 9;The corresponding noise of window that the audio frame of so serial number 2~9 is new, calculates the audio frame of serial number 2~9
Noise spectrum meansigma methodss.
Further, the embodiment of the present invention additionally provides the specific implementation of noise reduction process, as follows: above-mentioned processor
702, it is additionally operable to, after the new noise spectrum of above-mentioned determination, using above-mentioned new noise spectrum, the audio frame of above-mentioned audio signal be tieed up
Nanofiltration ripple.
Wiener filtering, as more conventional noise reduction process means, is attached in the embodiment of the present invention and accurately newly makes an uproar
Sound spectrum, it is possible to obtain preferably noise reduction, noise reduction process will not cause damage to music and voice, can improve audio signal
Quality, be adapted to the complex application context with music, voice and noise such as live.
Further, the embodiment of the present invention additionally provides the application scenarios determining non-noise, specific as follows: above-mentioned processor
702, if being additionally operable to above-mentioned audio frame not to be defined as noise it is determined that above-mentioned audio frame is voice or music.
In the present embodiment, it is not defined as other situations that noise refers to be unsatisfactory for the condition determining noise, that is to say this
Inventive embodiments do not determine the situation that above-mentioned audio frame is noise.
Further, in order to improve the accuracy determining that audio frame is voice or music, the embodiment of the present invention also provides
With reference to the implementation of the average distance d of the amplitude spectrum of the amplitude spectrum s and noise spectrum n of above-mentioned audio frame, specific as follows: above-mentioned
Processor 702, was additionally operable to before the above-mentioned audio frame of above-mentioned determination is voice or music, if above-mentioned d is more than above-mentioned second threshold
If value and above-mentioned audio frame are not defined as noise it is determined that above-mentioned audio frame is voice or music.
In the present embodiment, if above-mentioned d is not greater than above-mentioned Second Threshold, then it is considered that this audio frequency cannot be determined
Frame is noise, also cannot determine that this audio frame is voice or music.
Further, in view of the voice accurately having determined or music, the embodiment of the present invention additionally provides renewal threshold value
Scheme, can improve further subsequent audio frame type determine accuracy, specific as follows: above-mentioned processor 702, also use
If in not determining that above-mentioned audio frame is voice or music, using the ecorr of above-mentioned audio frame by the way of window is average
(τ) update above-mentioned first threshold.
The embodiment of the present invention additionally provides another kind of terminal unit, as shown in figure 8, for convenience of description, illustrate only with
The related part of the embodiment of the present invention, particular technique details does not disclose, and refer to present invention method part.This terminal
Equipment can be including mobile phone, panel computer, pda (personal digital assistant, personal digital assistant), pos
The arbitrarily terminal unit such as (point of sales, point-of-sale terminal), vehicle-mounted computer, so that terminal unit is as mobile phone as a example:
Fig. 8 is illustrated that the block diagram of the part-structure of the mobile phone related to terminal unit provided in an embodiment of the present invention.Ginseng
Examine Fig. 8, mobile phone includes: radio frequency (radio frequency, rf) circuit 810, memorizer 820, input block 830, display unit
840th, sensor 850, voicefrequency circuit 860, Wireless Fidelity (wireless fidelity, wifi) module 870, processor 880,
And the part such as power supply 890.It will be understood by those skilled in the art that the handset structure shown in Fig. 8 is not constituted to mobile phone
Limit, ratio can be included and illustrate more or less of part, or combine some parts, or different part arrangements.
With reference to Fig. 8, each component parts of mobile phone are specifically introduced:
Rf circuit 810 can be used for receiving and sending messages or communication process in, the reception of signal and transmission, especially, by base station
After downlink information receives, process to processor 880;In addition, up data is activation will be designed to base station.Generally, rf circuit 810
Including but not limited to antenna, at least one amplifier, transceiver, bonder, low-noise amplifier (low noise
Amplifier, lna), duplexer etc..Additionally, rf circuit 810 can also be communicated with network and other equipment by radio communication.
Above-mentioned radio communication can use arbitrary communication standard or agreement, including but not limited to global system for mobile communications (global
System of mobile communication, gsm), general packet radio service (general packet radio
Service, gprs), CDMA (code division multiple access, cdma), WCDMA
(wideband code division multiple access, wcdma), Long Term Evolution (long term evolution,
Lte), Email, Short Message Service (short messaging service, sms) etc..
Memorizer 820 can be used for storing software program and module, and processor 880 is stored in memorizer 820 by operation
Software program and module, thus executing various function application and the data processing of mobile phone.Memorizer 820 can mainly include
Storing program area and storage data field, wherein, storing program area can application journey needed for storage program area, at least one function
Sequence (such as sound-playing function, image player function etc.) etc.;Storage data field can store according to mobile phone using being created
Data (such as voice data, phone directory etc.) etc..Additionally, memorizer 820 can include high-speed random access memory, acceptable
Including nonvolatile memory, for example, at least one disk memory, flush memory device or other volatile solid-state
Part.
Input block 830 can be used for numeral or the character information of receives input, and produce with the user setup of mobile phone with
And the key signals input that function control is relevant.Specifically, input block 830 may include contact panel 831 and other inputs set
Standby 832.Contact panel 831, also referred to as touch screen, can collect user thereon or neighbouring touch operation (such as user uses
Any suitable object such as finger, stylus or adnexa on contact panel 831 or the operation near contact panel 831), and root
Drive corresponding attachment means according to formula set in advance.Optionally, contact panel 831 may include touch detecting apparatus and touch
Two parts of controller.Wherein, touch detecting apparatus detect the touch orientation of user, and detect the signal that touch operation brings,
Transmit a signal to touch controller;Touch controller receives touch information from touch detecting apparatus, and is converted into touching
Point coordinates, then give processor 880, and can the order sent of receiving processor 880 being executed.Furthermore, it is possible to using electricity
The polytypes such as resistive, condenser type, infrared ray and surface acoustic wave realize contact panel 831.Except contact panel 831, input
Unit 830 can also include other input equipments 832.Specifically, other input equipments 832 can include but is not limited to secondary or physical bond
One or more of disk, function key (such as volume control button, switch key etc.), trace ball, mouse, action bars etc..
Display unit 840 can be used for display and by the information of user input or is supplied to the information of user and the various of mobile phone
Menu.Display unit 840 may include display floater 841, optionally, can adopt liquid crystal display (liquid crystal
Display, lcd), the form such as Organic Light Emitting Diode (organic light-emitting diode, oled) aobvious to configure
Show panel 841.Further, contact panel 831 can cover display floater 841, when contact panel 831 detect thereon or attached
After near touch operation, send processor 880 to determine the type of touch event, with preprocessor 880 according to touch event
Type corresponding visual output is provided on display floater 841.Although in fig. 8, contact panel 831 and display floater 841
It is input and the input function to realize mobile phone as two independent parts, but in some embodiments it is possible to by touch-control
Panel 831 is integrated with display floater 841 and realizes mobile phone input and output function.
Mobile phone may also include at least one sensor 850, such as optical sensor, motion sensor and other sensors.
Specifically, optical sensor may include ambient light sensor and proximity transducer, and wherein, ambient light sensor can be according to ambient light
The brightness to adjust display floater 841 for the light and shade, proximity transducer can cut out display floater 841 when mobile phone moves in one's ear
And/or backlight.As one kind of motion sensor, accelerometer sensor can detect (generally three axles) acceleration in all directions
Size, can detect that size and the direction of gravity when static, can be used for identify mobile phone attitude application (such as horizontal/vertical screen is cut
Change, dependent game, magnetometer pose calibrating), Vibration identification correlation function (such as pedometer, tap) etc.;Also may be used as mobile phone
The other sensors such as the gyroscope of configuration, barometer, drimeter, thermometer, infrared ray sensor, will not be described here.
Voicefrequency circuit 860, speaker 861, microphone 862 can provide the audio interface between user and mobile phone.Audio-frequency electric
The signal of telecommunication after the voice data receiving conversion can be transferred to speaker 861, is converted to sound by speaker 861 by road 860
Signal output;On the other hand, the acoustical signal of collection is converted to the signal of telecommunication by microphone 862, turns after being received by voicefrequency circuit 860
It is changed to voice data, then after voice data output processor 880 is processed, through rf circuit 810 to be sent to such as another mobile phone,
Or voice data is exported to memorizer 820 to process further.
Wifi belongs to short range wireless transmission technology, and mobile phone can help user's transceiver electronicses postal by wifi module 870
Part, browse webpage and access streaming video etc., it has provided the user wireless broadband internet and has accessed.Although Fig. 8 shows
Wifi module 870, but it is understood that, it is simultaneously not belonging to must be configured into of mobile phone, can not change as needed completely
Omit in the scope of the essence becoming invention.
Processor 880 is the control centre of mobile phone, using the various pieces of various interfaces and connection whole mobile phone, leads to
Cross and run or software program and/or module that execution is stored in memorizer 820, and call and be stored in memorizer 820
Data, the various functions of execution mobile phone and processing data, thus carry out integral monitoring to mobile phone.Optionally, processor 880 can wrap
Include one or more processing units;Preferably, processor 880 can integrated application processor and modem processor, wherein, should
Mainly process operating system, user interface and application program etc. with processor, modem processor mainly processes radio communication.
It is understood that above-mentioned modem processor can not also be integrated in processor 880.
Mobile phone also includes the power supply 890 (such as battery) powered to all parts it is preferred that power supply can pass through power supply pipe
Reason system is logically contiguous with processor 880, thus realizing management charging, electric discharge and power managed by power-supply management system
Etc. function.
Although not shown, mobile phone can also include photographic head, bluetooth module etc., will not be described here.
In embodiments of the present invention, the function of the processor 880 included by this terminal unit can correspond to aforementioned enforcement
The function of processor 702 in example.Wherein, voicefrequency circuit 860 can use collection audio signal as input-output equipment.
It should be noted that in said apparatus embodiment, included unit simply carries out drawing according to function logic
Point, but it is not limited to above-mentioned division, as long as being capable of corresponding function;In addition, each functional unit is concrete
Title also only to facilitate mutual distinguish, is not limited to protection scope of the present invention.
In addition, one of ordinary skill in the art will appreciate that realizing all or part of step in above-mentioned each method embodiment
The program that can be by completes come the hardware to instruct correlation, and corresponding program can be stored in a kind of computer-readable recording medium
In, storage medium mentioned above can be read only memory, disk or CD etc..
These are only the present invention preferably specific embodiment, but protection scope of the present invention is not limited thereto, any
Those familiar with the art in the technical scope that the embodiment of the present invention discloses, the change that can readily occur in or replace
Change, all should be included within the scope of the present invention.Therefore, protection scope of the present invention should be with the protection model of claim
Enclose and be defined.
Claims (20)
1. a kind of noise detecting method is it is characterised in that include:
Obtain pending audio signal, calculate the power spectrum spectrum (ω) of described audio signal sound intermediate frequency frame;Described ω
Frequency for 2 π * power spectrum;
According to spectra calculation class autocorrelation spectrum corr (τ) of described audio frame, described τ is time value;
Calculate according to described autocorrelation spectrum corr (τ) and strengthen Correlated Spectroscopy ecorr (τ);
Obtain the maximum max (ecorr) in described ecorr (τ), if the max (ecorr) of the audio frame of continuous predetermined number is all
Less than first threshold it is determined that described audio frame is noise, described first threshold is the threshold value strengthening Correlated Spectroscopy;
Or, obtain the corresponding τ of described max (ecorr), if the corresponding τ of described max (ecorr) is not in preset threshold range,
Determine that described audio frame is noise, described preset threshold range is default time range.
2. according to claim 1 method it is characterised in that described according to described autocorrelation spectrum corr (τ) calculate strengthen phase
Close spectrum ecorr (τ) to include:
The value being less than 0 in described corr (τ) is entered as 0 and then calculates enhancing spectrum ecorr (τ);
The value being less than 0 in described ecorr (τ) is entered as 0, obtains strengthening Correlated Spectroscopy ecorr (τ).
3. method composes ecorr (τ) inclusion it is characterised in that described calculating strengthens according to claim 2:
Calculate described ecorr (τ) according to ecorr (τ)=corr (τ)-corr (τ/2), if τ is odd number, described corr (τ/2)
Obtained by near stratum exhaust.
4. according to claim 1 method it is characterised in that the described spectra calculation class auto-correlation according to described audio frame
Spectrum corr (τ) includes:
Calculate the cube root of the frequency of described spectrum (ω), and the cube root of the frequency to described spectrum (ω)
Make fast Fourier transform treating excess syndrome portion, obtain described corr (τ).
5. according to Claims 1-4 any one methods described it is characterised in that being noise in the described audio frame of described determination
Before, methods described also includes:
Calculate the average distance d, d=20 (log10 (s)-log10 of the amplitude spectrum of amplitude spectrum s and noise spectrum n of described audio frame
(n));If described d is less than Second Threshold and the max (ecorr) of the audio frame of continuous predetermined number is respectively less than described first threshold,
Or, if described d is less than described Second Threshold and the corresponding τ of described max (ecorr) not in preset threshold range it is determined that institute
Stating audio frame is noise, and described preset threshold range is default time range.
6. according to claim 5 method it is characterised in that methods described also includes:
If it is determined that described audio frame is noise, then determine new noise spectrum by the way of window is average.
7. according to claim 6 method it is characterised in that described determination new noise spectrum after, methods described also includes:
Using described new noise spectrum, Wiener filtering is carried out to the audio frame of described audio signal.
8. according to Claims 1-4 any one methods described it is characterised in that methods described also includes:
If described audio frame is not defined as noise it is determined that described audio frame is voice or music.
9. according to claim 8 method it is characterised in that the described audio frame of described determination be voice or music it
Before, methods described also includes:
If if described d is more than described Second Threshold and described audio frame is not defined as noise it is determined that described audio frame is voice
Or music.
10. according to claim 9 method it is characterised in that methods described also includes:
If not determining, described audio frame is voice or music, uses described audio frame by the way of window is average
Ecorr (τ) updates described first threshold.
A kind of 11. noise detection apparatus are it is characterised in that include:
Signal acquiring unit, for obtaining pending audio signal;
Computing unit, for calculating the power spectrum spectrum (ω) of described audio signal sound intermediate frequency frame;Described ω is 2 π * power
The frequency of spectrum;According to spectra calculation class autocorrelation spectrum corr (τ) of described audio frame, described τ is time value;According to described from
Correlated Spectroscopy corr (τ) calculates and strengthens Correlated Spectroscopy ecorr (τ);
Signal determining unit, for obtaining the maximum max (ecorr) in described ecorr (τ), if the sound of continuous predetermined number
The max (ecorr) of frequency frame is respectively less than first threshold it is determined that described audio frame is noise, and described first threshold is to strengthen correlation
The threshold value of spectrum;Or, obtain the corresponding τ of described max (ecorr), if the corresponding τ of described max (ecorr) is not in predetermined threshold value model
Enclose it is determined that described audio frame is noise, described preset threshold range is default time range.
12. according to claim 11 described device it is characterised in that
Described computing unit, specifically for being entered as 0 by the value being less than 0 in described corr (τ) and then calculating enhancing spectrum ecorr
(τ);The value being less than 0 in described ecorr (τ) is entered as 0, obtains strengthening Correlated Spectroscopy ecorr (τ).
13. according to claim 12 described device it is characterised in that
Described computing unit, specifically for calculating described ecorr (τ) according to ecorr (τ)=corr (τ)-corr (τ/2), if τ
For odd number, described corr (τ/2) is obtained by near stratum exhaust.
14. according to claim 11 described device it is characterised in that
Described computing unit, specifically for calculating the cube root of the frequency of described spectrum (ω), and to described
The cube root of the frequency of spectrum (ω) makees fast Fourier transform treating excess syndrome portion, obtains described corr (τ).
15. according to claim 11 to 14 any one described device it is characterised in that
Described computing unit, is additionally operable to, before described signal determining unit determines that described audio frame is noise, calculate described sound
The average distance d of the amplitude spectrum of amplitude spectrum s and noise spectrum n of frequency frame, d=20 (log10 (s)-log10 (n));
Described signal determining unit, if be less than the max of the audio frame of Second Threshold and continuous predetermined number specifically for described d
(ecorr) it is respectively less than described first threshold, or, if described d is less than described Second Threshold and the corresponding τ of described max (ecorr)
Not in preset threshold range it is determined that described audio frame is noise, described preset threshold range is default time range.
16. according to claim 15 described device it is characterised in that described device also includes:
Noise spectrum updating block, if determining that described audio frame is noise for described signal determining unit, average using window
Mode determine new noise spectrum n.
17. according to claim 16 described device it is characterised in that described device also includes:
Filter unit, for carrying out Wiener filtering using described new noise spectrum to the audio frame of described audio signal.
18. according to claim 11 to 14 any one described device it is characterised in that
Described signal determining unit, if be additionally operable to described audio frame be not defined as noise it is determined that described audio frame be voice or
Person's music.
19. according to claim 18 described device it is characterised in that
Described signal determining unit, is additionally operable to before the described audio frame of described determination is voice or music, if described d is more than
If described Second Threshold and described audio frame are not defined as noise it is determined that described audio frame is voice or music.
20. according to claim 19 described device it is characterised in that described device also includes:
Threshold value updating block, if for not determining that described audio frame is voice or music, make by the way of window is average
Update described first threshold with the ecorr (τ) of described audio frame.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610769237.5A CN106356071B (en) | 2016-08-30 | 2016-08-30 | A kind of noise detecting method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610769237.5A CN106356071B (en) | 2016-08-30 | 2016-08-30 | A kind of noise detecting method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106356071A true CN106356071A (en) | 2017-01-25 |
CN106356071B CN106356071B (en) | 2019-10-25 |
Family
ID=57856133
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610769237.5A Active CN106356071B (en) | 2016-08-30 | 2016-08-30 | A kind of noise detecting method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106356071B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108492837A (en) * | 2018-03-23 | 2018-09-04 | 腾讯音乐娱乐科技(深圳)有限公司 | Detection method, device and the storage medium of audio burst white noise |
CN109979478A (en) * | 2019-04-08 | 2019-07-05 | 网易(杭州)网络有限公司 | Voice de-noising method and device, storage medium and electronic equipment |
CN111128243A (en) * | 2019-12-25 | 2020-05-08 | 苏州科达科技股份有限公司 | Noise data acquisition method, device and storage medium |
CN112013947A (en) * | 2019-05-31 | 2020-12-01 | 北京小米移动软件有限公司 | Motor abnormal sound detection method, device and system |
CN112908352A (en) * | 2021-03-01 | 2021-06-04 | 百果园技术(新加坡)有限公司 | Audio denoising method and device, electronic equipment and storage medium |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101089952A (en) * | 2006-06-15 | 2007-12-19 | 株式会社东芝 | Method and device for controlling noise, smoothing speech manual, extracting speech characteristic, phonetic recognition and training phonetic mould |
EP2362390A1 (en) * | 2010-02-12 | 2011-08-31 | Nxp B.V. | Noise suppression |
CN102637438A (en) * | 2012-03-23 | 2012-08-15 | 同济大学 | Voice filtering method |
JP2013246418A (en) * | 2012-05-29 | 2013-12-09 | Oki Electric Ind Co Ltd | Noise suppression device, method, and program |
CN105092249A (en) * | 2015-09-22 | 2015-11-25 | 山东理工大学 | Rolling bearing fault diagnosis method based on Gabor filter |
CN105118522A (en) * | 2015-08-27 | 2015-12-02 | 广州市百果园网络科技有限公司 | Noise detection method and device |
CN105575406A (en) * | 2016-01-07 | 2016-05-11 | 深圳市音加密科技有限公司 | Noise robustness detection method based on likelihood ratio test |
CN105721083A (en) * | 2016-05-09 | 2016-06-29 | 南通大学 | Frequency spectrum detection method based on autocorrelation energy |
CN105741849A (en) * | 2016-03-06 | 2016-07-06 | 北京工业大学 | Voice enhancement method for fusing phase estimation and human ear hearing characteristics in digital hearing aid |
-
2016
- 2016-08-30 CN CN201610769237.5A patent/CN106356071B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101089952A (en) * | 2006-06-15 | 2007-12-19 | 株式会社东芝 | Method and device for controlling noise, smoothing speech manual, extracting speech characteristic, phonetic recognition and training phonetic mould |
EP2362390A1 (en) * | 2010-02-12 | 2011-08-31 | Nxp B.V. | Noise suppression |
CN102637438A (en) * | 2012-03-23 | 2012-08-15 | 同济大学 | Voice filtering method |
JP2013246418A (en) * | 2012-05-29 | 2013-12-09 | Oki Electric Ind Co Ltd | Noise suppression device, method, and program |
CN105118522A (en) * | 2015-08-27 | 2015-12-02 | 广州市百果园网络科技有限公司 | Noise detection method and device |
CN105092249A (en) * | 2015-09-22 | 2015-11-25 | 山东理工大学 | Rolling bearing fault diagnosis method based on Gabor filter |
CN105575406A (en) * | 2016-01-07 | 2016-05-11 | 深圳市音加密科技有限公司 | Noise robustness detection method based on likelihood ratio test |
CN105741849A (en) * | 2016-03-06 | 2016-07-06 | 北京工业大学 | Voice enhancement method for fusing phase estimation and human ear hearing characteristics in digital hearing aid |
CN105721083A (en) * | 2016-05-09 | 2016-06-29 | 南通大学 | Frequency spectrum detection method based on autocorrelation energy |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108492837A (en) * | 2018-03-23 | 2018-09-04 | 腾讯音乐娱乐科技(深圳)有限公司 | Detection method, device and the storage medium of audio burst white noise |
CN108492837B (en) * | 2018-03-23 | 2020-10-13 | 腾讯音乐娱乐科技(深圳)有限公司 | Method, device and storage medium for detecting audio burst white noise |
CN109979478A (en) * | 2019-04-08 | 2019-07-05 | 网易(杭州)网络有限公司 | Voice de-noising method and device, storage medium and electronic equipment |
CN112013947A (en) * | 2019-05-31 | 2020-12-01 | 北京小米移动软件有限公司 | Motor abnormal sound detection method, device and system |
CN111128243A (en) * | 2019-12-25 | 2020-05-08 | 苏州科达科技股份有限公司 | Noise data acquisition method, device and storage medium |
CN112908352A (en) * | 2021-03-01 | 2021-06-04 | 百果园技术(新加坡)有限公司 | Audio denoising method and device, electronic equipment and storage medium |
CN112908352B (en) * | 2021-03-01 | 2024-04-16 | 百果园技术(新加坡)有限公司 | Audio denoising method and device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN106356071B (en) | 2019-10-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106356070B (en) | A kind of acoustic signal processing method and device | |
CN106356071B (en) | A kind of noise detecting method and device | |
CN105280195A (en) | Method and device for processing speech signal | |
CN106126174B (en) | A kind of control method and electronic equipment of scene audio | |
CN106782613B (en) | Signal detection method and device | |
CN108447472A (en) | Voice awakening method and device | |
CN107993672B (en) | Frequency band expanding method and device | |
CN108470571B (en) | Audio detection method and device and storage medium | |
CN106384597A (en) | Audio frequency data processing method and device | |
CN103813127B (en) | A kind of video call method, terminal and system | |
CN105959482B (en) | A kind of control method and electronic equipment of scene audio | |
CN107770729A (en) | Signal intensity reminding method and Related product | |
CN109817241B (en) | Audio processing method, device and storage medium | |
CN111477243B (en) | Audio signal processing method and electronic equipment | |
CN106921791A (en) | The storage and inspection method of a kind of multimedia file, device and mobile terminal | |
CN106331359A (en) | Speech signal acquisition method and apparatus and terminal | |
CN108492837B (en) | Method, device and storage medium for detecting audio burst white noise | |
CN108512625A (en) | Anti-interference method, mobile terminal and the storage medium of camera | |
CN106095387A (en) | The audio method to set up of a kind of terminal and terminal | |
CN106170034A (en) | A kind of sound effect treatment method and mobile terminal | |
CN104091600B (en) | A kind of song method for detecting position and device | |
CN106506437A (en) | A kind of audio data processing method, and equipment | |
CN106265019A (en) | A kind of vibrations apparatus control method and device | |
CN106527666A (en) | Control method of central processing unit and terminal equipment | |
CN106793010A (en) | A kind of method for network access and equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20231010 Address after: 31a, 15 / F, building 30, maple mall, bangrang Road, Brazil, Singapore Patentee after: Baiguoyuan Technology (Singapore) Co.,Ltd. Address before: Floor 28, Building B1, Wanda Plaza, Wanbo Business District, Nancun Town, Panyu District, Guangzhou City, Guangdong Province, 511442 Patentee before: GUANGZHOU BAIGUOYUAN NETWORK TECHNOLOGY Co.,Ltd. |
|
TR01 | Transfer of patent right |