CN107910016A - A kind of noise containment determination methods of noisy speech - Google Patents

A kind of noise containment determination methods of noisy speech Download PDF

Info

Publication number
CN107910016A
CN107910016A CN201711372174.0A CN201711372174A CN107910016A CN 107910016 A CN107910016 A CN 107910016A CN 201711372174 A CN201711372174 A CN 201711372174A CN 107910016 A CN107910016 A CN 107910016A
Authority
CN
China
Prior art keywords
noise
noisy speech
speech
signal
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711372174.0A
Other languages
Chinese (zh)
Other versions
CN107910016B (en
Inventor
王亦红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hohai University HHU
Original Assignee
Hohai University HHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hohai University HHU filed Critical Hohai University HHU
Priority to CN201711372174.0A priority Critical patent/CN107910016B/en
Publication of CN107910016A publication Critical patent/CN107910016A/en
Application granted granted Critical
Publication of CN107910016B publication Critical patent/CN107910016B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Noise Elimination (AREA)
  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)

Abstract

The invention discloses a kind of noise containment determination methods of noisy speech, choose noisy speech sample, calculate the prior weight of each each Frequency point of noisy speech sample signal, and find out wherein minimum prior weight;The minimum prior weight of each sample is contrasted, determines the threshold value of Noisy Speech Signal noise containment.Framing and preemphasis are carried out to noisy speech, and carry out end-point detection;Power Spectral Estimation is carried out respectively to noise and noisy speech;Calculate the prior weight of each Frequency point of each frame.If the prior weight for having one and its frequencies above point is less than threshold value, the noise of the frame voice can not hold, and need speech enhan-cement to handle.If the prior weight of each Frequency point of the frame is all higher than threshold value, the noise of the frame voice signal can hold, without speech enhan-cement processing.According to the determination methods, noisy speech that recognizable noise can hold.The noisy speech that can hold for noise, without speech enhan-cement, so as to avoid the loss of effective information.

Description

A kind of noise containment determination methods of noisy speech
Technical field
The present invention relates to a kind of noise for judging noisy speech whether the method that can hold, belong to the processing of Noisy Speech Signal Technical field.
Background technology
In speech signal processing, noisy speech while its signal-to-noise ratio is improved, can cause after speech enhan-cement The loss of part efficient voice information, causes voice signal distortion.According to the auditory masking effect of human ear, the presence of voice signal The threshold value that its ambient noise is heard by human ear can be improved.Based on the principle, the ambient noises of some Noisy Speech Signals can not be by Human ear hears, even if or heard by human ear, people will not be allowed to produce discomfort.Such ambient noise is referred to as by the present invention Noise can be held, and give judge noise whether the method that can hold.
The content of the invention
Goal of the invention:For problems of the prior art, the present invention provides a kind of noise containment of noisy speech Determination methods, for Noisy Speech Signal, if judging, its ambient noise can hold, and in signal processing, can be kept away without de-noising The loss of efficient voice information caused by de-noising is exempted from, so as to remain effective information to greatest extent.
Technical solution:A kind of noise containment determination methods of noisy speech, including threshold value determine and based on threshold value Judge two parts.
The step of Part I threshold value
The first step, records several sections of pure voice signals
Second step, in Noisex-92 noises storehouse, respectively from impulse noise, broadband noise, periodic noise and voice interference four It is each to extract several scenes noise sample in noise-like signal.
3rd step, adds each noise sample in each pure voice signal that the first step is recorded respectively, is formed various Noisy Speech Signal.In the SNR ranges of 0dB to 20dB, based on different signal-to-noise ratio, each noisy speech is increased Manage strength.
4th step, respectively each noisy speech forward and backward to speech enhan-cement progress MOS marking.
5th step, as one group, it is forward and backward speech enhan-cement is therefrom found out using based on the same noisy speech of different signal-to-noise ratio The convergent signal of MOS marking, and the sample signal using the relatively minimum signal of wherein signal-to-noise ratio as this noisy speech.
6th step, carries out noisy speech sample framing and preemphasis is handled, and then, endpoint is carried out to noisy speech sample Detection, and power Spectral Estimation is carried out to noise and noisy speech respectively;
7th step, calculates the prior weight of each Frequency point in each frame of each noisy speech sample respectively:
The prior weight of ξ (n, k)-n-th frame kth Frequency point in formula;
The noisy speech power of-n-th frame in kth Frequency point.
Noise power of-the n-th frame in kth Frequency point.
α-value is 0.98
Find out the minimum prior weight of each noisy speech sample.The minimum prior weight of each sample is contrasted, really Determine the threshold value of noisy speech noise containment.Based on above-mentioned steps, present invention determine that threshold value be 0.95-1.05.
Part II judges whether the noise of Noisy Speech Signal can hold
The first step, carries out Noisy Speech Signal framing and preemphasis is handled,
Second step, to Noisy Speech Signal this progress end-point detection,
3rd step, carries out power Spectral Estimation to noise and noisy speech respectively;
4th step, the prior weight of each Frequency point in each frame of noisy speech is calculated according to formula (1) respectively:
5th step, the selected threshold in the range of 0.95-1.05.If the priori noise of each Frequency point of noisy speech frame Than less than selected threshold value, then it is assumed that the noise of the noisy speech frame can not hold, it is necessary to speech enhan-cement.Conversely, then it is considered It can hold, without enhancing.
Brief description of the drawings
Fig. 1 is the method flow block diagram of the embodiment of the present invention;
Fig. 2 is the pure voice signal waveform diagram of the embodiment of the present invention;
Fig. 3 is the Noisy Speech Signal waveform diagram that the signal-to-noise ratio of the embodiment of the present invention is 15dB;
Fig. 4 is the filtered waveform diagram that the addition containment of the embodiment of the present invention judges;
Fig. 5 is the filtered waveform diagram for not adding containment judgement of the embodiment of the present invention.
Embodiment
With reference to specific embodiment, the present invention is furture elucidated, it should be understood that these embodiments are merely to illustrate the present invention Rather than limit the scope of the invention, after the present invention has been read, various equivalences of the those skilled in the art to the present invention The modification of form falls within the application appended claims limited range.
The method that judging the noise of noisy speech can hold includes definite and judgement two parts based on threshold value of threshold value:
Threshold value determines
The first step, noise sample is extracted from Noisex-92 noises storehouse
Extract four classes such as impulse noise, broadband noise, periodic noise and voice interference respectively from Noisex-92 noises storehouse Noise sample of the additive noise under different scenes.Wherein, impact noise class selection applause, hammer knock is as noise sample This;The selection of broadband noise class is with the noise in the car of 130km/h travelings, the noise in noisy workshop, and road On clamour as sample noise;Periodic noise selects outdoor machine of air-conditioner, electric wind and the sound that is sent with hair dryer etc. to make respectively For sample noise;Voice interference selects the more people's speeches of office and people's sound of speech as noise sample respectively.
Second step, the formation and enhancing of noisy speech
Each noise sample is separately added into following five sections of pure voice signals:" if above kind water, thick moral loading ", " water energy Carrying boat also can capsized boat ", " sailing against the current, not to advance is to go back ", " one of voice enhancement algorithm ", " noise sample chosen ", forms Various noisy speeches, and in the SNR ranges of 0dB to 20dB, based on different signal-to-noise ratio, respectively to various noisy speeches Enhancing.Wherein, using the voice enhancement algorithm of short-time energy average respectively into hammer knock and with applause Voice signal is strengthened;The spectrum-subtraction based on auditory masking effect is used to strengthen respectively with the small sedan-chair travelled with 130km/h In-car noise, the voice signal with the noise in noisy workshop and with noisy noise on road;Using LMS certainly Suitable filter method strengthens with hair dryer noise, powered fan noise and voice signal with outdoor machine of air-conditioner noise respectively;Using comb The voice signal that shape wave filter difference speech enhan-cement is disturbed with a people or multi-person speech
3rd step, the selection of sample signal
Each noisy speech forward and backward to speech enhan-cement carries out MOS marking respectively.By based on the same of different signal-to-noise ratio Kind of Noisy Speech Signal therefrom finds out the forward and backward MOS of speech enhan-cement and gives a mark convergent signal as one group, and will wherein noise Than sample of the relatively minimum signal as this Noisy Speech Signal.
4th, minimum prior weight threshold value is utilized based on sample signal
As follows, the prior weight of each Frequency point in each each frame of noisy speech sample is calculated respectively:
The prior weight of ξ (n, k)-n-th frame kth Frequency point in formula;
The noisy speech power of-n-th frame in kth Frequency point.
Noise power of-the n-th frame in kth Frequency point.
α-value is 0.98
Find out the minimum prior weight of each noisy speech sample.The minimum prior weight of each sample is contrasted, really The threshold value for determining noisy speech noise containment is 0.95-1.05.
Part II judges whether the noise of Noisy Speech Signal can hold.
One section of pure voice signal of women " one of voice enhancement algorithm " is enrolled, speech waveform is as shown in Figure 2.Additive noise takes Factory2 factory noises in Noisex-92 standard noises storehouse.The band that waveform signal-to-noise ratio as shown in Figure 3 is 15dB is formed to make an uproar language Sound signal.Noisy Speech Signal shown in Fig. 3 is carried out respectively without noise containment to judge and with the judgement of noise containment Wiener filtering.It is as follows to implement step;
First, to Noisy Speech Signal preemphasis and adding window framing, and end-point detection is carried out;
Secondly, power Spectral Estimation is carried out to noise and noisy speech respectively;
Then, the prior weight of each Frequency point in each frame of noisy speech is calculated respectively according to formula (1);
Finally, for the threshold value of 0.95-1.05,0.95 threshold value judged as noise containment is taken.If noisy speech The prior weight of each Frequency point of frame is less than 0.95, then it is assumed that the noise of the noisy speech frame can not hold, and need Wiener filtering. Conversely, it is considered as then to hold, without filtering.Noisy Speech Signal shown in Fig. 3 clicks here reason frame by frame, the waveform after processing As shown in Figure 4.For comparative descriptions, to the Noisy Speech Signal shown in Fig. 3, containment judgement is not introduced, directly using wiener Filtering is handled, its handling result is as shown in Figure 5.

Claims (6)

  1. A kind of 1. noise containment determination methods of noisy speech, it is characterised in that extraction several scenes noise sample, by noise Sample is added in pure voice signal, is formed Noisy Speech Signal, enhancing processing is carried out to noisy speech, respectively to speech enhan-cement Each forward and backward noisy speech carries out MOS marking, and framing is carried out to Noisy Speech Signal and preemphasis is handled, and is made an uproar to band Voice signal carries out end-point detection;Secondly, power Spectral Estimation is carried out respectively to noise and noisy speech;Calculate Noisy Speech Signal The prior weight of each each Frequency point of frame, if the prior weight of each Frequency point is all higher than threshold value, the frame voice Noise can hold, without speech enhan-cement processing;If the prior weight for having one and its frequencies above point is less than threshold value, the frame language The noise not tolerable of sound, needs speech enhan-cement to handle.
  2. 2. the noise containment determination methods of noisy speech as claimed in claim 1, it is characterised in that make an uproar in Noisex-92 Sound storehouse, disturbs in four noise-like signals from impulse noise, broadband noise, periodic noise and voice, respectively extracts several scenes respectively Noise sample.
  3. 3. the noise containment determination methods of noisy speech as claimed in claim 1, it is characterised in that each is made an uproar respectively Sound sample is added in each pure voice signal that the first step is recorded, and forms various Noisy Speech Signals;In the noise of 0dB to 20dB Than in the range of, based on different signal-to-noise ratio, enhancing processing is carried out to each noisy speech.
  4. 4. the noise containment determination methods of noisy speech as claimed in claim 1, it is characterised in that different noises will be based on The same noisy speech of ratio therefrom finds out the convergent signal of the forward and backward MOS marking of speech enhan-cement as one group, and will wherein Sample signal of the relatively minimum signal of signal-to-noise ratio as this noisy speech;Framing and preemphasis are carried out to noisy speech sample Processing, then, carries out noisy speech sample end-point detection, and carry out power Spectral Estimation to noise and noisy speech respectively.
  5. 5. the noise containment determination methods of noisy speech as claimed in claim 1, it is characterised in that calculate each band respectively Make an uproar the prior weight of each Frequency point in each frame of speech samples;Then, the minimum prior weight of each sample is contrasted, Determine the threshold value that Noisy Speech Signal noise can hold.
  6. 6. the noise containment determination methods of noisy speech as claimed in claim 1, it is characterised in that the threshold range is 0.95-1.05dB。
CN201711372174.0A 2017-12-19 2017-12-19 Noise tolerance judgment method for noisy speech Active CN107910016B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711372174.0A CN107910016B (en) 2017-12-19 2017-12-19 Noise tolerance judgment method for noisy speech

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711372174.0A CN107910016B (en) 2017-12-19 2017-12-19 Noise tolerance judgment method for noisy speech

Publications (2)

Publication Number Publication Date
CN107910016A true CN107910016A (en) 2018-04-13
CN107910016B CN107910016B (en) 2021-07-27

Family

ID=61870324

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711372174.0A Active CN107910016B (en) 2017-12-19 2017-12-19 Noise tolerance judgment method for noisy speech

Country Status (1)

Country Link
CN (1) CN107910016B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108831500A (en) * 2018-05-29 2018-11-16 平安科技(深圳)有限公司 Sound enhancement method, device, computer equipment and storage medium
CN109920434A (en) * 2019-03-11 2019-06-21 南京邮电大学 A kind of noise classification minimizing technology based on conference scenario

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130085762A1 (en) * 2011-09-29 2013-04-04 Renesas Electronics Corporation Audio encoding device
CN103730110A (en) * 2012-10-10 2014-04-16 北京百度网讯科技有限公司 Method and device for detecting voice endpoint
CN104869209A (en) * 2015-04-24 2015-08-26 广东小天才科技有限公司 Method and apparatus for adjusting recording of mobile terminal
CN105810201A (en) * 2014-12-31 2016-07-27 展讯通信(上海)有限公司 Voice activity detection method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130085762A1 (en) * 2011-09-29 2013-04-04 Renesas Electronics Corporation Audio encoding device
CN103730110A (en) * 2012-10-10 2014-04-16 北京百度网讯科技有限公司 Method and device for detecting voice endpoint
CN105810201A (en) * 2014-12-31 2016-07-27 展讯通信(上海)有限公司 Voice activity detection method and system
CN104869209A (en) * 2015-04-24 2015-08-26 广东小天才科技有限公司 Method and apparatus for adjusting recording of mobile terminal

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108831500A (en) * 2018-05-29 2018-11-16 平安科技(深圳)有限公司 Sound enhancement method, device, computer equipment and storage medium
CN109920434A (en) * 2019-03-11 2019-06-21 南京邮电大学 A kind of noise classification minimizing technology based on conference scenario
CN109920434B (en) * 2019-03-11 2020-12-15 南京邮电大学 Noise classification removal method based on conference scene

Also Published As

Publication number Publication date
CN107910016B (en) 2021-07-27

Similar Documents

Publication Publication Date Title
CN108831499B (en) Speech enhancement method using speech existence probability
Esch et al. Efficient musical noise suppression for speech enhancement system
CN107610712B (en) Voice enhancement method combining MMSE and spectral subtraction
EP2381702A2 (en) Systems and methods for own voice recognition with adaptations for noise robustness
CN106653062A (en) Spectrum-entropy improvement based speech endpoint detection method in low signal-to-noise ratio environment
CN101430882A (en) Method and apparatus for restraining wind noise
CN103440869A (en) Audio-reverberation inhibiting device and inhibiting method thereof
CN103109320A (en) Noise suppression device
CN111091833A (en) Endpoint detection method for reducing noise influence
CN105261359A (en) Noise elimination system and method of mobile phone microphones
KR20110068637A (en) Method and apparatus for removing a noise signal from input signal in a noisy environment
CN103544961A (en) Voice signal processing method and device
CN103021405A (en) Voice signal dynamic feature extraction method based on MUSIC and modulation spectrum filter
CN105575406A (en) Noise robustness detection method based on likelihood ratio test
CN103280225B (en) Low-complexity silence detection method
CN112309417A (en) Wind noise suppression audio signal processing method, device, system and readable medium
CN107910016A (en) A kind of noise containment determination methods of noisy speech
CN109087657B (en) Voice enhancement method applied to ultra-short wave radio station
CN107045874B (en) Non-linear voice enhancement method based on correlation
CN112233657A (en) Speech enhancement method based on low-frequency syllable recognition
CN111225317A (en) Echo cancellation method
CN110444222B (en) Voice noise reduction method based on information entropy weighting
CN103201793A (en) Method and system based on voice communication for eliminating interference noise
Ghoreishi et al. A hybrid speech enhancement system based on HMM and spectral subtraction
Shen et al. A priori SNR estimator based on a convex combination of two DD approaches for speech enhancement

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant