CN104867497A - Voice noise-reducing method - Google Patents

Voice noise-reducing method Download PDF

Info

Publication number
CN104867497A
CN104867497A CN201410076957.4A CN201410076957A CN104867497A CN 104867497 A CN104867497 A CN 104867497A CN 201410076957 A CN201410076957 A CN 201410076957A CN 104867497 A CN104867497 A CN 104867497A
Authority
CN
China
Prior art keywords
noise
frame
speech
voice
spectrum
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410076957.4A
Other languages
Chinese (zh)
Inventor
陈子华
徐正春
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING XINYOUDA VIDEO TECHNOLOGY Co Ltd
Beijing Xinwei Telecom Technology Inc
Original Assignee
BEIJING XINYOUDA VIDEO TECHNOLOGY Co Ltd
Beijing Xinwei Telecom Technology Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING XINYOUDA VIDEO TECHNOLOGY Co Ltd, Beijing Xinwei Telecom Technology Inc filed Critical BEIJING XINYOUDA VIDEO TECHNOLOGY Co Ltd
Priority to CN201410076957.4A priority Critical patent/CN104867497A/en
Publication of CN104867497A publication Critical patent/CN104867497A/en
Pending legal-status Critical Current

Links

Landscapes

  • Noise Elimination (AREA)

Abstract

The invention provides a voice noise-reducing method, which comprises the steps of: a, dividing a voice frame region into silent frames and voice frames through endpoint detection; b, calculating a power spectral value of the current frame to serve as a noise power spectrum estimated value for the silent frames, and calculating an average noise power spectrum to serve as a noise power spectrum estimated value for the voice frames; c, subtracting the noise power spectrum estimated value from power spectra of the voice frames to obtain voice power spectra after noise reduction; and d, acquiring the voice frames after noise reduction according to the voice power spectra after noise reduction. The voice noise-reducing method reduces the error of the noise power spectrum estimated value by adopting the endpoint detection technology, and basically eliminates musical noise, thereby improving the voice noise-reducing quality and the effect of the subjective sense of hearing.

Description

A kind of voice de-noising method
Technical field
The present invention relates to voice call field, particularly relate to a kind of voice de-noising method.
Background technology
In speech business, modal problem has noise in call, and the technology that process noise is the most frequently used is at present spectrum-subtraction.It utilizes the short-term stationarity characteristic of voice signal, deducts the short time spectrum value of noise, thus obtains the frequency spectrum compared with clean speech, reach the object of voice de-noising from the spectrum in short-term of noisy speech.Spectrum-subtraction comprises amplitude spectrum subtraction and power spectrum subtraction: amplitude spectrum subtraction is exactly the amplitude spectrum of amplitude spectrum as voice signal deducting noise in a frequency domain from the amplitude spectrum of noisy speech; Power spectrum subtraction is then the power spectrum deducting noise from the power spectrum of noisy speech, obtains the power spectrum of clean speech, obtains amplitude spectrum by extracting operation.Because people's ear is insensitive to the phase place perception of voice spectrum component, therefore these algorithms are all the corrections carried out in amplitude, and phase bit position then remains unchanged, after processing noise, still use the phase place of noisy speech to recover the voice after noise reduction.In the estimation of noise spectrum, be generally the noise spectrum estimation value of the noise spectrum before use voice do not produce as whole voice de-noising interval.
Above-mentioned spectrum-subtraction reaches the object of voice de-noising by the short time spectrum value deducting noise from the short-time spectrum of noisy speech, and algorithm simply and easily realize.Owing to being using the noise spectrum estimation value of the noise spectrum before voice do not produce as whole voice de-noising interval, make the estimated value error of noise spectrum larger, therefore after deducting noise spectrum, also can the remainder of some relatively high power spectral component, frequency spectrum presents the random spike occurred, is acoustically forming residual noise.This noise has certain rhythm fluctuating sense, and being referred to as " music noise ", is the combined result of the tone that the multiple random frequency point of each frame occurs.Hearer usually can find " music noise " in the voice after processing, and it is more more clear than the noise in raw tone, also more easily offensive.
Summary of the invention
In order to solve adopt at present spectrum-subtraction process noise after there is the problem of music noise, invention proposes a kind of voice de-noising method improved based on spectrum-subtraction.The method comprises the following steps:
A, divides into mute frame and speech frame by end-point detection by speech frame;
B, for mute frame, calculate the power spectral value of present frame as noise power spectrum estimated value, for speech frame, calculating average noise power spectrum is as noise power spectrum estimated value;
C, deducts noise power spectrum estimated value by the power spectrum of speech frame, obtains the spectrum of the phonetic speech power after noise reduction;
D, draws the speech frame after noise reduction according to the phonetic speech power spectrum after noise reduction.
Preferably, step a is specially: the energy calculating each speech frame, if be more than or equal to threshold value, is then speech frame, if be less than threshold value, is then mute frame.Further, using the average noise energy of front 30 frame speech frames as described threshold value.
Preferably, in step b, the average noise energy of front 30 frame speech frames is composed as described average noise power.
Preferably, the value of noise spectrum estimation described in step b also smoothing process.
Preferably, steps d utilizes the phase spectrum of speech frame before noise reduction, calculates the speech manual after noise reduction, and then obtain the speech frame after noise reduction according to the phonetic speech power spectrum after noise reduction.
The present invention reduces the error of noise power spectrum estimated value by end-point detection technology, essentially eliminates music noise, thus improves the effect of voice de-noising quality and the subjective sense of hearing.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, accompanying drawing in the following describes is some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.
Fig. 1 is the end-point detection schematic flow sheet of the embodiment of the present invention.
Embodiment
For making the object of the embodiment of the present invention, technical scheme and advantage clearly, below in conjunction with the accompanying drawing in the embodiment of the present invention, technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is the present invention's part embodiment, instead of whole embodiments; It should be noted that, when not conflicting, the embodiment in the application and the feature in embodiment can combine mutually.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.
In noise-reduction method, the estimation of noise spectrum is most important, if noise estimated bias is comparatively large, affects voice de-noising quality by unquestionable.The present embodiment carries out noise estimation on the basis of end-point detection, end-point detection refers to from one section and comprises the starting point and terminal of determining voice the signal of voice, thus from continuous recording Noisy Speech Signal, isolate our real interested voice signal.The present embodiment divides into mute frame and speech frame by end-point detection wanting the speech frame of noise reduction.At mute frame, current spectrum is exactly noise spectrum, at speech frame, uses average noise power spectrum as noise power spectrum estimated value.So just use average noise power spectrum much little as the estimated value error of noise power spectrum than traditional in whole noise reduction interval.
The end-point detecting method of the present embodiment is compared with threshold value at the short-time energy of voice signal, if exceed threshold value, represents current for there being the voice segments of voice, otherwise just represent current quiet section for tone-off.Whole end-point detection flow process is as shown in Figure 1: first arranging an empirical value is threshold value, and the present embodiment is as threshold value using the average noise energy (EMN) of front 30 frame speech frames.Then the energy of each frame is calculated successively: in formula, N is frame length, and n is frame number, and 1≤n≤L, L is frame number, and m is each point in each frame.If the energy value of present frame is more than or equal to threshold value, then shows that present frame is speech frame, if be less than threshold value, be then shown to be noise frame.
Specific implementation step the following detailed description of the noise-reduction method of the present embodiment:
1, pre-filtering is carried out to the voice signal of input;
2, voice signal is carried out framing by every frame 128 signaling points;
3, Hamming window (Haming) is added to signal frame;
4, FFT conversion is carried out to the signal frame after windowing;
5, power spectrum is asked to each speech frame signal;
6, ask for average noise power spectrum according to front 30 frames;
7, utilize end-point detection to carry out noise and estimate to detect mute frame.If be in mute frame, then use the estimated value of power spectrum as noise power spectrum of present frame, if be in speech frame, then compose the estimated value (other common methods also can be adopted to carry out calculating average noise power spectrum) as noise power spectrum with the average noise power that the 6th step calculates;
8, median smoothing process is carried out to noise spectrum estimation value, eliminates wild point, make noise spectrum estimation value more level and smooth;
9, carrying out spectrum-subtraction computing, by treating that the phonetic speech power spectrum of noise reduction deducts noise power spectrum estimated value, obtaining the spectrum of the phonetic speech power after noise reduction;
10, insert the voice phase spectrum before noise reduction, calculate speech manual;
11, carry out IFFT conversion, reduction obtains the speech frame after noise reduction;
12, be combined as the voice signal after noise reduction according to each speech frame.
The present embodiment has also carried out emulation experiment, three kinds of representational situations are chosen in experiment: talk in microcomputer room, minimum shelves got to by fan simultaneously of talking on fan side, fan simultaneously of talking on fan side is got to middle-grade, under these three kinds of situations, use audio collecting device (not with decrease of noise functions) to gather the original noisy speech PCM data of 2 minutes with the sampling rate of 8K respectively, then traditionally spectrum-subtraction and the present embodiment method carry out noise reduction simulation process, obtain the data after the data after traditional spectrum-subtraction process and the process of the present embodiment method.Comparand it is found that, under these three kinds of situations, no matter be from figure or from acoustically, the noise reduction of the present embodiment method is all good than traditional spectrum-subtraction.
One of ordinary skill in the art will appreciate that: all or part of step realizing said method embodiment can have been come by the hardware that programmed instruction is relevant, aforesaid program can be stored in a computer read/write memory medium, this program, when performing, performs the step comprising said method embodiment; And aforesaid storage medium comprises: ROM, RAM, magnetic disc or CD etc. various can be program code stored medium.
Last it is noted that above embodiment is only in order to illustrate technical scheme of the present invention, be not intended to limit; Although with reference to previous embodiment to invention has been detailed description, those of ordinary skill in the art is to be understood that: it still can be modified to the technical scheme described in foregoing embodiments, or carries out equivalent replacement to wherein portion of techniques feature; And these amendments or replacement, do not make the essence of appropriate technical solution depart from the spirit and scope of various embodiments of the present invention technical scheme.

Claims (6)

1. a voice de-noising method, is characterized in that, said method comprising the steps of:
A, divides into mute frame and speech frame by end-point detection by speech frame;
B, for mute frame, calculate the power spectral value of present frame as noise power spectrum estimated value, for speech frame, calculating average noise power spectrum is as noise power spectrum estimated value;
C, deducts noise power spectrum estimated value by the power spectrum of speech frame, obtains the spectrum of the phonetic speech power after noise reduction;
D, draws the speech frame after noise reduction according to the phonetic speech power spectrum after noise reduction.
2. method according to claim 1, is characterized in that, step a is specially: the energy calculating each speech frame, if be more than or equal to threshold value, is then speech frame, if be less than threshold value, is then mute frame.
3. method according to claim 2, is characterized in that, using the average noise energy of front 30 frame speech frames as described threshold value.
4. method according to claim 1, is characterized in that step b, is composed by the average noise energy of front 30 frame speech frames as described average noise power.
5. the method according to claim 1 or 4, is characterized in that step b, the also smoothing process of described noise spectrum estimation value.
6. method according to claim 1, is characterized in that steps d, utilizes the phase spectrum of speech frame before noise reduction, calculates the speech manual after noise reduction, and then obtain the speech frame after noise reduction according to the phonetic speech power spectrum after noise reduction.
CN201410076957.4A 2014-02-26 2014-02-26 Voice noise-reducing method Pending CN104867497A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410076957.4A CN104867497A (en) 2014-02-26 2014-02-26 Voice noise-reducing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410076957.4A CN104867497A (en) 2014-02-26 2014-02-26 Voice noise-reducing method

Publications (1)

Publication Number Publication Date
CN104867497A true CN104867497A (en) 2015-08-26

Family

ID=53913289

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410076957.4A Pending CN104867497A (en) 2014-02-26 2014-02-26 Voice noise-reducing method

Country Status (1)

Country Link
CN (1) CN104867497A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105847857A (en) * 2016-03-07 2016-08-10 乐视致新电子科技(天津)有限公司 Method and device for processing audios when video is played in double speed
CN105989848A (en) * 2015-01-30 2016-10-05 上海西门子医疗器械有限公司 Noise reduction device and medical apparatus
CN106486131A (en) * 2016-10-14 2017-03-08 上海谦问万答吧云计算科技有限公司 A kind of method and device of speech de-noising
CN106909686A (en) * 2017-03-06 2017-06-30 吉林省盛创科技有限公司 A kind of man-machine interaction builds user's portrait cluster calculation method
CN107123419A (en) * 2017-05-18 2017-09-01 北京大生在线科技有限公司 The optimization method of background noise reduction in the identification of Sphinx word speeds
WO2019227590A1 (en) * 2018-05-29 2019-12-05 平安科技(深圳)有限公司 Voice enhancement method, apparatus, computer device, and storage medium
CN110689901A (en) * 2019-09-09 2020-01-14 苏州臻迪智能科技有限公司 Voice noise reduction method and device, electronic equipment and readable storage medium
CN110797041A (en) * 2019-10-21 2020-02-14 珠海市杰理科技股份有限公司 Voice noise reduction processing method and device, computer equipment and storage medium
CN112820309A (en) * 2020-12-31 2021-05-18 北京天润融通科技股份有限公司 RNN-based noise reduction processing method and system
CN115966206A (en) * 2022-11-23 2023-04-14 中创科技(广州)有限公司 Intelligent picture generation method, device, equipment and medium for AI voice recognition

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6804640B1 (en) * 2000-02-29 2004-10-12 Nuance Communications Signal noise reduction using magnitude-domain spectral subtraction
CN101866652A (en) * 2010-05-11 2010-10-20 天津大学 Voice de-noising method
CN101968957A (en) * 2010-10-28 2011-02-09 哈尔滨工程大学 Voice detection method under noise condition
CN102054482A (en) * 2009-10-27 2011-05-11 中国移动通信集团公司 Method and device for enhancing voice signal
CN102750956A (en) * 2012-06-18 2012-10-24 歌尔声学股份有限公司 Method and device for removing reverberation of single channel voice
CN103531204A (en) * 2013-10-11 2014-01-22 深港产学研基地 Voice enhancing method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6804640B1 (en) * 2000-02-29 2004-10-12 Nuance Communications Signal noise reduction using magnitude-domain spectral subtraction
CN102054482A (en) * 2009-10-27 2011-05-11 中国移动通信集团公司 Method and device for enhancing voice signal
CN101866652A (en) * 2010-05-11 2010-10-20 天津大学 Voice de-noising method
CN101968957A (en) * 2010-10-28 2011-02-09 哈尔滨工程大学 Voice detection method under noise condition
CN102750956A (en) * 2012-06-18 2012-10-24 歌尔声学股份有限公司 Method and device for removing reverberation of single channel voice
CN103531204A (en) * 2013-10-11 2014-01-22 深港产学研基地 Voice enhancing method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
彭利华: "《硕士学位论文》", 12 November 2007, 华中科技大学 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105989848A (en) * 2015-01-30 2016-10-05 上海西门子医疗器械有限公司 Noise reduction device and medical apparatus
CN105847857A (en) * 2016-03-07 2016-08-10 乐视致新电子科技(天津)有限公司 Method and device for processing audios when video is played in double speed
CN106486131A (en) * 2016-10-14 2017-03-08 上海谦问万答吧云计算科技有限公司 A kind of method and device of speech de-noising
CN106909686A (en) * 2017-03-06 2017-06-30 吉林省盛创科技有限公司 A kind of man-machine interaction builds user's portrait cluster calculation method
CN107123419A (en) * 2017-05-18 2017-09-01 北京大生在线科技有限公司 The optimization method of background noise reduction in the identification of Sphinx word speeds
WO2019227590A1 (en) * 2018-05-29 2019-12-05 平安科技(深圳)有限公司 Voice enhancement method, apparatus, computer device, and storage medium
CN110689901A (en) * 2019-09-09 2020-01-14 苏州臻迪智能科技有限公司 Voice noise reduction method and device, electronic equipment and readable storage medium
CN110797041A (en) * 2019-10-21 2020-02-14 珠海市杰理科技股份有限公司 Voice noise reduction processing method and device, computer equipment and storage medium
CN112820309A (en) * 2020-12-31 2021-05-18 北京天润融通科技股份有限公司 RNN-based noise reduction processing method and system
CN115966206A (en) * 2022-11-23 2023-04-14 中创科技(广州)有限公司 Intelligent picture generation method, device, equipment and medium for AI voice recognition

Similar Documents

Publication Publication Date Title
CN104867497A (en) Voice noise-reducing method
CN106373587B (en) Automatic acoustic feedback detection and removing method in a kind of real-time communication system
JP5870476B2 (en) Noise estimation device, noise estimation method, and noise estimation program
JP6793706B2 (en) Methods and devices for detecting audio signals
KR101737824B1 (en) Method and Apparatus for removing a noise signal from input signal in a noisy environment
EP3689002A2 (en) Howl detection in conference systems
JP5752324B2 (en) Single channel suppression of impulsive interference in noisy speech signals.
US11756564B2 (en) Deep neural network based speech enhancement
CN104867499A (en) Frequency-band-divided wiener filtering and de-noising method used for hearing aid and system thereof
US9002030B2 (en) System and method for performing voice activity detection
EP3413310B1 (en) Acoustic meaningful signal detection in wind noise
EP2689419B1 (en) Method and arrangement for damping dominant frequencies in an audio signal
Hu et al. Techniques for estimating the ideal binary mask
US11183172B2 (en) Detection of fricatives in speech signals
Upadhyay et al. Spectral subtractive-type algorithms for enhancement of noisy speech: an integrative review
Erkelens et al. Speech enhancement based on Rayleigh mixture modeling of speech spectral amplitude distributions
US10600432B1 (en) Methods for voice enhancement
Liu et al. An improved spectral subtraction method
Borsky et al. Noise and channel normalized cepstral features for far-speech recognition
Sun et al. A variable momentum factor algorithm for a priori SNR estimation in speech enhancement
Samui et al. Two-Stage Temporal Processing for Single-Channel Speech Enhancement.
CN113409812B (en) Processing method and device of voice noise reduction training data and training method
Guo et al. Research on voice activity detection in burst and partial duration noisy environment
Alam et al. Speech enhancement based on a hybrid a priori signal-to-noise ratio (SNR) estimator and a self-adaptive Lagrange multiplier
CN116913308A (en) Single-channel voice enhancement method for balancing noise reduction amount and voice quality

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20150826

WD01 Invention patent application deemed withdrawn after publication