CN110933235B - Noise identification method in intelligent calling system based on machine learning - Google Patents

Noise identification method in intelligent calling system based on machine learning Download PDF

Info

Publication number
CN110933235B
CN110933235B CN201911077584.1A CN201911077584A CN110933235B CN 110933235 B CN110933235 B CN 110933235B CN 201911077584 A CN201911077584 A CN 201911077584A CN 110933235 B CN110933235 B CN 110933235B
Authority
CN
China
Prior art keywords
signal
noise
signals
slice
flatness
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911077584.1A
Other languages
Chinese (zh)
Other versions
CN110933235A (en
Inventor
伍林
尹朝阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Zhexin Information Technology Co ltd
Original Assignee
Hangzhou Zhexin Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Zhexin Information Technology Co ltd filed Critical Hangzhou Zhexin Information Technology Co ltd
Priority to CN201911077584.1A priority Critical patent/CN110933235B/en
Publication of CN110933235A publication Critical patent/CN110933235A/en
Application granted granted Critical
Publication of CN110933235B publication Critical patent/CN110933235B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/18Automatic or semi-automatic exchanges with means for reducing interference or noise; with means for reducing effects due to line faults with means for protecting lines
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention discloses a noise identification method in an intelligent calling system based on machine learning, which comprises the following steps: slicing the telephone signal, normalizing and framing preprocessing; extracting MFCC characteristics from the sliced signals after framing and carrying out averaging processing; inputting the averaged MFCC characteristics into a machine learning classifier for model training, and taking the trained classification model as a noise classification model; slicing the newly added telephone signal; carrying out normalization and framing pretreatment on the slice signals; carrying out primary screening on the frequency spectrum flatness of the sliced signals after framing; extracting MFCC characteristics and then averaging; and inputting the MFCC characteristics after the averaging processing of each segment of signals into a noise classification model for identification. The invention has the beneficial effects that: by identifying the signal as human voice or noise based on a machine-learned classification model, a large number of noise signals in the telephony signal can be removed, thereby reducing the error rate of the signal being sent to ASR for translation into text.

Description

Noise identification method in intelligent calling system based on machine learning
Technical Field
The invention relates to the technical field of audio processing, in particular to a noise identification method in an intelligent calling system based on machine learning.
Background
In existing intelligent call systems, the telephone signal is intercepted by the VAD and sent to the ASR for conversion into text. Due to the complexity of the background, there are a large number of noise segments. The general processing method is to filter the signal by using a noise suppression method before signal interception, and estimate the noise mainly based on the frequency distribution of the signal, and the commonly used algorithms include an adaptive filter, a spectral subtraction method, a wiener filtering method and the like. The self-adaptive filter automatically adjusts the current filter parameter by using the filter parameter obtained at the previous moment so as to adapt to the statistical characteristic of random variation of signals and noise, thereby realizing noise filtering; the spectral subtraction mainly removes the frequency spectrum of noise in a frequency domain, and then restores a frequency domain signal into a time domain signal through inverse Fourier transform; the wiener filtering method mainly removes noise by designing a digital filter. These noise suppression methods can only filter a part of the noise, but cannot completely remove the intercepted noise segment, and as the signal-to-noise ratio in the telephone signal decreases, the noise reduction effect is worsened, and audio distortion due to excessive attenuation occurs in some time intervals.
Disclosure of Invention
In order to solve the above problems, an object of the present invention is to provide a noise recognition method in an intelligent calling system based on machine learning, which can remove a large amount of noise signals in a telephone signal by recognizing whether the signal is a human voice or noise based on a classification model based on machine learning, thereby reducing an error rate of the signal being sent to an ASR to be translated into characters.
The invention provides a noise identification method in an intelligent calling system based on machine learning, which comprises the following steps:
step 1, taking the sampled telephone signals as training data, and establishing a noise classification model based on machine learning:
step 101, slicing the telephone signal, and carrying out normalization and framing pretreatment on the sliced signal;
102, extracting MFCC characteristics of the sliced signals after framing, and averaging the extracted MFCC characteristics;
step 103, inputting the averaged MFCC features into a machine learning classifier for model training, and taking the trained classification model as a noise classification model;
and 2, inputting the newly added telephone signal into a specific noise classification model by using the established noise classification model to obtain a noise identification result:
step 201, slicing the newly added telephone signal;
step 202, carrying out normalization and framing pretreatment on the slice signals;
step 203, carrying out primary screening on the frequency spectrum flatness of the sliced signals after framing;
step 204, dividing the framing signals into odd-numbered segments after primary screening of the spectrum flatness, and respectively extracting the MFCC characteristics of each segment of signals and then carrying out averaging processing;
step 205 inputs the MFCC features of each segment of signal averaging into the noise classification model for identification, and identifies the noise in the slice signal.
As a further improvement of the invention, during preprocessing, the normalization processing is carried out by adopting the formula (1), the slice signals are uniformly quantized by 16 bits, the value range is between-65535 and 65535, and the signals are normalized to be between-1 and 1 by dividing the maximum value of the absolute value of the signals;
Figure GDA0003117019360000021
where x is the slice signal to be processed, | x | is the absolute value of the slice signal,
Figure GDA0003117019360000022
is normalized slice signal.
As a further improvement of the invention, when the slice signal is processed by framing, the frame length is 30ms, and the frame shift is 10 ms.
As a further improvement of the present invention, step 203 specifically includes: extracting the spectral flatness characteristics of each frame of slice signals, and averaging the extracted spectral flatness characteristics, namely, the average flatness; setting a flatness threshold value of the average flatness, if the average flatness of the slice signal is higher than the flatness threshold value, judging the slice signal to be noise, and directly discarding the noise; and if the average flatness of the slice signal is lower than the flatness threshold value, carrying out the next processing on the slice signal.
As a further improvement of the present invention, the flatness threshold value flatness is 0.13.
As a further improvement of the present invention, in averaging the extracted MFCC features, for each dimension data, averaging is performed in various dimensions based on all frames according to formula (2);
Figure GDA0003117019360000031
in the formula, y is an average value of the MFCC features in each dimension, M is the dimension of the MFCC features, and N is the number of frames of the slice signal after framing.
As a further improvement of the present invention, in step 205, the recognition result of each slice signal is given a mode, and if the ratio of the recognized noise is high, the inputted slice signal is determined as noise, otherwise, the inputted slice signal is determined as human voice.
As a further improvement of the present invention, in step S1, the time length of each segment is 0.5S, and the segment shift is 0.25S.
As a further improvement of the present invention, the slice signal is divided into a human voice signal and a noise signal, a threshold of the human voice signal is set to be 0.2, and in step 205, when the probability of the slice signal to be identified passing through the classification model is greater than the threshold, the slice signal is determined to be the human voice signal.
As a further improvement of the invention, the machine learning classifier is one of a random forest classifier, an SVM classifier and an XGboost classifier.
The invention has the beneficial effects that:
1. the noise identification method of the invention removes a large amount of noise signals in the telephone signals by modeling a large amount of telephone signals in the intelligent outbound system and identifying the noise, thereby reducing the error rate of the signals which are sent to ASR and translated into characters;
2. in the noise identification process, the noise identification method screens obvious noise signals based on the spectrum flatness, so that the workload of subsequent identification is reduced;
3. in the noise identification process, the method of dividing the signal into odd sections for testing and selecting the mode of the identification result is adopted, so that the identification accuracy of the slice signal can be effectively improved, and the voice is prevented from being deleted by mistake.
Drawings
Fig. 1 is a flowchart illustrating a noise identification method in an intelligent calling system based on machine learning according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail below with reference to specific embodiments and with reference to the attached drawings.
As shown in fig. 1, a noise identification method in an intelligent calling system based on machine learning according to an embodiment of the present invention includes:
step 1, taking the sampled telephone signals as training data, and establishing a noise classification model based on machine learning. The step 1 specifically comprises:
step 101, slicing the telephone signal, namely VAD slicing, and performing normalization and framing preprocessing on the sliced signal.
Because the volume of the slice signals is different, the volume of some signals is larger, the sound of some signals is lighter, and the normalization processing of the telephone signals is beneficial to improving the recognition rate. During preprocessing, normalization processing is carried out by adopting an equation (1), slice signals are uniformly quantized by 16 bits, the value range is between-65535 and 65535, and the signals are normalized to be between-1 and 1 by dividing the maximum value of the absolute value of the signals;
Figure GDA0003117019360000041
where x is the slice signal to be processed, | x | is the absolute value of the slice signal,
Figure GDA0003117019360000042
is normalized slice signal.
After the slice signal is normalized, because the frequency contour of the slice signal is lost along with the time, the slice signal needs to be subjected to framing processing, and each frame of obtained signal can be used as a stable signal for extracting frequency domain features. When the slice signal is processed by framing, the frame length is 30ms, and the frame shift is 10 ms.
And step 102, extracting MFCC characteristics from the sliced signals after framing, and performing averaging processing on the extracted MFCC characteristics.
Since the MFCC features are relatively consistent with the auditory characteristics of human ears, and can be used as representative features of a machine learning classifier, MFCC features need to be extracted from the preprocessed slice signals.
Since the MFCC features are relatively consistent with the auditory characteristics of human ears, and can be used as representative features of a machine learning classifier, MFCC features need to be extracted from the preprocessed slice signals. However, since the slice signals have different lengths and the number of frames obtained is different, the extracted MFCC features also need to be averaged. When averaging the extracted MFCC features, averaging is performed on each dimension data based on all frames according to formula (2);
Figure GDA0003117019360000043
in the formula, y is an average value of the MFCC features in each dimension, M is the dimension of the MFCC features, and N is the number of frames of the slice signal after framing.
In the present invention, M is 39.
Of course, besides extracting MFCC features, other acoustic features, such as short-time energy, zero-crossing rate, pitch, etc., may also be extracted, or a series of features may be combined for use by the classification model.
And 103, inputting the averaged MFCC features into a machine learning classifier for model training, and taking the trained classification model as a noise classification model.
The machine learning classifier is one of a random forest classifier, an SVM classifier and an XGboost classifier. Of course, the present invention is not limited to the above examples, and other classification learners can be applied to the present invention.
And 2, inputting the newly added telephone signal into a specific noise classification model by using the established noise classification model to obtain a noise identification result. The step 2 specifically comprises:
step 201, slicing the added telephone signal.
Step 202, performing normalization and framing preprocessing on the slice signals.
Because the volume of the slice signals is different, the volume of some signals is larger, the sound of some signals is lighter, and the normalization processing of the telephone signals is beneficial to improving the recognition rate. During preprocessing, normalization processing is carried out by adopting an equation (1), slice signals are uniformly quantized by 16 bits, the value range is between-65535 and 65535, and the signals are normalized to be between-1 and 1 by dividing the maximum value of the absolute value of the signals;
Figure GDA0003117019360000051
where x is the slice signal to be processed, | x | is the absolute value of the slice signal,
Figure GDA0003117019360000052
is normalized slice signal.
After the slice signal is normalized, because the frequency contour of the slice signal is lost along with the time, the slice signal needs to be subjected to framing processing, and each frame of obtained signal can be used as a stable signal for extracting frequency domain features. When the slice signal is processed by framing, the frame length is 30ms, and the frame shift is 10 ms.
And step 203, performing primary screening on the frequency spectrum flatness of the sliced signals after the framing.
After the framing processing, because the speech spectrum tends to have peaks in fundamental frequency and harmonic wave, and the noise spectrum is relatively flat, the signal spectrum flatness can be used to distinguish human voice from noise. Step 203 specifically includes: extracting the spectral flatness characteristics of each frame of slice signal, averaging the extracted spectral flatness characteristics, namely average flatness, setting a threshold value of the average flatness, and if the average flatness of the slice signal is higher than the threshold value, judging the slice signal to be noise and directly discarding the noise; and if the average flatness of the slice signals is lower than the threshold value, carrying out the next processing on the preprocessed slice signals.
The flatness threshold value of the present invention is set to 0.13.
And step 204, dividing the framing signals into odd-numbered segments after primary screening of the spectrum flatness, and performing averaging processing after respectively extracting the MFCC characteristics of each segment of signals.
The invention divides the longer slice signal into odd segments, each segment has a time length of 0.5s, and the segment shift is 0.25 s.
And step 205, inputting each section of slice signals subjected to averaging processing into a noise classification model for identification, and identifying noise in the slice signals.
And taking a mode of the identification result of each section of slice signal, if the identification result is that the proportion of the noise is high, determining that the input slice signal is the noise, and otherwise, determining that the input slice signal is the human voice. Because the slice signal contains both human voice and noise, the processing of step 205 can effectively improve the accuracy of signal identification.
Further, since the slice signal is divided into a human voice signal and a noise signal, the human voice signal threshold is set to be 0.2, and in step 205, when the probability that the slice signal to be identified passes through the noise classification model is greater than the threshold, the slice signal is determined to be the human voice signal, otherwise, the slice signal is discarded as the noise. The method can improve the voice recall rate to 99 percent and avoid deleting the voice by mistake.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (9)

1. A noise recognition method in an intelligent calling system based on machine learning, comprising:
step 1, taking the sampled telephone signals as training data, and establishing a noise classification model based on machine learning:
step 101, slicing the telephone signal, and carrying out normalization and framing pretreatment on the sliced signal;
102, extracting MFCC characteristics of the sliced signals after framing, and averaging the extracted MFCC characteristics;
step 103, inputting the averaged MFCC features into a machine learning classifier for model training, and taking the trained classification model as a noise classification model;
and 2, inputting the newly added telephone signal into a specific noise classification model by using the established noise classification model to obtain a noise identification result:
step 201, slicing the newly added telephone signal;
step 202, carrying out normalization and framing pretreatment on the slice signals;
step 203, carrying out primary screening on the frequency spectrum flatness of the sliced signals after framing;
step 204, dividing the framing signals into odd-numbered segments after primary screening of the spectrum flatness, and respectively extracting the MFCC characteristics of each segment of signals and then carrying out averaging processing;
step 205, inputting the MFCC features of each segment of signal averaging into a noise classification model for identification, and identifying the noise in the slice signal, wherein the mode is taken for the identification result of each segment of slice signal, if the proportion of the identified noise is high, the input slice signal is determined as noise, otherwise, the input slice signal is human voice.
2. The noise recognition method in the intelligent calling system based on machine learning of claim 1, wherein in the preprocessing, the normalization processing is performed by using formula (1), the slice signals are uniformly quantized by 16 bits, the value range is-65535 to 65535, and the signals are normalized to-1 to 1 by dividing the maximum value of the absolute value of the signals;
Figure FDA0003117019350000011
wherein x is the desired treatmentIs the absolute value of the slice signal, | x | is the absolute value of the slice signal,
Figure FDA0003117019350000012
is normalized slice signal.
3. The method of claim 1, wherein the frame length of the slice signal is 30ms and the frame shift is 10ms during the framing process.
4. The method of claim 1, wherein step 203 specifically comprises: extracting the spectral flatness characteristics of each frame of slice signals, and averaging the extracted spectral flatness characteristics, namely, the average flatness; setting a flatness threshold value of the average flatness, if the average flatness of the slice signal is higher than the flatness threshold value, judging the slice signal to be noise, and directly discarding the noise; and if the average flatness of the slice signal is lower than the flatness threshold value, carrying out the next processing on the slice signal.
5. The method of claim 4, wherein the flatness threshold value of flatness is 0.13.
6. The noise recognition method in a machine learning-based smart call system as claimed in claim 1, wherein, in averaging the extracted MFCC features, for each dimension data, averaging is performed in various dimensions based on all frames according to formula (2);
Figure FDA0003117019350000021
in the formula, y is an average value of the MFCC features in each dimension, M is the dimension of the MFCC features, and N is the number of frames of the slice signal after framing.
7. The method for recognizing noise in a machine learning-based intelligent calling system according to claim 1, wherein the time duration of each segment is 0.5S, and the segment shift is 0.25S in step S1.
8. The method as claimed in claim 4, wherein the slice signal is divided into a vocal signal and a noise signal, a threshold of the vocal signal is set to 0.2, and in step 205, when the probability of the slice signal to be recognized passing through the classification model is greater than the threshold, the slice signal is determined to be the vocal signal.
9. The method of claim 1, wherein the machine learning classifier is one of a random forest classifier, an SVM classifier, and an XGboost classifier.
CN201911077584.1A 2019-11-06 2019-11-06 Noise identification method in intelligent calling system based on machine learning Active CN110933235B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911077584.1A CN110933235B (en) 2019-11-06 2019-11-06 Noise identification method in intelligent calling system based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911077584.1A CN110933235B (en) 2019-11-06 2019-11-06 Noise identification method in intelligent calling system based on machine learning

Publications (2)

Publication Number Publication Date
CN110933235A CN110933235A (en) 2020-03-27
CN110933235B true CN110933235B (en) 2021-07-27

Family

ID=69853364

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911077584.1A Active CN110933235B (en) 2019-11-06 2019-11-06 Noise identification method in intelligent calling system based on machine learning

Country Status (1)

Country Link
CN (1) CN110933235B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111028852A (en) * 2019-11-06 2020-04-17 杭州哲信信息技术有限公司 Noise removing method in intelligent calling system based on CNN

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1992012583A1 (en) * 1991-01-04 1992-07-23 Picturetel Corporation Adaptive acoustic echo canceller
US5684921A (en) * 1995-07-13 1997-11-04 U S West Technologies, Inc. Method and system for identifying a corrupted speech message signal
CN104867499A (en) * 2014-12-26 2015-08-26 深圳市微纳集成电路与系统应用研究院 Frequency-band-divided wiener filtering and de-noising method used for hearing aid and system thereof
CN106024002A (en) * 2015-02-11 2016-10-12 恩智浦有限公司 Time zero convergence single microphone noise reduction
CN106548786A (en) * 2015-09-18 2017-03-29 广州酷狗计算机科技有限公司 A kind of detection method and system of voice data
CN107068161A (en) * 2017-04-14 2017-08-18 百度在线网络技术(北京)有限公司 Voice de-noising method, device and computer equipment based on artificial intelligence
CN107609488A (en) * 2017-08-21 2018-01-19 哈尔滨工程大学 A kind of ship noise method for identifying and classifying based on depth convolutional network
CN108172234A (en) * 2017-12-25 2018-06-15 天津天地伟业电子工业制造有限公司 A kind of audio-frequency noise detection method based on SVM
CN109256150A (en) * 2018-10-12 2019-01-22 北京创景咨询有限公司 Speech emotion recognition system and method based on machine learning
CN109448746A (en) * 2018-09-28 2019-03-08 百度在线网络技术(北京)有限公司 Voice de-noising method and device
CN109616135A (en) * 2018-11-14 2019-04-12 腾讯音乐娱乐科技(深圳)有限公司 Audio-frequency processing method, device and storage medium
CN109616142A (en) * 2013-03-26 2019-04-12 杜比实验室特许公司 Device and method for audio classification and processing
CN109671425A (en) * 2018-12-29 2019-04-23 广州酷狗计算机科技有限公司 Audio frequency classification method, device and storage medium
CN109767785A (en) * 2019-03-06 2019-05-17 河北工业大学 Ambient noise method for identifying and classifying based on convolutional neural networks
CN110164472A (en) * 2019-04-19 2019-08-23 天津大学 Noise classification method based on convolutional neural networks
CN110290280A (en) * 2019-05-28 2019-09-27 同盾控股有限公司 A kind of recognition methods of the SOT state of termination, device and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109074822B (en) * 2017-10-24 2023-04-21 深圳和而泰智能控制股份有限公司 Specific voice recognition method, apparatus and storage medium
CN109215665A (en) * 2018-07-20 2019-01-15 广东工业大学 A kind of method for recognizing sound-groove based on 3D convolutional neural networks

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1992012583A1 (en) * 1991-01-04 1992-07-23 Picturetel Corporation Adaptive acoustic echo canceller
US5684921A (en) * 1995-07-13 1997-11-04 U S West Technologies, Inc. Method and system for identifying a corrupted speech message signal
CN109616142A (en) * 2013-03-26 2019-04-12 杜比实验室特许公司 Device and method for audio classification and processing
CN104867499A (en) * 2014-12-26 2015-08-26 深圳市微纳集成电路与系统应用研究院 Frequency-band-divided wiener filtering and de-noising method used for hearing aid and system thereof
CN106024002A (en) * 2015-02-11 2016-10-12 恩智浦有限公司 Time zero convergence single microphone noise reduction
CN106548786A (en) * 2015-09-18 2017-03-29 广州酷狗计算机科技有限公司 A kind of detection method and system of voice data
CN107068161A (en) * 2017-04-14 2017-08-18 百度在线网络技术(北京)有限公司 Voice de-noising method, device and computer equipment based on artificial intelligence
CN107609488A (en) * 2017-08-21 2018-01-19 哈尔滨工程大学 A kind of ship noise method for identifying and classifying based on depth convolutional network
CN108172234A (en) * 2017-12-25 2018-06-15 天津天地伟业电子工业制造有限公司 A kind of audio-frequency noise detection method based on SVM
CN109448746A (en) * 2018-09-28 2019-03-08 百度在线网络技术(北京)有限公司 Voice de-noising method and device
CN109256150A (en) * 2018-10-12 2019-01-22 北京创景咨询有限公司 Speech emotion recognition system and method based on machine learning
CN109616135A (en) * 2018-11-14 2019-04-12 腾讯音乐娱乐科技(深圳)有限公司 Audio-frequency processing method, device and storage medium
CN109671425A (en) * 2018-12-29 2019-04-23 广州酷狗计算机科技有限公司 Audio frequency classification method, device and storage medium
CN109767785A (en) * 2019-03-06 2019-05-17 河北工业大学 Ambient noise method for identifying and classifying based on convolutional neural networks
CN110164472A (en) * 2019-04-19 2019-08-23 天津大学 Noise classification method based on convolutional neural networks
CN110290280A (en) * 2019-05-28 2019-09-27 同盾控股有限公司 A kind of recognition methods of the SOT state of termination, device and storage medium

Also Published As

Publication number Publication date
CN110933235A (en) 2020-03-27

Similar Documents

Publication Publication Date Title
CN110379412B (en) Voice processing method and device, electronic equipment and computer readable storage medium
EP3866165B1 (en) Method for enhancing telephone speech signals based on convolutional neural networks
EP3038106B1 (en) Audio signal enhancement
GB2374265A (en) Speech processing device and speech processing method
CN101354889A (en) Method and apparatus for tonal modification of voice
CN110277087B (en) Pre-judging preprocessing method for broadcast signals
EP3413310B1 (en) Acoustic meaningful signal detection in wind noise
Nuthakki et al. Single channel speech enhancement using a new binary mask in power spectral domain
CN110933235B (en) Noise identification method in intelligent calling system based on machine learning
KR101295727B1 (en) Apparatus and method for adaptive noise estimation
CN111028852A (en) Noise removing method in intelligent calling system based on CNN
CN116312561A (en) Method, system and device for voice print recognition, authentication, noise reduction and voice enhancement of personnel in power dispatching system
CN114333912B (en) Voice activation detection method, device, electronic equipment and storage medium
CN113593599A (en) Method for removing noise signal in voice signal
CN113948088A (en) Voice recognition method and device based on waveform simulation
CN113593590A (en) Method for suppressing transient noise in voice
EP2063420A1 (en) Method and assembly to enhance the intelligibility of speech
Sanam et al. A combination of semisoft and μ-law thresholding functions for enhancing noisy speech in wavelet packet domain
CN110070874A (en) A kind of voice de-noising method and device for Application on Voiceprint Recognition
Liu et al. Speech enhancement based on the integration of fully convolutional network, temporal lowpass filtering and spectrogram masking
Premananda et al. Selective frequency enhancement of speech signal for intelligibility improvement in presence of near-end noise
Lu et al. Reduction of residual noise using directional median filter
Gbadamosi et al. Development of non-parametric noise reduction algorithm for GSM voice signal
Zehtabian et al. Optimized singular vector denoising approach for speech enhancement
CN114664310B (en) Silent attack classification promotion method based on attention enhancement filtering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant