CN110933235A - Noise removing method in intelligent calling system based on machine learning - Google Patents

Noise removing method in intelligent calling system based on machine learning Download PDF

Info

Publication number
CN110933235A
CN110933235A CN201911077584.1A CN201911077584A CN110933235A CN 110933235 A CN110933235 A CN 110933235A CN 201911077584 A CN201911077584 A CN 201911077584A CN 110933235 A CN110933235 A CN 110933235A
Authority
CN
China
Prior art keywords
noise
signal
signals
slice
flatness
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911077584.1A
Other languages
Chinese (zh)
Other versions
CN110933235B (en
Inventor
伍林
尹朝阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Zhexin Information Technology Co Ltd
Original Assignee
Hangzhou Zhexin Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Zhexin Information Technology Co Ltd filed Critical Hangzhou Zhexin Information Technology Co Ltd
Priority to CN201911077584.1A priority Critical patent/CN110933235B/en
Publication of CN110933235A publication Critical patent/CN110933235A/en
Application granted granted Critical
Publication of CN110933235B publication Critical patent/CN110933235B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/18Automatic or semi-automatic exchanges with means for reducing interference or noise; with means for reducing effects due to line faults with means for protecting lines
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention discloses a noise removing method in an intelligent calling system based on machine learning, which comprises the following steps: slicing the telephone signal, normalizing and framing preprocessing; extracting MFCC characteristics from the sliced signals after framing and carrying out averaging processing; inputting the averaged MFCC characteristics into a machine learning classifier for model training, and taking the trained classification model as a noise classification model; slicing the newly added telephone signal; carrying out normalization and framing pretreatment on the slice signals; carrying out primary screening on the frequency spectrum flatness of the sliced signals after framing; extracting MFCC characteristics and then averaging; and inputting the MFCC characteristics after the averaging processing of each segment of signals into a noise classification model for identification. The invention has the beneficial effects that: by identifying the signal as human voice or noise based on a machine-learned classification model, a large number of noise signals in the telephony signal can be removed, thereby reducing the error rate of the signal being sent to ASR for translation into text.

Description

Noise removing method in intelligent calling system based on machine learning
Technical Field
The invention relates to the technical field of audio processing, in particular to a noise removal method in an intelligent calling system based on machine learning.
Background
In existing intelligent call systems, the telephone signal is intercepted by the VAD and sent to the ASR for conversion into text. Due to the complexity of the background, there are a large number of noise segments. The general processing method is to filter the signal by using a noise suppression method before signal interception, and estimate the noise mainly based on the frequency distribution of the signal, and the commonly used algorithms include an adaptive filter, a spectral subtraction method, a wiener filtering method and the like. The self-adaptive filter automatically adjusts the current filter parameter by using the filter parameter obtained at the previous moment so as to adapt to the statistical characteristic of random variation of signals and noise, thereby realizing noise filtering; the spectral subtraction mainly removes the frequency spectrum of noise in a frequency domain, and then restores a frequency domain signal into a time domain signal through inverse Fourier transform; the wiener filtering method mainly removes noise by designing a digital filter. These noise suppression methods can only filter a part of the noise, but cannot completely remove the intercepted noise segment, and as the signal-to-noise ratio in the telephone signal decreases, the noise reduction effect is worsened, and audio distortion due to excessive attenuation occurs in some time intervals.
Disclosure of Invention
In order to solve the above problems, an object of the present invention is to provide a noise removing method in an intelligent calling system based on machine learning, which can remove a large amount of noise signals in a telephone signal by identifying whether the signal is a human voice or noise based on a classification model based on machine learning, thereby reducing an error rate of the signal being sent to an ASR to be translated into characters.
The invention provides a noise removing method in an intelligent calling system based on machine learning, which comprises the following steps:
step 1, taking the sampled telephone signals as training data, and establishing a noise classification model based on machine learning:
step 101, slicing the telephone signal, and carrying out normalization and framing pretreatment on the sliced signal;
102, extracting MFCC characteristics of the sliced signals after framing, and averaging the extracted MFCC characteristics;
step 103, inputting the averaged MFCC features into a machine learning classifier for model training, and taking the trained classification model as a noise classification model;
and 2, inputting the newly added telephone signal into a specific noise classification model by using the established noise classification model to obtain a noise identification result:
step 201, slicing the newly added telephone signal;
step 202, carrying out normalization and framing pretreatment on the slice signals;
step 203, carrying out primary screening on the frequency spectrum flatness of the sliced signals after framing;
step 204, dividing the framing signals into odd-numbered segments after primary screening of the spectrum flatness, and respectively extracting the MFCC characteristics of each segment of signals and then carrying out averaging processing;
step 205 inputs the MFCC features of each segment of signal averaging into the noise classification model for identification, and identifies the noise in the slice signal.
As a further improvement of the invention, during preprocessing, the normalization processing is carried out by adopting the formula (1), the slice signals are uniformly quantized by 16 bits, the value range is between-65535 and 65535, and the signals are normalized to be between-1 and 1 by dividing the maximum value of the absolute value of the signals;
Figure BDA0002262960980000021
where x is the slice signal to be processed, | x | is the absolute value of the slice signal,
Figure BDA0002262960980000022
is normalized slice signal.
As a further improvement of the invention, when the slice signal is processed by framing, the frame length is 30ms, and the frame shift is 10 ms.
As a further improvement of the present invention, step 203 specifically includes: extracting the spectral flatness characteristics of each frame of slice signals, and averaging the extracted spectral flatness characteristics, namely, the average flatness; setting a flatness threshold value of the average flatness, if the average flatness of the slice signal is higher than the flatness threshold value, judging the slice signal to be noise, and directly discarding the noise; and if the average flatness of the slice signal is lower than the flatness threshold value, carrying out the next processing on the slice signal.
As a further improvement of the present invention, the flatness threshold value flatness is 0.13.
As a further improvement of the present invention, in averaging the extracted MFCC features, for each dimension data, averaging is performed in various dimensions based on all frames according to formula (2);
Figure BDA0002262960980000031
in the formula, y is an average value of the MFCC features in each dimension, M is the dimension of the MFCC features, and N is the number of frames of the slice signal after framing.
As a further improvement of the present invention, in step 205, the recognition result of each slice signal is given a mode, and if the ratio of the recognized noise is high, the inputted slice signal is determined as noise, otherwise, the inputted slice signal is determined as human voice.
As a further improvement of the present invention, in step S1, the time length of each segment is 0.5S, and the segment shift is 0.25S.
As a further improvement of the present invention, the slice signal is divided into a human voice signal and a noise signal, a threshold of the human voice signal is set to be 0.2, and in step 205, when the probability of the slice signal to be identified passing through the classification model is greater than the threshold, the slice signal is determined to be the human voice signal.
As a further improvement of the invention, the machine learning classifier is one of a random forest classifier, an SVM classifier and an XGboost classifier.
The invention has the beneficial effects that:
1. the noise removing method of the invention removes a large amount of noise signals in the telephone signals by modeling a large amount of telephone signals in the intelligent outbound system and identifying the noise, thereby reducing the error rate of the signals which are sent to ASR and translated into characters;
2. in the noise identification process, the noise removal method screens obvious noise signals based on the spectrum flatness, so that the workload of subsequent identification is reduced;
3. in the noise identification process, the noise removal method adopts a method of testing the signals by odd sections and taking the mode of the identification result, so that the identification accuracy of the slice signals can be effectively improved, and the voice is prevented from being deleted by mistake.
Drawings
Fig. 1 is a flowchart illustrating a noise removing method in an intelligent calling system based on machine learning according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail below with reference to specific embodiments and with reference to the attached drawings.
As shown in fig. 1, a method for removing noise in an intelligent calling system based on machine learning according to an embodiment of the present invention includes:
step 1, taking the sampled telephone signals as training data, and establishing a noise classification model based on machine learning. The step 1 specifically comprises:
step 101, slicing the telephone signal, namely VAD slicing, and performing normalization and framing preprocessing on the sliced signal.
Because the volume of the slice signals is different, the volume of some signals is larger, the sound of some signals is lighter, and the normalization processing of the telephone signals is beneficial to improving the recognition rate. During preprocessing, normalization processing is carried out by adopting an equation (1), slice signals are uniformly quantized by 16 bits, the value range is between-65535 and 65535, and the signals are normalized to be between-1 and 1 by dividing the maximum value of the absolute value of the signals;
Figure BDA0002262960980000041
where x is the slice signal to be processed, | x | is the absolute value of the slice signal,
Figure BDA0002262960980000042
is normalized slice signal.
After the slice signal is normalized, because the frequency contour of the slice signal is lost along with the time, the slice signal needs to be subjected to framing processing, and each frame of obtained signal can be used as a stable signal for extracting frequency domain features. When the slice signal is processed by framing, the frame length is 30ms, and the frame shift is 10 ms.
And step 102, extracting MFCC characteristics from the sliced signals after framing, and performing averaging processing on the extracted MFCC characteristics.
Since the MFCC features are relatively consistent with the auditory characteristics of human ears, and can be used as representative features of a machine learning classifier, MFCC features need to be extracted from the preprocessed slice signals.
Since the MFCC features are relatively consistent with the auditory characteristics of human ears, and can be used as representative features of a machine learning classifier, MFCC features need to be extracted from the preprocessed slice signals. However, since the slice signals have different lengths and the number of frames obtained is different, the extracted MFCC features also need to be averaged. When averaging the extracted MFCC features, averaging is performed on each dimension data based on all frames according to formula (2);
Figure BDA0002262960980000043
in the formula, y is an average value of the MFCC features in each dimension, M is the dimension of the MFCC features, and N is the number of frames of the slice signal after framing.
In the present invention, M is 39.
Of course, besides extracting MFCC features, other acoustic features, such as short-time energy, zero-crossing rate, pitch, etc., may also be extracted, or a series of features may be combined for use by the classification model.
And 103, inputting the averaged MFCC features into a machine learning classifier for model training, and taking the trained classification model as a noise classification model.
The machine learning classifier is one of a random forest classifier, an SVM classifier and an XGboost classifier. Of course, the present invention is not limited to the above examples, and other classification learners can be applied to the present invention.
And 2, inputting the newly added telephone signal into a specific noise classification model by using the established noise classification model to obtain a noise identification result. The step 2 specifically comprises:
step 201, slicing the added telephone signal.
Step 202, performing normalization and framing preprocessing on the slice signals.
Because the volume of the slice signals is different, the volume of some signals is larger, the sound of some signals is lighter, and the normalization processing of the telephone signals is beneficial to improving the recognition rate. During preprocessing, normalization processing is carried out by adopting an equation (1), slice signals are uniformly quantized by 16 bits, the value range is between-65535 and 65535, and the signals are normalized to be between-1 and 1 by dividing the maximum value of the absolute value of the signals;
Figure BDA0002262960980000051
where x is the slice signal to be processed, | x | is the absolute value of the slice signal,
Figure BDA0002262960980000052
is normalized slice signal.
After the slice signal is normalized, because the frequency contour of the slice signal is lost along with the time, the slice signal needs to be subjected to framing processing, and each frame of obtained signal can be used as a stable signal for extracting frequency domain features. When the slice signal is processed by framing, the frame length is 30ms, and the frame shift is 10 ms.
And step 203, performing primary screening on the frequency spectrum flatness of the sliced signals after the framing.
After the framing processing, because the speech spectrum tends to have peaks in fundamental frequency and harmonic wave, and the noise spectrum is relatively flat, the signal spectrum flatness can be used to distinguish human voice from noise. Step 203 specifically includes: extracting the spectral flatness characteristics of each frame of slice signal, averaging the extracted spectral flatness characteristics, namely average flatness, setting a threshold value of the average flatness, and if the average flatness of the slice signal is higher than the threshold value, judging the slice signal to be noise and directly discarding the noise; and if the average flatness of the slice signals is lower than the threshold value, carrying out the next processing on the preprocessed slice signals.
The flatness threshold value of the present invention is set to 0.13.
And step 204, dividing the framing signals into odd-numbered segments after primary screening of the spectrum flatness, and performing averaging processing after respectively extracting the MFCC characteristics of each segment of signals.
The invention divides the longer slice signal into odd segments, each segment has a time length of 0.5s, and the segment shift is 0.25 s.
And step 205, inputting each section of slice signals subjected to averaging processing into a noise classification model for identification, and identifying noise in the slice signals.
And taking a mode of the identification result of each section of slice signal, if the identification result is that the proportion of the noise is high, determining that the input slice signal is the noise, and otherwise, determining that the input slice signal is the human voice. Because the slice signal contains both human voice and noise, the processing of step 205 can effectively improve the accuracy of signal identification.
Further, since the slice signal is divided into a human voice signal and a noise signal, the human voice signal threshold is set to be 0.2, and in step 205, when the probability that the slice signal to be identified passes through the noise classification model is greater than the threshold, the slice signal is determined to be the human voice signal, otherwise, the slice signal is discarded as the noise. The method can improve the voice recall rate to 99 percent and avoid deleting the voice by mistake.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A method for removing noise in an intelligent calling system based on machine learning, comprising:
step 1, taking the sampled telephone signals as training data, and establishing a noise classification model based on machine learning:
step 101, slicing the telephone signal, and carrying out normalization and framing pretreatment on the sliced signal;
102, extracting MFCC characteristics of the sliced signals after framing, and averaging the extracted MFCC characteristics;
step 103, inputting the averaged MFCC features into a machine learning classifier for model training, and taking the trained classification model as a noise classification model;
and 2, inputting the newly added telephone signal into a specific noise classification model by using the established noise classification model to obtain a noise identification result:
step 201, slicing the newly added telephone signal;
step 202, carrying out normalization and framing pretreatment on the slice signals;
step 203, carrying out primary screening on the frequency spectrum flatness of the sliced signals after framing;
step 204, dividing the framing signals into odd-numbered segments after primary screening of the spectrum flatness, and respectively extracting the MFCC characteristics of each segment of signals and then carrying out averaging processing;
step 205, inputting the MFCC features of each segment of signal averaging into a noise classification model for identification, and identifying the noise in the slice signal.
2. The noise removing method in the intelligent calling system based on machine learning of claim 1, wherein in the preprocessing, the normalization processing is performed by using formula (1), the slice signals are uniformly quantized by 16 bits, the value range is-65535 to 65535, and the signals are normalized to be between-1 to 1 by dividing the maximum value of the absolute value of the signals;
Figure FDA0002262960970000011
where x is the slice signal to be processed, | x | is the absolute value of the slice signal,
Figure FDA0002262960970000012
is normalized slice signal.
3. The noise removing method in a machine learning-based smart calling system according to claim 1, wherein the frame length of the slice signal is 30ms and the frame shift is 10ms in the framing process.
4. The method for removing noise in an intelligent calling system based on machine learning according to claim 1, wherein step 203 specifically comprises: extracting the spectral flatness characteristics of each frame of slice signals, and averaging the extracted spectral flatness characteristics, namely, the average flatness; setting a flatness threshold value of the average flatness, if the average flatness of the slice signal is higher than the flatness threshold value, judging the slice signal to be noise, and directly discarding the noise; and if the average flatness of the slice signal is lower than the flatness threshold value, carrying out the next processing on the slice signal.
5. The method of claim 4, wherein the flatness threshold value of flatness is 0.13.
6. The noise removing method in a machine learning-based smart call system as claimed in claim 1, wherein in averaging the extracted MFCC features, for each dimension data, averaging is performed in various dimensions based on all frames according to formula (2);
Figure FDA0002262960970000021
in the formula, y is an average value of the MFCC features in each dimension, M is the dimension of the MFCC features, and N is the number of frames of the slice signal after framing.
7. The method of claim 1, wherein in step 205, a mode is selected for the recognition result of each slice signal, and if the recognition rate of the noise is high, the inputted slice signal is determined as noise, otherwise, the inputted slice signal is determined as human voice.
8. The noise removing method in a machine learning-based smart calling system according to claim 7, wherein in step S1, the duration of each segment is 0.5S, and the segment shift is 0.25S.
9. The method as claimed in claim 4, wherein the slice signal is divided into a vocal signal and a noise signal, a threshold of the vocal signal is set to 0.2, and in step 205, when the probability of the slice signal to be identified passing through the classification model is greater than the threshold, the slice signal is determined to be the vocal signal.
10. The method of claim 1, wherein the machine learning classifier is one of a random forest classifier, an SVM classifier, and an XGboost classifier.
CN201911077584.1A 2019-11-06 2019-11-06 Noise identification method in intelligent calling system based on machine learning Active CN110933235B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911077584.1A CN110933235B (en) 2019-11-06 2019-11-06 Noise identification method in intelligent calling system based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911077584.1A CN110933235B (en) 2019-11-06 2019-11-06 Noise identification method in intelligent calling system based on machine learning

Publications (2)

Publication Number Publication Date
CN110933235A true CN110933235A (en) 2020-03-27
CN110933235B CN110933235B (en) 2021-07-27

Family

ID=69853364

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911077584.1A Active CN110933235B (en) 2019-11-06 2019-11-06 Noise identification method in intelligent calling system based on machine learning

Country Status (1)

Country Link
CN (1) CN110933235B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111028852A (en) * 2019-11-06 2020-04-17 杭州哲信信息技术有限公司 Noise removing method in intelligent calling system based on CNN

Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1992012583A1 (en) * 1991-01-04 1992-07-23 Picturetel Corporation Adaptive acoustic echo canceller
US5684921A (en) * 1995-07-13 1997-11-04 U S West Technologies, Inc. Method and system for identifying a corrupted speech message signal
CN104867499A (en) * 2014-12-26 2015-08-26 深圳市微纳集成电路与系统应用研究院 Frequency-band-divided wiener filtering and de-noising method used for hearing aid and system thereof
CN106024002A (en) * 2015-02-11 2016-10-12 恩智浦有限公司 Time zero convergence single microphone noise reduction
CN106548786A (en) * 2015-09-18 2017-03-29 广州酷狗计算机科技有限公司 A kind of detection method and system of voice data
CN107068161A (en) * 2017-04-14 2017-08-18 百度在线网络技术(北京)有限公司 Voice de-noising method, device and computer equipment based on artificial intelligence
CN107609488A (en) * 2017-08-21 2018-01-19 哈尔滨工程大学 A kind of ship noise method for identifying and classifying based on depth convolutional network
CN108172234A (en) * 2017-12-25 2018-06-15 天津天地伟业电子工业制造有限公司 A kind of audio-frequency noise detection method based on SVM
CN109074822A (en) * 2017-10-24 2018-12-21 深圳和而泰智能控制股份有限公司 Specific sound recognition methods, equipment and storage medium
CN109215665A (en) * 2018-07-20 2019-01-15 广东工业大学 A kind of method for recognizing sound-groove based on 3D convolutional neural networks
CN109256150A (en) * 2018-10-12 2019-01-22 北京创景咨询有限公司 Speech emotion recognition system and method based on machine learning
CN109448746A (en) * 2018-09-28 2019-03-08 百度在线网络技术(北京)有限公司 Voice de-noising method and device
CN109616142A (en) * 2013-03-26 2019-04-12 杜比实验室特许公司 Device and method for audio classification and processing
CN109616135A (en) * 2018-11-14 2019-04-12 腾讯音乐娱乐科技(深圳)有限公司 Audio-frequency processing method, device and storage medium
CN109671425A (en) * 2018-12-29 2019-04-23 广州酷狗计算机科技有限公司 Audio frequency classification method, device and storage medium
CN109767785A (en) * 2019-03-06 2019-05-17 河北工业大学 Ambient noise method for identifying and classifying based on convolutional neural networks
CN110164472A (en) * 2019-04-19 2019-08-23 天津大学 Noise classification method based on convolutional neural networks
CN110290280A (en) * 2019-05-28 2019-09-27 同盾控股有限公司 A kind of recognition methods of the SOT state of termination, device and storage medium

Patent Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1992012583A1 (en) * 1991-01-04 1992-07-23 Picturetel Corporation Adaptive acoustic echo canceller
US5684921A (en) * 1995-07-13 1997-11-04 U S West Technologies, Inc. Method and system for identifying a corrupted speech message signal
CN109616142A (en) * 2013-03-26 2019-04-12 杜比实验室特许公司 Device and method for audio classification and processing
CN104867499A (en) * 2014-12-26 2015-08-26 深圳市微纳集成电路与系统应用研究院 Frequency-band-divided wiener filtering and de-noising method used for hearing aid and system thereof
CN106024002A (en) * 2015-02-11 2016-10-12 恩智浦有限公司 Time zero convergence single microphone noise reduction
CN106548786A (en) * 2015-09-18 2017-03-29 广州酷狗计算机科技有限公司 A kind of detection method and system of voice data
CN107068161A (en) * 2017-04-14 2017-08-18 百度在线网络技术(北京)有限公司 Voice de-noising method, device and computer equipment based on artificial intelligence
CN107609488A (en) * 2017-08-21 2018-01-19 哈尔滨工程大学 A kind of ship noise method for identifying and classifying based on depth convolutional network
CN109074822A (en) * 2017-10-24 2018-12-21 深圳和而泰智能控制股份有限公司 Specific sound recognition methods, equipment and storage medium
CN108172234A (en) * 2017-12-25 2018-06-15 天津天地伟业电子工业制造有限公司 A kind of audio-frequency noise detection method based on SVM
CN109215665A (en) * 2018-07-20 2019-01-15 广东工业大学 A kind of method for recognizing sound-groove based on 3D convolutional neural networks
CN109448746A (en) * 2018-09-28 2019-03-08 百度在线网络技术(北京)有限公司 Voice de-noising method and device
CN109256150A (en) * 2018-10-12 2019-01-22 北京创景咨询有限公司 Speech emotion recognition system and method based on machine learning
CN109616135A (en) * 2018-11-14 2019-04-12 腾讯音乐娱乐科技(深圳)有限公司 Audio-frequency processing method, device and storage medium
CN109671425A (en) * 2018-12-29 2019-04-23 广州酷狗计算机科技有限公司 Audio frequency classification method, device and storage medium
CN109767785A (en) * 2019-03-06 2019-05-17 河北工业大学 Ambient noise method for identifying and classifying based on convolutional neural networks
CN110164472A (en) * 2019-04-19 2019-08-23 天津大学 Noise classification method based on convolutional neural networks
CN110290280A (en) * 2019-05-28 2019-09-27 同盾控股有限公司 A kind of recognition methods of the SOT state of termination, device and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111028852A (en) * 2019-11-06 2020-04-17 杭州哲信信息技术有限公司 Noise removing method in intelligent calling system based on CNN

Also Published As

Publication number Publication date
CN110933235B (en) 2021-07-27

Similar Documents

Publication Publication Date Title
CN108831499B (en) Speech enhancement method using speech existence probability
CN109788400B (en) Neural network howling suppression method, system and storage medium for digital hearing aid
EP3866165B1 (en) Method for enhancing telephone speech signals based on convolutional neural networks
CN110277087B (en) Pre-judging preprocessing method for broadcast signals
EP3413310B1 (en) Acoustic meaningful signal detection in wind noise
CN110933235B (en) Noise identification method in intelligent calling system based on machine learning
CN113593590A (en) Method for suppressing transient noise in voice
CN111028852A (en) Noise removing method in intelligent calling system based on CNN
CN110299133B (en) Method for judging illegal broadcast based on keyword
CN112466276A (en) Speech synthesis system training method and device and readable storage medium
CN111477246B (en) Voice processing method and device and intelligent terminal
CN116312561A (en) Method, system and device for voice print recognition, authentication, noise reduction and voice enhancement of personnel in power dispatching system
CN114333912B (en) Voice activation detection method, device, electronic equipment and storage medium
CN113593599A (en) Method for removing noise signal in voice signal
CN113948088A (en) Voice recognition method and device based on waveform simulation
CN113012710A (en) Audio noise reduction method and storage medium
EP2063420A1 (en) Method and assembly to enhance the intelligibility of speech
Mehta et al. Robust front-end and back-end processing for feature extraction for Hindi speech recognition
CN110070874A (en) A kind of voice de-noising method and device for Application on Voiceprint Recognition
Gbadamosi et al. Development of non-parametric noise reduction algorithm for GSM voice signal
Sanam et al. A combination of semisoft and μ-law thresholding functions for enhancing noisy speech in wavelet packet domain
Liu et al. Speech enhancement based on the integration of fully convolutional network, temporal lowpass filtering and spectrogram masking
Zehtabian et al. Optimized singular vector denoising approach for speech enhancement
Kumar et al. Noise Reduction Algorithm for Speech Enhancement
CN114664310B (en) Silent attack classification promotion method based on attention enhancement filtering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant