CN111653272A - Vehicle-mounted voice enhancement algorithm based on deep belief network - Google Patents
Vehicle-mounted voice enhancement algorithm based on deep belief network Download PDFInfo
- Publication number
- CN111653272A CN111653272A CN202010484415.6A CN202010484415A CN111653272A CN 111653272 A CN111653272 A CN 111653272A CN 202010484415 A CN202010484415 A CN 202010484415A CN 111653272 A CN111653272 A CN 111653272A
- Authority
- CN
- China
- Prior art keywords
- vehicle
- belief network
- deep belief
- layer
- signal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000006870 function Effects 0.000 claims abstract description 17
- 238000012549 training Methods 0.000 claims abstract description 14
- 230000004913 activation Effects 0.000 claims abstract description 7
- 238000012360 testing method Methods 0.000 claims abstract description 4
- 239000002245 particle Substances 0.000 claims description 18
- 238000009826 distribution Methods 0.000 claims description 16
- 230000000007 visual effect Effects 0.000 claims description 6
- 238000009827 uniform distribution Methods 0.000 claims description 3
- 238000000605 extraction Methods 0.000 abstract description 3
- 238000001914 filtration Methods 0.000 description 9
- 238000000034 method Methods 0.000 description 8
- 230000003044 adaptive effect Effects 0.000 description 7
- 238000005070 sampling Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 238000007476 Maximum Likelihood Methods 0.000 description 1
- 108010076504 Protein Sorting Signals Proteins 0.000 description 1
- 238000004378 air conditioning Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000003592 biomimetic effect Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 239000011664 nicotinic acid Substances 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000002945 steepest descent method Methods 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10K—SOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
- G10K11/00—Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
- G10K11/16—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
- G10K11/175—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
- G10K11/178—Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase
- G10K11/1785—Methods, e.g. algorithms; Devices
- G10K11/17853—Methods, e.g. algorithms; Devices of the filter
- G10K11/17854—Methods, e.g. algorithms; Devices of the filter the filter being an adaptive filter
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/16—Speech classification or search using artificial neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Human Computer Interaction (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
The invention provides a vehicle-mounted voice enhancement algorithm based on a deep belief network, which comprises the following steps: step 1: dividing the vehicle-mounted voice signal into a training sample signal and a testing sample signal; step 2: optimizing the learning rate, the initial weight and the number of hidden nodes of the DBN by adopting a QPSO algorithm; and step 3: replacing the sigmoid function with a hyperbolic tangent (tanh) activation function to optimize the deep belief network model; and 4, step 4: carrying out greedy unsupervised learning layer by layer on the optimized deep belief network to obtain an abstract voice characteristic vector of the input vehicle-mounted voice signal; and 5: and inputting the abstract voice signal into a minimum mean square error algorithm to obtain a voice enhancement signal. The invention combines the deep belief network and the traditional least mean square error algorithm to carry out voice enhancement, not only utilizes the strong learning capability and the feature extraction capability of the deep belief network, but also combines the high efficiency of the traditional voice enhancement algorithm.
Description
Technical Field
The invention relates to a voice enhancement technology, in particular to a vehicle-mounted voice enhancement algorithm based on a deep belief network.
Background
In recent years, with the rapid development of economy, the consumption level of people is increasingly improved, and an automobile becomes a main walking tool for people to go out. Statistics show that the number of motor vehicles in the whole country in the last half of 2019 is 3.4 hundred million, 1242 thousands of newly registered automobiles and 1408 thousands of newly-licensed drivers. With the ever increasing market competition, automotive manufacturers are constantly upgrading in-vehicle electronic devices, such as in-vehicle multimedia systems, in-vehicle navigation systems, in-vehicle hands-free systems, in-vehicle control systems, etc., in order to meet the diverse demands of users during driving.
Under the ideal condition, a driver sends out command voice for controlling the vehicle-mounted electronic equipment, and the vehicle-mounted voice recognition system calls the corresponding vehicle-mounted electronic equipment according to the recognized content. However, there are various background noises in a real vehicle-mounted environment: such as engine noise, tire noise, wind noise, vehicle air conditioning noise, and also human noise from the passenger compartment. At present, the speech recognition accuracy rate in a quiet environment reaches about 98%, but in a real environment, particularly in a complex vehicle-mounted noise environment, the speech recognition accuracy rate will be sharply reduced.
Disclosure of Invention
The invention aims to: the vehicle-mounted voice enhancement algorithm with high voice recognition accuracy can be used in a real environment, particularly a complex vehicle-mounted noise environment.
The invention provides a vehicle-mounted voice enhancement algorithm based on a deep belief network, which comprises the following steps:
step 1: dividing the vehicle-mounted voice signal into a training sample signal and a testing sample signal;
step 2: optimizing the learning rate, the initial weight and the number of hidden nodes of the DBN by adopting a QPSO algorithm;
QPSO algorithm represents quantum particle swarm algorithm, and DBN represents deep belief network.
And step 3: replacing a sigmoid function with a tanh activation function, and optimizing the deep belief network model;
and 4, step 4: carrying out greedy unsupervised learning layer by layer on the optimized deep belief network to obtain an abstract voice characteristic vector of the input vehicle-mounted voice signal;
and 5: and inputting the abstract voice signal into a minimum mean square error algorithm to obtain a voice enhancement signal.
Further, the step 2 includes training the DBN by using a restricted boltzmann machine; the limited Boltzmann machine represents the current state of the system through energy, and the energy expression is as follows:
where n represents the number of nodes of the visual layer v, m represents the number of nodes of the hidden layer h, a represents the bias of the visual layer, b tableIndicating the bias of the hidden layer, WijRepresenting the weight value from the visible layer node to the hidden layer node, wherein theta is { W, a, b } representing the set of all parameters of the system;
the probability distribution of the whole system is as follows:
the logarithmic derivative of the edge probability distribution is taken using the following equation:
for the training samples, distribution P (h | v, θ) is represented by "data", and distribution P (v, h | θ) is represented by "model". Wherein, use<·>pRepresents a mathematical expectation with respect to the distribution P;
the edge probability distribution logarithmic derivative formula is expressed as:
the RBM parameter is obtained by adopting the following formula:
wherein k represents the number of sampling times, which is the learning rate;
RBM denotes a restricted Boltzmann machine;
optimizing the learning rate, the initial weight and the number of hidden nodes by adopting the following formulas:
P=μPi+(1-μ)Pj
in the formula: z is the size of the population; mu and u are the interval [0,1]Random numbers which meet the uniform distribution; zbAverage point of optimal position for all particle individuals; p is a radical ofiFor the individual optimal position of the particle i, pjFor the individual global optimum position of particle i, X (t) for the position of particle i in the t-th iteration αpTo compress the expansion factor.
Further, in step 3, the sigmoid function expression adopts the following formula:
the tanh activation function expression adopts the following formula:
the invention has the beneficial effects that: the deep belief network is combined with the traditional minimum mean square error algorithm to carry out voice enhancement, so that the strong learning capability and the feature extraction capability of the deep belief network are utilized, and the high efficiency of the traditional voice enhancement algorithm is combined. And (3) performing feature learning on the vehicle-mounted voice signal through a deep belief network, and inputting the feature learning into an LMS algorithm to obtain the optimal voice enhancement effect. Speech enhancement, as a pre-processing scheme, is an effective and necessary way to suppress noise and interference, providing convenience for subsequent speech recognition.
Drawings
FIG. 1 is a system block diagram of the present invention.
FIG. 2 is a flow chart of the QPSO algorithm.
Fig. 3 is a schematic diagram of an LMS algorithm adaptive noise canceller.
The method is concretely implemented as follows:
the following provides a more detailed description of the embodiments and the operation of the present invention with reference to the accompanying drawings.
As shown in figure 1, the invention provides a vehicle-mounted voice enhancement algorithm based on a deep belief network.
The method comprises the following specific steps:
step 1: and preprocessing the acquired vehicle-mounted voice signals, namely removing the mean value and normalizing, and dividing the samples into training samples and testing samples.
Step 2: firstly, a deep belief network model is constructed, wherein the DBN model consists of a plurality of RBMs and has the function of automatically extracting high-level features. The RBM represents the current state of the system through energy, and the energy expression is as follows:
wherein n represents the number of nodes of the visual layer v, m represents the number of nodes of the hidden layer h, a and b represent the bias of the visual layer and the hidden layer respectively, and WijThe weight value from the visible layer inode to the hidden layer j node is represented. The set of all parameters of the system is denoted by θ ═ { W, a, b }.
The probability distribution of the whole system is as follows:
the denominator part is called as a normalization factor, so that the value of the system probability is in the range of [0,1 ]. From the above equation, the edge probability distribution is obtained, which is estimated by using the maximum likelihood method, and the logarithmic derivative of which has:
for the training samples, distribution P (h | v, θ) is represented by "data", and distribution P (v, h | θ) is represented by "model". Wherein, use<·>pRepresenting about a distributionMathematical expectation of P. Because θ ═ W, a, b, the derivative defined in conjunction with the energy, the above equation can be expressed as:
and (4) approximating the log likelihood probability by Gibbs sampling, and obtaining the gradient of the RBM parameter for updating. The calculation expression is as follows:
where k represents the number of samples, which is the learning rate.
Optimizing the learning rate, the initial weight and the number of hidden layers by utilizing a QPSO algorithm, wherein the calculation expression is as follows:
P=μPi+(1-μ)Pj
in the formula: z is the size of the population; mu and u are the interval [0,1]Random numbers which meet the uniform distribution; zbAverage point of optimal position for all particle individuals; p is a radical ofiAnd pjFor particle i the individual optimum position and the global optimum position, X (t) for the position of particle i in the t iteration αpTo compress the expansion factor.
The QPSO algorithm is a flow chart as shown in fig. 2, and the process of optimizing the DBN by using the QPSO algorithm can be described as follows:
1) initializing a QPSO algorithm, wherein the QPSO algorithm comprises the positions and optimization ranges of particles, compression expansion factors, iteration times and the like, and the DBN learning rate, the initial weight and the number of hidden layers to be optimized are mapped to the positions of the particles;
2) calculating the fitness of each particle in the population to obtain the individual optimal position of each particle and the global optimal position of the population;
3) calculating to obtain the optimal average point of the individual positions of all the particles in the population, and then updating the positions of the particles;
4) and repeating 1) -3) until an iteration stop condition is met, wherein the output optimization result is the parameter of the DBN.
And step 3: and (3) setting basic parameters of the DBN according to the dimension of the voice signal, the size of the sample set and the result optimized by the QPSO algorithm in the step (2), wherein the basic parameters comprise parameters such as the number of hidden layer units, the number of model layers, training times, batch size, momentum learning rate, punishment rate, initial bias, initial weight and the like. And the expression of the traditional sigmoid function is as follows:
replaced with a tanh activation function, the expression is as follows:
the problem that the gradient of the traditional DBN activation function is easy to disappear is solved, and the convergence and the stability of the network are effectively improved.
And 4, step 4: the method comprises the steps of obtaining high-level expression of input vehicle-mounted voice signals by a layer-by-layer feature extraction method, optimizing weights among network connections, firstly adopting an unsupervised training mode, training RBMs layer by layer, keeping features of the voice signals as much as possible, then carrying out fine adjustment on a back propagation network, and learning ideal high-level abstract voice feature signals. The whole process is as follows:
1) and initializing parameters. And setting parameters including a weight matrix, a visible layer bias vector, a hidden layer bias vector, sampling times, iteration times, a learning rate and the like according to the step 3.
2) And (6) updating the parameters. And performing Gibbs sampling for multiple times by using a contrast divergence algorithm, and updating the parameters by using a parameter updating formula.
3) And (5) training layer by layer. And training each RBM layer by layer until all RBMs are trained.
4) And (6) fine adjustment. And adjusting the weight and the bias of each layer of network by using an error back propagation mechanism of the network model.
And 5: and inputting the high-level abstract voice signal into a self-adaptive filtering algorithm to perform voice enhancement. Based on the schematic diagram of fig. 3, where s (n) denotes the original speech signal; x (n) represents a speech signal containing noise; y (n) represents the output signal of the filter; v (n) represents a noise signal; v. of0(n) and v1(n) represents two uncorrelated noise signals; e (n) represents an error signal, expressed as follows:
e(n)=x(n)-y(n)=s(n)+v0(n)-y(n)
squaring both sides and then taking the mathematical expectation, one can obtain
E[e2(n)]=E[s2(n)]+[(v0(n)-y(n))2]+2E[s(n)(v0(n)-y(n))]
The adaptive process of the LMS algorithm is to automatically adjust the tap weight of the filterSo that the error E [ E ]2(n)]To a minimum. Because s (n) and v0(n) are independent of each other, so that E [ E ] is2(n)]To a minimum, only E [ (v) is required0(n)-y(n))2]The requirement can be met at minimum.
The iterative formula of the least mean square error LMS algorithm based on the steepest descent method is as follows:
so that it is possible to obtain:
where μ denotes a step factor.
The order of the adaptive filtering is M, the filter coefficient is F, the input signal sequence is X, and the output is:
e(n)=d(n)-y(n)
it is possible to obtain:
let F be [ w ]0w1…wM-1]T,Xj=[x1jx2j...xnj]Then the output of the filter can be written in matrix form:
the cost function is defined as:
when the cost function in the above equation is minimized, it is considered that optimal filtering is achieved, and such adaptive filtering becomes least mean square adaptive filtering.
For least mean square adaptive filtering, the filter coefficients that minimize the mean square error need to be determined, and gradient descent methods are generally used to solve such problems. The iterative formula of the filter coefficient vector is:
Because of the transient gradient-2XjejFor unbiased estimation of true gradient values, in practical applications, the instantaneous gradient can be used instead of the true gradient, that is:
Fj+1=Fj+μejXj
and through gradual iteration, the optimal filter coefficient can be obtained, and the self-adaptive filtering of the input vehicle-mounted voice signal is realized to perform voice enhancement.
The LMS-based speech noise reduction algorithm comprises the following specific steps:
1) first, wavelet transform is performed on a speech signal x (n) containing noise, so as to obtain the following formula:
x(n)=s(n)+v0(n)
2) then, bionic wavelet transform is carried out on the voice signal x (n) containing the noise to obtain wavelet coefficients containing the noise, which is as follows
Wherein the content of the first and second substances,representing the coefficients produced after a speech signal containing noise has been transformed using a biomimetic wavelet.
3) And (4) self-adaptive filtering processing. The input signal of the adaptive canceller is N ═ N0n1…nM-1]TThe output signal of the filter is:
according to the method, after noise filtering is carried out through a vehicle-mounted voice enhancement algorithm based on a deep belief network, an enhanced vehicle-mounted voice signal is output, and finally, the calculated signal-to-noise ratio and the calculated PESQ value are improved compared with those of a traditional voice enhancement algorithm. The algorithm can effectively eliminate the noise of the original voice signal, furthest reserve the information of the original voice signal and effectively enhance the processed voice signal.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (3)
1. A vehicle-mounted voice enhancement algorithm based on a deep belief network is characterized by comprising the following steps:
step 1: dividing the vehicle-mounted voice signal into a training sample signal and a testing sample signal;
step 2: optimizing the learning rate, the initial weight and the number of hidden nodes of the DBN by adopting a QPSO algorithm;
and step 3: replacing a sigmoid function with a tanh activation function, and optimizing the deep belief network model;
and 4, step 4: carrying out greedy unsupervised learning layer by layer on the optimized deep belief network to obtain an abstract voice characteristic vector of the input vehicle-mounted voice signal;
and 5: and inputting the abstract voice signal into a minimum mean square error algorithm to obtain a voice enhancement signal.
2. The vehicle-mounted speech enhancement algorithm based on the deep belief network according to claim 1, characterized in that: the step 2 comprises training the DBN by adopting a restricted Boltzmann machine;
the limited Boltzmann machine represents the current state of the system through energy, and the energy expression is as follows:
where n represents the number of nodes of the visual layer v, m represents the number of nodes of the hidden layer h, a represents the bias of the visual layer, b represents the bias of the hidden layer, WijRepresenting the weight value from the visible layer node to the hidden layer node, wherein theta is { W, a, b } representing the set of all parameters of the system;
the probability distribution of the whole system is as follows:
the logarithmic derivative of the edge probability distribution is taken using the following equation:
for the training samples, "data" is used to represent the distribution P (h | v, θ), and "model" is used to represent the distribution P (v, h | θ), where<·>pRepresents a mathematical expectation with respect to the distribution P;
the edge probability distribution logarithmic derivative formula is expressed as:
the RBM parameter is obtained by adopting the following formula:
where k represents the number of samples, learning rate
Optimizing the learning rate, the initial weight and the number of hidden nodes by adopting the following formulas:
P=μPi+(1-μ)Pj
in the formula: z is the size of the population; mu and u are the interval [0,1]Random numbers which meet the uniform distribution; zbAverage point of optimal position for all particle individuals; p is a radical ofiFor the individual optimal position of the particle i, pjFor the individual global optimum position of particle i, X (t) for the position of particle i in the t-th iteration αpTo compress the expansion factor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010484415.6A CN111653272A (en) | 2020-06-01 | 2020-06-01 | Vehicle-mounted voice enhancement algorithm based on deep belief network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010484415.6A CN111653272A (en) | 2020-06-01 | 2020-06-01 | Vehicle-mounted voice enhancement algorithm based on deep belief network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111653272A true CN111653272A (en) | 2020-09-11 |
Family
ID=72352034
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010484415.6A Pending CN111653272A (en) | 2020-06-01 | 2020-06-01 | Vehicle-mounted voice enhancement algorithm based on deep belief network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111653272A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112687294A (en) * | 2020-12-21 | 2021-04-20 | 重庆科技学院 | Vehicle-mounted noise identification method |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106024001A (en) * | 2016-05-03 | 2016-10-12 | 电子科技大学 | Method used for improving speech enhancement performance of microphone array |
CN107358966A (en) * | 2017-06-27 | 2017-11-17 | 北京理工大学 | Based on deep learning speech enhan-cement without reference voice quality objective evaluation method |
CN108615533A (en) * | 2018-03-28 | 2018-10-02 | 天津大学 | A kind of high-performance sound enhancement method based on deep learning |
CN109086817A (en) * | 2018-07-25 | 2018-12-25 | 西安工程大学 | A kind of Fault Diagnosis for HV Circuit Breakers method based on deepness belief network |
CN109492746A (en) * | 2016-09-06 | 2019-03-19 | 青岛理工大学 | Deepness belief network parameter optimization method based on GA-PSO Hybrid Algorithm |
CN109671433A (en) * | 2019-01-10 | 2019-04-23 | 腾讯科技(深圳)有限公司 | A kind of detection method and relevant apparatus of keyword |
-
2020
- 2020-06-01 CN CN202010484415.6A patent/CN111653272A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106024001A (en) * | 2016-05-03 | 2016-10-12 | 电子科技大学 | Method used for improving speech enhancement performance of microphone array |
CN109492746A (en) * | 2016-09-06 | 2019-03-19 | 青岛理工大学 | Deepness belief network parameter optimization method based on GA-PSO Hybrid Algorithm |
CN107358966A (en) * | 2017-06-27 | 2017-11-17 | 北京理工大学 | Based on deep learning speech enhan-cement without reference voice quality objective evaluation method |
CN108615533A (en) * | 2018-03-28 | 2018-10-02 | 天津大学 | A kind of high-performance sound enhancement method based on deep learning |
CN109086817A (en) * | 2018-07-25 | 2018-12-25 | 西安工程大学 | A kind of Fault Diagnosis for HV Circuit Breakers method based on deepness belief network |
CN109671433A (en) * | 2019-01-10 | 2019-04-23 | 腾讯科技(深圳)有限公司 | A kind of detection method and relevant apparatus of keyword |
Non-Patent Citations (5)
Title |
---|
朱萌: ""面向大数据的高压断路器状态信息挖掘与故障诊断"", 《中国优秀博硕士学位论文全文数据库(硕士) 工程科技Ⅱ辑》 * |
王伟军 等: "一种基于自适应滤波的语音降噪方法研究", 《现代电子技术》 * |
王涛: ""语音降噪处理技术的研究"", 《中国优秀博硕士学位论文全文数据库(硕士) 信息科辑》 * |
邢传玺,宋扬 著: "《浅海环境参数反演及声信号处理技术》", 30 June 2018, 北京理工大学出版社 * |
黄家华等: ""基于参数优化的深度信念网络TE过程故障诊断"", 《第30届中国过程控制会议(CPCC 2019)摘要集》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112687294A (en) * | 2020-12-21 | 2021-04-20 | 重庆科技学院 | Vehicle-mounted noise identification method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109841226B (en) | Single-channel real-time noise reduction method based on convolution recurrent neural network | |
CN108766419B (en) | Abnormal voice distinguishing method based on deep learning | |
Droppo et al. | Evaluation of the SPLICE algorithm on the Aurora2 database. | |
CN110867181A (en) | Multi-target speech enhancement method based on SCNN and TCNN joint estimation | |
Dharanipragada et al. | A nonlinear unsupervised adaptation technique for speech recognition. | |
CN112331224A (en) | Lightweight time domain convolution network voice enhancement method and system | |
Hilger et al. | Quantile based histogram equalization for noise robust speech recognition | |
CN108735199B (en) | Self-adaptive training method and system of acoustic model | |
CN110634476B (en) | Method and system for rapidly building robust acoustic model | |
CN113936681B (en) | Speech enhancement method based on mask mapping and mixed cavity convolution network | |
WO2022012206A1 (en) | Audio signal processing method, device, equipment, and storage medium | |
WO2005098820A1 (en) | Speech recognition device and speech recognition method | |
CN109346084A (en) | Method for distinguishing speek person based on depth storehouse autoencoder network | |
CN111899757A (en) | Single-channel voice separation method and system for target speaker extraction | |
CN113539293B (en) | Single-channel voice separation method based on convolutional neural network and joint optimization | |
CN112270405A (en) | Filter pruning method and system of convolution neural network model based on norm | |
CN113763965A (en) | Speaker identification method with multiple attention characteristics fused | |
CN111653272A (en) | Vehicle-mounted voice enhancement algorithm based on deep belief network | |
CN114678030A (en) | Voiceprint identification method and device based on depth residual error network and attention mechanism | |
CN111798828A (en) | Synthetic audio detection method, system, mobile terminal and storage medium | |
CN111091809B (en) | Regional accent recognition method and device based on depth feature fusion | |
CN114863938A (en) | Bird language identification method and system based on attention residual error and feature fusion | |
Xu et al. | Robust speech recognition based on noise and SNR classification-a multiple-model framework. | |
CN109871448B (en) | Short text classification method and system | |
CN111667836B (en) | Text irrelevant multi-label speaker recognition method based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200911 |
|
RJ01 | Rejection of invention patent application after publication |