WO2016101688A1 - Procédé de reconnaissance vocale continue sur la base de réseau neuronal récurrent de mémoire à court et long terme profond - Google Patents

Procédé de reconnaissance vocale continue sur la base de réseau neuronal récurrent de mémoire à court et long terme profond Download PDF

Info

Publication number
WO2016101688A1
WO2016101688A1 PCT/CN2015/092380 CN2015092380W WO2016101688A1 WO 2016101688 A1 WO2016101688 A1 WO 2016101688A1 CN 2015092380 W CN2015092380 W CN 2015092380W WO 2016101688 A1 WO2016101688 A1 WO 2016101688A1
Authority
WO
WIPO (PCT)
Prior art keywords
output
neural network
short
term
term memory
Prior art date
Application number
PCT/CN2015/092380
Other languages
English (en)
Chinese (zh)
Inventor
杨毅
孙甲松
Original Assignee
清华大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 清华大学 filed Critical 清华大学
Publication of WO2016101688A1 publication Critical patent/WO2016101688A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks

Definitions

  • the invention belongs to the field of audio technology, and in particular relates to a continuous speech recognition method based on a deep long-term and short-term memory cycle neural network.
  • speech recognition mainly adopts continuous speech recognition technology based on statistical model, and its main goal is to find the most probable word sequence represented by a given speech sequence.
  • Continuous speech recognition systems usually include acoustic models, language models and decoding methods.
  • Acoustic modeling methods as the core technology of continuous speech recognition, have developed rapidly in recent years.
  • the commonly used acoustic model is the Gaussian Mixture Model-Hidden Markov Model (GMM-HMM). The principle is: training the mixed Gaussian model to obtain the probability that each frame feature belongs to each phoneme state.
  • GMM-HMM Gaussian Mixture Model-Hidden Markov Model
  • the Markov model obtains the transition probabilities between the phoneme states and themselves, and accordingly obtains the probability that each phoneme state sequence produces the current speech feature vector sequence.
  • the phonemes are further divided into different modeling units according to different contexts (Context Dependent), which is called CD-GMM-HMM method.
  • Recurrent Neural Network is a kind of neural network with a directed loop to express the dynamic time characteristics of the network. It is widely used in handwriting recognition and language modeling. Speech signals are complex time-varying signals with complex correlations on different time scales. Therefore, compared with deep neural networks, cyclic neural networks have a loop-connecting function that is more suitable for processing such complex time series data.
  • the Long Short-Term Memory (LSTM) model is more suitable than the cyclic neural network to process and predict long-term sequences with delayed events and uncertain time.
  • the deep LSTM-RNN acoustic model proposed by the University of Toronto with the addition of a memory block combines the multi-level representation capabilities of deep neural networks with the ability of cyclic neural networks to flexibly utilize long-span contexts, resulting in errors in phoneme recognition based on TIMIT libraries. The rate dropped to 17.1%.
  • the gradient descent method used in the cyclic neural network has the problem of vanishing gradient, that is, in the process of adjusting the weight of the network, as the number of network layers increases, the gradient dissipates layer by layer, causing the weight to be adjusted. The effect is getting smaller and smaller.
  • Google's proposed two-layer depth LSTM-RNN acoustic model adds a linear Recurrent Projection Layer to the previous depth LSTM-RNN model to solve the gradient dissipation problem.
  • the continuous speech recognition method of short-term memory loop neural network improves the speech recognition rate of noisy continuous speech signals, and has the characteristics of low computational complexity and fast convergence, which is suitable for implementation on ordinary CPU.
  • a continuous speech recognition method based on deep long-term and short-term memory cyclic neural network comprising:
  • Step one establishing two deep long-term and short-term memory cyclic neural network modules having identical structures including a plurality of long-term and short-term memory layers and a linear cyclic projection layer;
  • Step two respectively sending the original pure speech signal and the noisy signal as input to the two modules of step one;
  • Step 3 Calculate the cross-entropy of all the parameters of the corresponding long-short-term memory layer in the two modules to measure the information distribution difference between the two modules, and implement the cross-entropy parameter update through the linear cyclic projection layer 2;
  • step four continuous speech recognition is achieved by comparing the final update result with the final output of the deep long-term memory loop neural network module with the original pure speech signal as input.
  • the output of the short-term memory layer is the input of the first linear loop projection layer.
  • the output of the first linear loop projection layer is the input of the next linear loop projection layer, and the output of the next linear loop projection layer is used as the next linear loop.
  • the input of the projection layer, and so on, wherein the output of the last linear loop projection layer is the output of the entire deep long-term memory loop neural network module in the deep long-term memory loop neural network module with the original pure speech signal as input.
  • [y 1 ,...,y T ] is the length of time of the speech signal, and in the deep long-term memory loop neural network module with the noisy signal as input, the output of the last linear loop projection layer is discarded.
  • the long-term and short-term memory layer is composed of a memory cell, an input gate, an output gate, an forgetting gate, a tanh function, and a multiplier, wherein the long-term and short-term memory layer, that is, the long-term and short-term memory neural network sub-module, is long at t ⁇ [1,T]
  • the parameters in the short-term memory neural network sub-module are calculated as follows:
  • G input sigmoid(W ix x+W ic Cell'+b i )
  • G forget sigmoid(W fx x+W fc Cell'+b f )
  • G output sigmoid(W ox x+W oc Cell'+b o )
  • G input is the output of the input gate
  • G forget is the output of the forgetting gate
  • Cell is the output of the memory cell
  • Cell' is the output of the memory cell at time t-1
  • G output is the output of the output gate
  • G' output is t-
  • the output of the output gate is 1 time
  • m is the output of the linear cyclic projection layer
  • m' is the output of the linear cyclic projection layer at time t-1
  • x is the input of the whole long-term and short-term memory cycle neural network module
  • y is a long-term and short-term memory cycle
  • b i is the deviation of the input gate i
  • b f is the deviation of the forgetting gate f
  • b c is the deviation of the memory cell c
  • b o is the deviation of the output gate o
  • b y is The amount of deviation of the output y, different b represents a different amount of deviation
  • the output of a long-term and short-term memory neural network sub-module at the same level is taken as two inputs of an update sub-module, and one update sub-module is composed of cross-entropy and linear loop.
  • the projection layer is composed of two, and the plurality of update submodules are connected in series to form an update module, and the output of one update submodule is input of the next update submodule, and the output of the last submodule is output of the entire update module.
  • x 1 and x 2 represent the two inputs of the update sub-module, respectively, ie the long-short-term memory neural network sub-module in the long-short-term memory neural network module with the original pure speech signal and the noisy signal as input.
  • the output of the linear loop projection layer 2 is calculated as follows:
  • W y represents the weight of the parameter update output to the output of the linear loop projection layer
  • d represents the cross entropy
  • b y ' represents the amount of deviation
  • the existing deep neural network acoustic model has good performance in a quiet environment, but fails in the case where the environmental noise is large and the signal to noise ratio is drastically reduced.
  • a directed cycle between the elements in the acoustic neural network acoustic model of the present invention which can effectively describe the dynamic time characteristics inside the neural network, and is more suitable for processing voice data with complex time series.
  • Long- and short-term memory neural networks are more suitable than cyclic neural networks to process and predict long-term sequences with delayed events and uncertain time. Therefore, acoustic models used to construct speech recognition can achieve better results.
  • the deep and long-term memory cycle neural network acoustic model structure it is necessary to reduce the influence of noise characteristics on neural network parameters and improve the noise immunity and robustness of speech recognition system under environmental noise interference.
  • FIG. 1 is a flow chart of a deep long-term and short-term memory neural network model of the present invention.
  • FIG. 2 is a flow chart of the deep long-term and short-term memory cycle neural network update module of the present invention.
  • FIG. 3 is a flow chart of the acoustic model of the robust deep long-term memory neural network of the present invention.
  • the present invention proposes a method and apparatus for robust deep long-term memory neural network acoustic models, in particular, for continuous speech recognition scenarios.
  • These methods and apparatus are not limited to continuous speech recognition, but can be any method and apparatus related to speech recognition.
  • Step 1 Establish two deep long-term and short-term memory cyclic neural network modules including two long-short-term memory layers and a linear cyclic projection layer, respectively, and send the original pure speech signal and the noisy signal as input to the two of step one respectively. Modules.
  • FIG. 1 is a flow chart of a deep long-term and short-term memory cycle neural network module according to the present invention, including the following contents:
  • the module is composed of a memory cell 103, an input gate 104, an output gate 105, a forgetting gate 106, a tanh function 107, and a multiplier; the output of the long-term and short-term memory neural network sub-module is input to the linear cyclic projection layer 108, and the linear cyclic projection layer 108
  • G input sigmoid(W ix x+W ic Cell'+b i )
  • G forget sigmoid(W fx x+W fc Cell'+b f )
  • G output sigmoid(W ox x+W oc Cell'+b o )
  • G input is the output of the input gate
  • G forget is the output of the forgetting gate
  • Cell is the output of the memory cell
  • Cell' is the output of the memory cell at time t-1
  • G output is the output of the output gate
  • G' output is t-
  • the output of the output gate is 1 time
  • m is the output of the linear cyclic projection layer
  • m' is the output of the linear cyclic projection layer at time t-1
  • x is the input of the whole long-term and short-term memory cycle neural network module
  • y is a long-term and short-term memory cycle
  • b i is the deviation of the input gate i
  • b f is the deviation of the forgetting gate f
  • b c is the deviation of the memory cell c
  • b o is the deviation of the output gate o
  • b y is The amount of deviation of the output y, different b represents a different amount of deviation
  • Step 2 Calculate the cross-entropy of all the parameters of the corresponding long-short-term memory layer in the two modules to measure the information distribution difference between the two modules, and implement the cross-entropy parameter update through the linear cyclic projection layer 2.
  • FIG. 2 is a flow chart of a deep-long-term and short-term memory cycle neural network update module according to the present invention, which includes the following contents: the original pure speech signal and the noisy signal (ie, the original pure speech signal after being interfered by environmental noise) are respectively used as the depth in FIG.
  • the input of the long-term and short-term memory cycle neural network module can respectively obtain the outputs of two long-term and short-term memory neural network sub-modules (ie, the block of FIG.
  • the update submodule 202 of the update module, the update submodule 202 is composed of a cross entropy 203 and a linear loop projection layer two 204; the output of the update submodule 202 is used as an input of the next update submodule, so that the loop is repeated multiple times; the last updater The output of the module is the output 205 of the entire update module.
  • the cross entropy 203 of the update submodule 202 is calculated according to the following formula:
  • x 1 and x 2 represent the two inputs of the update module, that is, the outputs of the two long-term and short-term memory loop neural networks respectively input from the original pure speech signal and the noisy signal.
  • the output of the linear loop projection layer 204 is calculated as follows:
  • W y represents the weight of the cross entropy 203 output to the linear cyclic projection layer 204
  • d represents the cross entropy
  • b y ' represents the amount of deviation
  • x k represents the input of the kth ⁇ [1,K] soft max functions
  • l ⁇ [1,K] is used for all Summing.
  • Step 3 by comparing the final update results with the depth of the input with the original pure speech signal
  • the final output of the memory-cycled neural network module enables continuous speech recognition.
  • FIG. 3 is a flow chart of an acoustic deep model of the robust deep long-term memory neural network of the present invention, including the following contents:
  • the deep long-term memory cycle neural network module 303 with the original pure speech signal 301 as input, the deep long-term memory cycle neural network update module 304, and the noisy signal (that is, the original after being disturbed by environmental noise)
  • the pure speech signal 302 is the input deep long-term memory loop neural network module 305, wherein the parameters are calculated as shown in steps 1 and 2, and the final output is the original pure speech signal as the input of the deep long-term memory loop neural network module output 306.
  • the output 307 of the deep long-term memory loop neural network update module is the input deep long-term memory loop neural network module 305, wherein the parameters are calculated as shown in steps 1 and 2, and the final output is the original pure speech signal as the input of the deep long-term memory loop neural network module output 306.
  • the output 307 of the deep long-term memory loop neural network update module is the input deep long-term memory loop neural network module 305, wherein the parameters are calculated as shown in steps 1 and 2, and the final output is the original pure speech signal as the input of the deep long-

Abstract

L'invention concerne un procédé de reconnaissance vocale continue sur la base d'un réseau neuronal récurrent de mémoire à court et long terme profond, comprenant : l'utilisation d'un signal vocal bruyant (302) et un signal vocal pur d'origine (301) comme échantillons d'apprentissage ; la construction de deux modules de réseau neuronal récurrent de mémoire à court et long terme profond (303, 305) ayant la même structure ; la réalisation d'un calcul d'entropie croisée entre chaque couche de mémoire à court et long terme profond (102) des deux modules (303, 305) pour obtenir la différence entre ces dernières ; la mise à jour d'un paramètre d'entropie croisée par l'intermédiaire d'une couche de projection de circulation linéaire (108) ; et, enfin, l'acquisition d'un modèle acoustique de réseau neuronal récurrent de mémoire à court et long terme profond résistant à un bruit ambiant. Le procédé construit un modèle acoustique de réseau neuronal récurrent de mémoire à court et long terme profond, ce qui permet d'augmenter un rendement de reconnaissance d'un signal vocal bruyant continu, de résoudre le problème selon lequel la majorité de calculs doivent être réalisés sur un dispositif de CPU en conséquence de la grande échelle de paramètres de réseau neuronal profond (DNN), d'avoir une complexité de calcul faible et un taux de convergence rapide, et d'être largement applicable à une pluralité de champs d'apprentissage machine, tels qu'une reconnaissance de haut-parleur, une reconnaissance de mot-clé et une interaction homme-machine en lien avec une reconnaissance vocale.
PCT/CN2015/092380 2014-12-25 2015-10-21 Procédé de reconnaissance vocale continue sur la base de réseau neuronal récurrent de mémoire à court et long terme profond WO2016101688A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410821646.6 2014-12-25
CN201410821646.6A CN104538028B (zh) 2014-12-25 2014-12-25 一种基于深度长短期记忆循环神经网络的连续语音识别方法

Publications (1)

Publication Number Publication Date
WO2016101688A1 true WO2016101688A1 (fr) 2016-06-30

Family

ID=52853544

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/092380 WO2016101688A1 (fr) 2014-12-25 2015-10-21 Procédé de reconnaissance vocale continue sur la base de réseau neuronal récurrent de mémoire à court et long terme profond

Country Status (2)

Country Link
CN (1) CN104538028B (fr)
WO (1) WO2016101688A1 (fr)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109086865A (zh) * 2018-06-11 2018-12-25 上海交通大学 一种基于切分循环神经网络的序列模型建立方法
CN110147284A (zh) * 2019-05-24 2019-08-20 湖南农业大学 基于二维长短期记忆神经网络的超级计算机工作负载预测方法
CN110377889A (zh) * 2019-06-05 2019-10-25 安徽继远软件有限公司 一种基于前馈序列记忆神经网络的文本编辑方法及系统
CN110705743A (zh) * 2019-08-23 2020-01-17 国网浙江省电力有限公司 一种基于长短期记忆神经网络的新能源消纳电量预测方法
CN111079906A (zh) * 2019-12-30 2020-04-28 燕山大学 基于长短时记忆网络的水泥成品比表面积预测方法及系统
CN111191559A (zh) * 2019-12-25 2020-05-22 国网浙江省电力有限公司泰顺县供电公司 基于时间卷积神经网络的架空线预警系统障碍物识别方法
CN111241466A (zh) * 2020-01-15 2020-06-05 上海海事大学 一种基于深度学习的船舶流量预测方法
CN111414478A (zh) * 2020-03-13 2020-07-14 北京科技大学 基于深度循环神经网络的社交网络情感建模方法
CN112001482A (zh) * 2020-08-14 2020-11-27 佳都新太科技股份有限公司 振动预测及模型训练方法、装置、计算机设备和存储介质
CN112466056A (zh) * 2020-12-01 2021-03-09 上海旷日网络科技有限公司 一种基于语音识别的自助柜取件系统及方法
CN112488286A (zh) * 2019-11-22 2021-03-12 大唐环境产业集团股份有限公司 一种用于mbr膜污染在线监测方法及系统
CN112714130A (zh) * 2020-12-30 2021-04-27 南京信息工程大学 一种基于大数据自适应网络安全态势感知方法
CN114740361A (zh) * 2022-04-12 2022-07-12 湖南大学 基于长短期记忆神经网络模型的燃料电池电压预测方法

Families Citing this family (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104538028B (zh) * 2014-12-25 2017-10-17 清华大学 一种基于深度长短期记忆循环神经网络的连续语音识别方法
CN104952448A (zh) * 2015-05-04 2015-09-30 张爱英 一种双向长短时记忆递归神经网络的特征增强方法及系统
US10909329B2 (en) * 2015-05-21 2021-02-02 Baidu Usa Llc Multilingual image question answering
CN106611599A (zh) * 2015-10-21 2017-05-03 展讯通信(上海)有限公司 基于人工神经网络的语音识别方法、装置及电子设备
KR102494139B1 (ko) * 2015-11-06 2023-01-31 삼성전자주식회사 뉴럴 네트워크 학습 장치 및 방법과, 음성 인식 장치 및 방법
CN105389980B (zh) * 2015-11-09 2018-01-19 上海交通大学 基于长短时记忆递归神经网络的短时交通流预测方法
CN105469065B (zh) * 2015-12-07 2019-04-23 中国科学院自动化研究所 一种基于递归神经网络的离散情感识别方法
CN105513591B (zh) * 2015-12-21 2019-09-03 百度在线网络技术(北京)有限公司 用lstm循环神经网络模型进行语音识别的方法和装置
WO2017136077A1 (fr) * 2016-02-04 2017-08-10 Google Inc. Couches associatives de réseau neuronal de mémoire à long et court terme
US10235994B2 (en) 2016-03-04 2019-03-19 Microsoft Technology Licensing, Llc Modular deep learning model
CN105559777B (zh) * 2016-03-17 2018-10-12 北京工业大学 基于小波包和lstm型rnn神经网络的脑电识别方法
KR102151682B1 (ko) * 2016-03-23 2020-09-04 구글 엘엘씨 다중채널 음성 인식을 위한 적응성 오디오 강화
CN107316198B (zh) * 2016-04-26 2020-05-29 阿里巴巴集团控股有限公司 账户风险识别方法及装置
EP3451239A4 (fr) 2016-04-29 2020-01-01 Cambricon Technologies Corporation Limited Appareil et procédé permettant d'exécuter des calculs de réseau neuronal récurrent et de ltsm
CN106096729B (zh) * 2016-06-06 2018-11-20 天津科技大学 一种面向大规模环境中复杂任务的深度策略学习方法
CN106126492B (zh) * 2016-06-07 2019-02-05 北京高地信息技术有限公司 基于双向lstm神经网络的语句识别方法及装置
US11449744B2 (en) 2016-06-23 2022-09-20 Microsoft Technology Licensing, Llc End-to-end memory networks for contextual language understanding
CN107808664B (zh) * 2016-08-30 2021-07-30 富士通株式会社 基于稀疏神经网络的语音识别方法、语音识别装置和电子设备
US10366163B2 (en) 2016-09-07 2019-07-30 Microsoft Technology Licensing, Llc Knowledge-guided structural attention processing
CN106383888A (zh) * 2016-09-22 2017-02-08 深圳市唯特视科技有限公司 一种利用图片检索定位导航的方法
CN108461080A (zh) * 2017-02-21 2018-08-28 中兴通讯股份有限公司 一种基于hlstm模型的声学建模方法和装置
CN116702843A (zh) 2017-05-20 2023-09-05 谷歌有限责任公司 投影神经网络
CN107293288B (zh) * 2017-06-09 2020-04-21 清华大学 一种残差长短期记忆循环神经网络的声学模型建模方法
CN107633842B (zh) 2017-06-12 2018-08-31 平安科技(深圳)有限公司 语音识别方法、装置、计算机设备及存储介质
CN107301864B (zh) * 2017-08-16 2020-12-22 重庆邮电大学 一种基于Maxout神经元的深度双向LSTM声学模型
CN107657313B (zh) * 2017-09-26 2021-05-18 上海数眼科技发展有限公司 基于领域适应的自然语言处理任务的迁移学习系统和方法
CN107993636B (zh) * 2017-11-01 2021-12-31 天津大学 基于递归神经网络的乐谱建模与生成方法
CN108364634A (zh) * 2018-03-05 2018-08-03 苏州声通信息科技有限公司 基于深度神经网络后验概率算法的口语发音评测方法
CN108831450A (zh) * 2018-03-30 2018-11-16 杭州鸟瞰智能科技股份有限公司 一种基于用户情绪识别的虚拟机器人人机交互方法
US10885277B2 (en) 2018-08-02 2021-01-05 Google Llc On-device neural networks for natural language understanding
CN109243494B (zh) * 2018-10-30 2022-10-11 南京工程学院 基于多重注意力机制长短时记忆网络的儿童情感识别方法
CN110517679B (zh) * 2018-11-15 2022-03-08 腾讯科技(深圳)有限公司 一种人工智能的音频数据处理方法及装置、存储介质
CN111368996B (zh) 2019-02-14 2024-03-12 谷歌有限责任公司 可传递自然语言表示的重新训练投影网络
CN110570845B (zh) * 2019-08-15 2021-10-22 武汉理工大学 一种基于域不变特征的语音识别方法
CN111429938B (zh) * 2020-03-06 2022-09-13 江苏大学 一种单通道语音分离方法、装置及电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5133012A (en) * 1988-12-02 1992-07-21 Kabushiki Kaisha Toshiba Speech recognition system utilizing both a long-term strategic and a short-term strategic scoring operation in a transition network thereof
CN101937675A (zh) * 2009-06-29 2011-01-05 展讯通信(上海)有限公司 语音检测方法及其设备
CN102122507A (zh) * 2010-01-08 2011-07-13 龚澍 一种运用人工神经网络进行前端处理的语音检错方法
US8005674B2 (en) * 2006-11-29 2011-08-23 International Business Machines Corporation Data modeling of class independent recognition models
CN104538028A (zh) * 2014-12-25 2015-04-22 清华大学 一种基于深度长短期记忆循环神经网络的连续语音识别方法

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9235799B2 (en) * 2011-11-26 2016-01-12 Microsoft Technology Licensing, Llc Discriminative pretraining of deep neural networks

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5133012A (en) * 1988-12-02 1992-07-21 Kabushiki Kaisha Toshiba Speech recognition system utilizing both a long-term strategic and a short-term strategic scoring operation in a transition network thereof
US8005674B2 (en) * 2006-11-29 2011-08-23 International Business Machines Corporation Data modeling of class independent recognition models
CN101937675A (zh) * 2009-06-29 2011-01-05 展讯通信(上海)有限公司 语音检测方法及其设备
CN102122507A (zh) * 2010-01-08 2011-07-13 龚澍 一种运用人工神经网络进行前端处理的语音检错方法
CN104538028A (zh) * 2014-12-25 2015-04-22 清华大学 一种基于深度长短期记忆循环神经网络的连续语音识别方法

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109086865A (zh) * 2018-06-11 2018-12-25 上海交通大学 一种基于切分循环神经网络的序列模型建立方法
CN109086865B (zh) * 2018-06-11 2022-01-28 上海交通大学 一种基于切分循环神经网络的序列模型建立方法
CN110147284A (zh) * 2019-05-24 2019-08-20 湖南农业大学 基于二维长短期记忆神经网络的超级计算机工作负载预测方法
CN110377889A (zh) * 2019-06-05 2019-10-25 安徽继远软件有限公司 一种基于前馈序列记忆神经网络的文本编辑方法及系统
CN110377889B (zh) * 2019-06-05 2023-06-20 安徽继远软件有限公司 一种基于前馈序列记忆神经网络的文本编辑方法及系统
CN110705743A (zh) * 2019-08-23 2020-01-17 国网浙江省电力有限公司 一种基于长短期记忆神经网络的新能源消纳电量预测方法
CN110705743B (zh) * 2019-08-23 2023-08-18 国网浙江省电力有限公司 一种基于长短期记忆神经网络的新能源消纳电量预测方法
CN112488286A (zh) * 2019-11-22 2021-03-12 大唐环境产业集团股份有限公司 一种用于mbr膜污染在线监测方法及系统
CN111191559A (zh) * 2019-12-25 2020-05-22 国网浙江省电力有限公司泰顺县供电公司 基于时间卷积神经网络的架空线预警系统障碍物识别方法
CN111191559B (zh) * 2019-12-25 2023-07-11 国网浙江省电力有限公司泰顺县供电公司 基于时间卷积神经网络的架空线预警系统障碍物识别方法
CN111079906B (zh) * 2019-12-30 2023-05-05 燕山大学 基于长短时记忆网络的水泥成品比表面积预测方法及系统
CN111079906A (zh) * 2019-12-30 2020-04-28 燕山大学 基于长短时记忆网络的水泥成品比表面积预测方法及系统
CN111241466A (zh) * 2020-01-15 2020-06-05 上海海事大学 一种基于深度学习的船舶流量预测方法
CN111241466B (zh) * 2020-01-15 2023-10-03 上海海事大学 一种基于深度学习的船舶流量预测方法
CN111414478A (zh) * 2020-03-13 2020-07-14 北京科技大学 基于深度循环神经网络的社交网络情感建模方法
CN111414478B (zh) * 2020-03-13 2023-11-17 北京科技大学 基于深度循环神经网络的社交网络情感建模方法
CN112001482A (zh) * 2020-08-14 2020-11-27 佳都新太科技股份有限公司 振动预测及模型训练方法、装置、计算机设备和存储介质
CN112466056A (zh) * 2020-12-01 2021-03-09 上海旷日网络科技有限公司 一种基于语音识别的自助柜取件系统及方法
CN112714130A (zh) * 2020-12-30 2021-04-27 南京信息工程大学 一种基于大数据自适应网络安全态势感知方法
CN114740361A (zh) * 2022-04-12 2022-07-12 湖南大学 基于长短期记忆神经网络模型的燃料电池电压预测方法

Also Published As

Publication number Publication date
CN104538028B (zh) 2017-10-17
CN104538028A (zh) 2015-04-22

Similar Documents

Publication Publication Date Title
WO2016101688A1 (fr) Procédé de reconnaissance vocale continue sur la base de réseau neuronal récurrent de mémoire à court et long terme profond
TWI692751B (zh) 語音喚醒方法、裝置以及電子設備
EP3926623A1 (fr) Procédé et appareil de reconnaissance vocale, ainsi que procédé et appareil d'apprentissage de réseau neuronal
KR102167719B1 (ko) 언어 모델 학습 방법 및 장치, 음성 인식 방법 및 장치
JP7109302B2 (ja) 文章生成モデルのアップデート方法及び文章生成装置
Nakkiran et al. Compressing deep neural networks using a rank-constrained topology
US20190034784A1 (en) Fixed-point training method for deep neural networks based on dynamic fixed-point conversion scheme
US20180018555A1 (en) System and method for building artificial neural network architectures
US9728183B2 (en) System and method for combining frame and segment level processing, via temporal pooling, for phonetic classification
WO2016145850A1 (fr) Procédé de construction de modèle acoustique de réseau neuronal récursif à mémoire à court terme à longue portée fondé sur un principe d'attention sélective
CN109065032B (zh) 一种基于深度卷积神经网络的外部语料库语音识别方法
JP2019159654A (ja) 時系列情報の学習システム、方法およびニューラルネットワークモデル
CN108831445A (zh) 四川方言识别方法、声学模型训练方法、装置及设备
US20140142929A1 (en) Deep neural networks training for speech and pattern recognition
US20230196202A1 (en) System and method for automatic building of learning machines using learning machines
CN109147774B (zh) 一种改进的延时神经网络声学模型
CN110853630B (zh) 面向边缘计算的轻量级语音识别方法
CN106340297A (zh) 一种基于云计算与置信度计算的语音识别方法与系统
KR20220130565A (ko) 키워드 검출 방법 및 장치
CN111144124A (zh) 机器学习模型的训练方法、意图识别方法及相关装置、设备
CN113987179A (zh) 基于知识增强和回溯损失的对话情绪识别网络模型、构建方法、电子设备及存储介质
Zhang et al. High order recurrent neural networks for acoustic modelling
CN108461080A (zh) 一种基于hlstm模型的声学建模方法和装置
Kang et al. Advanced recurrent network-based hybrid acoustic models for low resource speech recognition
Li et al. Improving long short-term memory networks using maxout units for large vocabulary speech recognition

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15871761

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15871761

Country of ref document: EP

Kind code of ref document: A1