WO2021212684A1 - Recurrent neural network and training method therefor - Google Patents

Recurrent neural network and training method therefor Download PDF

Info

Publication number
WO2021212684A1
WO2021212684A1 PCT/CN2020/105359 CN2020105359W WO2021212684A1 WO 2021212684 A1 WO2021212684 A1 WO 2021212684A1 CN 2020105359 W CN2020105359 W CN 2020105359W WO 2021212684 A1 WO2021212684 A1 WO 2021212684A1
Authority
WO
WIPO (PCT)
Prior art keywords
layer
network
neural network
recurrent neural
residual
Prior art date
Application number
PCT/CN2020/105359
Other languages
French (fr)
Chinese (zh)
Inventor
康燕斌
张志齐
Original Assignee
上海依图网络科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海依图网络科技有限公司 filed Critical 上海依图网络科技有限公司
Publication of WO2021212684A1 publication Critical patent/WO2021212684A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present invention relates to speech recognition, in particular to a cyclic neural network.
  • the invention also relates to a training method of the recurrent neural network.
  • FIG. 1 it is a model structure diagram of an existing speech recognition device; the existing recurrent neural network (RNN) is formed by connecting two layers of long and short-term memory (LSTM) network layers 102.
  • RNN recurrent neural network
  • LSTM long and short-term memory
  • the cyclic neural network is used in a speech recognition device.
  • the speech recognition device includes: a convolution layer (convolution layer) 101, the cyclic neural network, a fully connected layer (Fully connected Layer) 103, and a Connectionist Temporal Classification (CTC) layer 104.
  • a convolution layer convolution layer
  • the cyclic neural network includes: a fully connected layer (Fully connected Layer) 103, and a Connectionist Temporal Classification (CTC) layer 104.
  • CTC Connectionist Temporal Classification
  • the convolutional layer 101 receives sound spectrum signals, the output of the convolutional layer 101 is connected to the recurrent neural network, and the recurrent deep network is connected to the CTC layer 104 through the fully connected layer 103.
  • the CTC layer 104 improves the CTC loss function and is used to train the speech signal.
  • the number of layers of the convolutional layer 101 is 1 to 3, and the convolutional layer 101 is usually an invariant convolution layer (Invariant convolution layer).
  • the fully connected layer 103 is more than one layer.
  • the LSTM network layer 102 is formed by connecting multiple LSTM network nodes 105.
  • the LSTM network layer 102 is a bidirectional network layer. That is, in the width direction of each LSTM network layer 102, different LSTM network nodes 105 can transmit information to each other, as shown by the two arrow lines of the dashed circle 106.
  • a forgetting gate is usually set to control the influence of the output of the previous LSTM network node 105 on the LSTM network node 105.
  • the control function of the forgetting gate adopts the sigmoid function with output 0 or 1, which is input to the LSTM network node.
  • the LSMT network node 105 sets up a multiplication module, inputs the control signal output by the forget gate and the corresponding other nodes into the signal phase layer of the LSTM network node 105, and can control whether the corresponding input signal is input or not input to the LSTM network node 105.
  • the LSMT network node 105 also includes input gates and output gates, both of which multiply the signals of 0 and 1 and the corresponding signals to realize selective input of signals and control the flow of information.
  • the disadvantage of the existing recurrent neural network composed of the LSMT network layer 102 is that only about 2 layers of recurrent neural network can be used; when the number of layers increases, the training cannot be converged, or the training effect is significantly worse than that of the shallow network. It is not possible to further improve the performance of the cyclic network.
  • Figure 1 is a model structure diagram of an existing speech recognition device
  • Figure 2 is a model structure diagram of a speech recognition device according to an embodiment of the present invention.
  • Fig. 3 is a flowchart of a recurrent neural network training method according to an embodiment of the present invention.
  • FIG. 2 it is a model structure diagram of a speech recognition device according to an embodiment of the present invention.
  • the cyclic neural network of the embodiment of the present invention includes:
  • the baseline model is formed by connecting layer 2 of the 2-layer LSTM network.
  • An extended model includes a multi-layer residual network layer 3, the residual network layer 3 of each layer is formed by connecting a layer of LSTM network layer 2 and a layer of addition function layer, the residual network layer 3
  • the input terminal is connected to the output of the upper network layer, and the two input terminals of the addition function layer are respectively connected to the output of the LSTM network layer 2 of the residual network layer 3 and the output of the upper network layer.
  • the addition function The output of the layer is used as the output of the residual network layer 3.
  • the depth of the residual network layer 3 included in the extended model is 1 to 7 layers, and the depth of the cyclic neural network is 3 to 9 layers.
  • the extension depth of the extension model is confirmed by training.
  • the training result becomes worse when a layer of the residual network is added, the depth before the increased residual network is taken as the depth of the recurrent neural network.
  • the recurrent neural network is used in a speech recognition device.
  • the speech recognition device includes: a convolutional layer 1, the recurrent neural network, a fully connected layer 4, and a CTC layer 5.
  • the convolutional layer 1 receives sound spectrum signals, the output of the convolutional layer 1 is connected to the recurrent neural network, and the recurrent deep network is connected to the CTC layer 5 through the fully connected layer 4.
  • the CTC layer 5 improves the CTC loss function and is used to train the speech signal.
  • the number of layers of the convolutional layer 1 is 1 to 3, and the convolutional layer 1 is usually an invariant convolutional layer.
  • the fully connected layer 4 is one or more layers.
  • each network layer includes the same network node; for LSTM network layer 2, the network nodes are all LSTM network nodes 6; for residual network layer 3, the network nodes are all residual network nodes 8.
  • the residual network node 8 is composed of an LSTM network node 6 and an addition function node 9.
  • the addition function node 9 is also represented by ADD in FIG. Describe the additive function layer.
  • Each network layer in the cyclic neural network is a bidirectional network layer. That is, in the width direction of each network layer, different network nodes can transmit information to each other, as shown by the two arrow lines of the dashed circle 7.
  • each network layer only describes the detailed information of the network nodes of one network layer, and three points are used to indicate that the network layer contains more network nodes.
  • the number of network nodes in each network layer is the same and has a one-to-one correspondence.
  • the output of the previous network node is input to the LSTM network node 6 and the addition function node 9, respectively, and the output of the LSTM network node 6 in the residual network node 8 is also input to all
  • the addition function node 9 uses the output of the addition function node 9 as the output of the residual network node 8.
  • the output signal of the residual network node 8 corresponding to the residual network layer 3 can be expressed by the following formula:
  • output_ ⁇ k+1 ⁇ represents the output of the residual network node 8 of the K+1 network layer, that is, the output of the addition function node 9;
  • output_ ⁇ k ⁇ represents the output of the residual network node 8 of the K-th network layer, that is, the output of the addition function node 9;
  • LSTM_ ⁇ k+1 ⁇ () represents the functional expression of the LSTM network node 6 in the residual network node 8 of the K+1 network layer;
  • LSTM_ ⁇ k+1 ⁇ (output_k) represents the output of the LSTM network node 6 in the residual network node 8 of the K+1 network layer when the input is output_k.
  • the output signal of each LSTM network node 6 is: LSTM_ ⁇ k ⁇ (output_ ⁇ k-1 ⁇ ); LSTM_ ⁇ k ⁇ () represents the K-th LSTM
  • the recurrent neural network adds a residual network layer 3 on the basis of a baseline model composed of two layers of LSTM network layer 2.
  • the residual network layer 3 is formed by connecting the LSTM network layer 2 and the additive function layer.
  • the difference network layer 3 can increase the depth of the recurrent neural network while maintaining convergence, and finally can realize the improvement of the network depth, and thus can improve the training effect and performance.
  • FIG. 3 it is a flowchart of a cyclic neural network training method according to an embodiment of the present invention.
  • the training method of a cyclic neural network according to an embodiment of the present invention includes the following steps:
  • Step 1 Provide a baseline model of the recurrent neural network, the baseline model is formed by connecting two layers of the LSTM network. Step one corresponds to the step marked 301 in FIG. 3.
  • Step 2 Initialize the baseline model, and this initialization corresponds to the step marked 302 in FIG. 3.
  • the training of the recurrent neural network starts from the first layer of the LSTM network layer 2.
  • the training step of the LSTM network layer 2 of the first layer is not directly illustrated, and it is included in the initialization step.
  • Step 3 Add an extended model based on the baseline model.
  • the extended model includes a multi-layer residual network layer 3.
  • the residual network layer 3 of each layer consists of a layer of LSTM network layer 2 and a layer of addition function Layer connection is formed, the input end of the residual network layer 3 is connected to the output of the upper network layer, and the two input ends of the addition function layer are respectively connected to the output of the LSTM network layer 2 of the residual network layer 3 and The output of the upper network layer, and the output of the addition function layer is used as the output of the residual network layer 3.
  • the sub-steps of adding the residual network layer 3 include:
  • Step 31 Add a new layer of the residual network layer 3 to make the newly added layer 3 of the residual network layer K+1.
  • the previous K-layer network layers have been trained, and the trained model is used
  • the first K-layer network layer is initialized, and the K+1-th layer network is initialized with random parameters.
  • the first K-1 network layer is initialized with the trained parameter pair, and the K-th network layer is initialized with random parameters.
  • Step 32 Train the residual network layer 3 of the K+1th layer. That is, the step indicated by mark 303 is performed.
  • Step 33 Perform a performance test to check whether the promotion value of the performance test result is greater than the threshold value. That is, the step indicated by mark 304 is performed.
  • step 34 is performed.
  • the threshold in step 33 is 3%.
  • step 35 is performed.
  • Step 34 Add the residual network layer 3 of the K+1th layer to the recurrent neural network, and then repeat step 31.
  • Step 35 As shown in the step corresponding to the mark 309, the training ends, stop adding the residual network layer 3, and use the existing K-layer network layer as the recurrent neural network.
  • the method of the embodiment of the present invention can realize that: the depth of the residual network layer 3 included in the extended model is 1 to 7 layers, and the depth of the cyclic neural network is 3 to 9 layers.
  • the recurrent neural network is used in a speech recognition device.
  • the speech recognition device includes: a convolutional layer 1, the recurrent neural network, a fully connected layer 4, and a CTC layer 5.
  • the convolutional layer 1 receives sound spectrum signals, the output of the convolutional layer 1 is connected to the recurrent neural network, and the recurrent deep network is connected to the CTC layer 5 through the fully connected layer 4.
  • the CTC layer 5 improves the CTC loss function and is used to train the speech signal.
  • the number of layers of the convolutional layer 1 is 1 to 3, and the convolutional layer 1 is usually an invariant convolutional layer.
  • the fully connected layer 4 is one or more layers.
  • each network layer includes the same network node; for LSTM network layer 2, the network nodes are all LSTM network nodes 6; for residual network layer 3, the network nodes are all residual network nodes 8.
  • the residual network node 8 is composed of an LSTM network node 6 and an addition function node 9.
  • the addition function node 9 is also represented by ADD in FIG. Describe the additive function layer.
  • Each network layer in the cyclic neural network is a bidirectional network layer. That is, in the width direction of each network layer, different network nodes can transmit information to each other, as shown by the two arrow lines of the dashed circle 7.
  • each network layer only describes the detailed information of the network nodes of one network layer, and three points are used to indicate that the network layer contains more network nodes.
  • the number of network nodes in each network layer is the same and has a one-to-one correspondence.
  • the output of the previous network node is input to the LSTM network node 6 and the addition function node 9, respectively, and the output of the LSTM network node 6 in the residual network node 8 is also input to all
  • the addition function node 9 uses the output of the addition function node 9 as the output of the residual network node 8.
  • the output signal of the residual network node 8 corresponding to the residual network layer 3 can be expressed by the following formula:
  • output_ ⁇ k+1 ⁇ represents the output of the residual network node 8 of the K+1 network layer, that is, the output of the addition function node 9;
  • output_ ⁇ k ⁇ represents the output of the residual network node 8 of the K-th network layer, that is, the output of the addition function node 9;
  • LSTM_ ⁇ k+1 ⁇ () represents the functional expression of the LSTM network node 6 in the residual network node 8 of the K+1 network layer;
  • LSTM_ ⁇ k+1 ⁇ (output_k) represents the output of the LSTM network node 6 in the residual network node 8 of the K+1 network layer when the input is output_k.
  • the output signal of each LSTM network node 6 is: LSTM_ ⁇ k ⁇ (output_ ⁇ k-1 ⁇ ); LSTM_ ⁇ k ⁇ () represents the K-th LSTM

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

Disclosed is a recurrent neural network, comprising: a baseline model, which is formed by connecting two LSTM network layers; an extension model, the extension model comprising a plurality of residual network layers. Each residual network layer is formed by connecting one LSTM network layer and one addition function layer; an input end of the residual network layer is connected to an output of the previous network layer, two input ends of the addition function layer are respectively connected to an output of the LSTM network layer of the residual network layer and the output of the previous network layer, and an output of the addition function layer is used as the output of the residual network layer. Also disclosed is a training method for the recurrent neural network. According to the present invention, the depth of the recurrent neural network using the LSTM network layer can be deepened, and the training effect and performance can be improved.

Description

循环神经网络及其训练方法Recurrent neural network and its training method 技术领域Technical field
本发明涉及语音识别,特别是涉及一种循环神经网络。本发明还涉及循环神经网络的训练方法。The present invention relates to speech recognition, in particular to a cyclic neural network. The invention also relates to a training method of the recurrent neural network.
背景技术Background technique
如图1所示,是现有的语音识别装置的模型结构图;现有循环神经网络(RNN)由2层长短期记忆(LSTM)网络层102连接形成。As shown in Figure 1, it is a model structure diagram of an existing speech recognition device; the existing recurrent neural network (RNN) is formed by connecting two layers of long and short-term memory (LSTM) network layers 102.
图1中,所述循环神经网络用于语音识别装置。In Figure 1, the cyclic neural network is used in a speech recognition device.
所述语音识别装置包括:卷积层(convolution layer)101,所述循环神经网络,全连接层(Fully connected Layer)103和基于连接时序分类准则(Connectionist Temporal Classification,CTC)层104。The speech recognition device includes: a convolution layer (convolution layer) 101, the cyclic neural network, a fully connected layer (Fully connected Layer) 103, and a Connectionist Temporal Classification (CTC) layer 104.
所述卷积层101接收声音的频谱信号,所述卷积层101的输出连接到所述循环神经网络中,所述循环深度网络通过所述全连接层103连接到所述CTC层104。所述CTC层104提高CTC损失函数并用于对语音信号进行训练。The convolutional layer 101 receives sound spectrum signals, the output of the convolutional layer 101 is connected to the recurrent neural network, and the recurrent deep network is connected to the CTC layer 104 through the fully connected layer 103. The CTC layer 104 improves the CTC loss function and is used to train the speech signal.
所述卷积层101的层数为1至3层,所述卷积层101通常为不变卷积层(Invariant convolution layer)。The number of layers of the convolutional layer 101 is 1 to 3, and the convolutional layer 101 is usually an invariant convolution layer (Invariant convolution layer).
所述全连接层103为1层以上。The fully connected layer 103 is more than one layer.
所述循环神经网络中,LSTM网络层102由多个LSTM网络节点105连接而成。图1中,所述LSTM网络层102为双向网络层。也即在各所述LSTM网络层102的宽度方向上,不同的LSTM网络节点105能互相传递信息如虚线圈106的两根箭头线所示。LSTM网络节点105中,通常设置有遗忘门来控制之前的LSTM网络节点105的输出对LSTM网络节点105的影响,遗忘门的控制函数采用输出0或1的sigmoid函数,通过在输入到LSTM网络节点105设置乘法模块,将遗忘门输出的控制信号和对应的其它节点输入到LSTM网络节点105的信号相层,能控制对应的输入信号是否输入还是不输入到LSTM网络节点105。除了遗忘门之外LSMT网络节点105中还包括输入门和输出门,也都是将0和1的信号和对应的信号相乘实现对信号的选择性输入,实现对信息流动的控制。In the recurrent neural network, the LSTM network layer 102 is formed by connecting multiple LSTM network nodes 105. In FIG. 1, the LSTM network layer 102 is a bidirectional network layer. That is, in the width direction of each LSTM network layer 102, different LSTM network nodes 105 can transmit information to each other, as shown by the two arrow lines of the dashed circle 106. In the LSTM network node 105, a forgetting gate is usually set to control the influence of the output of the previous LSTM network node 105 on the LSTM network node 105. The control function of the forgetting gate adopts the sigmoid function with output 0 or 1, which is input to the LSTM network node. 105 sets up a multiplication module, inputs the control signal output by the forget gate and the corresponding other nodes into the signal phase layer of the LSTM network node 105, and can control whether the corresponding input signal is input or not input to the LSTM network node 105. In addition to the forget gate, the LSMT network node 105 also includes input gates and output gates, both of which multiply the signals of 0 and 1 and the corresponding signals to realize selective input of signals and control the flow of information.
现有采用LSMT网络层102组成的循环神经网络的缺点是,仅能使用2层左右的循环神经网络;当层数提高时, 会导致训练无法收敛, 或者训练效果显著差于浅层网络,从而无法进一步提高循环网络的性能。The disadvantage of the existing recurrent neural network composed of the LSMT network layer 102 is that only about 2 layers of recurrent neural network can be used; when the number of layers increases, the training cannot be converged, or the training effect is significantly worse than that of the shallow network. It is not possible to further improve the performance of the cyclic network.
技术问题technical problem
在此处键入技术问题描述段落。Type a paragraph describing the technical problem here.
技术解决方案Technical solutions
在此处键入技术解决方案描述段落。Type a paragraph describing the technical solution here.
有益效果Beneficial effect
在此处键入有益效果描述段落。Type a paragraph describing the beneficial effect here.
附图说明Description of the drawings
下面结合附图和具体实施方式对本发明作进一步详细的说明:The present invention will be further described in detail below in conjunction with the drawings and specific embodiments:
图1是现有语音识别装置的模型结构图;Figure 1 is a model structure diagram of an existing speech recognition device;
图2是本发明实施例的语音识别装置的模型结构图;Figure 2 is a model structure diagram of a speech recognition device according to an embodiment of the present invention;
图3是本发明实施例循环神经网络训练方法的流程图。Fig. 3 is a flowchart of a recurrent neural network training method according to an embodiment of the present invention.
本发明的最佳实施方式The best mode of the present invention
如图2所示,是本发明实施例的语音识别装置的模型结构图;本发明实施例循环神经网络包括:As shown in Figure 2, it is a model structure diagram of a speech recognition device according to an embodiment of the present invention; the cyclic neural network of the embodiment of the present invention includes:
基线模型,由2层LSTM网络层2连接形成。The baseline model is formed by connecting layer 2 of the 2-layer LSTM network.
延伸模型,所述延伸模型包括多层残差网络层3,各层的所述残差网络层3由一层LSTM网络层2和一层加法函数层连接形成,所述残差网络层3的输入端连接上一层网络层的输出,所述加法函数层的两个输入端分别连接所述残差网络层3的LSTM网络层2的输出和上一层网络层的输出,所述加法函数层的输出作为所述残差网络层3的输出。An extended model, the extended model includes a multi-layer residual network layer 3, the residual network layer 3 of each layer is formed by connecting a layer of LSTM network layer 2 and a layer of addition function layer, the residual network layer 3 The input terminal is connected to the output of the upper network layer, and the two input terminals of the addition function layer are respectively connected to the output of the LSTM network layer 2 of the residual network layer 3 and the output of the upper network layer. The addition function The output of the layer is used as the output of the residual network layer 3.
所述延伸模型所包括的所述残差网络层3的深度为1至7层,循环神经网络的深度为3至9层。The depth of the residual network layer 3 included in the extended model is 1 to 7 layers, and the depth of the cyclic neural network is 3 to 9 layers.
所述延伸模型的延伸深度通过训练确认,当增加一层所述残差网络时训练结果变差,则以增加的所述残差网络之前的深度为所述循环神经网络的深度。The extension depth of the extension model is confirmed by training. When the training result becomes worse when a layer of the residual network is added, the depth before the increased residual network is taken as the depth of the recurrent neural network.
本发明实施例中,所述循环神经网络用于语音识别装置。In the embodiment of the present invention, the recurrent neural network is used in a speech recognition device.
所述语音识别装置包括:卷积层1,所述循环神经网络,全连接层4和CTC层5。The speech recognition device includes: a convolutional layer 1, the recurrent neural network, a fully connected layer 4, and a CTC layer 5.
所述卷积层1接收声音的频谱信号,所述卷积层1的输出连接到所述循环神经网络中,所述循环深度网络通过所述全连接层4连接到所述CTC层5。所述CTC层5提高CTC损失函数并用于对语音信号进行训练。The convolutional layer 1 receives sound spectrum signals, the output of the convolutional layer 1 is connected to the recurrent neural network, and the recurrent deep network is connected to the CTC layer 5 through the fully connected layer 4. The CTC layer 5 improves the CTC loss function and is used to train the speech signal.
所述卷积层1的层数为1至3层,所述卷积层1通常为不变卷积层。The number of layers of the convolutional layer 1 is 1 to 3, and the convolutional layer 1 is usually an invariant convolutional layer.
所述全连接层4为1层以上。The fully connected layer 4 is one or more layers.
所述循环神经网络中,每一层网络层包括相同的网络节点;对于LSTM网络层2,网络节点都为LSTM网络节点6;对于残差网络层3,网络节点都为残差网络节点8。由图2所示可知,所述残差网络节点8由一个LSTM网络节点6和一个加法函数节点9组成,加法函数节点9在图2中也采用ADD表示,各所述加法函数节点9组成所述加法函数层。In the recurrent neural network, each network layer includes the same network node; for LSTM network layer 2, the network nodes are all LSTM network nodes 6; for residual network layer 3, the network nodes are all residual network nodes 8. As shown in FIG. 2, it can be seen that the residual network node 8 is composed of an LSTM network node 6 and an addition function node 9. The addition function node 9 is also represented by ADD in FIG. Describe the additive function layer.
所述循环神经网络中的各网络层都为双向网络层。也即在各所述网络层的宽度方向上,不同的网络节点能互相传递信息如虚线圈7的两根箭头线所示。图2中,各网络层仅详细描述了一个网络层的网络节点的详细信息,采用三个点表示网络层中包含有更多的网络节点。Each network layer in the cyclic neural network is a bidirectional network layer. That is, in the width direction of each network layer, different network nodes can transmit information to each other, as shown by the two arrow lines of the dashed circle 7. In Figure 2, each network layer only describes the detailed information of the network nodes of one network layer, and three points are used to indicate that the network layer contains more network nodes.
在所述循环神经网络的深度方向上,各所述网络层的网络节点数相同且具有一一对应的关系。In the depth direction of the recurrent neural network, the number of network nodes in each network layer is the same and has a one-to-one correspondence.
对于一个所述残差网络节点8,前一个所述网络节点的输出分别输入到LSTM网络节点6和加法函数节点9,所述残差网络节点8中的LSTM网络节点6的输出也输入到所述加法函数节点9,以加法函数节点9的输出作为所述残差网络节点8的输出。对于第K+1层网络层为所述残差网络层3时,所述残差网络层3中对应的所述残差网络节点8的输出信号可以采用如下公式表示:For one residual network node 8, the output of the previous network node is input to the LSTM network node 6 and the addition function node 9, respectively, and the output of the LSTM network node 6 in the residual network node 8 is also input to all The addition function node 9 uses the output of the addition function node 9 as the output of the residual network node 8. When the K+1 network layer is the residual network layer 3, the output signal of the residual network node 8 corresponding to the residual network layer 3 can be expressed by the following formula:
output_{k+1}=LSTM_{k+1}(output_k)+output_k;output_{k+1}=LSTM_{k+1}(output_k)+output_k;
其中,output_{k+1}表示第K+1层网络层的所述残差网络节点8的输出即所述加法函数节点9的输出;Wherein, output_{k+1} represents the output of the residual network node 8 of the K+1 network layer, that is, the output of the addition function node 9;
output_{k}表示第K层网络层的所述残差网络节点8的输出即所述加法函数节点9的输出;output_{k} represents the output of the residual network node 8 of the K-th network layer, that is, the output of the addition function node 9;
LSTM_{k+1}()表示第K+1层网络层的所述残差网络节点8中的LSTM网络节点6的函数表达式;LSTM_{k+1}() represents the functional expression of the LSTM network node 6 in the residual network node 8 of the K+1 network layer;
LSTM_{k+1}(output_k)则表示输入为output_k时第K+1层网络层的所述残差网络节点8中的LSTM网络节点6的输出。LSTM_{k+1}(output_k) represents the output of the LSTM network node 6 in the residual network node 8 of the K+1 network layer when the input is output_k.
而对于基线模型,即前两个所述LSTM网络层2,各LSTM网络节点6的输出信号为:LSTM_{k}(output_{k-1});LSTM_{k}()表示第K层LSTM网络层2的所述LSTM网络节点6的函数表达式;LSTM_{k}(output_{k-1})则表示输入为output_{k-1}时第K层LSTM网络层2的所述LSTM网络节点6的输出。For the baseline model, that is, the first two LSTM network layers 2, the output signal of each LSTM network node 6 is: LSTM_{k}(output_{k-1}); LSTM_{k}() represents the K-th LSTM The functional expression of the LSTM network node 6 of the network layer 2; LSTM_{k}(output_{k-1}) represents the LSTM network of the K-th LSTM network layer 2 when the input is output_{k-1} The output of node 6.
本发明实施例循环神经网络在由2层LSTM网络层2组成的基线模型的基础上,增加了残差网络层3且残差网络层3是由LSTM网络层2和加法函数层连接形成,残差网络层3能在增加循环神经网络的深度的同时还能保持收敛,最后能实现提高网络深度,并从而能提高训练效果和性能。In the embodiment of the present invention, the recurrent neural network adds a residual network layer 3 on the basis of a baseline model composed of two layers of LSTM network layer 2. The residual network layer 3 is formed by connecting the LSTM network layer 2 and the additive function layer. The difference network layer 3 can increase the depth of the recurrent neural network while maintaining convergence, and finally can realize the improvement of the network depth, and thus can improve the training effect and performance.
如图3所示,是本发明实施例循环神经网络训练方法的流程图;本发明实施例循环神经网络的训练方法包括如下步骤:As shown in FIG. 3, it is a flowchart of a cyclic neural network training method according to an embodiment of the present invention; the training method of a cyclic neural network according to an embodiment of the present invention includes the following steps:
步骤一、提供循环神经网络的基线模型,所述基线模型由2层LSTM网络层2连接形成。步骤一对应于图3中标记301所示步骤。Step 1: Provide a baseline model of the recurrent neural network, the baseline model is formed by connecting two layers of the LSTM network. Step one corresponds to the step marked 301 in FIG. 3.
步骤二、对所述基线模型进行初始化,该初始化对应于图3中标记302所示步骤。Step 2: Initialize the baseline model, and this initialization corresponds to the step marked 302 in FIG. 3.
从第1层所述LSTM网络层2开始对所述循环神经网络进行训练。图3中,对第1层所述LSTM网络层2的训练步骤未直接示意,包括在所述初始化的步骤中。图3中标记303对应的步骤是从K=2开始的,K大于2时对应于后续的延伸模型的训练。The training of the recurrent neural network starts from the first layer of the LSTM network layer 2. In FIG. 3, the training step of the LSTM network layer 2 of the first layer is not directly illustrated, and it is included in the initialization step. The step corresponding to mark 303 in FIG. 3 starts from K=2, and when K is greater than 2, it corresponds to the subsequent training of the extension model.
步骤三、在所述基线模型的基础上增加延伸模型,所述延伸模型包括多层残差网络层3,各层的所述残差网络层3由一层LSTM网络层2和一层加法函数层连接形成,所述残差网络层3的输入端连接上一层网络层的输出,所述加法函数层的两个输入端分别连接所述残差网络层3的LSTM网络层2的输出和上一层网络层的输出,所述加法函数层的输出作为所述残差网络层3的输出。Step 3: Add an extended model based on the baseline model. The extended model includes a multi-layer residual network layer 3. The residual network layer 3 of each layer consists of a layer of LSTM network layer 2 and a layer of addition function Layer connection is formed, the input end of the residual network layer 3 is connected to the output of the upper network layer, and the two input ends of the addition function layer are respectively connected to the output of the LSTM network layer 2 of the residual network layer 3 and The output of the upper network layer, and the output of the addition function layer is used as the output of the residual network layer 3.
每增加一层所述残差网络层3,则进行一次所述循环神经网络的训练即标记303对应的训练,增加所述残差网络层3的分步骤包括:Each time a layer of the residual network layer 3 is added, the training of the recurrent neural network is performed once, that is, the training corresponding to the mark 303. The sub-steps of adding the residual network layer 3 include:
步骤31、增加一层新的所述残差网络层3,令新增加的所述残差网络层3为第K+1层,前K层网络层都已训练好,采用以训练好的模型对前K层网络层进行初始化,第K+1层网络采用随机参数进行初始化。Step 31: Add a new layer of the residual network layer 3 to make the newly added layer 3 of the residual network layer K+1. The previous K-layer network layers have been trained, and the trained model is used The first K-layer network layer is initialized, and the K+1-th layer network is initialized with random parameters.
如标记307对应的步骤所示,通常增加了一层所述残差网络层3后,为了便于循环训练,通常会重新设置K,K=K+1。As shown in the step corresponding to mark 307, after adding a layer of the residual network layer 3, in order to facilitate the loop training, K is usually reset, and K=K+1.
之后,如标记308对应的步骤所示,由于重新设置了K值之后,则有:前K-1层网络层采用已训练参数对进行初始化,第K层网络层采用随机参数初始化。After that, as shown in the step corresponding to the mark 308, after the K value is reset, the first K-1 network layer is initialized with the trained parameter pair, and the K-th network layer is initialized with random parameters.
步骤32、对第K+1层所述残差网络层3进行训练。即进行标记303所示步骤。Step 32: Train the residual network layer 3 of the K+1th layer. That is, the step indicated by mark 303 is performed.
步骤33、进行性能测试,检查性能测试结果的提升值是否大于阈值。即进行标记304所示步骤。Step 33: Perform a performance test to check whether the promotion value of the performance test result is greater than the threshold value. That is, the step indicated by mark 304 is performed.
参考标记304对应的步骤所示:The steps corresponding to reference mark 304 are as follows:
如果所述性能测试结果的提升值大于阈值,则进行步骤34。步骤33中的所述阈值为3%。If the promotion value of the performance test result is greater than the threshold value, step 34 is performed. The threshold in step 33 is 3%.
如果所述性能测试结果的提升值小于阈值,则进行步骤35。If the promotion value of the performance test result is less than the threshold value, step 35 is performed.
步骤34、将第K+1层所述残差网络层3增加到所述循环神经网络中,之后重复步骤31。Step 34: Add the residual network layer 3 of the K+1th layer to the recurrent neural network, and then repeat step 31.
步骤35、如标记309对应的步骤所示,训练结束,停止继续增加所述残差网络层3,以已有的K层网络层作为所述循环神经网络。Step 35: As shown in the step corresponding to the mark 309, the training ends, stop adding the residual network layer 3, and use the existing K-layer network layer as the recurrent neural network.
本发明实施例方法能实现:所述延伸模型所包括的所述残差网络层3的深度为1至7层,循环神经网络的深度为3至9层。The method of the embodiment of the present invention can realize that: the depth of the residual network layer 3 included in the extended model is 1 to 7 layers, and the depth of the cyclic neural network is 3 to 9 layers.
本发明实施例方法中,所述循环神经网络用于语音识别装置。In the method of the embodiment of the present invention, the recurrent neural network is used in a speech recognition device.
所述语音识别装置包括:卷积层1,所述循环神经网络,全连接层4和CTC层5。The speech recognition device includes: a convolutional layer 1, the recurrent neural network, a fully connected layer 4, and a CTC layer 5.
所述卷积层1接收声音的频谱信号,所述卷积层1的输出连接到所述循环神经网络中,所述循环深度网络通过所述全连接层4连接到所述CTC层5。所述CTC层5提高CTC损失函数并用于对语音信号进行训练。The convolutional layer 1 receives sound spectrum signals, the output of the convolutional layer 1 is connected to the recurrent neural network, and the recurrent deep network is connected to the CTC layer 5 through the fully connected layer 4. The CTC layer 5 improves the CTC loss function and is used to train the speech signal.
所述卷积层1的层数为1至3层,所述卷积层1通常为不变卷积层。The number of layers of the convolutional layer 1 is 1 to 3, and the convolutional layer 1 is usually an invariant convolutional layer.
所述全连接层4为1层以上。The fully connected layer 4 is one or more layers.
所述循环神经网络中,每一层网络层包括相同的网络节点;对于LSTM网络层2,网络节点都为LSTM网络节点6;对于残差网络层3,网络节点都为残差网络节点8。由图2所示可知,所述残差网络节点8由一个LSTM网络节点6和一个加法函数节点9组成,加法函数节点9在图2中也采用ADD表示,各所述加法函数节点9组成所述加法函数层。In the recurrent neural network, each network layer includes the same network node; for LSTM network layer 2, the network nodes are all LSTM network nodes 6; for residual network layer 3, the network nodes are all residual network nodes 8. As shown in FIG. 2, it can be seen that the residual network node 8 is composed of an LSTM network node 6 and an addition function node 9. The addition function node 9 is also represented by ADD in FIG. Describe the additive function layer.
所述循环神经网络中的各网络层都为双向网络层。也即在各所述网络层的宽度方向上,不同的网络节点能互相传递信息如虚线圈7的两根箭头线所示。图2中,各网络层仅详细描述了一个网络层的网络节点的详细信息,采用三个点表示网络层中包含有更多的网络节点。Each network layer in the cyclic neural network is a bidirectional network layer. That is, in the width direction of each network layer, different network nodes can transmit information to each other, as shown by the two arrow lines of the dashed circle 7. In Figure 2, each network layer only describes the detailed information of the network nodes of one network layer, and three points are used to indicate that the network layer contains more network nodes.
在所述循环神经网络的深度方向上,各所述网络层的网络节点数相同且具有一一对应的关系。In the depth direction of the recurrent neural network, the number of network nodes in each network layer is the same and has a one-to-one correspondence.
对于一个所述残差网络节点8,前一个所述网络节点的输出分别输入到LSTM网络节点6和加法函数节点9,所述残差网络节点8中的LSTM网络节点6的输出也输入到所述加法函数节点9,以加法函数节点9的输出作为所述残差网络节点8的输出。对于第K+1层网络层为所述残差网络层3时,所述残差网络层3中对应的所述残差网络节点8的输出信号可以采用如下公式表示:For one residual network node 8, the output of the previous network node is input to the LSTM network node 6 and the addition function node 9, respectively, and the output of the LSTM network node 6 in the residual network node 8 is also input to all The addition function node 9 uses the output of the addition function node 9 as the output of the residual network node 8. When the K+1 network layer is the residual network layer 3, the output signal of the residual network node 8 corresponding to the residual network layer 3 can be expressed by the following formula:
output_{k+1}=LSTM_{k+1}(output_k)+output_k;output_{k+1}=LSTM_{k+1}(output_k)+output_k;
其中,output_{k+1}表示第K+1层网络层的所述残差网络节点8的输出即所述加法函数节点9的输出;Wherein, output_{k+1} represents the output of the residual network node 8 of the K+1 network layer, that is, the output of the addition function node 9;
output_{k}表示第K层网络层的所述残差网络节点8的输出即所述加法函数节点9的输出;output_{k} represents the output of the residual network node 8 of the K-th network layer, that is, the output of the addition function node 9;
LSTM_{k+1}()表示第K+1层网络层的所述残差网络节点8中的LSTM网络节点6的函数表达式;LSTM_{k+1}() represents the functional expression of the LSTM network node 6 in the residual network node 8 of the K+1 network layer;
LSTM_{k+1}(output_k)则表示输入为output_k时第K+1层网络层的所述残差网络节点8中的LSTM网络节点6的输出。LSTM_{k+1}(output_k) represents the output of the LSTM network node 6 in the residual network node 8 of the K+1 network layer when the input is output_k.
而对于基线模型,即前两个所述LSTM网络层2,各LSTM网络节点6的输出信号为:LSTM_{k}(output_{k-1});LSTM_{k}()表示第K层LSTM网络层2的所述LSTM网络节点6的函数表达式;LSTM_{k}(output_{k-1})则表示输入为output_{k-1}时第K层LSTM网络层2的所述LSTM网络节点6的输出。For the baseline model, that is, the first two LSTM network layers 2, the output signal of each LSTM network node 6 is: LSTM_{k}(output_{k-1}); LSTM_{k}() represents the K-th LSTM The functional expression of the LSTM network node 6 of the network layer 2; LSTM_{k}(output_{k-1}) represents the LSTM network of the K-th LSTM network layer 2 when the input is output_{k-1} The output of node 6.
以上通过具体实施例对本发明进行了详细的说明,但这些并非构成对本发明的限制。在不脱离本发明原理的情况下,本领域的技术人员还可做出许多变形和改进,这些也应视为本发明的保护范围。The present invention has been described in detail through specific embodiments above, but these do not constitute a limitation to the present invention. Without departing from the principle of the present invention, those skilled in the art can make many modifications and improvements, which should also be regarded as the protection scope of the present invention.

Claims (18)

  1. 一种循环神经网络,其特征在于,包括:A recurrent neural network, characterized in that it includes:
    基线模型,由2层LSTM网络层连接形成;The baseline model is formed by connecting two LSTM network layers;
    延伸模型,所述延伸模型包括多层残差网络层,各层的所述残差网络层由一层LSTM网络层和一层加法函数层连接形成,所述残差网络层的输入端连接上一层网络层的输出,所述加法函数层的两个输入端分别连接所述残差网络层的LSTM网络层的输出和上一层网络层的输出,所述加法函数层的输出作为所述残差网络层的输出。An extended model, the extended model includes a multi-layer residual network layer, the residual network layer of each layer is formed by connecting an LSTM network layer and an additive function layer, and the input terminal of the residual network layer is connected to The output of a network layer, the two input ends of the addition function layer are respectively connected to the output of the LSTM network layer of the residual network layer and the output of the upper network layer, and the output of the addition function layer is used as the The output of the residual network layer.
  2. 如权利要求1所述的循环神经网络,其特征在于:所述延伸模型所包括的所述残差网络层的深度为1至7层,循环神经网络的深度为3至9层。The cyclic neural network according to claim 1, wherein the residual network layer included in the extended model has a depth of 1 to 7 layers, and the cyclic neural network has a depth of 3 to 9 layers.
  3. 如权利要求2所述的循环神经网络,其特征在于:所述延伸模型的延伸深度通过训练确认,当增加一层所述残差网络时训练结果变差,则以增加的所述残差网络之前的深度为所述循环神经网络的深度。The cyclic neural network according to claim 2, wherein the extension depth of the extension model is confirmed by training, and when a layer of the residual network is added, the training result becomes worse, then the increased residual network The previous depth is the depth of the recurrent neural network.
  4. 如权利要求1所述的循环神经网络,其特征在于:所述循环神经网络用于语音识别装置。The cyclic neural network according to claim 1, wherein the cyclic neural network is used in a speech recognition device.
  5. 如权利要求4所述的循环神经网络,其特征在于:所述语音识别装置包括:卷积层,所述循环神经网络,全连接层和CTC层;5. The recurrent neural network according to claim 4, wherein the speech recognition device comprises: a convolutional layer, the recurrent neural network, a fully connected layer and a CTC layer;
    所述卷积层接收声音的频谱信号,所述卷积层的输出连接到所述循环神经网络中,所述循环深度网络通过所述全连接层连接到所述CTC层。The convolutional layer receives sound spectrum signals, the output of the convolutional layer is connected to the recurrent neural network, and the recurrent deep network is connected to the CTC layer through the fully connected layer.
  6. 如权利要求5所述的循环神经网络,其特征在于:所述卷积层为1至3层。The cyclic neural network according to claim 5, wherein the convolutional layer is 1 to 3 layers.
  7. 如权利要求5所述的循环神经网络,其特征在于:所述全连接层为1层以上。The cyclic neural network according to claim 5, wherein the fully connected layer is more than one layer.
  8. 如权利要求5所述的循环神经网络,其特征在于:所述循环神经网络中,每一层网络层包括相同的网络节点;对于LSTM网络层,网络节点都为LSTM网络节点;对于残差网络层,网络节点都为残差网络节点。The recurrent neural network according to claim 5, characterized in that: in the recurrent neural network, each network layer includes the same network node; for the LSTM network layer, the network nodes are all LSTM network nodes; for the residual network Layer, network nodes are all residual network nodes.
  9. 如权利要求8所述的循环神经网络,其特征在于:所述循环神经网络中的各网络层都为双向网络层。8. The cyclic neural network of claim 8, wherein each network layer in the cyclic neural network is a bidirectional network layer.
  10. 一种循环神经网络的训练方法,其特征在于,包括如下步骤:A method for training a recurrent neural network, which is characterized in that it comprises the following steps:
    步骤一、提供循环神经网络的基线模型,所述基线模型由2层LSTM网络层连接形成;Step 1: Provide a baseline model of the recurrent neural network, which is formed by connecting two LSTM network layers;
    步骤二、对所述基线模型进行初始化,从第1层所述LSTM网络层开始对所述循环神经网络进行训练;Step 2: Initialize the baseline model, and start training the recurrent neural network from the first layer of the LSTM network layer;
    步骤三、在所述基线模型的基础上增加延伸模型,所述延伸模型包括多层残差网络层,各层的所述残差网络层由一层LSTM网络层和一层加法函数层连接形成,所述残差网络层的输入端连接上一层网络层的输出,所述加法函数层的两个输入端分别连接所述残差网络层的LSTM网络层的输出和上一层网络层的输出,所述加法函数层的输出作为所述残差网络层的输出;Step 3: Add an extended model on the basis of the baseline model. The extended model includes a multi-layer residual network layer, and the residual network layer of each layer is formed by connecting an LSTM network layer and an additive function layer The input end of the residual network layer is connected to the output of the upper network layer, and the two input ends of the addition function layer are respectively connected to the output of the LSTM network layer of the residual network layer and the output of the upper network layer. Output, the output of the addition function layer is used as the output of the residual network layer;
    每增加一层所述残差网络层,则进行一次所述循环神经网络的训练,增加所述残差网络层的分步骤包括:Each time a layer of the residual network layer is added, the training of the recurrent neural network is performed once, and the sub-steps of adding the residual network layer include:
    步骤31、增加一层新的所述残差网络层,令新增加的所述残差网络层为第K+1层,前K层网络层都已训练好,采用以训练好的模型对前K层网络层进行初始化,第K+1层网络采用随机参数进行初始化;Step 31. Add a new layer of the residual network layer to make the newly added layer of the residual network layer K+1. The previous K-layer network layers have been trained, and the trained model is used to compare the previous layer. The K layer network layer is initialized, and the K+1 layer network is initialized with random parameters;
    步骤32、对第K+1层所述残差网络层进行训练;Step 32: Train the residual network layer of the K+1th layer;
    步骤33、进行性能测试,检查性能测试结果的提升值是否大于阈值;Step 33: Perform a performance test to check whether the promotion value of the performance test result is greater than the threshold;
    如果所述性能测试结果的提升值大于阈值,则进行步骤34;If the promotion value of the performance test result is greater than the threshold value, proceed to step 34;
    如果所述性能测试结果的提升值小于阈值,则进行步骤35;If the promotion value of the performance test result is less than the threshold value, proceed to step 35;
    步骤34、将第K+1层所述残差网络层增加到所述循环神经网络中,之后重复步骤31;Step 34: Add the residual network layer of the K+1th layer to the recurrent neural network, and then repeat step 31;
    步骤35、训练结束,停止继续增加所述残差网络层,以已有的K层网络层作为所述循环神经网络。Step 35: After the training is over, stop adding the residual network layer, and use the existing K-layer network layer as the recurrent neural network.
  11. 如权利要求10所述的循环神经网络的训练方法,其特征在于:所述延伸模型所包括的所述残差网络层的深度为1至7层,循环神经网络的深度为3至9层。10. The training method of a recurrent neural network according to claim 10, wherein the residual network layer included in the extended model has a depth of 1 to 7 layers, and the depth of the recurrent neural network is 3 to 9 layers.
  12. 如权利要求10所述的循环神经网络的训练方法,其特征在于:步骤33中的所述阈值为3%。The method for training a recurrent neural network according to claim 10, wherein the threshold in step 33 is 3%.
  13. 如权利要求10所述的循环神经网络的训练方法,其特征在于:所述循环神经网络用于语音识别装置。The training method of a recurrent neural network according to claim 10, wherein the recurrent neural network is used in a speech recognition device.
  14. 如权利要求13所述的循环神经网络的训练方法,其特征在于:所述语音识别装置包括:卷积层,所述循环神经网络,全连接层和CTC层;The training method of a recurrent neural network according to claim 13, wherein the speech recognition device comprises: a convolutional layer, the recurrent neural network, a fully connected layer and a CTC layer;
    所述卷积层接收声音的频谱信号,所述卷积层的输出连接到所述循环神经网络中,所述循环深度网络通过所述全连接层连接到所述CTC层。The convolutional layer receives sound spectrum signals, the output of the convolutional layer is connected to the recurrent neural network, and the recurrent deep network is connected to the CTC layer through the fully connected layer.
  15. 如权利要求14所述的循环神经网络的训练方法,其特征在于:所述卷积层为1至3层。The training method of a recurrent neural network according to claim 14, wherein the convolutional layer is 1 to 3 layers.
  16. 如权利要求14所述的循环神经网络的训练方法,其特征在于:所述全连接层为1层以上。The training method of a recurrent neural network according to claim 14, wherein the fully connected layer is more than one layer.
  17. 如权利要求14所述的循环神经网络的训练方法,其特征在于:所述循环神经网络中,每一层网络层包括相同的网络节点;对于LSTM网络层,网络节点都为LSTM网络节点;对于残差网络层,网络节点都为残差网络节点。The training method of a recurrent neural network according to claim 14, characterized in that: in the recurrent neural network, each network layer includes the same network node; for the LSTM network layer, the network nodes are all LSTM network nodes; In the residual network layer, the network nodes are all residual network nodes.
  18. 如权利要求17所述的循环神经网络的训练方法,其特征在于:所述循环神经网络中的各网络层都为双向网络层。The training method of a recurrent neural network according to claim 17, wherein each network layer in the recurrent neural network is a bidirectional network layer.
PCT/CN2020/105359 2020-04-22 2020-07-29 Recurrent neural network and training method therefor WO2021212684A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010323668.5 2020-04-22
CN202010323668.5A CN111401530B (en) 2020-04-22 2020-04-22 Training method for neural network of voice recognition device

Publications (1)

Publication Number Publication Date
WO2021212684A1 true WO2021212684A1 (en) 2021-10-28

Family

ID=71429759

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/105359 WO2021212684A1 (en) 2020-04-22 2020-07-29 Recurrent neural network and training method therefor

Country Status (2)

Country Link
CN (1) CN111401530B (en)
WO (1) WO2021212684A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114756977A (en) * 2022-06-16 2022-07-15 成都飞机工业(集团)有限责任公司 Method, device and equipment for predicting boring cutter yield of intersection hole of airplane and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111401530B (en) * 2020-04-22 2021-04-09 上海依图网络科技有限公司 Training method for neural network of voice recognition device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10192327B1 (en) * 2016-02-04 2019-01-29 Google Llc Image compression with recurrent neural networks
CN109767759A (en) * 2019-02-14 2019-05-17 重庆邮电大学 End-to-end speech recognition methods based on modified CLDNN structure
CN110992941A (en) * 2019-10-22 2020-04-10 国网天津静海供电有限公司 Power grid dispatching voice recognition method and device based on spectrogram
CN111401530A (en) * 2020-04-22 2020-07-10 上海依图网络科技有限公司 Recurrent neural network and training method thereof

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107562784A (en) * 2017-07-25 2018-01-09 同济大学 Short text classification method based on ResLCNN models
US20190130896A1 (en) * 2017-10-26 2019-05-02 Salesforce.Com, Inc. Regularization Techniques for End-To-End Speech Recognition
CN108847223B (en) * 2018-06-20 2020-09-29 陕西科技大学 Voice recognition method based on deep residual error neural network
CN110895933B (en) * 2018-09-05 2022-05-03 中国科学院声学研究所 Far-field speech recognition method based on space-time residual error neural network
WO2020077232A1 (en) * 2018-10-12 2020-04-16 Cambridge Cancer Genomics Limited Methods and systems for nucleic acid variant detection and analysis
CN110148408A (en) * 2019-05-29 2019-08-20 上海电力学院 A kind of Chinese speech recognition method based on depth residual error

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10192327B1 (en) * 2016-02-04 2019-01-29 Google Llc Image compression with recurrent neural networks
CN109767759A (en) * 2019-02-14 2019-05-17 重庆邮电大学 End-to-end speech recognition methods based on modified CLDNN structure
CN110992941A (en) * 2019-10-22 2020-04-10 国网天津静海供电有限公司 Power grid dispatching voice recognition method and device based on spectrogram
CN111401530A (en) * 2020-04-22 2020-07-10 上海依图网络科技有限公司 Recurrent neural network and training method thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
AADITYA PRAKASH, HASAN SADID A, LEE KATHY, DATLA VIVEK, QADIR ASHEQUL, LIU JOEY, FARRI OLADIMEJI: "Neural Paraphrase Generation with Stacked Residual LSTM Networks", COLING 2016, 10 October 2016 (2016-10-10), XP055355306, Retrieved from the Internet <URL:https://arxiv.org/pdf/1610.03098.pdf> [retrieved on 20170315] *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114756977A (en) * 2022-06-16 2022-07-15 成都飞机工业(集团)有限责任公司 Method, device and equipment for predicting boring cutter yield of intersection hole of airplane and storage medium

Also Published As

Publication number Publication date
CN111401530A (en) 2020-07-10
CN111401530B (en) 2021-04-09

Similar Documents

Publication Publication Date Title
WO2021212684A1 (en) Recurrent neural network and training method therefor
Liu et al. Deep neural network architectures for modulation classification
Liang et al. An iterative BP-CNN architecture for channel decoding
Kim et al. Physical layer communication via deep learning
WO2021057038A1 (en) Apparatus and method for speech recognition and keyword detection based on multi-task model
CN108282264A (en) The polarization code coding method of list algorithm is serially eliminated based on bit reversal
Che et al. Spatial-temporal hybrid feature extraction network for few-shot automatic modulation classification
CN109361404A (en) A kind of LDPC decoding system and interpretation method based on semi-supervised deep learning network
US20230299872A1 (en) Neural Network-Based Communication Method and Related Apparatus
CN108510982A (en) Audio event detection method, device and computer readable storage medium
US11546086B2 (en) Channel decoding method and channel decoding device
Ye et al. Circular convolutional auto-encoder for channel coding
CN113676266B (en) Channel modeling method based on quantum generation countermeasure network
CN115510319A (en) Recommendation method and system based on potential interest multi-view fusion
CN110895933B (en) Far-field speech recognition method based on space-time residual error neural network
CN102832950B (en) A kind of frame error rate method of estimation of low density parity check code
CN112329524A (en) Signal classification and identification method, system and equipment based on deep time sequence neural network
CN115987722A (en) Deep learning assisted OFDM channel estimation and signal detection method
Rupp et al. Supervised learning of perceptron and output feedback dynamic networks: A feedback analysis via the small gain theorem
CN108365918A (en) A kind of multielement LDPC code coding method based on criterion in active set
KR20210058548A (en) Method for modeling automatic transmission using artificial neural network
CN113033695A (en) Method for predicting faults of electronic device
CN112994840B (en) Decoder based on neural network
WO2023226635A1 (en) Channel encoding/decoding method and apparatus
Salami et al. Belief control strategies for interactions over weak graphs

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20931780

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20931780

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20931780

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 19.05.2023)