CN108197701A - A kind of multi-task learning method based on RNN - Google Patents

A kind of multi-task learning method based on RNN Download PDF

Info

Publication number
CN108197701A
CN108197701A CN201810112482.8A CN201810112482A CN108197701A CN 108197701 A CN108197701 A CN 108197701A CN 201810112482 A CN201810112482 A CN 201810112482A CN 108197701 A CN108197701 A CN 108197701A
Authority
CN
China
Prior art keywords
task
publicly
neural network
gradient
rnn
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810112482.8A
Other languages
Chinese (zh)
Inventor
王磊
翟荣安
王纯配
顾仓
王毓
刘晶晶
王飞
于振中
李文兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
HRG International Institute for Research and Innovation
Original Assignee
HRG International Institute for Research and Innovation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by HRG International Institute for Research and Innovation filed Critical HRG International Institute for Research and Innovation
Priority to CN201810112482.8A priority Critical patent/CN108197701A/en
Publication of CN108197701A publication Critical patent/CN108197701A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The present invention provides a kind of multi-task learning methods based on RNN, the described method comprises the following steps:Step S1:Initialize systematic parameter θ=(W, U, B, V);Step S2:Input sample x1,i,…,xR,i, learn publicly-owned information Xco, will be in the training of publicly-owned information compensation to individual task;Step S3:Calculate the prediction label vector output of each neural networkThe loss L of calculating task rr,i;Step S4:The gradient of θ=(W, U, B, V) is solved according to gradient descent method and BPTT algorithms, determines task r about publicly-owned information XcoGradient;Step S5:It determines learning rate η, updates each weights gradient W=W η δW;Step S6:Judge whether neural network reaches stable, if performing step S7;If not, return to step S2, iteration update model parameter;Step S7:Export Optimized model.The present invention can efficiently use the publicly-owned feature between RNN study multitasks, and publicly-owned feature is input in the study of individual task, realize information sharing.And by quoting GRU structures in RNN, gradient disappearance problem can be efficiently solved.

Description

A kind of multi-task learning method based on RNN
Technical field
The present invention relates to neural network multi-task learning field more particularly to a kind of multi-task learning methods based on RNN.
Background technology
In practical application, it can link together in different ways between different tasks.And multitask It practises than individual task study advantageously.For example, we only have the available data of fraction, at this moment multitask when each task Study can together learn the data acquisition system of multiple inter-related tasks.It is between task it could also be possible that because latent there are certain Common represent and link together.For example, in target identification, former steps that human visual system is formed all are to pass through A common characteristic set is practised to represent all targets.Method previously with respect to multi-task learning is between task mostly Relationship by a kind of functional concept connection together.
Modeling for sequence data, the model based on neural network (Neural Network) is in the knowledge that makes a broadcast address Not, excellent achievement is achieved in language model and video recording classification problem.These models largely belong to two neural networks, Feedforward neural network (Feedforward Neural Network) and recurrent neural network (Recurrent Neural Network).It can learn the sequence information of endless in traditional RNN structural theories.But, in practice it has proved that, time interval is got over This learning ability of big RNN will be weaker.And this loop structure be difficult training because it is easy to appear gradient disappearances It explodes with gradient.In order to solve the problems, such as gradient disappearance, many structures have been suggested.Including long mnemon in short-term (LSTM) recurrent neural network.But LSTM network structures are more complicated, this will expend more training times, and easily exist Occurs gradient disappearance in back-propagation process.In order to solve this problem, a kind of GRU structures with more simple structure are carried Go out, this structure is easier to realize compared to LSTM, and training is also simpler.Its structure is as shown in Figure 1.
The existing method previously with respect to multi-task learning be mostly the relationship between task is passed through it is a kind of functional Concept connection is together.For example, Baxter determines the correlation between them by a single model selection criteria, i.e., There are one group of optimal hypothesis classes between multitask.The existing most structure of multi-task learning method is complex, such as LSTM Model, and easily occur gradient disappearance in back-propagation process, based on this, the present invention provides a kind of based on the more of RNN Tasking learning method, the RNN have GRU structures, can effectively prevent gradient disappearance problem, and simpler compared to LSTM structures, make The feature that must be obtained is more accurate.
Invention content
In view of the drawbacks of the prior art, the present invention provides a kind of multi-task learning method based on RNN, is had according to RNN The participation of publicly-owned information can improve the learning ability of individual task when the characteristics of learning contextual information and multi-task learning Characteristic using the publicly-owned feature between RNN model learning tasks, and is input to as input offset the study of individual task In, learn individual task finally by a feedforward compensation layer (Feed Forward Layer, FF) so that there is GRU structures RNN can effectively prevent gradient disappearance problem, and simpler compared to LSTM structures, obtained feature is more accurate.
To achieve the above object, the present invention provides a kind of multi-task learning method based on RNN, the method includes with Lower step:
Step S1:Systematic parameter θ=(W, U, B, V) is initialized, wherein, W represents what neural network connected between layers Weight matrix;U represents weight matrix of the data when inputting neural network;B represents the biasing square of neural network between layers Battle array;V represents neural network hidden layer to softmax layers of weight matrix;
Step S2:Input sample x1,i,…,xR,i, learn publicly-owned information Xco, by the instruction of publicly-owned information compensation to individual task In white silk;
Step S3:Calculate the prediction label vector output of each neural networkThe loss L of calculating task rr,i
Step S4:The gradient of θ=(W, U, B, V) is solved according to gradient descent method and BPTT algorithms, determines task r about public affairs There is information XcoGradient;
Step S5:It determines learning rate η, updates each weights gradient W=W- η δW, wherein, δWIt represents reversed in neural network During propagation, pass through the partial derivative for the weight matrix that gradient descent method obtains;
Step S6:Judge whether neural network reaches stable, if performing step S7;If not, return to step S2, repeatedly Generation update model parameter;
Step S7:Export Optimized model.
Wherein, institute step S2 further comprises:It extracts a sample input RNN out from each task, learns on one Hereafter vector obtains publicly-owned information X as shared informationco
Wherein, institute step S3 further comprises:By publicly-owned information XcoThe study of individual task is input to as input offset In, individual task is learnt by feedforward compensation layer (Feed Forward Layer, FF), generates the label vector of prediction According to the prediction label of generation vectorThe loss L of calculating task rr,i
Wherein, institute step S3 further comprises:Hidden layer is exported into h(r)Output layer is input to, and passes through softmax functions Layer output prediction label vectorWhereinzr,i=V(r)·h(r), h(r)=g (U(r)·xr,i+W(r)· Xco+b(r)), wherein g () represents sigmoid activation primitives, weight matrix Bias vectorThe loss of task r is Lr,i
Wherein, institute step S5 further comprises, determines learning rate by the following method:η=Ae-λn, wherein n is network instruction Iterations during white silk, 1≤A≤50,0.0001≤λ≤0.001;Alternatively, η (k)=e(k-1), wherein, 0.0001≤ λ≤0.001, k are iterations.
Wherein, institute step S7 further comprises:It determines object function, and minimizes object function λ is regularization coefficient.
The present invention can efficiently use the publicly-owned feature between RNN study multitasks, and publicly-owned feature is input to single In the study of business, information sharing is realized.And by quoting GRU structures in RNN, gradient disappearance problem can be efficiently solved.
The detailed description of specific embodiment by referring to the following drawings and to the present invention, feature and advantage of the invention It will become apparent.
Description of the drawings
Fig. 1 is the structure diagram of door recursive unit GRU of the prior art;
The schematic diagram of prediction label vector is generated when Fig. 2 is the RNN multi-task learnings the present invention is based on publicly-owned feature compensation;
Fig. 3, which is that the present invention is based on the multi-task learning method parameter iteration of RNN, to update flow diagram.
Specific embodiment
In order to make technical scheme of the present invention clearer, clear, it is described in further detail, should manages below in conjunction with attached drawing Solution, the specific embodiments described herein are merely illustrative of the present invention, is not intended to limit the present invention.
The schematic diagram of prediction label vector is generated when Fig. 2 is the RNN multi-task learnings the present invention is based on publicly-owned feature compensation, Specific method is as follows:
DefinitionFor the sample under each task, wherein NrRepresent the number of sample in sample, MrRepresent sample Dimension.We assume that the sample number of each task is identical, NrIt is represented with N.Therefore, in each task different views sample table Show as follows
Sample is divided into two parts by us, and a part has the N of labellA sample is for training, N of the another part without labelu A sample is for testing, Nl+Nu=N.Our object function isWherein Lr,iIt represents to appoint I-th sample losses of business r, λ are regularization coefficient, and θ represents weight matrix.
We extract a sample input RNN out from each task and learn a context vector as shared information. Obtaining publicly-owned information XcoAfterwards, we use R feedforward compensation neural network, learn each task respectively, each in study in this way Existing publicly-owned information participates in having private information participation again during task, can preferably utilize the dependence between task.
The input layer of the neural network of task r includes sample xr,iWith publicly-owned information Xco, they are input to hidden layer;
h(r)=g (U(r)·xr,i+W(r)·Xco+b(r))
Wherein g () represents sigmoid activation primitives, weight matrixBias vectorSince how much the publicly-owned information that each task is utilized differs, publicly-owned information XcoThe journey of participation task r training Degree is by weight matrix W(r)It determines.Next hidden layer is exported into h(r)Output layer is input to, and by the output of softmax functions layer Prediction label vector
Wherein zr,i=V(r)·h(r), weight matrixPrediction outputWe define task r's It loses as Lr,i
As shown in figure 3, the present invention provides a kind of multi-task learning method based on RNN, this method includes parameter iteration More new technological process specifically comprises the following steps:
Step S1:Systematic parameter θ=(W, U, B, V) is initialized, wherein, W represents what neural network connected between layers Weight matrix;U represents weight matrix of the data when inputting neural network;B represents the biasing square of neural network between layers Battle array;V represents neural network hidden layer to softmax layers of weight matrix;
In this step, the parameter that θ=(W, U, B, V) is meant that in weight matrix θ includes W, U, B, V, initializes system Parameter θ=(W, U, B, V) just refers to input the initial value of W, U, B, V, and the initial value of W, U, B, V can be prior according to actual conditions It is set.
Step S2:Input sample x1,i,…,xR,i, learn publicly-owned information Xco, by the instruction of publicly-owned information compensation to individual task In white silk;
In this step, x1,i,…,xR,iIt is the sample extracted out from each task respectively, it is then that each sample is defeated Enter in RNN and learnt, learn a context vector as shared information, obtain publicly-owned information Xco
Step S3:Calculate the prediction label vector output of each neural networkThe loss L of calculating task rr,i
In this step, by publicly-owned information XcoIt is input in the study of individual task as input offset, passes through feedforward compensation Layer (Feed Forward Layer, FF) learns individual task, generates the label vector of predictionAccording to the prediction of generation Label vectorThe loss L of calculating task rr,i
Hidden layer is exported into h(r)Output layer is input to, and by softmax functions layer output prediction label vectorIts Inzr,i=V(r)·h(r), h(r)=g (U(r)·xr,i+W(r)·Xco+b(r)), wherein g () is represented Sigmoid activation primitives, weight matrixBias vector The loss of task r is Lr,i
Step S4:The gradient of θ=(W, U, B, V) is solved according to gradient descent method and BPTT algorithms, determines task r about public affairs There is information XcoGradient;
In this step, in the model parameter, i.e. unconstrained optimization problem for solving machine learning algorithm, gradient declines (Gradient Descent) is one of commonly used method.It, can when minimizing loss function in machine learning algorithm With by gradient descent method come iterative solution step by step, the loss function minimized.The algorithm of gradient descent method can be with There are two kinds of algebraic approach and matrix method (also referred to as vector method) to represent that algebraic approach is easier to understand, and matrix method is more succinct.Gradient Descent method includes batch gradient descent method (Batch GradientDescent), stochastic gradient descent method (Stochastic Gradient Descent) and small lot gradient descent method (Mini-batch Gradient Descent), the decline of batch gradient Method, is gradient descent method the most common form, and specific practice is carried out more namely in undated parameter using all samples Newly, common linear regression gradient descent algorithm is exactly batch gradient descent method.For training speed, stochastic gradient descent Method only with a sample due to carrying out iteration every time, and quickly, but stochastic gradient descent method is due to only with one for training speed Sample determines gradient direction, and it is not probably optimal to lead to solution.For convergence rate, due to stochastic gradient descent method one Secondary one sample of iteration, cause iteration direction variation very greatly, it is impossible to quickly converge to locally optimal solution.Stochastic gradient descent method More new formula be:Small lot gradient descent method be batch gradient descent method and The compromise of stochastic gradient descent method, the present invention can solve θ=(W, U, B, V's) using above-mentioned described gradient descent method Gradient.
BPTT algorithms are common a kind of algorithms of backpropagation at any time in neural network, and a kind of illustratively algorithm is such as Shown in lower:
The gradient of θ=(W, U, B, V) is solved according to gradient descent method described above and BPTT algorithms, further determines that and appoints R be engaged in about publicly-owned information XcoGradient.
Step S5:It determines learning rate η, updates each weights gradient W=W- η δW, wherein, δWIt represents reversed in neural network During propagation, pass through the partial derivative for the weight matrix that gradient descent method obtains;
In this step, learning rate η is generally artificially determined, is adjusted according to neural network learning effect, generally according to nerve E-learning error transfer factor, as error is gradually reduced, learning rate reduces therewith, for example the learning rate of next time can be last time 1/10th.
Specifically, learning rate can also be determined by the following formula:η=Ae-λn, wherein n is changing in network training process Generation number, 1≤A≤50,0.0001≤λ≤0.001;
Alternatively, η (k)=e(k-1), wherein, 0.0001≤λ≤0.001, k are iterations.
Step S6:Judge whether neural network reaches stable, if performing step S7;If not, return to step S2, repeatedly Generation update model parameter;
In this step, neural network, which reaches, to be stablized generally according to the totality between the output of neural network and true label Error determines, as training epoch numbers increase (epoch is represented using primary complete training dataset), when error curve most After tend to be steady, and error amount is less than given threshold value and then thinks that neural network learning is effective.
Step S7:Export Optimized model;
In this step, object function is determinedSpecifically,
Wherein, λ is regularization coefficient, minimizes object functionI.e. so that target letter The model that number minimizes is optimal model.
Method provided by the invention can efficiently use the publicly-owned feature between RNN study multitasks, and publicly-owned feature is defeated Enter into the study of individual task, realize information sharing.And by quoting GRU structures in RNN, gradient can be efficiently solved and disappeared Mistake problem.
The foregoing is merely the preferred embodiment of the present invention, are not intended to limit the scope of the invention, every at this Under the design of invention, the equivalent structure transformation made using description of the invention and accompanying drawing content or directly/be used in it indirectly His relevant technical field is included in the scope of patent protection of the present invention.

Claims (6)

  1. A kind of 1. multi-task learning method based on RNN, which is characterized in that the described method comprises the following steps:
    Step S1:Systematic parameter θ=(W, U, B, V) is initialized, wherein, W represents the weights that neural network connects between layers Matrix;U represents weight matrix of the data when inputting neural network;B represents the bias matrix of neural network between layers;V Represent neural network hidden layer to softmax layers of weight matrix;
    Step S2:Input sample x1,i,…,xR,i, learn publicly-owned information Xco, by the training of publicly-owned information compensation to individual task In;
    Step S3:Calculate the prediction label vector output of each neural networkThe loss L of calculating task rr,i
    Step S4:The gradient of θ=(W, U, B, V) is solved according to gradient descent method and BPTT algorithms, determines task r about publicly-owned letter Cease XcoGradient;
    Step S5:It determines learning rate η, updates each weights gradient W=W- η δW, wherein, δWIt represents in neural network backpropagation When, pass through the partial derivative for the weight matrix that gradient descent method obtains;
    Step S6:Judge whether neural network reaches stable, if performing step S7;If not, return to step S2, iteration is more New model parameter;
    Step S7:Export Optimized model.
  2. 2. according to the method described in claim 1, it is characterized in that, institute step S2 further comprises:It is extracted out from each task One sample inputs RNN, learns a context vector as shared information, obtains publicly-owned information Xco
  3. 3. according to the method described in claim 1, it is characterized in that, institute step S3 further comprises:By publicly-owned information XcoAs Input offset is input in the study of individual task, learns list by feedforward compensation layer (Feed Forward Layer, FF) A task generates the label vector of predictionAccording to the prediction label of generation vectorThe loss L of calculating task rr,i
  4. 4. according to the method described in claim 3, it is characterized in that, institute step S3 further comprises:Hidden layer is exported into h(r)It is defeated Enter to output layer, and by softmax functions layer output prediction label vectorWhereinzr,i= V(r)·h(r), h(r)=g (U(r)·xr,i+W(r)·Xco+b(r)), wherein g () represents sigmoid activation primitives, weight matrixBias vectorThe loss of task r is Lr,i
  5. 5. according to the method described in claim 1, it is characterized in that, institute step S5 further comprises, determine by the following method Learning rate:η=Ae-λn, wherein n be network training process in iterations, 1≤A≤50,0.0001≤λ≤0.001;Or Person, η (k)=e(k-1), wherein, 0.0001≤λ≤0.001, k are iterations.
  6. 6. according to the method described in claim 1, it is characterized in that, institute step S7 further comprises:Determine object functionAnd minimize object functionλ is regularization coefficient.
CN201810112482.8A 2018-02-05 2018-02-05 A kind of multi-task learning method based on RNN Pending CN108197701A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810112482.8A CN108197701A (en) 2018-02-05 2018-02-05 A kind of multi-task learning method based on RNN

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810112482.8A CN108197701A (en) 2018-02-05 2018-02-05 A kind of multi-task learning method based on RNN

Publications (1)

Publication Number Publication Date
CN108197701A true CN108197701A (en) 2018-06-22

Family

ID=62592376

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810112482.8A Pending CN108197701A (en) 2018-02-05 2018-02-05 A kind of multi-task learning method based on RNN

Country Status (1)

Country Link
CN (1) CN108197701A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110490128A (en) * 2019-08-16 2019-11-22 南京邮电大学 A kind of hand-written recognition method based on encryption neural network
CN110503640A (en) * 2018-08-21 2019-11-26 深圳科亚医疗科技有限公司 Device, system and the computer-readable medium that medical image is analyzed
CN110766231A (en) * 2019-10-30 2020-02-07 上海天壤智能科技有限公司 Crime prediction method and system based on multi-head neural network
CN111222628A (en) * 2019-11-20 2020-06-02 深圳前海微众银行股份有限公司 Method, device and system for optimizing recurrent neural network training and readable storage medium
CN111488967A (en) * 2020-02-26 2020-08-04 浙江工业大学 Difference visual analysis method of gradient descent algorithm
CN111950602A (en) * 2020-07-21 2020-11-17 江苏大学 Image indexing method based on random gradient descent and multi-example multi-label learning
WO2021162779A1 (en) * 2020-02-13 2021-08-19 Google Llc Multi-stream recurrent neural network transducer(s)
CN114332914A (en) * 2021-11-29 2022-04-12 中国电子科技集团公司电子科学研究院 Personnel feature identification method, device and computer-readable storage medium
CN111815030B (en) * 2020-06-11 2024-02-06 浙江工商大学 Multi-target feature prediction method based on small amount of questionnaire survey data
US12033056B2 (en) 2018-11-19 2024-07-09 Google Llc Multi-task recurrent neural networks

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
司马海峰 等: "《遥感图像分类中的智能计算方法》", 31 January 2018, 吉林大学出版社 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110503640A (en) * 2018-08-21 2019-11-26 深圳科亚医疗科技有限公司 Device, system and the computer-readable medium that medical image is analyzed
US12033056B2 (en) 2018-11-19 2024-07-09 Google Llc Multi-task recurrent neural networks
CN110490128B (en) * 2019-08-16 2022-09-06 南京邮电大学 Handwriting recognition method based on encryption neural network
CN110490128A (en) * 2019-08-16 2019-11-22 南京邮电大学 A kind of hand-written recognition method based on encryption neural network
CN110766231A (en) * 2019-10-30 2020-02-07 上海天壤智能科技有限公司 Crime prediction method and system based on multi-head neural network
CN111222628B (en) * 2019-11-20 2023-09-26 深圳前海微众银行股份有限公司 Method, device, system and readable storage medium for optimizing training of recurrent neural network
CN111222628A (en) * 2019-11-20 2020-06-02 深圳前海微众银行股份有限公司 Method, device and system for optimizing recurrent neural network training and readable storage medium
WO2021162779A1 (en) * 2020-02-13 2021-08-19 Google Llc Multi-stream recurrent neural network transducer(s)
CN111488967A (en) * 2020-02-26 2020-08-04 浙江工业大学 Difference visual analysis method of gradient descent algorithm
CN111815030B (en) * 2020-06-11 2024-02-06 浙江工商大学 Multi-target feature prediction method based on small amount of questionnaire survey data
CN111950602A (en) * 2020-07-21 2020-11-17 江苏大学 Image indexing method based on random gradient descent and multi-example multi-label learning
CN111950602B (en) * 2020-07-21 2024-05-14 江苏大学 Image indexing method based on random gradient descent and multi-example multi-label learning
CN114332914A (en) * 2021-11-29 2022-04-12 中国电子科技集团公司电子科学研究院 Personnel feature identification method, device and computer-readable storage medium

Similar Documents

Publication Publication Date Title
CN108197701A (en) A kind of multi-task learning method based on RNN
CN109389207A (en) A kind of adaptive neural network learning method and nerve network system
CN108319980A (en) A kind of recurrent neural network multi-tag learning method based on GRU
Han et al. Hierarchical extreme learning machine for feedforward neural network
US9043326B2 (en) Methods and systems for biclustering algorithm
US7062476B2 (en) Student neural network
Zhang et al. A hybrid bird mating optimizer algorithm with teaching-learning-based optimization for global numerical optimization
JP2024123213A (en) Residual Semi-Recurrent Neural Network
Cen et al. Nim: modeling and generation of simulation inputs via generative neural networks
CN116757283A (en) Knowledge graph link prediction method
Desell et al. An empirical exploration of deep recurrent connections using neuro-evolution
Biçici et al. Conditional information gain networks
Khalil Comparison of four neural network learning methods based on genetic algorithm for non-linear dynamic systems identification
Laleh et al. Chaotic continual learning
Devi et al. Introduction to Artificial Neural Networks
Wang et al. Automated Reinforcement Learning Based on Parameter Sharing Network Architecture Search
Marchetti et al. A hybrid neural network-genetic programming intelligent control approach
Zhan et al. Dueling network architecture for multi-agent deep deterministic policy gradient
Davel Activation gap generators in neural networks.
Abdelbar et al. Ant colony optimization applied to the training of a high order neural network with adaptable exponential weights
Kleinman et al. Critical learning periods emerge even in deep linear networks
Fujimoto et al. Global Behavior of Learning Dynamics in Zero-Sum Games with Memory Asymmetry
Sharma et al. Comparison of Effect of learning rate of neural network performance in deep learning neural networks using the stochastic gradient descent algorithm
Wilson et al. Neuromodulated Learning in Deep Neural Networks
Medvedev et al. Optimization of the local search in the training for SAMANN neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination