CN105844332A - Fast recursive Elman neural network modeling and learning algorithm - Google Patents
Fast recursive Elman neural network modeling and learning algorithm Download PDFInfo
- Publication number
- CN105844332A CN105844332A CN201610137875.5A CN201610137875A CN105844332A CN 105844332 A CN105844332 A CN 105844332A CN 201610137875 A CN201610137875 A CN 201610137875A CN 105844332 A CN105844332 A CN 105844332A
- Authority
- CN
- China
- Prior art keywords
- layer
- output
- hidden
- error function
- node
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention belongs to the field of a neural network, and specifically discloses a fast recursive Elman neural network modeling and learning algorithm, which comprises a first step of selecting a model; a second step of initializing parameters; a third step of calculating an error function Ek of a training sample, turning to a fourth step if the error function value is less than an error threshold, and otherwise, calculating the gradient of the error function with respect to weights Wk and Vk; and the fourth step of testing the precision. The fast recursive Elman neural network modeling and learning algorithm provided by the invention can complete weight selection from a hidden layer to an output layer through a generalized inverse method in one step, without iteration, while optimizing an input weight matrix; and in this way, the original two layers of weights calculated using a gradient method become one during the weight update, the training speed is greatly increased, and the shortcomings that a BP algorithm can easily fall into a local minimum, the generalization ability is poor and the like can be also avoided.
Description
Technical field
The invention belongs to field of neural networks, specially Fast Recursive Elman neural net model establishing learning algorithm.
Background technology
At document " Pham, D.T., and X.Liu.1992.Dynamic system modeling using
Partially recurrentneural networks.J.of Systems Engineering, 2:90-97 " in, Pham etc.
People proposes the Elman network (Modified Elman Networks) of correction, and the most this networks are as standard
Elman network.
Elman network in addition to input layer, hidden layer, output layer, an also special undertaking layer.This layer unit is to use
Remember the output valve in Hidden unit former moment, it is believed that be a time delay operator.Elman neutral net has n input
Node,Individual hidden node and m output node.Input and recurrence weight matrix are respectivelyWithClose
And weight matrix is Represent the weight vector that hidden node and output node connect.
For training sample set N={ (xi,ti) | i=1 ..., N} inputs to network, usesRepresent the hidden of kth time circulation
Layer output matrix, defines H0=0.Assuming that g (x) is the activation primitive of hidden node, generally use sigmoid function, it may be assumed that
The input of the kth of Elman neutral net time, is to combine to actually enter XkThe hidden layer output square of matrix and circulation last time
Battle array Hk-1ForThen the input of hidden layer isHidden layer is output as Hk=g
(VkHk-1+WkXk).Network is output asThe wherein activation primitive of f (x) output node, is taken as linear letter more
Number, i.e. network is output as:
Definition error function is: Ek=(Yk-T)(Yk-T)T.E to input layer and is accepted weight matrix L and E of layer to hidden
Layer seeks gradient to the connection weights U of output layerWithThen being updated L and U by steepest descent method, formula isWithWherein, learning rate ηkObtain by the way of line search.By so
Iteration, it is thus achieved that the network parameter of optimization.
This algorithm uses error back propagation method, is based primarily upon the thought that gradient declines, owing to having used under steepest
Fall method is iterated, and the therefore required training time is relatively long, and is easily absorbed in local minimum, and generalization ability has much room for improvement.
Summary of the invention
For above-mentioned technology contents, the present invention provides a kind of fast algorithm, it is possible to significantly promotes network training speed, keeps away
Exempt to be absorbed in local minimum, promote generalization ability.
Concrete technical scheme is:
Fast Recursive Elman neural net model establishing learning algorithm, comprises the following steps:
Step 1, Model Selection
Fast Recursive Elman neutral net have n input node,Individual hidden node and m output node;
Input and recurrence weight matrix are respectivelyWithMerging weight matrix isRepresent the weight vector that hidden node and output node connect;
For training sample set N={ (xi,ti) | i=1 ..., N} inputs to network, usesRepresent that kth time is followed
The hidden layer output matrix of ring, defines H0=0;
Assuming that g (x) is the activation primitive of hidden node, use sigmoid function, it may be assumed that
Step 2, parameter initialization
(1). set primary iteration number of times k=0;
(2). definition H0=0;
(3). random assignment is from input layer to hidden layer and accepts the initial weight matrix W of layer, V, chooses between (-1,1)
Random number;
Step 3, optimization initial weight
(1). calculate hidden layer output matrix Hk=g (VkHk-1+WkX);
(2). the actual of network is output asError function is Ek=Tr [(Yk-T)(Yk-T)T], wherein T is
Preferable output;
Then solve the network weight U from hidden layer to output layerkProblem be converted into the problem minimizing error function, i.e. seek
Look for least square solution UkMake EkMinimum;
Moore-penrose generalized inverse is utilized to obtain: Uk=(HkHk T)-1HkTT, bring H intokObtain hidden layer with T to arrive
The weight matrix U of output layerk;
(3). calculate the error function E of training samplekIf error function value is less than error threshold, then go to step 4;Otherwise,
Calculate error function about weights WkAnd VkGradient:
WhereinIt it is the generalized inverse of H;
(4). right value update formula is:
Wherein, learning rate ηkObtain by the way of line search;
(5). make k=k+1, return step (2);
Step 4, measuring accuracy
Weights W according to the hidden layer after optimizing to input layerkWith undertaking layer weight matrix Vk, and according to the step in step 3
Suddenly the hidden layer that (2) obtain is to the weights U of output layerk, obtain the network parameter of this algorithm, calculate the precision of test sample.
The Fast Recursive Elman neural net model establishing learning algorithm that the present invention provides, will when optimizing input weight matrix
Hidden layer is chosen by drawing by generalized inverse method one step to the weights of output layer, it is not necessary to iteration so that updating weights process
In, become one layer from the original weights by gradient method calculating two-layer, make training speed be greatly improved, avoid BP simultaneously
Algorithm is easily trapped into the shortcoming such as local minimum, generalization ability difference.
Accompanying drawing explanation
Fig. 1 is the Fast Recursive Elman neural network structure figure of the present invention;
Fig. 2 is the flow chart of the present invention.
Detailed description of the invention
The detailed description of the invention of the accompanying drawings present invention.
The one novel Fast Recursive Elman neural net model establishing learning algorithm of the present invention, include in the specific implementation as
Lower step:
Step 1,
For training sample set N={ (xi,ti) | i=1 ..., N} inputs to network, determines the input node number of network
N, output node number m, and the number of hidden nodes arranging network isIndividual.
Step 2,
Random assignment input and recurrence weight matrixWithChoose the random number between (-1,1);
Merging weight matrix isThe hidden layer output matrix H of definition circulation0=0.
Step 3,
Calculate hidden layer output matrix Hk=g (VkHk-1+WkX).G (x) is the activation primitive of hidden node, uses sigmoid
Function, it may be assumed that
Step 4,
Calculate the actual of network to be output asAnd error function Ek=Tr [(Yk-T)(Yk-T)T], wherein T
Export for ideal;And utilize Moore-penrose generalized inverse to obtain: Uk=(HkHk T)-1HkTT, bring H intokObtain hidden with T
Layer arrives the weight matrix U of output layerk。
Step 5,
Calculate the error function E of training samplekIf error function value is less than error threshold, then go to step 7;Otherwise, calculate
Error function is about weights WkAnd VkGradient:
WhereinIt it is the generalized inverse of H;And weights are updated: right value update formula is:
Step 6,
Make k=k+1, return step 3.
Step 7,
Weights W according to the hidden layer after optimizing to input layerkWith undertaking layer weight matrix Vk, and according to calculated
Hidden layer is to the weights U of output layerk, obtain the network parameter of this algorithm, calculate the precision of test sample.
Below by MNIST data set to existing Elman, ELM (Extreme Learning Machine) and the present invention
Algorithm tested, and their result is compared.
MNIST data set is the Corinna Cortes and the Yann of New York University's Ke Lang institute of Google laboratory
LeCun has a handwritten numeral data base, and training sample has 60, the digital picture of 000 hand-written 0-9, and test sample has 10,
000.Every pictures gray level is all 8, and every pictures can use the vector sign of 784 sizes.It is more common
The data set of machine learning.In three kinds of algorithms of this experiment, excitation function all uses sigmoid function, and the number of hidden nodes is all chosen
128.
The experimental result of Elman algorithm, ELM algorithm and this algorithm is as shown in table 1.
Table 1
By table 1 it can be seen that in the bigger MNIST data set instance of data volume, this algorithm and Elman are in iteration
During same number, when even Elman iterations is this algorithm twice, training precision and precision of prediction obtained by this algorithm are equal
Higher than Elman.And the precision of this algorithm seeks the ELM of network parameter also above using generalized inverse thought equally.This algorithm is described
Compared to other two kinds of algorithms, improve the generalization ability of network, effectively raise the precision of network.
Claims (1)
1. Fast Recursive Elman neural net model establishing learning algorithm, it is characterised in that: comprise the following steps:
Step 1, Model Selection
Fast Recursive Elman neutral net have n input node,Individual hidden node and m output node;
Input and recurrence weight matrix are respectivelyWithMerging weight matrix isRepresent the weight vector that hidden node and output node connect;
For training sample set N={ (xi,ti) | i=1 ..., N} inputs to network, usesRepresent kth time circulation
Hidden layer output matrix, defines H0=0;
Assuming that g (x) is the activation primitive of hidden node, use sigmoid function, it may be assumed that
Step 2, parameter initialization
(1). set primary iteration number of times k=0;
(2). definition H0=0;
(3). random assignment is from input layer to hidden layer and accepts the initial weight matrix W of layer, V, and that chooses between (-1,1) is random
Number;
Step 3, optimization initial weight
(1). calculate hidden layer output matrix Hk=g (VkHk-1+WkX);
(2). the actual of network is output asError function is Ek=Tr [(Yk-T)(Yk-T)T], wherein T is preferable defeated
Go out;
Then solve the network weight U from hidden layer to output layerkProblem be converted into the problem minimizing error function, i.e. find
A young waiter in a wineshop or an inn takes advantage of solution UkMake EkMinimum;
Moore-penrose generalized inverse is utilized to obtain: Uk=(HkHk T)-1HkTT, bring H intokHidden layer has been obtained to output with T
The weight matrix U of layerk;
(3). calculate the error function E of training samplekIf error function value is less than error threshold, then go to step 4;Otherwise, calculate
Error function is about weights WkAnd VkGradient:
WhereinIt it is the generalized inverse of H;
(4). right value update formula is:
Wherein, learning rate ηkObtain by the way of line search;
(5). make k=k+1, return step (2);
Step 4, measuring accuracy
Weights W according to the hidden layer after optimizing to input layerkWith undertaking layer weight matrix Vk, and according to the step in step 3
(2) hidden layer obtained is to the weights U of output layerk, obtain the network parameter of this algorithm, calculate the precision of test sample.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610137875.5A CN105844332A (en) | 2016-03-10 | 2016-03-10 | Fast recursive Elman neural network modeling and learning algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610137875.5A CN105844332A (en) | 2016-03-10 | 2016-03-10 | Fast recursive Elman neural network modeling and learning algorithm |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105844332A true CN105844332A (en) | 2016-08-10 |
Family
ID=56587040
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610137875.5A Pending CN105844332A (en) | 2016-03-10 | 2016-03-10 | Fast recursive Elman neural network modeling and learning algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105844332A (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106407711A (en) * | 2016-10-10 | 2017-02-15 | 重庆科技学院 | Recommendation method and recommendation system of pet feeding based on cloud data |
CN106472332A (en) * | 2016-10-10 | 2017-03-08 | 重庆科技学院 | Pet feeding method and system based on dynamic intelligent algorithm |
CN109782606A (en) * | 2019-03-12 | 2019-05-21 | 北京理工大学珠海学院 | Recursive small echo Ai Erman neural network based on modified form gravitation searching method |
CN110362881A (en) * | 2019-06-25 | 2019-10-22 | 西安电子科技大学 | Microwave power device nonlinear model method based on extreme learning machine |
CN110838087A (en) * | 2019-11-13 | 2020-02-25 | 燕山大学 | Image super-resolution reconstruction method and system |
CN111881990A (en) * | 2020-08-03 | 2020-11-03 | 江南大学 | Construction type neural network parameter fusion optimization method for digital image recognition |
CN112819086A (en) * | 2021-02-10 | 2021-05-18 | 北京工业大学 | Image classification method for calculating global optimal solution of single-hidden-layer ReLU neural network by dividing network space |
CN112926727A (en) * | 2021-02-10 | 2021-06-08 | 北京工业大学 | Solving method for local minimum value of single hidden layer ReLU neural network |
CN113962369A (en) * | 2021-11-29 | 2022-01-21 | 北京工业大学 | Radial basis function neural network optimization method based on improved Levenberg-Marquardt |
CN111461229B (en) * | 2020-04-01 | 2023-10-31 | 北京工业大学 | Deep neural network optimization and image classification method based on target transfer and line search |
-
2016
- 2016-03-10 CN CN201610137875.5A patent/CN105844332A/en active Pending
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106472332A (en) * | 2016-10-10 | 2017-03-08 | 重庆科技学院 | Pet feeding method and system based on dynamic intelligent algorithm |
CN106472332B (en) * | 2016-10-10 | 2019-05-10 | 重庆科技学院 | Pet feeding method and system based on dynamic intelligent algorithm |
CN106407711A (en) * | 2016-10-10 | 2017-02-15 | 重庆科技学院 | Recommendation method and recommendation system of pet feeding based on cloud data |
CN109782606A (en) * | 2019-03-12 | 2019-05-21 | 北京理工大学珠海学院 | Recursive small echo Ai Erman neural network based on modified form gravitation searching method |
CN110362881A (en) * | 2019-06-25 | 2019-10-22 | 西安电子科技大学 | Microwave power device nonlinear model method based on extreme learning machine |
CN110838087A (en) * | 2019-11-13 | 2020-02-25 | 燕山大学 | Image super-resolution reconstruction method and system |
CN111461229B (en) * | 2020-04-01 | 2023-10-31 | 北京工业大学 | Deep neural network optimization and image classification method based on target transfer and line search |
CN111881990A (en) * | 2020-08-03 | 2020-11-03 | 江南大学 | Construction type neural network parameter fusion optimization method for digital image recognition |
CN111881990B (en) * | 2020-08-03 | 2024-03-08 | 江南大学 | Construction type neural network parameter fusion optimization method for digital image recognition |
CN112926727A (en) * | 2021-02-10 | 2021-06-08 | 北京工业大学 | Solving method for local minimum value of single hidden layer ReLU neural network |
CN112819086A (en) * | 2021-02-10 | 2021-05-18 | 北京工业大学 | Image classification method for calculating global optimal solution of single-hidden-layer ReLU neural network by dividing network space |
CN112926727B (en) * | 2021-02-10 | 2024-02-27 | 北京工业大学 | Solving method for local minimum value of single hidden layer ReLU neural network |
CN112819086B (en) * | 2021-02-10 | 2024-03-15 | 北京工业大学 | Image classification method for calculating global optimal solution of single hidden layer ReLU neural network by dividing network space |
CN113962369A (en) * | 2021-11-29 | 2022-01-21 | 北京工业大学 | Radial basis function neural network optimization method based on improved Levenberg-Marquardt |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105844332A (en) | Fast recursive Elman neural network modeling and learning algorithm | |
CN109948029B (en) | Neural network self-adaptive depth Hash image searching method | |
KR102302609B1 (en) | Neural Network Architecture Optimization | |
US20200320399A1 (en) | Regularized neural network architecture search | |
CN104598611B (en) | The method and system being ranked up to search entry | |
CN112631717B (en) | Asynchronous reinforcement learning-based network service function chain dynamic deployment system and method | |
CN107239733A (en) | Continuous hand-written character recognizing method and system | |
CN106203625A (en) | A kind of deep-neural-network training method based on multiple pre-training | |
CN110321484A (en) | A kind of Products Show method and device | |
CN107798385A (en) | Recognition with Recurrent Neural Network partially connected method based on block tensor resolution | |
CN109299478A (en) | Intelligent automatic question-answering method and system based on two-way shot and long term Memory Neural Networks | |
CN111723914A (en) | Neural network architecture searching method based on convolution kernel prediction | |
CN108038539A (en) | A kind of integrated length memory Recognition with Recurrent Neural Network and the method for gradient lifting decision tree | |
US20210224650A1 (en) | Method for differentiable architecture search based on a hierarchical grouping mechanism | |
KR20180096469A (en) | Knowledge Transfer Method Using Deep Neural Network and Apparatus Therefor | |
CN112578089B (en) | Air pollutant concentration prediction method based on improved TCN | |
CN111445008A (en) | Knowledge distillation-based neural network searching method and system | |
CN109597998A (en) | A kind of characteristics of image construction method of visual signature and characterizing semantics joint insertion | |
CN110245602A (en) | A kind of underwater quiet target identification method based on depth convolution feature | |
CN108052680B (en) | Image data target identification Enhancement Method based on data map, Information Atlas and knowledge mapping | |
CN106407932B (en) | Handwritten Digit Recognition method based on fractional calculus Yu generalized inverse neural network | |
CN113409157B (en) | Cross-social network user alignment method and device | |
CN108960326B (en) | Point cloud fast segmentation method and system based on deep learning framework | |
CN112487305B (en) | GCN-based dynamic social user alignment method | |
CN110377957B (en) | Bridge crane neural network modeling method of whale search strategy wolf algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20160810 |
|
RJ01 | Rejection of invention patent application after publication |