CN105844332A - Fast recursive Elman neural network modeling and learning algorithm - Google Patents

Fast recursive Elman neural network modeling and learning algorithm Download PDF

Info

Publication number
CN105844332A
CN105844332A CN201610137875.5A CN201610137875A CN105844332A CN 105844332 A CN105844332 A CN 105844332A CN 201610137875 A CN201610137875 A CN 201610137875A CN 105844332 A CN105844332 A CN 105844332A
Authority
CN
China
Prior art keywords
layer
output
hidden
error function
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610137875.5A
Other languages
Chinese (zh)
Inventor
王健
龚晓玲
叶振昀
时贤
温艳青
杨国玲
张炳杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Petroleum East China
Original Assignee
China University of Petroleum East China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Petroleum East China filed Critical China University of Petroleum East China
Priority to CN201610137875.5A priority Critical patent/CN105844332A/en
Publication of CN105844332A publication Critical patent/CN105844332A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention belongs to the field of a neural network, and specifically discloses a fast recursive Elman neural network modeling and learning algorithm, which comprises a first step of selecting a model; a second step of initializing parameters; a third step of calculating an error function Ek of a training sample, turning to a fourth step if the error function value is less than an error threshold, and otherwise, calculating the gradient of the error function with respect to weights Wk and Vk; and the fourth step of testing the precision. The fast recursive Elman neural network modeling and learning algorithm provided by the invention can complete weight selection from a hidden layer to an output layer through a generalized inverse method in one step, without iteration, while optimizing an input weight matrix; and in this way, the original two layers of weights calculated using a gradient method become one during the weight update, the training speed is greatly increased, and the shortcomings that a BP algorithm can easily fall into a local minimum, the generalization ability is poor and the like can be also avoided.

Description

Fast Recursive Elman neural net model establishing learning algorithm
Technical field
The invention belongs to field of neural networks, specially Fast Recursive Elman neural net model establishing learning algorithm.
Background technology
At document " Pham, D.T., and X.Liu.1992.Dynamic system modeling using Partially recurrentneural networks.J.of Systems Engineering, 2:90-97 " in, Pham etc. People proposes the Elman network (Modified Elman Networks) of correction, and the most this networks are as standard Elman network.
Elman network in addition to input layer, hidden layer, output layer, an also special undertaking layer.This layer unit is to use Remember the output valve in Hidden unit former moment, it is believed that be a time delay operator.Elman neutral net has n input Node,Individual hidden node and m output node.Input and recurrence weight matrix are respectivelyWithClose And weight matrix is Represent the weight vector that hidden node and output node connect. For training sample set N={ (xi,ti) | i=1 ..., N} inputs to network, usesRepresent the hidden of kth time circulation Layer output matrix, defines H0=0.Assuming that g (x) is the activation primitive of hidden node, generally use sigmoid function, it may be assumed that
The input of the kth of Elman neutral net time, is to combine to actually enter XkThe hidden layer output square of matrix and circulation last time Battle array Hk-1ForThen the input of hidden layer isHidden layer is output as Hk=g (VkHk-1+WkXk).Network is output asThe wherein activation primitive of f (x) output node, is taken as linear letter more Number, i.e. network is output as:
Definition error function is: Ek=(Yk-T)(Yk-T)T.E to input layer and is accepted weight matrix L and E of layer to hidden Layer seeks gradient to the connection weights U of output layerWithThen being updated L and U by steepest descent method, formula isWithWherein, learning rate ηkObtain by the way of line search.By so Iteration, it is thus achieved that the network parameter of optimization.
This algorithm uses error back propagation method, is based primarily upon the thought that gradient declines, owing to having used under steepest Fall method is iterated, and the therefore required training time is relatively long, and is easily absorbed in local minimum, and generalization ability has much room for improvement.
Summary of the invention
For above-mentioned technology contents, the present invention provides a kind of fast algorithm, it is possible to significantly promotes network training speed, keeps away Exempt to be absorbed in local minimum, promote generalization ability.
Concrete technical scheme is:
Fast Recursive Elman neural net model establishing learning algorithm, comprises the following steps:
Step 1, Model Selection
Fast Recursive Elman neutral net have n input node,Individual hidden node and m output node;
Input and recurrence weight matrix are respectivelyWithMerging weight matrix isRepresent the weight vector that hidden node and output node connect;
For training sample set N={ (xi,ti) | i=1 ..., N} inputs to network, usesRepresent that kth time is followed The hidden layer output matrix of ring, defines H0=0;
Assuming that g (x) is the activation primitive of hidden node, use sigmoid function, it may be assumed that
Step 2, parameter initialization
(1). set primary iteration number of times k=0;
(2). definition H0=0;
(3). random assignment is from input layer to hidden layer and accepts the initial weight matrix W of layer, V, chooses between (-1,1) Random number;
Step 3, optimization initial weight
(1). calculate hidden layer output matrix Hk=g (VkHk-1+WkX);
(2). the actual of network is output asError function is Ek=Tr [(Yk-T)(Yk-T)T], wherein T is Preferable output;
Then solve the network weight U from hidden layer to output layerkProblem be converted into the problem minimizing error function, i.e. seek Look for least square solution UkMake EkMinimum;
Moore-penrose generalized inverse is utilized to obtain: Uk=(HkHk T)-1HkTT, bring H intokObtain hidden layer with T to arrive The weight matrix U of output layerk
(3). calculate the error function E of training samplekIf error function value is less than error threshold, then go to step 4;Otherwise, Calculate error function about weights WkAnd VkGradient:
WhereinIt it is the generalized inverse of H;
(4). right value update formula is:
Wherein, learning rate ηkObtain by the way of line search;
(5). make k=k+1, return step (2);
Step 4, measuring accuracy
Weights W according to the hidden layer after optimizing to input layerkWith undertaking layer weight matrix Vk, and according to the step in step 3 Suddenly the hidden layer that (2) obtain is to the weights U of output layerk, obtain the network parameter of this algorithm, calculate the precision of test sample.
The Fast Recursive Elman neural net model establishing learning algorithm that the present invention provides, will when optimizing input weight matrix Hidden layer is chosen by drawing by generalized inverse method one step to the weights of output layer, it is not necessary to iteration so that updating weights process In, become one layer from the original weights by gradient method calculating two-layer, make training speed be greatly improved, avoid BP simultaneously Algorithm is easily trapped into the shortcoming such as local minimum, generalization ability difference.
Accompanying drawing explanation
Fig. 1 is the Fast Recursive Elman neural network structure figure of the present invention;
Fig. 2 is the flow chart of the present invention.
Detailed description of the invention
The detailed description of the invention of the accompanying drawings present invention.
The one novel Fast Recursive Elman neural net model establishing learning algorithm of the present invention, include in the specific implementation as Lower step:
Step 1,
For training sample set N={ (xi,ti) | i=1 ..., N} inputs to network, determines the input node number of network N, output node number m, and the number of hidden nodes arranging network isIndividual.
Step 2,
Random assignment input and recurrence weight matrixWithChoose the random number between (-1,1); Merging weight matrix isThe hidden layer output matrix H of definition circulation0=0.
Step 3,
Calculate hidden layer output matrix Hk=g (VkHk-1+WkX).G (x) is the activation primitive of hidden node, uses sigmoid Function, it may be assumed that
Step 4,
Calculate the actual of network to be output asAnd error function Ek=Tr [(Yk-T)(Yk-T)T], wherein T Export for ideal;And utilize Moore-penrose generalized inverse to obtain: Uk=(HkHk T)-1HkTT, bring H intokObtain hidden with T Layer arrives the weight matrix U of output layerk
Step 5,
Calculate the error function E of training samplekIf error function value is less than error threshold, then go to step 7;Otherwise, calculate Error function is about weights WkAnd VkGradient:
WhereinIt it is the generalized inverse of H;And weights are updated: right value update formula is:
W k + 1 = W k - η k ∂ E k ∂ W k ; V k + 1 = V k - η k ∂ E k ∂ V k .
Step 6,
Make k=k+1, return step 3.
Step 7,
Weights W according to the hidden layer after optimizing to input layerkWith undertaking layer weight matrix Vk, and according to calculated Hidden layer is to the weights U of output layerk, obtain the network parameter of this algorithm, calculate the precision of test sample.
Below by MNIST data set to existing Elman, ELM (Extreme Learning Machine) and the present invention Algorithm tested, and their result is compared.
MNIST data set is the Corinna Cortes and the Yann of New York University's Ke Lang institute of Google laboratory LeCun has a handwritten numeral data base, and training sample has 60, the digital picture of 000 hand-written 0-9, and test sample has 10, 000.Every pictures gray level is all 8, and every pictures can use the vector sign of 784 sizes.It is more common The data set of machine learning.In three kinds of algorithms of this experiment, excitation function all uses sigmoid function, and the number of hidden nodes is all chosen 128.
The experimental result of Elman algorithm, ELM algorithm and this algorithm is as shown in table 1.
Table 1
By table 1 it can be seen that in the bigger MNIST data set instance of data volume, this algorithm and Elman are in iteration During same number, when even Elman iterations is this algorithm twice, training precision and precision of prediction obtained by this algorithm are equal Higher than Elman.And the precision of this algorithm seeks the ELM of network parameter also above using generalized inverse thought equally.This algorithm is described Compared to other two kinds of algorithms, improve the generalization ability of network, effectively raise the precision of network.

Claims (1)

1. Fast Recursive Elman neural net model establishing learning algorithm, it is characterised in that: comprise the following steps:
Step 1, Model Selection
Fast Recursive Elman neutral net have n input node,Individual hidden node and m output node;
Input and recurrence weight matrix are respectivelyWithMerging weight matrix isRepresent the weight vector that hidden node and output node connect;
For training sample set N={ (xi,ti) | i=1 ..., N} inputs to network, usesRepresent kth time circulation Hidden layer output matrix, defines H0=0;
Assuming that g (x) is the activation primitive of hidden node, use sigmoid function, it may be assumed that
Step 2, parameter initialization
(1). set primary iteration number of times k=0;
(2). definition H0=0;
(3). random assignment is from input layer to hidden layer and accepts the initial weight matrix W of layer, V, and that chooses between (-1,1) is random Number;
Step 3, optimization initial weight
(1). calculate hidden layer output matrix Hk=g (VkHk-1+WkX);
(2). the actual of network is output asError function is Ek=Tr [(Yk-T)(Yk-T)T], wherein T is preferable defeated Go out;
Then solve the network weight U from hidden layer to output layerkProblem be converted into the problem minimizing error function, i.e. find A young waiter in a wineshop or an inn takes advantage of solution UkMake EkMinimum;
Moore-penrose generalized inverse is utilized to obtain: Uk=(HkHk T)-1HkTT, bring H intokHidden layer has been obtained to output with T The weight matrix U of layerk
(3). calculate the error function E of training samplekIf error function value is less than error threshold, then go to step 4;Otherwise, calculate Error function is about weights WkAnd VkGradient:
WhereinIt it is the generalized inverse of H;
(4). right value update formula is:
Wherein, learning rate ηkObtain by the way of line search;
(5). make k=k+1, return step (2);
Step 4, measuring accuracy
Weights W according to the hidden layer after optimizing to input layerkWith undertaking layer weight matrix Vk, and according to the step in step 3 (2) hidden layer obtained is to the weights U of output layerk, obtain the network parameter of this algorithm, calculate the precision of test sample.
CN201610137875.5A 2016-03-10 2016-03-10 Fast recursive Elman neural network modeling and learning algorithm Pending CN105844332A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610137875.5A CN105844332A (en) 2016-03-10 2016-03-10 Fast recursive Elman neural network modeling and learning algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610137875.5A CN105844332A (en) 2016-03-10 2016-03-10 Fast recursive Elman neural network modeling and learning algorithm

Publications (1)

Publication Number Publication Date
CN105844332A true CN105844332A (en) 2016-08-10

Family

ID=56587040

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610137875.5A Pending CN105844332A (en) 2016-03-10 2016-03-10 Fast recursive Elman neural network modeling and learning algorithm

Country Status (1)

Country Link
CN (1) CN105844332A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106407711A (en) * 2016-10-10 2017-02-15 重庆科技学院 Recommendation method and recommendation system of pet feeding based on cloud data
CN106472332A (en) * 2016-10-10 2017-03-08 重庆科技学院 Pet feeding method and system based on dynamic intelligent algorithm
CN109782606A (en) * 2019-03-12 2019-05-21 北京理工大学珠海学院 Recursive small echo Ai Erman neural network based on modified form gravitation searching method
CN110362881A (en) * 2019-06-25 2019-10-22 西安电子科技大学 Microwave power device nonlinear model method based on extreme learning machine
CN110838087A (en) * 2019-11-13 2020-02-25 燕山大学 Image super-resolution reconstruction method and system
CN111881990A (en) * 2020-08-03 2020-11-03 江南大学 Construction type neural network parameter fusion optimization method for digital image recognition
CN112819086A (en) * 2021-02-10 2021-05-18 北京工业大学 Image classification method for calculating global optimal solution of single-hidden-layer ReLU neural network by dividing network space
CN112926727A (en) * 2021-02-10 2021-06-08 北京工业大学 Solving method for local minimum value of single hidden layer ReLU neural network
CN113962369A (en) * 2021-11-29 2022-01-21 北京工业大学 Radial basis function neural network optimization method based on improved Levenberg-Marquardt
CN111461229B (en) * 2020-04-01 2023-10-31 北京工业大学 Deep neural network optimization and image classification method based on target transfer and line search

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106472332A (en) * 2016-10-10 2017-03-08 重庆科技学院 Pet feeding method and system based on dynamic intelligent algorithm
CN106472332B (en) * 2016-10-10 2019-05-10 重庆科技学院 Pet feeding method and system based on dynamic intelligent algorithm
CN106407711A (en) * 2016-10-10 2017-02-15 重庆科技学院 Recommendation method and recommendation system of pet feeding based on cloud data
CN109782606A (en) * 2019-03-12 2019-05-21 北京理工大学珠海学院 Recursive small echo Ai Erman neural network based on modified form gravitation searching method
CN110362881A (en) * 2019-06-25 2019-10-22 西安电子科技大学 Microwave power device nonlinear model method based on extreme learning machine
CN110838087A (en) * 2019-11-13 2020-02-25 燕山大学 Image super-resolution reconstruction method and system
CN111461229B (en) * 2020-04-01 2023-10-31 北京工业大学 Deep neural network optimization and image classification method based on target transfer and line search
CN111881990A (en) * 2020-08-03 2020-11-03 江南大学 Construction type neural network parameter fusion optimization method for digital image recognition
CN111881990B (en) * 2020-08-03 2024-03-08 江南大学 Construction type neural network parameter fusion optimization method for digital image recognition
CN112926727A (en) * 2021-02-10 2021-06-08 北京工业大学 Solving method for local minimum value of single hidden layer ReLU neural network
CN112819086A (en) * 2021-02-10 2021-05-18 北京工业大学 Image classification method for calculating global optimal solution of single-hidden-layer ReLU neural network by dividing network space
CN112926727B (en) * 2021-02-10 2024-02-27 北京工业大学 Solving method for local minimum value of single hidden layer ReLU neural network
CN112819086B (en) * 2021-02-10 2024-03-15 北京工业大学 Image classification method for calculating global optimal solution of single hidden layer ReLU neural network by dividing network space
CN113962369A (en) * 2021-11-29 2022-01-21 北京工业大学 Radial basis function neural network optimization method based on improved Levenberg-Marquardt

Similar Documents

Publication Publication Date Title
CN105844332A (en) Fast recursive Elman neural network modeling and learning algorithm
CN109948029B (en) Neural network self-adaptive depth Hash image searching method
KR102302609B1 (en) Neural Network Architecture Optimization
US20200320399A1 (en) Regularized neural network architecture search
CN104598611B (en) The method and system being ranked up to search entry
CN112631717B (en) Asynchronous reinforcement learning-based network service function chain dynamic deployment system and method
CN107239733A (en) Continuous hand-written character recognizing method and system
CN106203625A (en) A kind of deep-neural-network training method based on multiple pre-training
CN110321484A (en) A kind of Products Show method and device
CN107798385A (en) Recognition with Recurrent Neural Network partially connected method based on block tensor resolution
CN109299478A (en) Intelligent automatic question-answering method and system based on two-way shot and long term Memory Neural Networks
CN111723914A (en) Neural network architecture searching method based on convolution kernel prediction
CN108038539A (en) A kind of integrated length memory Recognition with Recurrent Neural Network and the method for gradient lifting decision tree
US20210224650A1 (en) Method for differentiable architecture search based on a hierarchical grouping mechanism
KR20180096469A (en) Knowledge Transfer Method Using Deep Neural Network and Apparatus Therefor
CN112578089B (en) Air pollutant concentration prediction method based on improved TCN
CN111445008A (en) Knowledge distillation-based neural network searching method and system
CN109597998A (en) A kind of characteristics of image construction method of visual signature and characterizing semantics joint insertion
CN110245602A (en) A kind of underwater quiet target identification method based on depth convolution feature
CN108052680B (en) Image data target identification Enhancement Method based on data map, Information Atlas and knowledge mapping
CN106407932B (en) Handwritten Digit Recognition method based on fractional calculus Yu generalized inverse neural network
CN113409157B (en) Cross-social network user alignment method and device
CN108960326B (en) Point cloud fast segmentation method and system based on deep learning framework
CN112487305B (en) GCN-based dynamic social user alignment method
CN110377957B (en) Bridge crane neural network modeling method of whale search strategy wolf algorithm

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20160810

RJ01 Rejection of invention patent application after publication