CN113159299A - Fractional order depth BP neural network optimization method based on extremum optimization - Google Patents
Fractional order depth BP neural network optimization method based on extremum optimization Download PDFInfo
- Publication number
- CN113159299A CN113159299A CN202110484178.8A CN202110484178A CN113159299A CN 113159299 A CN113159299 A CN 113159299A CN 202110484178 A CN202110484178 A CN 202110484178A CN 113159299 A CN113159299 A CN 113159299A
- Authority
- CN
- China
- Prior art keywords
- neural network
- optimization
- fractional order
- layer
- extremum
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 88
- 238000005457 optimization Methods 0.000 title claims abstract description 51
- 238000000034 method Methods 0.000 title claims abstract description 37
- 238000012549 training Methods 0.000 claims abstract description 24
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 14
- 230000008569 process Effects 0.000 claims abstract description 8
- 238000004364 calculation method Methods 0.000 claims abstract description 6
- 238000003062 neural network model Methods 0.000 claims abstract description 6
- 230000006870 function Effects 0.000 claims description 16
- 210000002569 neuron Anatomy 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 4
- 230000035772 mutation Effects 0.000 claims description 4
- 238000012937 correction Methods 0.000 claims description 2
- 239000011159 matrix material Substances 0.000 claims description 2
- 238000011160 research Methods 0.000 abstract description 4
- 230000002411 adverse Effects 0.000 abstract description 2
- 230000000694 effects Effects 0.000 abstract description 2
- 238000004088 simulation Methods 0.000 description 5
- 238000012360 testing method Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 230000001537 neural effect Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a fractional order deep BP neural network optimization method based on extremum optimization. The fractional order system has the advantages of high convergence speed and high convergence accuracy, the initial weight of the established fractional order depth BP (PEO-FODBP) neural network is preferentially selected by utilizing a group extremum optimization algorithm, and in the iterative training process of the network, the optimal individuals in the group and the optimal fitness value of the optimal individuals are iteratively optimized while the network level weight is corrected, so that the adverse effect of the initial weight on the neural network is improved. The problems of easy falling into local minimum, long time consumption and low convergence speed in the prior art are solved. By utilizing the optimization method, the application performance of the fractional order deep BP neural network in various fields can be improved. In addition, the group extremum optimization method and the fractional derivative calculation method in the method can be popularized to other more neural network models, and the optimization research of the neural network is promoted.
Description
Technical Field
The invention belongs to the technical field of artificial intelligence, relates to a fractional order deep BP neural network optimization method based on extremum optimization, and particularly relates to modeling of a deep BP neural network and design of an extremum optimization algorithm.
Background
Neural networks have received considerable attention from researchers in the fields of machine learning, statistics, and computer vision as a powerful tool for data regression and classification. The deep BP neural network is a network system formed by a large number of processing units developed by inspiring by the working principle of biological neural tissue, has the basic characteristics of a biological neural system, and has the advantages of large-scale parallel operation, distributed processing, self-adaptation and self-learning.
Fractional calculus theory has been used as a classic concept in mathematics for hundreds of years, is based on any fractional differentiation and integration, and is a popular popularization of integral calculus at present. Compared with the traditional integer order system, the fractional order system has the advantages of high convergence speed and high convergence accuracy, so that the fractional order system is widely applied to the fields of image processing, machine learning and neural networks.
Extreme value optimization is a new optimization method developed by being inspired by equilibrium dynamics far away from self-organization criticality, and has been successfully applied to various combined optimization problems. The basic principle of the extremum optimization algorithm is to select the individual with the lowest fitness value in the current solving range and the related variable thereof for variation, so that the system is continuously improved towards the optimal solution. Therefore, the extremum optimization algorithm does not end up in an equilibrium state, but fluctuates continuously to improve the searching ability of the algorithm in the solution domain.
Many researches on improving neural network training have been carried out at home and abroad, but most of the researches are carried out on the training parameters of the network in an iterative optimization mode, the important influence of the initial weight of the neural network on the performance of the neural network is ignored, and the methods generally have the problems of easy falling into local minimum values, long time consumption and low convergence speed.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides an extremum optimization-based fractional order depth BP neural network optimization method, wherein a Population extremum optimization algorithm (PEO) is used for carrying out preferred selection on an initial weight of a fractional order depth BP (PEO-FODBP) neural network, and the adverse effect of the initial weight on the neural network is improved.
A fractional order depth BP neural network optimization method based on extremum optimization specifically comprises the following steps:
step one, establishing a fractional order depth BP neural network model;
establishing an L-layer deep BP neural network model, wherein the number of neuron nodes of the L-th layer is nl,l=1,2,...,L,Is the weight between the l layer and the (l +1) layer of the neural network, i is 1,2l,j=1,2,...,nl +1。fl(. h) is the activation function of layer l, X is the input sample of the neural network, O is the ideal output of the input sample X,is an input of the l-th neural network, AlIs the output of the l-th layer neural network.The system is composed of two parts of nodes,is an external node which is not connected with the neuron node of the upper layer;it is an internal node, and is fully connected with the neuron node of the upper layer. Thus, the forward propagation process of a neural network can be expressed as:
the loss function of the neural network isWherein, | | · | is an euclidean norm, and Σ is a summation symbol;
the back propagation process of the fractional order deep BP neural network can be divided into two parts of gradient propagation and gradient calculation. In the gradient propagation process, the gradient propagation term of the l layer of the neural network can be defined as:
in the formula (I), the compound is shown in the specification,representing the partial derivative calculation;
according to the chain rule, δlAnd deltal+1The relationship of (c) can be expressed as:
wherein f' (. cndot.) represents the first derivative of f (-);
thus, the fractional order gradient calculation under the definition of Caputo can be expressed as:
wherein v is a fraction representing the order of the fractional order,represents the loss function E versus the weight W under the definition of the Caputo fractional derivativelCalculating a fractional derivative, wherein gamma (·) is a gamma function;
the weight correction expression of each layer of the neural network is as follows:
wherein t is a natural number and represents the current training algebra of the neural network, and mu is the learning rate.
Initializing parameters;
initializing various parameters in the method, including: number of neural network trainings ImaxLearning rate mu, fractional order v, population size T, population iteration times G, and neural network training times I for calculating individual fitness valuefAnd a mutation factor b.
Step three, generating an initial population;
P=xmin+(xmax-xmin)×rand(T,c) (6)
randomly generating an initial population P ═ S according to equation (6)1,S2,...,STIn which S isp={x1,x2,...,xc},p=1,2,...,T,Representing the total number of weights, x, contained in the neural networkmaxAnd xminThe upper and lower bounds of the weight are represented, respectively, and rand (T, c) represents the generation of a random matrix with dimension T × c ranging between (0, 1).
Step four, calculating individual fitness;
the individuals S of the populationpThe decoded product is input into FODBP neural network as initial weight, then the training sample is input into the neural network for training, the weight of the neural network is corrected according to the steps, and the training frequency reaches IfThen obtaining the training precision and taking the training precision as the fitness function value f of the individualp。
Step five, selecting an individual;
the individuals of the T populations are subjected to fitness function values f according to the individualspSorting is performed so that fΠ1>fΠ2>...>fΠTSelecting pi 1 to piIndividual replacement of piIndividuals to pi T and obtaining optimal individuals Sbest=SΠ1And an optimal fitness value fbest=fΠ1。
Step six, generating a new population;
performing mutation operation on the population selected in the fifth step according to Non-uniform mutation (NUM) rule to generate a new population PN={SN1,SN2,…,SNT}; the non-uniform variation rule is as follows:
wherein e is a random number between (0,1), xmid=(xmax+xmin)/2,The positive integer g is the current iteration number of the group extremum optimization algorithm.
Step seven, iterative loop;
repeating the fourth step to the sixth step to obtain a new optimal individual SNbestAnd a fitness value fNbestIf f isNbest>fbestThe new optimal individual is then saved with the optimal fitness value, i.e. Sbest=SNbest,fbest=fNbestCirculating for multiple times until the iterative optimization times of the group reach G, and obtaining the global optimal solution SbestAnd the decoded result is input into the FODBP neural network as an initial weight to complete the optimization of the fractional order depth BP neural network.
The invention has the following beneficial effects:
1. and carrying out preferential selection on the initial weight of the fractional-order depth BP neural network by utilizing a population extremum optimization algorithm, thereby achieving the purpose of improving the performance of the neural network.
2. The problem that the fractional order depth BP neural network is easy to fall into a local extremum is remarkably improved, and the performance of the fractional order depth BP neural network is improved, so that the fractional order depth BP neural network has stronger applicability.
3. The group extremum optimization method and the fractional derivative calculation method can be popularized to other more neural network models, and the optimization research of the neural network is promoted.
Drawings
FIG. 1 is a conceptual diagram of a fractional order deep BP neural network;
FIG. 2 is a flow chart of the present optimization method;
FIG. 3 is a diagram of a fractional depth BP neural network structure built in the embodiment;
FIG. 4 is a graph of error convergence for the embodiment and other algorithms.
Detailed Description
The invention is further explained below with reference to the drawings;
a fractional order deep BP neural network optimization method based on extremum optimization is disclosed, the structure of the fractional order deep BP neural network is shown in figure 1, and figure 2 is a flow chart of the optimization method.
Establishing a fractional order deep BP neural network as shown in FIG. 3, wherein the network has 8 layers, external nodes of the first four layers are 196, internal nodes are 32, and the input of the internal node of the first layer neural network is 1; the four layers of neural networks have 64 internal nodes without external nodes. The activation function of each layer of the neural network is a sigmoid function. The method is used for recognizing handwritten fonts, MATLAB is adopted for software programming in a simulation experiment, and MNIST handwritten digital data is adopted for recognition and simulation. The MNIST handwritten digital data consists of 60000 training data and 10000 test data, each set of data being standardized by handwritten digital images. Wherein, a digital image has 28 multiplied by 28 gray data, and the gray value is between 0 and 255. In the simulation experiment, a digital image is divided into 4 parts, each part has 14 × 14 gray data, the gray data are arranged into 196 × 1 vectors, and finally 4 groups of data are respectively input into a neural network for training.
The parameters of the simulation are shown in table 1.
TABLE 1
Wherein, the Batch size is the number of the one-time input samples. The initial weight of the generated neural network follows standard normal distribution.
Table 2 shows the training accuracy and the testing accuracy of the FODBP neural network. Table 3 shows the training accuracy and testing accuracy of the PEO-FODBP neural network. From the data in the two tables, it can be found that the training precision and the testing precision of the fractional order deep BP neural network are obviously improved by using the group extremum optimization algorithm, and the performance of the fractional order deep BP neural network is obviously improved.
TABLE 2
TABLE 3
Fig. 4 is an error convergence graph of the PEO-FODBP algorithm and the FODBP algorithm of the present invention along with the training process, and it can be found from the graph that the convergence speed of the fractional order deep BP neural network optimized by the PEO algorithm is faster and the convergence accuracy is higher. According to the simulation experiment result, the following conclusion can be drawn: the method provides a fractional order depth BP neural network method based on population extremum optimization, and compared with a common fractional order depth BP neural network, the performance is obviously improved. By utilizing the method provided by the invention, the application performance of the fractional order deep BP neural network in various fields can be improved.
Claims (6)
1. A fractional order depth BP neural network optimization method based on extremum optimization is characterized in that: the method specifically comprises the following steps:
step one, establishing a fractional order depth BP neural network model;
establishing an L-layer deep BP neural network model, wherein the number of neuron nodes of the L-th layer is nl,l=1,2,2,L,As the weight between the l layer and the (l +1) layer of the neural network, i is 1,2,2, nl,j=1,2,2,nl+1(ii) a X is the input sample of the neural network, O is the ideal output of the input sample X,is an input of the l-th neural network, AlIs the output of the l layer neural network; the loss function of the neural network isWherein, | | · | is an euclidean norm, and Σ is a summation symbol;
wherein, deltalIs the gradient propagation term of the l layer of the neural network, v is a fraction, represents the order of the fractional order,e represents the loss function E versus the weight W under the definition of the Caputo fractional derivativelCalculating a fractional derivative, wherein gamma (·) is a gamma function;
the weight correction expression of each layer of the neural network is as follows:
wherein t is a natural number and represents the current training algebra of the neural network, and mu is a learning rate;
initializing parameters;
initializing various parameters in the method, including: number of neural network trainings ImaxLearning rate mu, fractional order v, population size T, population iteration times G, and neural network training times I for calculating individual fitness valuefA mutation factor b;
step three, generating an initial population;
randomly generating an initial population P ═ S1,S2,2,STIn which S isp={x1,x2,...,xc},p=1,2,...,T,Representing the total number of weights contained in the neural network;
step four, calculating individual fitness;
the individuals S of the populationpInputting the decoded data into FODBP neural network as initial weight, inputting MNIST handwritten digital data into the neural network for training, correcting the weight of the neural network according to the steps, and training the number of times to reach IfThen obtaining the training precision and taking the training precision as the fitness function value f of the individualp;
Step five, selecting an individual;
ordering the individuals of the T populations according to the fitness function values thereof so that fΠ1>fΠ2>...>fΠTSelecting pi 1 toIndividual replacement ofIndividuals to pi T and obtaining optimal individuals Sbest=SΠ1And an optimal fitness value fbest=fΠ1;
Step six, generating a new population;
performing variation operation on the population selected in the fifth step according to a non-uniform variation rule to generate a new population PN={SN1,SN2,…,SNT};
Step seven, iterative loop;
repeating the fourth step to the sixth step to obtain a new optimal individual SNbestAnd a fitness value fNbestIf f isNbest>fbestThe new optimal individual is then saved with the optimal fitness value, i.e. Sbest=SNbest,fbest=fNbestCirculating for multiple times until the iterative optimization times of the group reach G, and obtaining the global optimal solution SbestThe decoded offspring is put into an FODBP neural network to be used as an initial weight to complete the optimization of the fractional order depth BP neural network;
step eight, character recognition
Inputting handwritten numbers to the fractional order depth BP neural network optimized in the step seven, and outputting a recognition result by the network.
2. The extremum optimization-based fractional order deep BP neural network optimization method of claim 1, wherein: output of layer I neural networkThe system is composed of two parts of nodes,is an external node which is not connected with the neuron node of the upper layer;the internal node is fully connected with the neuron node of the upper layer; f. ofl(. h) is the activation function of the l-th layer, and the forward propagation process of the network is as follows:
3. the extremum optimization-based fractional order deep BP neural network optimization method of claim 1 or 2, wherein: activation function f of layer I of neural networkl(. cndot.) is sigmoid function.
4. The extremum optimization-based fractional order deep BP neural network optimization method of claim 1, wherein: the back propagation process of the fractional depth BP neural network is divided into two parts of gradient propagation and gradient calculation; in the gradient propagation process, the gradient propagation term delta of the l layer of the neural networklComprises the following steps:
according to the chain rule, δlAnd deltal+1The relationship of (1) is:
where f' (. cndot.) represents the first derivative of f (-).
5. The extremum optimization-based fractional order deep BP neural network optimization method of claim 1, wherein: the random generation method of the initial population comprises the following steps:
P=xmin+(xmax-xmin)×rand(T,c) (6)
wherein xmaxAnd xminThe upper and lower bounds of the weight are represented, respectively, and rand (T, c) represents the generation of a random matrix with dimension T × c ranging between (0, 1).
6. The extremum optimization-based fractional order deep BP neural network optimization method of claim 1, wherein: the uneven variation rule of the population in the sixth step is as follows:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110484178.8A CN113159299B (en) | 2021-04-30 | 2021-04-30 | Fractional order depth BP neural network optimization method based on extremum optimization |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110484178.8A CN113159299B (en) | 2021-04-30 | 2021-04-30 | Fractional order depth BP neural network optimization method based on extremum optimization |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113159299A true CN113159299A (en) | 2021-07-23 |
CN113159299B CN113159299B (en) | 2024-02-06 |
Family
ID=76873052
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110484178.8A Active CN113159299B (en) | 2021-04-30 | 2021-04-30 | Fractional order depth BP neural network optimization method based on extremum optimization |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113159299B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110046710A (en) * | 2019-04-11 | 2019-07-23 | 山东师范大学 | A kind of the nonlinear function Extremal optimization method and system of neural network |
CN111160520A (en) * | 2019-12-06 | 2020-05-15 | 南京理工大学 | BP neural network wind speed prediction method based on genetic algorithm optimization |
-
2021
- 2021-04-30 CN CN202110484178.8A patent/CN113159299B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110046710A (en) * | 2019-04-11 | 2019-07-23 | 山东师范大学 | A kind of the nonlinear function Extremal optimization method and system of neural network |
CN111160520A (en) * | 2019-12-06 | 2020-05-15 | 南京理工大学 | BP neural network wind speed prediction method based on genetic algorithm optimization |
Also Published As
Publication number | Publication date |
---|---|
CN113159299B (en) | 2024-02-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhu et al. | Multi-objective evolutionary federated learning | |
Zhang et al. | ANODEV2: A coupled neural ODE framework | |
Sharma | Deep challenges associated with deep learning | |
CN109948029A (en) | Based on the adaptive depth hashing image searching method of neural network | |
Sun et al. | Automatically evolving cnn architectures based on blocks | |
CN107330902B (en) | Chaotic genetic BP neural network image segmentation method based on Arnold transformation | |
JPH07296117A (en) | Constitution method of sort weight matrix for pattern recognition system using reduced element feature section set | |
CN111898689A (en) | Image classification method based on neural network architecture search | |
CN112465120A (en) | Fast attention neural network architecture searching method based on evolution method | |
CN112700060A (en) | Station terminal load prediction method and prediction device | |
Bakhshi et al. | Fast evolution of CNN architecture for image classification | |
CN110135498A (en) | Image identification method based on deep evolution neural network | |
CN113157919A (en) | Sentence text aspect level emotion classification method and system | |
CN107590538B (en) | Danger source identification method based on online sequence learning machine | |
Hao et al. | Annealing genetic GAN for imbalanced web data learning | |
Rad et al. | GP-RVM: Genetic programing-based symbolic regression using relevance vector machine | |
CN116259109A (en) | Human behavior recognition method based on generation type self-supervision learning and contrast learning | |
Wilson et al. | Positional cartesian genetic programming | |
CN111695689B (en) | Natural language processing method, device, equipment and readable storage medium | |
CN113159299B (en) | Fractional order depth BP neural network optimization method based on extremum optimization | |
CN114662678B (en) | Image identification method based on variable activation function convolutional neural network | |
CN111222529A (en) | GoogLeNet-SVM-based sewage aeration tank foam identification method | |
CN115906959A (en) | Parameter training method of neural network model based on DE-BP algorithm | |
Kundu et al. | Autosparse: Towards automated sparse training of deep neural networks | |
CN110059806B (en) | Multi-stage weighted network community structure detection method based on power law function |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |