CN113159299A - Fractional order depth BP neural network optimization method based on extremum optimization - Google Patents

Fractional order depth BP neural network optimization method based on extremum optimization Download PDF

Info

Publication number
CN113159299A
CN113159299A CN202110484178.8A CN202110484178A CN113159299A CN 113159299 A CN113159299 A CN 113159299A CN 202110484178 A CN202110484178 A CN 202110484178A CN 113159299 A CN113159299 A CN 113159299A
Authority
CN
China
Prior art keywords
neural network
optimization
fractional order
layer
extremum
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110484178.8A
Other languages
Chinese (zh)
Other versions
CN113159299B (en
Inventor
陈碧鹏
陈云
曾国强
佘青山
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN202110484178.8A priority Critical patent/CN113159299B/en
Publication of CN113159299A publication Critical patent/CN113159299A/en
Application granted granted Critical
Publication of CN113159299B publication Critical patent/CN113159299B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a fractional order deep BP neural network optimization method based on extremum optimization. The fractional order system has the advantages of high convergence speed and high convergence accuracy, the initial weight of the established fractional order depth BP (PEO-FODBP) neural network is preferentially selected by utilizing a group extremum optimization algorithm, and in the iterative training process of the network, the optimal individuals in the group and the optimal fitness value of the optimal individuals are iteratively optimized while the network level weight is corrected, so that the adverse effect of the initial weight on the neural network is improved. The problems of easy falling into local minimum, long time consumption and low convergence speed in the prior art are solved. By utilizing the optimization method, the application performance of the fractional order deep BP neural network in various fields can be improved. In addition, the group extremum optimization method and the fractional derivative calculation method in the method can be popularized to other more neural network models, and the optimization research of the neural network is promoted.

Description

Fractional order depth BP neural network optimization method based on extremum optimization
Technical Field
The invention belongs to the technical field of artificial intelligence, relates to a fractional order deep BP neural network optimization method based on extremum optimization, and particularly relates to modeling of a deep BP neural network and design of an extremum optimization algorithm.
Background
Neural networks have received considerable attention from researchers in the fields of machine learning, statistics, and computer vision as a powerful tool for data regression and classification. The deep BP neural network is a network system formed by a large number of processing units developed by inspiring by the working principle of biological neural tissue, has the basic characteristics of a biological neural system, and has the advantages of large-scale parallel operation, distributed processing, self-adaptation and self-learning.
Fractional calculus theory has been used as a classic concept in mathematics for hundreds of years, is based on any fractional differentiation and integration, and is a popular popularization of integral calculus at present. Compared with the traditional integer order system, the fractional order system has the advantages of high convergence speed and high convergence accuracy, so that the fractional order system is widely applied to the fields of image processing, machine learning and neural networks.
Extreme value optimization is a new optimization method developed by being inspired by equilibrium dynamics far away from self-organization criticality, and has been successfully applied to various combined optimization problems. The basic principle of the extremum optimization algorithm is to select the individual with the lowest fitness value in the current solving range and the related variable thereof for variation, so that the system is continuously improved towards the optimal solution. Therefore, the extremum optimization algorithm does not end up in an equilibrium state, but fluctuates continuously to improve the searching ability of the algorithm in the solution domain.
Many researches on improving neural network training have been carried out at home and abroad, but most of the researches are carried out on the training parameters of the network in an iterative optimization mode, the important influence of the initial weight of the neural network on the performance of the neural network is ignored, and the methods generally have the problems of easy falling into local minimum values, long time consumption and low convergence speed.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides an extremum optimization-based fractional order depth BP neural network optimization method, wherein a Population extremum optimization algorithm (PEO) is used for carrying out preferred selection on an initial weight of a fractional order depth BP (PEO-FODBP) neural network, and the adverse effect of the initial weight on the neural network is improved.
A fractional order depth BP neural network optimization method based on extremum optimization specifically comprises the following steps:
step one, establishing a fractional order depth BP neural network model;
establishing an L-layer deep BP neural network model, wherein the number of neuron nodes of the L-th layer is nl,l=1,2,...,L,
Figure BDA0003050179850000021
Is the weight between the l layer and the (l +1) layer of the neural network, i is 1,2l,j=1,2,...,nl +1。fl(. h) is the activation function of layer l, X is the input sample of the neural network, O is the ideal output of the input sample X,
Figure BDA0003050179850000022
is an input of the l-th neural network, AlIs the output of the l-th layer neural network.
Figure BDA0003050179850000023
The system is composed of two parts of nodes,
Figure BDA0003050179850000024
is an external node which is not connected with the neuron node of the upper layer;
Figure BDA0003050179850000025
it is an internal node, and is fully connected with the neuron node of the upper layer. Thus, the forward propagation process of a neural network can be expressed as:
Figure BDA0003050179850000026
the loss function of the neural network is
Figure BDA0003050179850000027
Wherein, | | · | is an euclidean norm, and Σ is a summation symbol;
the back propagation process of the fractional order deep BP neural network can be divided into two parts of gradient propagation and gradient calculation. In the gradient propagation process, the gradient propagation term of the l layer of the neural network can be defined as:
Figure BDA0003050179850000028
in the formula (I), the compound is shown in the specification,
Figure BDA0003050179850000029
representing the partial derivative calculation;
according to the chain rule, δlAnd deltal+1The relationship of (c) can be expressed as:
Figure BDA00030501798500000210
wherein f' (. cndot.) represents the first derivative of f (-);
thus, the fractional order gradient calculation under the definition of Caputo can be expressed as:
Figure BDA00030501798500000211
wherein v is a fraction representing the order of the fractional order,
Figure BDA00030501798500000212
represents the loss function E versus the weight W under the definition of the Caputo fractional derivativelCalculating a fractional derivative, wherein gamma (·) is a gamma function;
the weight correction expression of each layer of the neural network is as follows:
Figure BDA00030501798500000213
wherein t is a natural number and represents the current training algebra of the neural network, and mu is the learning rate.
Initializing parameters;
initializing various parameters in the method, including: number of neural network trainings ImaxLearning rate mu, fractional order v, population size T, population iteration times G, and neural network training times I for calculating individual fitness valuefAnd a mutation factor b.
Step three, generating an initial population;
P=xmin+(xmax-xmin)×rand(T,c) (6)
randomly generating an initial population P ═ S according to equation (6)1,S2,...,STIn which S isp={x1,x2,...,xc},p=1,2,...,T,
Figure BDA0003050179850000031
Representing the total number of weights, x, contained in the neural networkmaxAnd xminThe upper and lower bounds of the weight are represented, respectively, and rand (T, c) represents the generation of a random matrix with dimension T × c ranging between (0, 1).
Step four, calculating individual fitness;
the individuals S of the populationpThe decoded product is input into FODBP neural network as initial weight, then the training sample is input into the neural network for training, the weight of the neural network is corrected according to the steps, and the training frequency reaches IfThen obtaining the training precision and taking the training precision as the fitness function value f of the individualp
Step five, selecting an individual;
the individuals of the T populations are subjected to fitness function values f according to the individualspSorting is performed so that fΠ1>fΠ2>...>fΠTSelecting pi 1 to pi
Figure BDA0003050179850000032
Individual replacement of pi
Figure BDA0003050179850000033
Individuals to pi T and obtaining optimal individuals Sbest=SΠ1And an optimal fitness value fbest=fΠ1
Step six, generating a new population;
performing mutation operation on the population selected in the fifth step according to Non-uniform mutation (NUM) rule to generate a new population PN={SN1,SN2,…,SNT}; the non-uniform variation rule is as follows:
Figure BDA0003050179850000034
wherein e is a random number between (0,1), xmid=(xmax+xmin)/2,
Figure BDA0003050179850000035
The positive integer g is the current iteration number of the group extremum optimization algorithm.
Step seven, iterative loop;
repeating the fourth step to the sixth step to obtain a new optimal individual SNbestAnd a fitness value fNbestIf f isNbest>fbestThe new optimal individual is then saved with the optimal fitness value, i.e. Sbest=SNbest,fbest=fNbestCirculating for multiple times until the iterative optimization times of the group reach G, and obtaining the global optimal solution SbestAnd the decoded result is input into the FODBP neural network as an initial weight to complete the optimization of the fractional order depth BP neural network.
The invention has the following beneficial effects:
1. and carrying out preferential selection on the initial weight of the fractional-order depth BP neural network by utilizing a population extremum optimization algorithm, thereby achieving the purpose of improving the performance of the neural network.
2. The problem that the fractional order depth BP neural network is easy to fall into a local extremum is remarkably improved, and the performance of the fractional order depth BP neural network is improved, so that the fractional order depth BP neural network has stronger applicability.
3. The group extremum optimization method and the fractional derivative calculation method can be popularized to other more neural network models, and the optimization research of the neural network is promoted.
Drawings
FIG. 1 is a conceptual diagram of a fractional order deep BP neural network;
FIG. 2 is a flow chart of the present optimization method;
FIG. 3 is a diagram of a fractional depth BP neural network structure built in the embodiment;
FIG. 4 is a graph of error convergence for the embodiment and other algorithms.
Detailed Description
The invention is further explained below with reference to the drawings;
a fractional order deep BP neural network optimization method based on extremum optimization is disclosed, the structure of the fractional order deep BP neural network is shown in figure 1, and figure 2 is a flow chart of the optimization method.
Establishing a fractional order deep BP neural network as shown in FIG. 3, wherein the network has 8 layers, external nodes of the first four layers are 196, internal nodes are 32, and the input of the internal node of the first layer neural network is 1; the four layers of neural networks have 64 internal nodes without external nodes. The activation function of each layer of the neural network is a sigmoid function. The method is used for recognizing handwritten fonts, MATLAB is adopted for software programming in a simulation experiment, and MNIST handwritten digital data is adopted for recognition and simulation. The MNIST handwritten digital data consists of 60000 training data and 10000 test data, each set of data being standardized by handwritten digital images. Wherein, a digital image has 28 multiplied by 28 gray data, and the gray value is between 0 and 255. In the simulation experiment, a digital image is divided into 4 parts, each part has 14 × 14 gray data, the gray data are arranged into 196 × 1 vectors, and finally 4 groups of data are respectively input into a neural network for training.
The parameters of the simulation are shown in table 1.
Figure BDA0003050179850000041
Figure BDA0003050179850000051
TABLE 1
Wherein, the Batch size is the number of the one-time input samples. The initial weight of the generated neural network follows standard normal distribution.
Table 2 shows the training accuracy and the testing accuracy of the FODBP neural network. Table 3 shows the training accuracy and testing accuracy of the PEO-FODBP neural network. From the data in the two tables, it can be found that the training precision and the testing precision of the fractional order deep BP neural network are obviously improved by using the group extremum optimization algorithm, and the performance of the fractional order deep BP neural network is obviously improved.
Figure BDA0003050179850000052
TABLE 2
Figure BDA0003050179850000053
TABLE 3
Fig. 4 is an error convergence graph of the PEO-FODBP algorithm and the FODBP algorithm of the present invention along with the training process, and it can be found from the graph that the convergence speed of the fractional order deep BP neural network optimized by the PEO algorithm is faster and the convergence accuracy is higher. According to the simulation experiment result, the following conclusion can be drawn: the method provides a fractional order depth BP neural network method based on population extremum optimization, and compared with a common fractional order depth BP neural network, the performance is obviously improved. By utilizing the method provided by the invention, the application performance of the fractional order deep BP neural network in various fields can be improved.

Claims (6)

1. A fractional order depth BP neural network optimization method based on extremum optimization is characterized in that: the method specifically comprises the following steps:
step one, establishing a fractional order depth BP neural network model;
establishing an L-layer deep BP neural network model, wherein the number of neuron nodes of the L-th layer is nl,l=1,2,2,L,
Figure FDA0003050179840000011
As the weight between the l layer and the (l +1) layer of the neural network, i is 1,2,2, nl,j=1,2,2,nl+1(ii) a X is the input sample of the neural network, O is the ideal output of the input sample X,
Figure FDA0003050179840000012
is an input of the l-th neural network, AlIs the output of the l layer neural network; the loss function of the neural network is
Figure FDA0003050179840000013
Wherein, | | · | is an euclidean norm, and Σ is a summation symbol;
fractional step size under Caputo definition
Figure FDA0003050179840000014
E is as follows:
Figure FDA0003050179840000015
wherein, deltalIs the gradient propagation term of the l layer of the neural network, v is a fraction, represents the order of the fractional order,
Figure FDA0003050179840000016
e represents the loss function E versus the weight W under the definition of the Caputo fractional derivativelCalculating a fractional derivative, wherein gamma (·) is a gamma function;
the weight correction expression of each layer of the neural network is as follows:
Figure FDA0003050179840000017
wherein t is a natural number and represents the current training algebra of the neural network, and mu is a learning rate;
initializing parameters;
initializing various parameters in the method, including: number of neural network trainings ImaxLearning rate mu, fractional order v, population size T, population iteration times G, and neural network training times I for calculating individual fitness valuefA mutation factor b;
step three, generating an initial population;
randomly generating an initial population P ═ S1,S2,2,STIn which S isp={x1,x2,...,xc},p=1,2,...,T,
Figure FDA0003050179840000018
Representing the total number of weights contained in the neural network;
step four, calculating individual fitness;
the individuals S of the populationpInputting the decoded data into FODBP neural network as initial weight, inputting MNIST handwritten digital data into the neural network for training, correcting the weight of the neural network according to the steps, and training the number of times to reach IfThen obtaining the training precision and taking the training precision as the fitness function value f of the individualp
Step five, selecting an individual;
ordering the individuals of the T populations according to the fitness function values thereof so that fΠ1>fΠ2>...>fΠTSelecting pi 1 to
Figure FDA0003050179840000021
Individual replacement of
Figure FDA0003050179840000022
Individuals to pi T and obtaining optimal individuals Sbest=SΠ1And an optimal fitness value fbest=fΠ1
Step six, generating a new population;
performing variation operation on the population selected in the fifth step according to a non-uniform variation rule to generate a new population PN={SN1,SN2,…,SNT};
Step seven, iterative loop;
repeating the fourth step to the sixth step to obtain a new optimal individual SNbestAnd a fitness value fNbestIf f isNbest>fbestThe new optimal individual is then saved with the optimal fitness value, i.e. Sbest=SNbest,fbest=fNbestCirculating for multiple times until the iterative optimization times of the group reach G, and obtaining the global optimal solution SbestThe decoded offspring is put into an FODBP neural network to be used as an initial weight to complete the optimization of the fractional order depth BP neural network;
step eight, character recognition
Inputting handwritten numbers to the fractional order depth BP neural network optimized in the step seven, and outputting a recognition result by the network.
2. The extremum optimization-based fractional order deep BP neural network optimization method of claim 1, wherein: output of layer I neural network
Figure FDA0003050179840000023
The system is composed of two parts of nodes,
Figure FDA0003050179840000024
is an external node which is not connected with the neuron node of the upper layer;
Figure FDA0003050179840000025
the internal node is fully connected with the neuron node of the upper layer; f. ofl(. h) is the activation function of the l-th layer, and the forward propagation process of the network is as follows:
Figure FDA0003050179840000026
3. the extremum optimization-based fractional order deep BP neural network optimization method of claim 1 or 2, wherein: activation function f of layer I of neural networkl(. cndot.) is sigmoid function.
4. The extremum optimization-based fractional order deep BP neural network optimization method of claim 1, wherein: the back propagation process of the fractional depth BP neural network is divided into two parts of gradient propagation and gradient calculation; in the gradient propagation process, the gradient propagation term delta of the l layer of the neural networklComprises the following steps:
Figure FDA0003050179840000031
according to the chain rule, δlAnd deltal+1The relationship of (1) is:
Figure FDA0003050179840000032
where f' (. cndot.) represents the first derivative of f (-).
5. The extremum optimization-based fractional order deep BP neural network optimization method of claim 1, wherein: the random generation method of the initial population comprises the following steps:
P=xmin+(xmax-xmin)×rand(T,c) (6)
wherein xmaxAnd xminThe upper and lower bounds of the weight are represented, respectively, and rand (T, c) represents the generation of a random matrix with dimension T × c ranging between (0, 1).
6. The extremum optimization-based fractional order deep BP neural network optimization method of claim 1, wherein: the uneven variation rule of the population in the sixth step is as follows:
Figure FDA0003050179840000033
wherein e is a random number between (0,1), xmid=(xmax+xmin)/2,
Figure FDA0003050179840000034
The positive integer g is the current iteration number of the group extremum optimization algorithm.
CN202110484178.8A 2021-04-30 2021-04-30 Fractional order depth BP neural network optimization method based on extremum optimization Active CN113159299B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110484178.8A CN113159299B (en) 2021-04-30 2021-04-30 Fractional order depth BP neural network optimization method based on extremum optimization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110484178.8A CN113159299B (en) 2021-04-30 2021-04-30 Fractional order depth BP neural network optimization method based on extremum optimization

Publications (2)

Publication Number Publication Date
CN113159299A true CN113159299A (en) 2021-07-23
CN113159299B CN113159299B (en) 2024-02-06

Family

ID=76873052

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110484178.8A Active CN113159299B (en) 2021-04-30 2021-04-30 Fractional order depth BP neural network optimization method based on extremum optimization

Country Status (1)

Country Link
CN (1) CN113159299B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110046710A (en) * 2019-04-11 2019-07-23 山东师范大学 A kind of the nonlinear function Extremal optimization method and system of neural network
CN111160520A (en) * 2019-12-06 2020-05-15 南京理工大学 BP neural network wind speed prediction method based on genetic algorithm optimization

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110046710A (en) * 2019-04-11 2019-07-23 山东师范大学 A kind of the nonlinear function Extremal optimization method and system of neural network
CN111160520A (en) * 2019-12-06 2020-05-15 南京理工大学 BP neural network wind speed prediction method based on genetic algorithm optimization

Also Published As

Publication number Publication date
CN113159299B (en) 2024-02-06

Similar Documents

Publication Publication Date Title
Zhu et al. Multi-objective evolutionary federated learning
Zhang et al. ANODEV2: A coupled neural ODE framework
Sharma Deep challenges associated with deep learning
CN109948029A (en) Based on the adaptive depth hashing image searching method of neural network
Sun et al. Automatically evolving cnn architectures based on blocks
CN107330902B (en) Chaotic genetic BP neural network image segmentation method based on Arnold transformation
JPH07296117A (en) Constitution method of sort weight matrix for pattern recognition system using reduced element feature section set
CN111898689A (en) Image classification method based on neural network architecture search
CN112465120A (en) Fast attention neural network architecture searching method based on evolution method
CN112700060A (en) Station terminal load prediction method and prediction device
Bakhshi et al. Fast evolution of CNN architecture for image classification
CN110135498A (en) Image identification method based on deep evolution neural network
CN113157919A (en) Sentence text aspect level emotion classification method and system
CN107590538B (en) Danger source identification method based on online sequence learning machine
Hao et al. Annealing genetic GAN for imbalanced web data learning
Rad et al. GP-RVM: Genetic programing-based symbolic regression using relevance vector machine
CN116259109A (en) Human behavior recognition method based on generation type self-supervision learning and contrast learning
Wilson et al. Positional cartesian genetic programming
CN111695689B (en) Natural language processing method, device, equipment and readable storage medium
CN113159299B (en) Fractional order depth BP neural network optimization method based on extremum optimization
CN114662678B (en) Image identification method based on variable activation function convolutional neural network
CN111222529A (en) GoogLeNet-SVM-based sewage aeration tank foam identification method
CN115906959A (en) Parameter training method of neural network model based on DE-BP algorithm
Kundu et al. Autosparse: Towards automated sparse training of deep neural networks
CN110059806B (en) Multi-stage weighted network community structure detection method based on power law function

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant