CN113159299B

CN113159299B - Fractional order depth BP neural network optimization method based on extremum optimization

Info

Publication number: CN113159299B
Application number: CN202110484178.8A
Authority: CN
Inventors: 陈碧鹏; 陈云; 曾国强; 佘青山
Original assignee: Hangzhou Dianzi University
Current assignee: Hangzhou Dianzi University
Priority date: 2021-04-30
Filing date: 2021-04-30
Publication date: 2024-02-06
Anticipated expiration: 2041-04-30
Also published as: CN113159299A

Abstract

The invention discloses a fractional order depth BP neural network optimization method based on extremum optimization. The fractional order system has the advantages of high convergence speed and high convergence accuracy, the method utilizes a population extremum optimization algorithm to perform preferential selection on the initial weight of the established fractional order depth BP (PEO-FODBP) neural network, and the optimal individuals in the population and the optimal fitness value thereof are also subjected to iterative optimization while the network level weight is corrected in the iterative training process of the network, so that adverse effects of the initial weight on the neural network are improved. The problems of easy sinking into local minimum value, long time consumption and low convergence speed in the prior art are solved. By utilizing the optimization method, the application performance of the fractional order depth BP neural network in various fields can be improved. In addition, the population extremum optimizing method and the fractional derivative calculating method in the method can be popularized to other more neural network models, and the optimization research of the neural network is promoted.

Description

Fractional order depth BP neural network optimization method based on extremum optimization

Technical Field

The invention belongs to the technical field of artificial intelligence, relates to a fractional order depth BP neural network optimization method based on extremum optimization, and particularly relates to modeling of a depth BP neural network and design of an extremum optimization algorithm.

Background

Neural networks have received considerable attention from researchers in the fields of machine learning, statistics and computer vision as a powerful data regression and classification tool. The deep BP neural network is a developed network system composed of a large number of processing units, which is inspired by the working principle of biological neural tissue, has the basic characteristics of the biological neural system, and has the advantages of large-scale parallel operation, distributed processing, self-adaption and self-learning.

Fractional calculus theory has been a classical concept in mathematics for hundreds of years, and is based on differentiation and integration of any fractional order, and is a popularization of integer calculus popular at present. Compared with the traditional integer-order system, the fractional-order system has the advantages of high convergence speed and high convergence accuracy, so that the fractional-order system is widely applied to the fields of image processing, machine learning and neural networks.

Extremum optimization is a new optimization method developed by being inspired by equilibrium dynamics far from the criticality of self-organization, and has been successfully applied to various combinatorial optimization problems. The basic principle of the extremum optimizing algorithm is that the individual with the lowest fitness value in the current solving range and the related variable are selected for variation, so that the system is continuously improved to the optimal solution. Therefore, the extremum optimizing algorithm does not terminate in an equilibrium state, but continuously fluctuates to improve the searching ability of the algorithm in the solution domain.

Many research works related to improving neural network training exist at home and abroad, but most of the research works are iterative optimization on training parameters of the network, but important influences of initial weights of the neural network on the performance of the neural network are ignored, and the problems of easy local minimum, long time consumption and slow convergence speed of the methods generally exist.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a fractional order depth BP neural network optimization method based on extremum optimization, which utilizes a Population extremum optimization algorithm (PEO-based extremal optimization) to preferentially select the initial weight of a fractional order depth BP (PEO-FODBP) neural network and improves the adverse effect of the initial weight on the neural network.

The fractional order depth BP neural network optimization method based on extremum optimization specifically comprises the following steps:

step one, establishing a fractional order depth BP neural network model;

establishing an L-layer deep BP neural network model, wherein the number of the neuron nodes of the first layer is n ^l ，l＝1,2,...,L，I=1, 2, n., is a weight between the neural network layer i and layer (l+1) ^l ,j＝1,2,...,n ^l ⁺¹ 。f _l (. Cndot.) is the activation function of the first layer, X is the input sample of the neural network, O is the ideal output of the input sample X,for input of the layer I neural network, A ^l Is the output of the layer-1 neural network. />Consists of two-part nodes->Is an external node, and is not connected with the neuron node of the upper layer; />Is an internal node, and is fully connected with the neuron node of the upper layer. Thus, the forward propagation process of the neural network can be expressed as:

the neural network has a loss function ofWherein, I is Euclidean norm, and Sigma is sum symbol;

the back propagation process of the fractional order depth BP neural network can be divided into two parts of gradient propagation and gradient calculation. In the gradient propagation process, the gradient propagation term of the first layer of the neural network can be defined as:

in the method, in the process of the invention,representing the partial derivative;

delta according to the chain law ^l And delta ^l+1 The relationship of (2) can be expressed as:

wherein f' (. Cndot.) represents the first derivative of f (. Cndot.);

thus, the fractional gradient computation under the Caputo definition can be expressed as:

where v is a fraction, representing the order of the fractional order,representing the weight W of the loss function E under the definition of the fractional derivative of Caputo ^l Solving fractional derivatives, wherein Γ (·) is a gamma function;

the weight correction expression of each layer of the neural network is as follows:

wherein t is a natural number, represents the current training algebra of the neural network, and mu is the learning rate.

Initializing parameters;

various parameters in the initialization method comprise: neural network training times I _max The learning rate mu, the order v of fractional order, the population size T, the population iteration times G, and the neural network training times I of individual fitness values are calculated _f And a mutation factor b.

Step three, generating an initial population;

P＝x _min +(x _max -x _min )×rand(T,c) (6)

randomly generating an initial population p= { S according to formula (6) ₁ ,S ₂ ,...,S _T S, where S _p ＝{x ₁ ,x ₂ ,...,x _c }，p＝1,2,...,T，Representing the total number of weights contained in the neural network, x _max And x _min Representing the upper and lower bounds of the weights, respectively, rand (T, c) represents the generation of a T C-dimensional random matrix ranging between (0, 1).

Step four, calculating individual fitness;

individuals S of the population _p After decoding, substituting the decoded data into the FODBP neural network as an initial weight, inputting a training sample into the neural network for training, correcting the weight of the neural network according to the step one, and enabling the training frequency to reach I _f Then obtaining training accuracy and using the training accuracy as fitness function value f of the individual _p 。

Step five, selecting individuals;

the individuals of the T populations are subjected to the fitness function value f _p Ordering is performed such that f _Π1 >f _Π2 >...>f _ΠT Selecting pi 1 to piIs replaced by pi->To pi T, and obtaining an optimal individual S _best ＝S _Π1 Optimal fitness value f _best ＝f _Π1 。

Step six, generating a new population;

performing mutation operation on the population selected in the fifth step according to Non-uniform mutation (NUM) rules to generate a new population P _N ＝{S _N1 ,S _N2 ,…,S _NT -a }; the non-uniform variation rule is as follows:

wherein e is a random number between (0, 1), x _mid ＝(x _max +x _min )/2，The positive integer g is the current iteration number of the population extremum optimizing algorithm.

Step seven, iteration circulation;

repeating the fourth to sixth steps to obtain a new optimal individual S _Nbest Fitness value f _Nbest If f _Nbest >f _best Then the new optimal individual and the optimal fitness value are saved, namely S _best ＝S _Nbest ,f _best ＝f _Nbest Cycling for multiple times until the number of times of group iterative optimization reaches G, and obtaining a global optimal solution S _best And after decoding, substituting the obtained product into the FODBP neural network to serve as an initial weight, and completing optimization of the fractional order depth BP neural network.

The invention has the following beneficial effects:

1. and the initial weight of the fractional order depth BP neural network is preferentially selected by using a population extremum optimization algorithm, so that the aim of improving the performance of the neural network is fulfilled.

2. The problem that the fractional order depth BP neural network is easy to be trapped in a local extremum is remarkably improved, and the performance of the fractional order depth BP neural network is improved, so that the fractional order depth BP neural network has stronger applicability.

3. The population extremum optimizing method and the fractional derivative calculating method can be popularized to other more neural network models, and the optimization research of the neural network is promoted.

Drawings

FIG. 1 is a conceptual diagram of a fractional order deep BP neural network;

FIG. 2 is a flow chart of the present optimization method;

FIG. 3 is a block diagram of a fractional order depth BP neural network built in an embodiment;

FIG. 4 is a chart showing error convergence of the embodiment and other algorithms.

Detailed Description

The invention is further explained below with reference to the drawings;

a fractional order depth BP neural network optimization method based on extremum optimization, wherein the structure of the fractional order depth BP neural network is shown in figure 1, and figure 2 is a flow chart of the optimization method.

Establishing a fractional order depth BP neural network shown in fig. 3, wherein the network has 8 layers in total, 196 external nodes in the first four layers, 32 internal nodes in the first four layers, and 1 input of the internal nodes in the first layer of the neural network; the latter four layers of neural networks have no external nodes and 64 internal nodes. The activation function of each layer of the neural network is a sigmoid function. In the identification of handwriting, MATLAB is adopted for the simulation experiment to carry out software programming, and MNIST handwriting digital data is adopted for identification simulation. MNIST handwriting digital data consists of 60000 training data and 10000 test data, each set of data is normalized by handwriting digital images. Wherein, a digital image has 28×28 gray data, and the gray value is between 0 and 255. In the simulation experiment, a digital image is divided into 4 parts, each part has 14×14 gray data, the data are arranged into 196×1 vectors, and finally 4 groups of data are respectively input into a neural network for training.

The parameters of the simulation are shown in table 1.

TABLE 1

Wherein, the Batch size is the number of the input samples at one time. The generated neural network initial weights obey a standard normal distribution.

Table 2 shows the training accuracy and the testing accuracy of the FODBP neural network. Table 3 shows the training accuracy and testing accuracy of the PEO-FODBP neural network. From the data in the two tables, it can be found that the training precision and the testing precision of the fractional order depth BP neural network are obviously improved by using the group extremum optimization algorithm, and the performance of the fractional order depth BP neural network is obviously improved.

TABLE 2

TABLE 3 Table 3

As shown in FIG. 4, the error convergence diagrams of the PEO-FODBP algorithm and the FODBP algorithm along with the training process of the invention show that the convergence speed of the fractional depth BP neural network after the optimization of the PEO algorithm is faster and the convergence accuracy is higher. From the simulation results, it can be concluded that: the invention provides a fractional order depth BP neural network method based on population extremum optimization, and compared with a common fractional order depth BP neural network, the performance is obviously improved. The method provided by the invention can improve the application performance of the fractional order depth BP neural network in various fields.

Claims

1. A fractional order depth BP neural network optimization method based on extremum optimization is characterized in that: the method specifically comprises the following steps:

step one, establishing a fractional order depth BP neural network model;

establishing an L-layer deep BP neural network model, wherein the number of the neuron nodes of the first layer is n ^l ，l＝1,2,2,L，I=1, 2, n, which is the weight between the first layer and the (l+1) th layer of the neural network ^l ,j＝1,2,2,n ^l+1 The method comprises the steps of carrying out a first treatment on the surface of the X is the input sample of the neural network, O is the ideal output of the input sample X, ++>For input of the layer I neural network, A ^l Is the output of the layer I neural network; the neural network has a loss function of +.>Wherein, I is Euclidean norm, and Sigma is sum symbol;

fractional order gradient under the Caputo definitionE is:

wherein delta ^l The gradient propagation term of the first layer of the neural network, v is a fraction, representing the order of the fractional order,e represents the weight W of the loss function E under the definition of the fractional derivative of Caputo ^l Solving fractional derivatives, wherein Γ (·) is a gamma function;

wherein t is a natural number, represents the current training algebra of the neural network, and mu is a learning rate;

initializing parameters;

various parameters in the initialization method comprise: neural network training times I _max The learning rate mu, the order v of fractional order, the population size T, the population iteration times G, and the neural network training times I of individual fitness values are calculated _f A mutation factor b;

step three, generating an initial population;

randomly generating an initial population p= { S ₁ ,S ₂ ,2,S _T S, where S _p ＝{x ₁ ,x ₂ ,...,x _c }，p＝1,2,...,T，Representing the total number of weights contained by the neural network;

step four, calculating individual fitness;

individuals S of the population _p After decoding, substituting the decoded data into the FODBP neural network as an initial weight, inputting MNIST handwriting digital data into the neural network for training, correcting the weight of the neural network according to the step one, and enabling the training frequency to reach I _f Then obtaining training accuracy and using the training accuracy as fitness function value f of the individual _p ；

Step five, selecting individuals;

sorting the individuals of the T populations according to their fitness function values so that f _Π1 >f _Π2 >...>f _ΠT Selecting pi 1 toIndividual replacement of->To pi T, and obtaining an optimal individual S _best ＝S _Π1 Optimal fitness value f _best ＝f _Π1 ；

Step six, generating a new population;

performing mutation operation on the population selected in the fifth step according to the non-uniform mutation rule to generate a new population P _N ＝{S _N1 ,S _N2 ,…,S _NT }；

Step seven, iteration circulation;

repeating the fourth to sixth steps to obtain a new optimal individual S _Nbest Fitness value f _Nbest If f _Nbest >f _best Then the new optimal individual and the optimal fitness value are saved, namely S _best ＝S _Nbest ,f _best ＝f _Nbest Cycling for multiple times until the number of times of group iterative optimization reaches G, and obtaining a global optimal solution S _best After decoding, substituting the obtained product into the FODBP neural network as an initial weight to finish fractional order depthOptimizing a BP neural network;

step eight, character identification

Inputting handwriting numbers into the fractional order depth BP neural network after the optimization in the step seven, and outputting the identification result by the network.

2. The method for optimizing the fractional order depth BP neural network based on extremum optimization as defined in claim 1, wherein the method is characterized by comprising the following steps of: output of layer 1 neural networkConsists of two-part nodes->Is an external node, and is not connected with the neuron node of the upper layer; />Is an internal node, and is fully connected with the neuron node of the upper layer; f (f) _l (. Cndot.) is the activation function of layer i, the forward propagation process of the network is:

3. the method for optimizing the fractional order depth BP neural network based on extremum optimization as defined in claim 1 or 2, wherein the method comprises the following steps of: activation function f of the first layer of the neural network _l (. Cndot.) is a sigmoid function.

4. The method for optimizing the fractional order depth BP neural network based on extremum optimization as defined in claim 1, wherein the method is characterized by comprising the following steps of: the back propagation process of the fractional order depth BP neural network is divided into two parts of gradient propagation and gradient calculation; in the gradient propagation process, the gradient propagation term delta of the first layer of the neural network ^l The method comprises the following steps:

delta according to the chain law ^l And delta ^l+1 The relation of (2) is:

wherein f' (. Cndot.) represents the first derivative of f (. Cndot.).

5. The method for optimizing the fractional order depth BP neural network based on extremum optimization as defined in claim 1, wherein the method is characterized by comprising the following steps of: the random generation method of the initial population comprises the following steps:

P＝x _min +(x _max -x _min )×rand(T,c) (6)

wherein x is _max And x _min Representing the upper and lower bounds of the weights, respectively, rand (T, c) represents the generation of a T C-dimensional random matrix ranging between (0, 1).

6. The method for optimizing the fractional order depth BP neural network based on extremum optimization as defined in claim 1, wherein the method is characterized by comprising the following steps of: the non-uniform variation rule of the population in the step six is as follows: