CN113159299A

CN113159299A - Fractional order depth BP neural network optimization method based on extremum optimization

Info

Publication number: CN113159299A
Application number: CN202110484178.8A
Authority: CN
Inventors: 陈碧鹏; 陈云; 曾国强; 佘青山
Original assignee: Hangzhou Dianzi University
Current assignee: Hangzhou Dianzi University
Priority date: 2021-04-30
Filing date: 2021-04-30
Publication date: 2021-07-23
Anticipated expiration: 2041-04-30
Also published as: CN113159299B

Abstract

The invention discloses a fractional order deep BP neural network optimization method based on extremum optimization. The fractional order system has the advantages of high convergence speed and high convergence accuracy, the initial weight of the established fractional order depth BP (PEO-FODBP) neural network is preferentially selected by utilizing a group extremum optimization algorithm, and in the iterative training process of the network, the optimal individuals in the group and the optimal fitness value of the optimal individuals are iteratively optimized while the network level weight is corrected, so that the adverse effect of the initial weight on the neural network is improved. The problems of easy falling into local minimum, long time consumption and low convergence speed in the prior art are solved. By utilizing the optimization method, the application performance of the fractional order deep BP neural network in various fields can be improved. In addition, the group extremum optimization method and the fractional derivative calculation method in the method can be popularized to other more neural network models, and the optimization research of the neural network is promoted.

Description

Fractional order depth BP neural network optimization method based on extremum optimization

Technical Field

The invention belongs to the technical field of artificial intelligence, relates to a fractional order deep BP neural network optimization method based on extremum optimization, and particularly relates to modeling of a deep BP neural network and design of an extremum optimization algorithm.

Background

Neural networks have received considerable attention from researchers in the fields of machine learning, statistics, and computer vision as a powerful tool for data regression and classification. The deep BP neural network is a network system formed by a large number of processing units developed by inspiring by the working principle of biological neural tissue, has the basic characteristics of a biological neural system, and has the advantages of large-scale parallel operation, distributed processing, self-adaptation and self-learning.

Fractional calculus theory has been used as a classic concept in mathematics for hundreds of years, is based on any fractional differentiation and integration, and is a popular popularization of integral calculus at present. Compared with the traditional integer order system, the fractional order system has the advantages of high convergence speed and high convergence accuracy, so that the fractional order system is widely applied to the fields of image processing, machine learning and neural networks.

Extreme value optimization is a new optimization method developed by being inspired by equilibrium dynamics far away from self-organization criticality, and has been successfully applied to various combined optimization problems. The basic principle of the extremum optimization algorithm is to select the individual with the lowest fitness value in the current solving range and the related variable thereof for variation, so that the system is continuously improved towards the optimal solution. Therefore, the extremum optimization algorithm does not end up in an equilibrium state, but fluctuates continuously to improve the searching ability of the algorithm in the solution domain.

Many researches on improving neural network training have been carried out at home and abroad, but most of the researches are carried out on the training parameters of the network in an iterative optimization mode, the important influence of the initial weight of the neural network on the performance of the neural network is ignored, and the methods generally have the problems of easy falling into local minimum values, long time consumption and low convergence speed.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides an extremum optimization-based fractional order depth BP neural network optimization method, wherein a Population extremum optimization algorithm (PEO) is used for carrying out preferred selection on an initial weight of a fractional order depth BP (PEO-FODBP) neural network, and the adverse effect of the initial weight on the neural network is improved.

A fractional order depth BP neural network optimization method based on extremum optimization specifically comprises the following steps:

step one, establishing a fractional order depth BP neural network model;

establishing an L-layer deep BP neural network model, wherein the number of neuron nodes of the L-th layer is n^l，l＝1,2,...,L，

Is the weight between the l layer and the (l +1) layer of the neural network, i is 1,2^l,j＝1,2,...,n^l ⁺¹。f_l(. h) is the activation function of layer l, X is the input sample of the neural network, O is the ideal output of the input sample X,

is an input of the l-th neural network, A^lIs the output of the l-th layer neural network.

The system is composed of two parts of nodes,

is an external node which is not connected with the neuron node of the upper layer;

it is an internal node, and is fully connected with the neuron node of the upper layer. Thus, the forward propagation process of a neural network can be expressed as:

the loss function of the neural network is

Wherein, | | · | is an euclidean norm, and Σ is a summation symbol;

the back propagation process of the fractional order deep BP neural network can be divided into two parts of gradient propagation and gradient calculation. In the gradient propagation process, the gradient propagation term of the l layer of the neural network can be defined as:

in the formula (I), the compound is shown in the specification,

representing the partial derivative calculation;

according to the chain rule, δ^lAnd delta^l+1The relationship of (c) can be expressed as:

wherein f' (. cndot.) represents the first derivative of f (-);

thus, the fractional order gradient calculation under the definition of Caputo can be expressed as:

wherein v is a fraction representing the order of the fractional order,

represents the loss function E versus the weight W under the definition of the Caputo fractional derivative^lCalculating a fractional derivative, wherein gamma (·) is a gamma function;

the weight correction expression of each layer of the neural network is as follows:

wherein t is a natural number and represents the current training algebra of the neural network, and mu is the learning rate.

Initializing parameters;

initializing various parameters in the method, including: number of neural network trainings I_maxLearning rate mu, fractional order v, population size T, population iteration times G, and neural network training times I for calculating individual fitness value_fAnd a mutation factor b.

Step three, generating an initial population;

P＝x_min+(x_max-x_min)×rand(T,c) (6)

randomly generating an initial population P ═ S according to equation (6)₁,S₂,...,S_TIn which S is_p＝{x₁,x₂,...,x_c}，p＝1,2,...,T，

Representing the total number of weights, x, contained in the neural network_maxAnd x_minThe upper and lower bounds of the weight are represented, respectively, and rand (T, c) represents the generation of a random matrix with dimension T × c ranging between (0, 1).

Step four, calculating individual fitness;

the individuals S of the population_pThe decoded product is input into FODBP neural network as initial weight, then the training sample is input into the neural network for training, the weight of the neural network is corrected according to the steps, and the training frequency reaches I_fThen obtaining the training precision and taking the training precision as the fitness function value f of the individual_p。

Step five, selecting an individual;

the individuals of the T populations are subjected to fitness function values f according to the individuals_pSorting is performed so that f_Π1>f_Π2>...>f_ΠTSelecting pi 1 to pi

Individual replacement of pi

Individuals to pi T and obtaining optimal individuals S_best＝S_Π1And an optimal fitness value f_best＝f_Π1。

Step six, generating a new population;

performing mutation operation on the population selected in the fifth step according to Non-uniform mutation (NUM) rule to generate a new population P_N＝{S_N1,S_N2,…,S_NT}; the non-uniform variation rule is as follows:

wherein e is a random number between (0,1), x_mid＝(x_max+x_min)/2，

The positive integer g is the current iteration number of the group extremum optimization algorithm.

Step seven, iterative loop;

repeating the fourth step to the sixth step to obtain a new optimal individual S_NbestAnd a fitness value f_NbestIf f is_Nbest>f_bestThe new optimal individual is then saved with the optimal fitness value, i.e. S_best＝S_Nbest,f_best＝f_NbestCirculating for multiple times until the iterative optimization times of the group reach G, and obtaining the global optimal solution S_bestAnd the decoded result is input into the FODBP neural network as an initial weight to complete the optimization of the fractional order depth BP neural network.

The invention has the following beneficial effects:

1. and carrying out preferential selection on the initial weight of the fractional-order depth BP neural network by utilizing a population extremum optimization algorithm, thereby achieving the purpose of improving the performance of the neural network.

2. The problem that the fractional order depth BP neural network is easy to fall into a local extremum is remarkably improved, and the performance of the fractional order depth BP neural network is improved, so that the fractional order depth BP neural network has stronger applicability.

3. The group extremum optimization method and the fractional derivative calculation method can be popularized to other more neural network models, and the optimization research of the neural network is promoted.

Drawings

FIG. 1 is a conceptual diagram of a fractional order deep BP neural network;

FIG. 2 is a flow chart of the present optimization method;

FIG. 3 is a diagram of a fractional depth BP neural network structure built in the embodiment;

FIG. 4 is a graph of error convergence for the embodiment and other algorithms.

Detailed Description

The invention is further explained below with reference to the drawings;

a fractional order deep BP neural network optimization method based on extremum optimization is disclosed, the structure of the fractional order deep BP neural network is shown in figure 1, and figure 2 is a flow chart of the optimization method.

Establishing a fractional order deep BP neural network as shown in FIG. 3, wherein the network has 8 layers, external nodes of the first four layers are 196, internal nodes are 32, and the input of the internal node of the first layer neural network is 1; the four layers of neural networks have 64 internal nodes without external nodes. The activation function of each layer of the neural network is a sigmoid function. The method is used for recognizing handwritten fonts, MATLAB is adopted for software programming in a simulation experiment, and MNIST handwritten digital data is adopted for recognition and simulation. The MNIST handwritten digital data consists of 60000 training data and 10000 test data, each set of data being standardized by handwritten digital images. Wherein, a digital image has 28 multiplied by 28 gray data, and the gray value is between 0 and 255. In the simulation experiment, a digital image is divided into 4 parts, each part has 14 × 14 gray data, the gray data are arranged into 196 × 1 vectors, and finally 4 groups of data are respectively input into a neural network for training.

The parameters of the simulation are shown in table 1.

TABLE 1

Wherein, the Batch size is the number of the one-time input samples. The initial weight of the generated neural network follows standard normal distribution.

Table 2 shows the training accuracy and the testing accuracy of the FODBP neural network. Table 3 shows the training accuracy and testing accuracy of the PEO-FODBP neural network. From the data in the two tables, it can be found that the training precision and the testing precision of the fractional order deep BP neural network are obviously improved by using the group extremum optimization algorithm, and the performance of the fractional order deep BP neural network is obviously improved.

TABLE 2

TABLE 3

Fig. 4 is an error convergence graph of the PEO-FODBP algorithm and the FODBP algorithm of the present invention along with the training process, and it can be found from the graph that the convergence speed of the fractional order deep BP neural network optimized by the PEO algorithm is faster and the convergence accuracy is higher. According to the simulation experiment result, the following conclusion can be drawn: the method provides a fractional order depth BP neural network method based on population extremum optimization, and compared with a common fractional order depth BP neural network, the performance is obviously improved. By utilizing the method provided by the invention, the application performance of the fractional order deep BP neural network in various fields can be improved.

Claims

1. A fractional order depth BP neural network optimization method based on extremum optimization is characterized in that: the method specifically comprises the following steps:

step one, establishing a fractional order depth BP neural network model;

establishing an L-layer deep BP neural network model, wherein the number of neuron nodes of the L-th layer is n^l，l＝1,2,2,L，

As the weight between the l layer and the (l +1) layer of the neural network, i is 1,2,2, n^l,j＝1,2,2,n^l+1(ii) a X is the input sample of the neural network, O is the ideal output of the input sample X,

is an input of the l-th neural network, A^lIs the output of the l layer neural network; the loss function of the neural network is

Wherein, | | · | is an euclidean norm, and Σ is a summation symbol;

fractional step size under Caputo definition

E is as follows:

wherein, delta^lIs the gradient propagation term of the l layer of the neural network, v is a fraction, represents the order of the fractional order,

e represents the loss function E versus the weight W under the definition of the Caputo fractional derivative^lCalculating a fractional derivative, wherein gamma (·) is a gamma function;

wherein t is a natural number and represents the current training algebra of the neural network, and mu is a learning rate;

initializing parameters;

initializing various parameters in the method, including: number of neural network trainings I_maxLearning rate mu, fractional order v, population size T, population iteration times G, and neural network training times I for calculating individual fitness value_fA mutation factor b;

step three, generating an initial population;

randomly generating an initial population P ═ S₁,S₂,2,S_TIn which S is_p＝{x₁,x₂,...,x_c}，p＝1,2,...,T，

Representing the total number of weights contained in the neural network;

step four, calculating individual fitness;

the individuals S of the population_pInputting the decoded data into FODBP neural network as initial weight, inputting MNIST handwritten digital data into the neural network for training, correcting the weight of the neural network according to the steps, and training the number of times to reach I_fThen obtaining the training precision and taking the training precision as the fitness function value f of the individual_p；

Step five, selecting an individual;

ordering the individuals of the T populations according to the fitness function values thereof so that f_Π1>f_Π2>...>f_ΠTSelecting pi 1 to

Individual replacement of

Individuals to pi T and obtaining optimal individuals S_best＝S_Π1And an optimal fitness value f_best＝f_Π1；

Step six, generating a new population;

performing variation operation on the population selected in the fifth step according to a non-uniform variation rule to generate a new population P_N＝{S_N1,S_N2,…,S_NT}；

Step seven, iterative loop;

repeating the fourth step to the sixth step to obtain a new optimal individual S_NbestAnd a fitness value f_NbestIf f is_Nbest>f_bestThe new optimal individual is then saved with the optimal fitness value, i.e. S_best＝S_Nbest,f_best＝f_NbestCirculating for multiple times until the iterative optimization times of the group reach G, and obtaining the global optimal solution S_bestThe decoded offspring is put into an FODBP neural network to be used as an initial weight to complete the optimization of the fractional order depth BP neural network;

step eight, character recognition

Inputting handwritten numbers to the fractional order depth BP neural network optimized in the step seven, and outputting a recognition result by the network.

2. The extremum optimization-based fractional order deep BP neural network optimization method of claim 1, wherein: output of layer I neural network

The system is composed of two parts of nodes,

the internal node is fully connected with the neuron node of the upper layer; f. of_l(. h) is the activation function of the l-th layer, and the forward propagation process of the network is as follows:

3. the extremum optimization-based fractional order deep BP neural network optimization method of claim 1 or 2, wherein: activation function f of layer I of neural network_l(. cndot.) is sigmoid function.

4. The extremum optimization-based fractional order deep BP neural network optimization method of claim 1, wherein: the back propagation process of the fractional depth BP neural network is divided into two parts of gradient propagation and gradient calculation; in the gradient propagation process, the gradient propagation term delta of the l layer of the neural network^lComprises the following steps:

according to the chain rule, δ^lAnd delta^l+1The relationship of (1) is:

where f' (. cndot.) represents the first derivative of f (-).

5. The extremum optimization-based fractional order deep BP neural network optimization method of claim 1, wherein: the random generation method of the initial population comprises the following steps:

P＝x_min+(x_max-x_min)×rand(T,c) (6)

wherein x_maxAnd x_minThe upper and lower bounds of the weight are represented, respectively, and rand (T, c) represents the generation of a random matrix with dimension T × c ranging between (0, 1).

6. The extremum optimization-based fractional order deep BP neural network optimization method of claim 1, wherein: the uneven variation rule of the population in the sixth step is as follows:

wherein e is a random number between (0,1), x_mid＝(x_max+x_min)/2，