CN114220127B

CN114220127B - Image recognition method based on gradient guided evolutionary algorithm

Info

Publication number: CN114220127B
Application number: CN202111639517.1A
Authority: CN
Inventors: 田野; 石子睿; 杨尚尚; 张兴义
Original assignee: Anhui University
Current assignee: Anhui University
Priority date: 2021-12-29
Filing date: 2021-12-29
Publication date: 2024-05-10
Anticipated expiration: 2041-12-29
Also published as: CN114220127A

Abstract

The invention discloses an image recognition method based on a gradient guided evolutionary algorithm, which mainly comprises the following steps: 1. acquiring an image sample to construct a training sample data set; 2. initializing a parent population, using gSBX operators to obtain a child population in the mating selection process, adding the parent population into the child population, performing non-dominant sorting on the child population, and selecting a plurality of first individuals from the sorted population as an optimal individual population; 3. and deleting the dominated solution from the optimal individual population, fine-tuning the weight variable of each individual remained in the population by using a sparse random gradient method SGD, and selecting an attribute set of the individual as a variable of a final training model in an inflection point region on a Pareto front surface of the population. According to the invention, the image recognition model is optimized through the evolutionary algorithm, so that the accuracy of the model in image recognition can be improved, and the training cost and the memory consumption of the neural network are reduced.

Description

Image recognition method based on gradient guided evolutionary algorithm

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to an image recognition method.

Background

The image recognition is realized by comparing and calculating the stored information with the current picture information and a series of processing procedures. Image recognition is an important field of artificial intelligence, such as face recognition, and is a biological recognition technology for performing identity recognition based on facial feature information of people. With the progress of technology, how to extract target features quickly and efficiently and build corresponding image recognition models in image recognition is an important and key problem in image recognition.

The currently common image recognition method is mainly convolutional neural network (Convolutional Neural Networks, CNN). Since weights in one CNN play a decisive role in classification performance, the CNN high complexity presents a significant challenge for the corresponding optimal weights for a given dataset. In the last decades, a number of algorithms have been proposed for neural network training, most of which optimize weights based on gradient information, typical algorithms include SGD, RMSProp, adam, and so on. Due to the rapid convergence speed provided by gradient information, gradient-based methods have proven to be promising in training CNNs. However, these methods still suffer from some limitations. Such methods tend to sink into locally optimal and saddle points, for example, and it is often necessary to introduce regularization terms in the gradient-based method to mitigate overfitting, and also carefully predefine some hyper-parameters such as learning rate, momentum and descent rate, with additional costs. In addition, in the neural network model trained conventionally, the learned image features are lost through a large number of propagation iterations, the lost information cannot be reserved, and finally the image recognition accuracy is low.

As a set of meta-heuristic algorithms featuring population evolution, the Evolutionary Algorithm (EA) exhibits its effectiveness in solving many complex optimization problems in various research fields, such as various non-linear, non-convex and combinatorial optimization problems. Compared to gradient-based algorithms, EA has an attractive exploratory capacity and insensitivity to local optima. Therefore, since 1980, many EA have been proposed for training neural networks. Such as Whitley et al, use genetic algorithms to optimize the weights of the neural network, and experiments have demonstrated that genetic algorithms are competitive on neural networks. Montana and Davis trained a total of 126 weighted feedforward neural networks through a custom genetic algorithm that was validated as superior to the gradient-based approach.

Although they have advantages in training NNs, EA exhibits poor scalability in optimizing the weights of large-scale NNs, as large-scale optimization is always a challenging topic in evolutionary computation. In order to address dimensional curse, attempts have been made to enhance the scalability of EA for training large-scale NNs, especially DNNs. Gong et al propose a dual-objective EA that optimizes the weights of sparse DNNs by considering only the bias, rather than the ownership weights, as bias is the primary factor in controlling hidden layer sparsity. Sun et al developed a single target EA to optimize the weights of DNNs, where the weight optimization was converted to optimization of orthogonal bias vectors in the weight space. Although in evolutionary methods of training large-scale neural networks, most aim at solving the dimensional curse by reducing the search space (i.e. the number of weights to be optimized), this may miss the best search area and increase the likelihood of sinking into local optima, failing to achieve the best image recognition result.

Disclosure of Invention

The invention provides an image recognition method based on a gradient-guided evolutionary algorithm, which aims to solve the defects in the existing image recognition technology and can optimize an image recognition model through the evolutionary algorithm so as to improve the accuracy of the model in image recognition.

In order to achieve the aim of the invention, the invention adopts the following technical scheme:

The invention relates to an image recognition method based on a gradient guided evolutionary algorithm, which is characterized by comprising the following steps:

step one, acquiring T image samples and category labels thereof, and extracting attribute features corresponding to each image sample according to the category label of each image sample so as to obtain an image sample set Wherein x _t represents the attribute feature of the T-th image sample, y _t represents the true class label of the T-th image sample, (x _t,y_t) represents the sample data of the T-th image, t=1, 2, …, T;

Defining the maximum iteration number as GEN, the current iteration number as GEN, and initializing gen=1;

step two, setting the population size as N, defining a population set of the gen generation as Representing an ith individual in the population of the gen generation;

Defining the attributes of each individual in the gen-th generation population includes: r weight variables, gradients and two targets to be optimized; the targets to be optimized are a loss function value and model complexity respectively;

initializing R weight variables of each individual in the gen generation population to be random values;

initializing a gradient set of each individual in the generation-gen population as an empty set;

Step 2.1, creating an image recognition Model ₁ containing R weight variables, and initializing i=1;

Step 2.2, using the ith individual in the generation of population P _gen R weight variables in the R weight variables replacement image recognition Model ₁ to obtain a replaced image recognition Model '₁, taking sample data in the image sample set L as input of the replaced image recognition Model' ₁, and obtaining a prediction category label set/>Wherein y '_t represents the predictive category label of the image recognition Model' ₁ on the sample data (x _t,y_t) of the t-th image;

step 2.3, the prediction category label set And the true category label set/>The i-th individual/>, in the gen generation population c, is obtained by the transmission loss functionLoss function value Loss on image sample set L, and then counting ith individual/>, in the generation population P _gen As a model complexity of the corresponding individual;

Step 2.4, calculating gradient according to the loss function value of the Model ₁ of the image recognition Model Represents the i-th individual/>, in the gen-th generation population P _gen Gradient of the r-th weight variable of (2), then using gradient/>Ith individual/>, replacing the gen-th generation population P _gen R e [1, R ];

Step 2.5, assigning i+1 to i, judging whether i < N is satisfied, if so, returning to step 2.2 for sequential execution, otherwise, representing image characteristic information of each individual finished image sample set L in the gen generation population P _gen And executing step 2.6;

step 2.6, selecting N individuals from all the learned generation populations P' _gen by using a binary tournament selection algorithm as a mating pool M ₁, and simultaneously creating an empty generation offspring population Z _gen;

Step 2.7, randomly selecting two individuals q ₁ and q ₂ from the mating pool M ₁, deleting the two individuals q ₁ and q 5226 from the mating pool M ₁ at the same time, and performing a crossover operation on the two selected individuals q ₁ and q ₂ by using a gSBX operator:

Step 2.7.0, defining ten parameters eta, lambda and mu ₁、μ₂、k₁、k₂、β₁、β₂、α₁、α₂, wherein eta is image characteristic information Mean/>Λ is a constant sampled by uniformly distributed U [0,1] in each dimension, μ ₁、μ₂ is two random numbers determined by weight variables and gradient information, and β ₁、β₂、α₁、α₂ is an intermediate variable;

Definition of the definition The j-th weight variable for the 1 st individual q ₁,/>The j-th gradient for the 1 st individual q ₁,/>The j-th weight variable for the 2 nd individual q ₂,/>The jth gradient for the 2 nd individual q ₂ and initializing j=1;

defining z ₁ as a1 st child generation individual, and initializing a weight variable set of the 1 st child generation individual z ₁ as an empty set;

Defining z ₂ as a 2 nd generation individual, and initializing a weight variable set of the 2 nd generation individual z ₂ as an empty set;

step 2.7.1, calculation If k ₁ > 0, μ ₁ is randomly valued in [0,0.5], if k ₁ < 0, μ ₁ is randomly valued in (0.5, 1), if k ₁ =0, μ ₁ is randomly valued in [0,1 ];

step 2.7.2, calculate If k ₂ < 0, μ ₂ is randomly valued in [0,0.5], if k ₂ > 0, μ ₂ is randomly valued in (0.5, 1), if k ₂ =0, μ ₂ is randomly valued in [0,1 ];

Step 2.7.3, if the first random number mu _l is less than or equal to 0.5, calculating an intermediate variable Otherwise, calculate the intermediate variable/>Wherein l=1, 2;

Step 2.7.4, performing variable substitution on two intermediate variables beta ₁,β₂:

If lambda is less than or equal to 0.5 and And/>Let α ₁＝-β₂-1,α₂＝-β₁ -1, otherwise, let α ₁＝β₁-1,α₂＝β₂ -1;

Step 2.7.5, correcting two intermediate variables α ₁,α₂:

If the jth weight variable of the ith individual q _l Let α _l =0, otherwise α _l remain unchanged, where l=1, 2;

Step 2.7.6, calculating the j-th weight variable of the 1 st child z ₁ The jth weight variable/>, of the 2 nd child z ₂

Step 2.7.7, assigning j+1 to j, if j is less than or equal to R, returning to step 2.7.1 for sequential execution, otherwise, indicating that all weight variables of the two child individuals z ₁、z₂ are updated, and executing step 2.8;

Step 2.8, adding two offspring individuals Z ₁、z₂ with updated weight variables into a offspring population Z _gen of the gen generation;

Step 2.9, judging whether the mating pool M ₁ is empty or not, and judging whether the number of individuals in the offspring population Z _gen of the gen generation reaches N or not, if so, executing the step 2.10, otherwise, returning to the step 2.7 for sequential execution;

step 2.10, calculating each individual in the offspring population Z _gen of the gen generation by taking the image sample set L as input Loss function value and model complexity of (a);

Step 2.11, sorting and selecting the parent generation population P _gen and the offspring population Z _gen of the gen generation according to the NSGA-II environment selection strategy, thereby obtaining a new population of the gen generation after the selection is completed

Step 2.12, after assigning gen+1 to GEN, judging whether GEN > GEN is established, if so, generating a new population from GEN generation according to NSGA-II dominance relationThe subject individuals are deleted to obtain a final population P _final, otherwise, the new population of the gen-1 generation/>After assigning the assignment to the gen generation population P _gen, returning to the step 2.1 for sequential execution;

step three, making the number of individuals in the final population P _final be n, and making The mth individual representing the final population P _final, initializing m=1;

Step 3.1, R weight variables of the image recognition Model' ₁ Substitution with mth individual/>To obtain an updated image recognition Model "₁;

defining the maximum iteration number as G, the current iteration number as G, and initializing g=1;

Step 3.2, inputting the image sample dataset L into the updated image recognition Model ₁ to obtain the current image recognition result Wherein y _t' represents a predictive category label of the image recognition Model ₁ on sample data (x _t,y_t) of the t-th image, and determining a current image recognition result/>, according to a loss functionAnd the real class label setA Loss function value Loss between;

Step 3.3, the Loss function value Loss is back propagated in the image recognition Model ₁ to obtain gradient information as Wherein g ^r represents the r-th gradient in the image recognition Model "₁, according to the weight variable/>, of the image recognition Model" ₁ And gradient information/>Updating the weight variable of the image recognition Model "₁, wherein w ^r represents the r-th weight variable in the image recognition Model" ₁;

Step 3.4, assigning g+1 to G, if G > G, indicating that training of the image recognition model is completed, and obtaining a current final training model And executing the step 3.5, otherwise returning to the step 3.2 for sequential execution;

Step 3.5, taking the sample data set L as the current final training model And performs one round of forward propagation to obtain the current image recognition result/>Determining the label/>, of the current image recognition result according to the loss functionAnd the real label information/>Loss function value Loss between the models, and then the model/>All weight variables in (1)Extract and isolate the mth individual/>, of the final population P _final Is replaced by/>Simultaneous statistics/>The number of non-0 weight variables num is calculated by using Loss function values Loss and/>The number num of non-0 weight variables replaces the mth individual/>, of the final population P _final Wherein,/>Representing the model/>The r weight variable in (a);

Step 3.6, assigning m+1 to m, if m > n, then finishing the adjustment of the weight variable sets of all the individuals in the final population P _final, and executing step 3.7, otherwise, returning to step 3.1 for sequential execution;

step 3.7, according to the updated final population Two target values of n individuals in the image are used for obtaining a two-dimensional front image, a weight variable set of one individual at the inflection point of the front image is selected, and the weight variable set is substituted into an image recognition Model ₁ to obtain an optimal image classification Model for recognizing and classifying the image.

Compared with the prior art, the invention has the beneficial effects that:

1. the invention provides a strong analog binary crossover operator based on gradient, which is called gSBX, and has the advantages mainly in two aspects. In one aspect gSBX generates offspring solutions by using a strategy similar to SBX, which can preserve the exploratory capabilities similar to evolutionary algorithms. On the other hand, the search direction of gSBX is always set to be the same as the gradient of the parent, aiming at enhancing the ability to utilize the gradient information and image features by these information. Therefore gSBX alleviates the curse problem of the dimension of the previous evolutionary algorithm in optimizing the parameters of the neural network, and can balance the development and exploration while optimizing the parameters of the neural network so as to obtain more accurate image recognition results than the traditional single method.

2. The invention provides an image recognition method using an evolutionary algorithm, which is called GEMONN. The current neural network optimization algorithm based on the evolution algorithm is to reduce the search space, i.e. the number of weights to be optimized, to solve the curse of the dimension, but this may miss the best search area and increase the risk of falling into local optima. In order to make up for the defects of the existing evolution method, the invention provides a gradient-guided evolution algorithm for training a neural network, adopts an evolutionary multi-objective optimization technology for optimizing training loss and simultaneously controlling network sparsity, wherein the proposed gSBX is adopted for generating offspring solutions, so that the advantages of gradient descent and the evolutionary algorithm can be inherited, and more image characteristic information is reserved in an image recognition model.

Drawings

FIG. 1 is a flow chart of the method of the present invention;

FIG. 2 is a schematic diagram of the structure of a CNN used in the present invention;

FIG. 3 is a schematic diagram of a process for reading and identifying image features according to the present invention.

Detailed Description

In this embodiment, as shown in fig. 1, an image recognition method based on a gradient guided evolution algorithm combines the advantages of the two methods by combining the evolution algorithm and the gradient method, uses the gradient to accelerate the convergence rate of the evolution, uses the evolution to assist in jumping out of local optimization, and learns more image feature information, thereby realizing better model performance than single method optimization, and specifically, the method comprises the following steps:

Step one, obtaining T animal image samples and category labels thereof, and extracting attribute characteristics corresponding to each image sample according to the category label of each image sample so as to obtain an image sample set Wherein x _t represents the attribute characteristics of the T-th image sample for subsequent feature information calculation, y _t represents the true class labels of the T-th image sample for subsequent calculation of loss function values, (x _t,y_t) represents the sample data of the T-th image, t=1, 2, …, T; in this embodiment, the attribute of the image sample is mainly the pixel value of the image:

initializing R weight variables of each individual in the gen generation population to be random values so as to ensure that the identification method of each model is different;

Step 2.2, replacing R weight variables in the image recognition Model ₁ with R weight variables of the ith individual P _gen in the gen generation population P _gen to obtain a replaced image recognition Model '₁, taking sample data in the image sample set L as input of the replaced image recognition Model' ₁, and obtaining an image recognition result by a plurality of rolling volumes, pooling and full connection layers of the input image as shown in FIG. 2 Where y '_t represents the predictive category label of the image recognition Model' ₁ on the sample data (x _t,y_t) of the t-th image;

step 2.3, FIG. 3 shows a brief example flow of image feature learning and image recognition, and local pixel information of a dog image is read to find out local features of eyes and nose of the dog image, so that the probability that animals in the image are dogs is high, and a prediction type label set is obtained With real class tag set/>In the input loss function, if the probability of judging the cat in the prediction type label is larger in the graph of fig. 3, the image prediction result is not consistent with the real label, the loss of the image characteristic information is generated, and the ith individual/>, in the generation population c, is obtained through calculation of the loss functionLoss function value Loss on image sample set L, and then counting ith individual/>, in the generation population P _gen The non-0 variable number of (3) is taken as the model complexity of the corresponding individual, so that two target values of the ith individual P _gen in the gen generation population P _gen are obtained;

Step 2.4, calculating gradient according to loss function value of Model ₁ of image recognition Model Represents the i-th individual/>, in the gen-th generation population P _gen Gradient of the r-th weight variable of (2), then using gradient/>Ith individual/>, replacing the gen-th generation population P _gen R e [1, R ];

step 2.6, selecting N individuals from all the learned generation populations P' _gen by using a binary tournament selection algorithm as a mating pool M ₁, and simultaneously creating an empty generation offspring population Z _gen; the binary tournament selection algorithm, namely taking two individuals from the population each time (putting back the samples), selecting the population of the entering offspring with better adaptability, and repeating the operation until the population size is the same as the original population size; the mating pool M ₁ is mainly used for taking out two parent individuals from the mating pool M ₁ for cross mutation to generate offspring;

Step 2.7, randomly selecting two individuals q ₁ and q ₂ from the mating pool M ₁, deleting the two individuals q ₁ and q 5226 from the mating pool M ₁ at the same time, and performing mating operation on the two selected individuals q ₁ and q ₂ by using a gSBX operator:

step 2.7.2, calculate If k ₂ is less than 0, mu ₂ is randomly valued in [0,0.5], if k ₂ is more than 0, mu ₂ is randomly valued in (0.5, 1), and if k ₂ =0, mu ₂ is randomly valued in [0,1], wherein the random number is only taken to ensure weight variable distribution;

Step 2.7.3, if the first random number mu _l is less than or equal to 0.5, calculating an intermediate variable Otherwise, calculate the intermediate variable/>Wherein l=1, 2; the two formulas are used for influencing individual distribution by utilizing the characteristic information of the image, so that the relation between the weight variable and the characteristic information of the image is more compact;

If lambda is less than or equal to 0.5 and And/>Let α ₁＝-β₂-1,α₂＝-β₁ -1, otherwise, let α ₁＝β₁-1,α₂＝β₂ -1; because the positive and negative of the gradient can influence the direction of the evolution of the individual, the variable substitution is performed to ensure that the change of the weight variable is changed towards the direction with better image recognition result;

Step 2.7.5, correcting two intermediate variables α ₁,α₂:

If the jth weight variable of the ith individual q _l Let α _l =0, otherwise α _l remain unchanged, where l=1, 2; correction is to ensure sparsity, the variable in the parent is 0, and the variable in the offspring should also be 0;

Step 2.7.6, calculating the j-th weight variable of the 1 st child z ₁ The jth weight variable/>, of the 2 nd child z ₂ Thereby ensuring that the evolution of the offspring individual is along the direction evolution of the gradient to obtain a better image recognition model;

Step 2.11, sorting and selecting the parent generation population P _gen and the offspring population Z _gen of the gen generation according to the NSGA-II environment selection strategy, thereby obtaining a new population of the gen generation after the selection is completed The NSGA-II environment selection strategy comprises rapid non-dominant ranking and crowding comparison selection:

A dominance relationship for individuals a and b, each individual comprising two target values f ₁ and f ₂, when f _a1≤f_b1 and f _a2≤f_b2, then it is said that individual a dominates individual b; instead of dominance, i.e., f _a1≤f_b1 but f _a2≥f_b2, or f _a1≥f_b1 but f _a2≤f_b2, it is said that individual a and individual b are not dominance to each other, i.e., are non-dominance;

The rapid non-dominant ranking, assuming a population of P, calculates two parameters n _p and S _p for each individual in P, where n _p is the number of individuals in the population that are dominant by the individual P and S _p is the set of individuals in the population that are dominant by the individual P. The main steps of the algorithm are as follows:

(1) Finding all individuals with n _p =0 in the population, and storing in a current set F ₁;

(2) For each individual i in the current set, whose dominant set of individuals is S _i, traversing each individual i in S _i, performing n _l＝n_l -1, if n _l ≡0, saving individual i in set H;

(3) Recording the individuals obtained in F ₁ as the individuals of the first non-dominant layer, and repeating the above operation with H as the current set until the whole population is classified;

crowdedness, which refers to the density of surrounding individuals for a given individual in a population;

The crowdedness comparison operator, i.e. determining whether a and b are better or worse according to the non-dominant ranking n _rank and the crowdedness n _d of two individuals a and b, when a _rank≤b_rank and a _d＞b_d, then the individual a is said to be better than the individual b, when the environment selection policy will select a;

step 2.12, after assigning gen+1 to GEN, judging whether GEN > GEN is established, if so, generating a new population from GEN generation according to NSGA-II dominance relation The subject individuals are deleted to obtain a final population P _final, otherwise, the new population of the gen-1 generation/>After assigning the assignment to the gen generation population P _gen, returning to the step 2.1 for sequential execution;

Step 3.2, inputting the image sample dataset L into the updated image recognition Model ₁ to obtain the current image recognition result Wherein y "_t represents a predictive category label of the image recognition Model" ₁ on the sample data (x _t,y_t) of the t-th image, and determining the current image recognition result/>, according to the loss functionAnd true category tag set/>A Loss function value Loss between;

Step 3.3, the Loss function value Loss is reversely propagated in the Model' ₁ of the image recognition Model to obtain gradient information as Where g ^r represents the r-th gradient in the image recognition Model "₁, according to the weight variable/>, of the image recognition Model" ₁ And gradient information/>Updating the weight variable of the image recognition Model ₁, wherein w ^r represents the r-th weight variable in the image recognition Model ₁;

Step 3.5, taking the sample data set L as the current final training model And performs one round of forward propagation to obtain the current image recognition result/>Determining the label/>, of the current image recognition result according to the loss functionAnd real tag information/>Loss function value Loss between the models, and model/>All weight variables/>Extract and isolate the mth individual/>, of the final population P _final Is replaced by/>Simultaneous statistics/>The number of non-0 weight variables num is calculated by using Loss function values Loss and/>The number num of non-0 weight variables replaces the mth individual/>, of the final population P _final Wherein,/>Representation model/>The r weight variable in (a);

step 3.7, according to the updated final population Two target values of n individuals in the image are used for obtaining a two-dimensional front image, a weight variable set of one individual at the inflection point of the front image is selected, and the weight variable set is substituted into an image recognition Model ₁ to obtain an optimal image classification Model for recognizing and classifying animal images, so that animals corresponding to fig. 3 are recognized more accurately. The image recognition accuracy of the obtained model is higher than that of the traditional model, and the time complexity is the same as that of the traditional method.

Claims

1. The image recognition method based on the gradient guided evolutionary algorithm is characterized by comprising the following steps:

Step 2.2, using the ith individual in the generation of population P _gen R weight variables in the R weight variables replacement image recognition Model ₁ to obtain a replaced image recognition Model ₁ ', taking sample data in the image sample set L as input of the replaced image recognition Model ₁', and obtaining a prediction category label set/>Wherein y _t 'represents the predictive category label of the image recognition Model ₁' on the sample data (xt, yt) of the t-th image;

Step 2.5, assigning i+1 to i, judging whether i < N is satisfied, if so, returning to step 2.2 for sequential execution, otherwise, representing image characteristic information of each individual finished image sample set L in the gen generation population P _gen And perform

Step 2.6;

Step 2.6, selecting N individuals from all the learned generation populations P _g′_en by using a binary tournament selection algorithm as a mating pool M ₁, and simultaneously creating an empty generation offspring population Z _gen;

Step 2.7.3, if the first random number mu _l is less than or equal to 0.5, calculating an intermediate variable Otherwise, calculate the intermediate variableWherein l=1, 2;

Step 2.7.5, correcting two intermediate variables α ₁,α₂:

Step 3.1, R weight variables of the image recognition Model ₁ Substitution with mth individual/>To obtain an updated image recognition Model ₁';

Step 3.2, inputting the image sample dataset L into the updated image recognition Model ₁ "to obtain the current image recognition result Wherein yt "represents a predictive category label of the image recognition Model ₁" on sample data (xt, yt) of the t-th image, and determining a current image recognition result/>, based on a loss functionAnd the real class tag set/>A Loss function value Loss between;

Step 3.3, back-propagating the Loss function value Loss in the Model ₁' to obtain gradient information as Wherein g ^r represents the r-th gradient in the image recognition Model ₁ ", according to the weight variable/>, of the image recognition Model ₁ And gradient information/>Updating the weight variables of the image recognition Model ₁ ", wherein w ^r represents the r-th weight variable in the image recognition Model ₁";

Step 3.5, taking the sample data set L as the current final training model And performs one round of forward propagation to obtain the current image recognition result/>Determining the label/>, of the current image recognition result according to the loss functionAnd the true category label set/>Loss function value Loss between the models, and then the model/>All weight variables in (1)Extract and isolate the mth individual/>, of the final population P _final Is replaced by/>Simultaneous statistics/>The number of non-0 weight variables num is calculated by using Loss function values Loss and/>The number num of non-0 weight variables replaces the mth individual/>, of the final population P _final Wherein,/>Representing the model/>The r weight variable in (a);