CN112561039A

CN112561039A - Improved search method of evolutionary neural network architecture based on hyper-network

Info

Publication number: CN112561039A
Application number: CN202011567363.5A
Authority: CN
Inventors: 金耀初; 沈修平
Original assignee: SHANGHAI ULUCU ELECTRONIC TECHNOLOGY CO LTD
Current assignee: SHANGHAI ULUCU ELECTRONIC TECHNOLOGY CO LTD
Priority date: 2020-12-26
Filing date: 2020-12-26
Publication date: 2021-03-26

Abstract

The invention relates to an improved search method of an evolutionary neural network architecture based on a hyper-network. The method comprises the following steps: step S1, packaging five calculation modules by taking the input layer as the first layer; step S2, connection binarization of the internal calculation nodes of the neural network is carried out; and step S3, learning a structure weight for each calculation node, and step S4, constructing a parent population P by adopting a binary tournament selection method. And step S5, forming a child population Q. And step S6, performing mutation operation on the individuals in the offspring population Q. Step S7, decoding each individual in the child population Q into a corresponding neural network to obtain a structure weight; and step S8, merging the parent population P and the child population Q into a population R, selecting a plurality of individuals as the original population of the next generation by adopting an environment selection method, and feeding back to the step S4 until a preset maximum evolution generation number is reached. And after the evolution is finished, outputting the individual with the highest fitness value as an optimal neural network architecture.

Description

Improved search method of evolutionary neural network architecture based on hyper-network

Technical Field

The invention relates to the technical field of image classification model construction, in particular to an improved search method of an evolutionary neural network architecture based on a hyper-network.

Background

An image classification (image classification) task is an image processing technique that distinguishes objects of different categories based on different feature information reflected in a picture. Since many models applied to the image classification task can be migrated as a feature extraction network to other computer vision fields, the image classification task is a basic task in the computer vision field, and the design of the image classification model is a hot spot of attention of researchers. However, the artificial design of the neural network model requires experienced experts, and the neural network model with excellent performance can be designed through careful study and repeated experiments on the distribution and characteristics of the data set. Therefore, a huge amount of time and labor cost are required.

Currently, Neural network Architecture Search algorithms (NAS) are attracting a wide range of attention of researchers. Such algorithms enable an efficient neural network architecture to be automatically designed based on a given data set without much expertise. Since NAS algorithms typically require continuous evaluation of neural network models in the search space, a great deal of computer effort is required. In order to improve the search efficiency of the NAS algorithm, there are two main methods:

the first approach is to construct an End-to-End Performance Predictor. This approach requires a coding method that uniquely maps the neural network architecture into a set of digital decision variables. The coding of the neural network architecture and its performance (e.g., accuracy of classification) are then formed into a data pair that is used as input to a performance predictor, which is trained. After the performance predictor is trained, the performance of the neural network model in the search space can be directly predicted without training the neural network model, and the search efficiency is further improved. However, this approach follows a training-then-prediction approach, requiring the performance predictor to be trained first using a set of training samples. In general, the more samples trained, the better the performance of the predictor. However, collecting more training samples means consuming more computing resources, and thus has a certain impact on search efficiency. Therefore, in practical use, a neural network architecture which is more effective by using an incremental strategy needs to be sampled, and certain calculation cost is needed.

The second method is a Neural network Architecture Search method (One-shot Neural Architecture Search) based on the super network. The method needs to train a hyper-network (One-shot model) as a search space; then randomly sampling a certain number of sub-networks from the super-network for performance evaluation, and ranking the sub-networks according to the performance of the sub-networks; finally, the sub-network with the best performance evaluation is taken as the output of the algorithm. Because the sub-network can relay the bearing value from the super-network and can evaluate without training, the search efficiency of the NAS algorithm can be effectively improved. However, the existing neural network architecture search algorithm based on the super network has certain defects. Firstly, the training of nodes inside the super-network is unbalanced, which causes inaccurate performance ranking in the sub-network evaluation stage, and further causes the algorithm not to find a network architecture with the best performance. Secondly, when the super network is trained, mutual interference among different sub networks may cause instability of a neural network architecture search algorithm based on the super network, the super network convergence speed is slow, and even convergence is impossible, so that the performance prediction result of the sub model is poor.

Disclosure of Invention

Aiming at the defects that the neural network architecture searching method based on the super network in the prior art is unstable in performance, the super network training convergence speed is low and even the convergence cannot be realized, the invention provides the neural network architecture searching method based on the super network, and the neural network architecture is automatically generated based on the super network by using an evolutionary algorithm as a searching strategy so as to improve the classification accuracy of an image classification task.

In order to solve the technical problems, the invention adopts the technical scheme that:

an improved search method of an evolutionary neural network architecture based on a hyper-network is characterized by comprising the following steps:

step S1, packaging five calculation modules by taking the input layer as the first layer; m computing nodes are packaged in each module, and finally, a full connection layer is used as an output layer of the neural network; and M is a natural number greater than 1.

Step S2, coding the neural network structure by a mixed coding mode, and binarizing the connection of the internal calculation nodes of the neural network; randomly generating N chromosomes to construct an initial population; and N is a natural number greater than 1.

Step S3, aiming at the individuals in the population, evenly sampling, training based on training data, learning a structure weight for each computing node, and adopting the classification precision of a verification set as a fitness function to evaluate the fitness of the individuals.

And step S4, constructing a parent population P by adopting a binary championship selection method.

Step S5, based on the given crossover rate p_cAnd carrying out pairwise crossing on chromosome individuals in the parent population by adopting a mixed crossing method to obtain a plurality of new chromosomes to form a child population Q.

Step S6, based on the given variation rate p_mAnd performing mutation operation on the individuals in the offspring population Q by adopting a mixed mutation method.

And step S7, decoding each individual in the offspring population Q into a corresponding neural network, obtaining a structure weight value in an inheritance or random initialization mode, and performing fitness evaluation on the individual by adopting the classification precision of the verification set as a fitness function.

Step S8, merging the parent population P and the child population Q into a population R, selecting several individuals as the original population of the next Generation by using an environment selection method, and feeding back to step S4 until reaching a predetermined maximum evolution number (Generation). And after the evolution is finished, outputting the individual with the highest fitness value as an optimal neural network architecture.

Further, in step S1, the input layer is sequentially composed of a convolutional layer, a ReLU activation function, and a Batch Normalization (BN) layer encapsulation.

Further, in step S1, the computing node is a computing unit in the neural network, and may be randomly selected from the operation search space θ. All the calculation node steps in the first calculation module, the third calculation module and the fifth calculation module are 1; the step length of all the computing nodes in the second computing module and the fourth computing module is 2.

Further, in step S2, the hybrid coding scheme is a combination of integer and binary. Describing the types of the computing nodes in the neural network architecture and the connection relation between the nodes by using integer coding; and binarizing the connection relation of the computing nodes in the neural network architecture by using binary numbers to describe whether the connection between the two computing nodes is activated or not. The method specifically comprises the following steps:

further, in the above step S21, a compute node is encoded as a quintuple

Wherein the content of the first and second substances,

represents a calculation unit a included in a calculation node i; i is₁，I₂Indexes of computing units representing connections of computing node I, i.e. computing node I and computing node I₁，I₂Are connected with each other;

I₁，I₂is a set of integers; j. the design is a square₁，J₂Representing a compute node I and a compute node I for a set of binary numbers₁，I₂The four states of the connection mode are specifically: j. the design is a square₁＝0，J ₂0 denotes a compute node I and a compute node I₁，I₂All the connections are in an activated state; then at this point, node I is computed₁，I₂After the feature maps of the outputs are fused, the fused feature maps are used as the inputs of the computing nodes i. The output δ of the computing node i is:

J₁＝0，J ₂1, denotes a compute node I and a compute node I₁Is activated, calculates node i and countsCalculation node I₂The connection of (2) is closed; then, at this time, the output δ of the computing node i is:

J₁＝1，J ₂0 denotes a compute node I and a compute node I₁Is closed, computing node I and computing node I₂Is activated; then, at this time, the output δ of the computing node i is:

J₁＝1，J ₂1, denotes a compute node I and a compute node I₁，I₂All the connections are in a closed state; i.e. the current compute node i is masked. Then at this point, node I is computed₁，I₂After the output feature graph is fused, the feature graph does not pass through a computing node

Processing is done directly as the output value δ of the compute node i:

δ＝I₁(x_c)+I₂(x_d)

wherein x is_c，x_dAre respectively a computing node I₁，I₂Input of (1)₁(x_c)，I₂(x_d) Are respectively a computing node I₁，I₂Output of (2)

Representing a computing node I₁，I₂Output characteristic diagram I₁(x_c)，I₂(x_d) Fusion, as input to compute node i, by a compute unit

After processing, as the output of compute node i.

Further, in step S22, the computing module includes M computing nodes. Then the coding structure of a computing module at this time is:

in step S23, the chromosome is a neural network architecture, and each neural network architecture includes five computing modules. Then, at this time, the coding structure of a neural network architecture is:

further, in step S3, the individuals in the population are uniformly sampled, training is performed based on training data, a structure weight is generated for each computing node, and fitness evaluation is performed on the individuals by using the classification accuracy of the verification set as a fitness function. The method specifically comprises the following steps:

further, in step S31, the predetermined training data set is divided into B batches (batch) on average according to the size of the batch data (batch size). And B is a natural number larger than N. In each batch, randomly selecting an individual from the parent population P, decoding the individual into a corresponding neural network, and training until a maximum training batch B is reached.

Further, at the above step S32, the fitness value fitness of each individual in the parent population is evaluated. The method comprises the following steps of adopting the classification accuracy of the pictures in the verification data set as a fitness function to evaluate the fitness, wherein the expression is as follows:

wherein G is the number of pictures with correct model identification, and H is the total number of pictures in the verification set.

Further, in the step S4, the binary tournament selection method may include the steps of:

and step S41, randomly selecting two individuals from the original population, reserving the individual with higher fitness value to the parent population P according to the fitness value, and returning the individual with lower fitness value to the original population.

And step S42, repeating step S41 until the number of individuals contained in the parent population P reaches a preset number of individuals K, wherein K is a natural number more than 1.

Further, in the above step S5, based on the given crossover rate p_cCarrying out pairwise crossing on chromosome individuals in the parent population P by using a mixed crossing method to obtain a plurality of chromosome individuals, and specifically comprising the following steps:

step S51, the integer part and the binary part of each chromosome are split into an integer chromosome part and a binary chromosome part.

Step S52, in the interval [0, 1]]Randomly generating a random number r, and randomly selecting two individuals P from the parent population P₁And p₂Determining the two individuals p by using the random number r₁And p₂Whether to perform a crossover operation.

Step S53, if r is less than or equal to p_mAligning the left sides of the integer chromosome parts of the two chromosomes to carry out single-point crossing, namely randomly setting a crossing point in the two integer chromosomes, and exchanging genes after the crossing point, wherein the crossing point of the two integer chromosomes is at the same position; the left sides of the binary chromosome parts of the two chromosomes are aligned for multi-point crossing, i.e. several crossing points are randomly selected in the two binary chromosomes, and the genes at the crossing points are exchanged, and the crossing points of the two binary numbers should be at the same position. Combining two individuals q resulting from said hybrid crossover method₁And q is₂And storing the obtained product into a filial generation population Q.

Step S54, if r is larger than or equal to p_mThe two individuals p selected in step S52₁And p₂And storing the obtained product into a filial generation population Q.

Further, in the above step S6, the variation rate p is determined based on the predetermined variation rate_mBy mixed variationMutation operations are performed on individuals in the offspring population Q. The method comprises the following specific steps:

step S61, the integer part and the binary part of each chromosome are split into an integer chromosome part and a binary chromosome part.

Step S62, randomly generating a random number t corresponding to any chromosome individual in the interval [0, 1] for any gene position in any chromosome individual, and using the random number to determine whether the gene position of the individual is mutated.

Step S63, if t is less than or equal to p_mThen a polynomial mutation operation is performed on the integer chromosome portion of the chromosome.

Wherein, a_iA 'represents the gene at the i-th gene position in the chromosome'_iThe expression is based on said gene a_iThe novel gene so produced; u is in the interval [0, 1]]A random number generated in (a);

respectively represent the genes a_iUpper and lower bounds of variation.

Step S64, if t is less than or equal to p_mThen, a flip mutation operation is performed on the binary chromosome part of the chromosome, that is, a plurality of mutation points are randomly selected in the chromosome, and the mutation is performed on the gene locus corresponding to each mutation point, wherein the gene locus with 0 is mutated into 1, and the gene locus with 1 is mutated into 0.

In step S7, each individual in the child population obtains a structure weight value by means of inheritance or random initialization, specifically: for any chromosome individual in the offspring population Q, if any calculation node in the chromosome individual is obtained by the hybrid crossover method of the step S5, inheriting the weight from the corresponding calculation node in the parent generation chromosome individual; if the hybrid mutation method in step S6 is used, the weight of the computing node is generated by random initialization.

In step S8, the parent population P and the child population Q are combined into a population R, and a plurality of individuals are selected as the original population of the next generation by an environment selection method, which specifically comprises the steps of:

and step S81, sorting the individuals in the population R according to the fitness value in the sequence from high fitness value to low fitness value.

And step S82, selecting individuals ranked from No. 1 to No. N in the population R according to the preset population scale N as the next generation population.

Compared with the prior art, the invention has the following beneficial effects:

(1) the method comprises the steps of coding a hyper-network by using a hybrid coding mode, describing the type and the connection relation of a calculation node inside a neural network architecture by using integer coding, and binarizing the connection relation of the calculation node inside the neural network architecture by using binary coding; the design has the advantages that different parts of the chromosomes can be randomly selected to be crossed in the population evolution process, the global search and the local search of the search space can be realized simultaneously, specifically, the single-point crossing operation is to generate a new neural network architecture by exchanging the internal computing nodes of two individuals, and the global exploration of the search space is realized. The multipoint intersection operation is only to exchange the binarization information of the neural network connection, and is to change the flow direction of the data stream in the single neural network to generate a new neural network architecture, thereby realizing the local exploration of the search space.

(2) Based on the mixed coding mode, different parts of the chromosome can be randomly selected to carry out mutation operation in the evolution process of the population, and calculation nodes which do not belong to the super network are introduced by a polynomial mutation method, and the weights of the calculation nodes are randomly initialized; the design has the advantage that the problem that the convergence is difficult after the hyper-network training to the later stage due to the deep coupling relation formed by calculating the node weights in the hyper-network training by the conventional method can be solved. The introduced computing nodes which do not belong to the super network can be merged into the super network along with the population evolution process, and because the weights of the computing nodes which do not belong to the super network are randomly given, the deep coupling relation in the super network training can be reduced, the algorithm can be helped to jump out of a local optimal solution, and the difficulty in convergence of the super network training can be avoided.

Based on the beneficial effects (1) and (2), the method provided by the invention can solve the problem that the super network is difficult to train to converge, and based on the problem, compared with the existing method, the method provided by the invention can realize the neural network architecture search based on a large-scale data set (for example, ImageNet).

Drawings

Fig. 1 is an overall architecture of the neural network of the present invention.

FIG. 2 is a flow chart of the algorithm of the present invention.

FIG. 3 is a flow chart of chromosome generation creation according to the present invention.

FIG. 4 is a schematic diagram of the hybrid cross method and hybrid mutation method of the present invention.

FIG. 5 is a process of neural network architecture optimization based on ImageNet classification task according to the present invention.

FIG. 6 is a training process of the neural network architecture searched by the present invention, based on ImageNet classification task.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.

As shown in fig. 1 to fig. 3, the present embodiment provides an improved searching method for an evolutionary neural network architecture based on a super network, which mainly includes the following steps:

the method comprises the following steps that firstly, an input layer is used as a first layer, and five calculation modules are packaged; m computing nodes are packaged in each module, and finally, a full connection layer is used as an output layer of the neural network; in this embodiment, each computing module is configured to include 9 computing nodes, that is, M is 9; the input layer is formed by packaging a convolution layer, a ReLU activation function and a Batch Normalization (BN) layer in sequence; the computing nodes are computing units in the neural network and can be randomly selected from the operation search space theta. All the calculation node steps in the first calculation module, the third calculation module and the fifth calculation module are 1; the step length of all the computing nodes in the second computing module and the fourth computing module is 2.

Secondly, coding the neural network structure in a mixed coding mode, and binarizing the connection of internal computing nodes of the neural network; randomly generating N chromosomes to construct an initial population; in this example, the initial population includes 40 chromosomes, i.e., N-40.

In this embodiment, a hybrid coding method is adopted to randomly generate an initial population to realize population initialization, each individual in the initial population represents a neural network architecture corresponding to the individual, and a connection method of internal computing nodes is binarized at the same time. Each computing node represents a computing unit of the neural network, and the coding information of the computing unit is shown in table 1. In the process of gene coding, the computing units are randomly coded into the overall neural network architecture to form a chromosome, namely the final neural network architecture is formed.

TABLE 1 coding information of neural network computing units

The specific coding mode is as follows:

the mixed coding mode is a coding mode combining integer and binary number. Describing the types of the computing nodes in the neural network architecture and the connection relation between the nodes by using integer coding; and binarizing the connection relation of the computing nodes in the neural network architecture by using binary numbers to describe whether the connection between the two computing nodes is activated or not. The method specifically comprises the following steps:

in step S21, a compute node is encoded as a five tuple

Wherein the content of the first and second substances,

I₁，I₂is a set of integers; j. the design is a square₁，J₂Representing a compute node I and a compute node I for a set of binary numbers₁，I₂The four states of the connection mode are specifically:

J₁＝0，J ₂0 denotes a compute node I and a compute node I₁，I₂All the connections are in an activated state; then at this point, node I is computed₁，I₂After the feature maps of the outputs are fused, the fused feature maps are used as the inputs of the computing nodes i. The output δ of the computing node i is:

J₁＝0，J ₂1, denotes a compute node I and a compute node I₁Is activated, computing node I and computing node I₂The connection of (2) is closed; then, at this time, the output δ of the computing node i is:

Processing is done directly as the output value δ of the compute node i:

δ＝I₁(x_c)+I₂(x_d)

After processing, as the output of compute node i.

In step S22, the computing module includes M computing nodes. Then the coding structure of a computing module at this time is:

and thirdly, uniformly sampling the individuals in the population, training based on training data, learning a structure weight for each computing node, and evaluating the fitness of the individuals by adopting the classification precision of a verification set as a fitness function. The method specifically comprises the following steps:

in step S31, the predetermined training data set is divided into B batches (batch) on average according to the size of the batch data (batch size). And B is a natural number larger than N. In each batch, randomly selecting an individual from the parent population P, decoding the individual into a corresponding neural network, and training until a maximum training batch B is reached. Based on this, in the present embodiment, the batch size is set to 256.

At step S32, the fitness value fitness of each individual in the parent population is evaluated. The method comprises the following steps of adopting the classification accuracy of the pictures in the verification data set as a fitness function to evaluate the fitness, wherein the expression is as follows:

And fourthly, constructing a parent population P by adopting a binary championship selection method. The method specifically comprises the following steps:

In step S42, step S41 is repeated until the number of individuals included in the parent population P reaches a preset number of individuals K, which is 40 in the present embodiment.

A fifth step of determining the intersection rate p based on the given intersection rate_cAnd carrying out pairwise crossing on chromosome individuals in the parent population by adopting a mixed crossing method to obtain a plurality of new chromosomes to form a child population Q. In this example, p_cThe hybrid crossover method is shown in fig. 4 as 0.95, and comprises the following specific steps:

A sixth step of determining the variation rate p based on the given variation rate_mAnd performing mutation operation on the individuals in the offspring population Q by adopting a mixed mutation method. In this example p_mThe mixed variation method is shown in fig. 5 as 0.1, and comprises the following specific steps:

respectively represent the genes a_iUpper and lower bounds of variation.

And seventhly, decoding each individual in the offspring population Q into a corresponding neural network, obtaining a structure weight value in a succession or random initialization mode, and performing fitness evaluation on the individual by adopting the classification precision of the verification set as a fitness function.

Each individual in the offspring population obtains a structure weight value through inheritance or random initialization, and the method specifically comprises the following steps: for any chromosome individual in the offspring population Q, if any calculation node in the chromosome individual is obtained by the hybrid crossover method of the step S5, inheriting the weight from the corresponding calculation node in the parent generation chromosome individual; if the hybrid mutation method in step S6 is used, the weight of the computing node is generated by random initialization.

And eighthly, combining the parent population P and the child population Q into a population R, selecting a plurality of individuals as the original population of the next generation by adopting an environment selection method, and feeding back to the step S4 until a preset maximum evolution generation number is reached. And after the evolution is finished, outputting the individual with the highest fitness value as an optimal neural network architecture.

To verify the advantages of the present invention, the following comparisons were made:

the dataset used by the invention is ImageNet. ImageNet is a large visual data set used for visual object recognition studies. The image classification method comprises more than 1400 million images, and is divided into a training set, a verification set and a test set, and the training set, the verification set and the test set comprise 20000 categories.

The algorithm hyper-parameter design used by the invention is as follows:

the initial channel number C is 32, and the maximum evolution Generation is 100. The SGD optimizer parameters are initialized. The method comprises the following steps: the initial learning rate lr is 0.1, the weight attenuation coefficient w is 0.0003, and the momentum (momentum) coefficient m is 0.9.

And after the algorithm iteration is finished, outputting the individual with the optimal fitness value. The individual is decoded into the corresponding neural network architecture EvoNet. Network structure parameters are reinitialized, and the neural network architecture is trained using the training data set until convergence. The test data set is then used to test the performance of the neural network architecture.

In the invention, the optimization process and the final individual test process based on ImageNet are respectively shown in FIGS. 5 and 6, and it can be seen that the invention obtains higher prediction classification accuracy during searching, and the classification accuracy of top1 is as follows: 77.4 percent.

The comparison result of the performance of the neural network architecture searched by the method, the existing artificially designed neural network architecture and the neural network architecture searching algorithm is shown in table 2. From table 2, it can be found that the neural network architecture searched by the present invention has better performance than the existing artificially designed neural network architecture and neural network architecture search algorithm.

Table 2 comparison of experimental results

Claims

1. An improved search method of an evolutionary neural network architecture based on a hyper-network is characterized by comprising the following steps:

step S1, packaging five calculation modules by taking the input layer as the first layer; m computing nodes are packaged in each module, and finally, a full connection layer is used as an output layer of the neural network; m is a natural number greater than 1;

step S2, coding the neural network structure by a mixed coding mode, and binarizing the connection of the internal calculation nodes of the neural network; randomly generating N chromosomes to construct an original population; the number of the computing nodes in any chromosome is less than the total number of the computing nodes of a preset chromosome; n is a natural number greater than 1;

step S3, uniformly sampling the individuals in the population, training based on training data, learning a structure weight for each computing node, and performing fitness evaluation on the individuals by adopting the classification precision of a verification set as a fitness function;

step S4, constructing a parent population P by adopting a binary championship selection method;

step S5, based on the given crossover rate p_cCarrying out pairwise crossing on chromosome individuals in the parent population by adopting a mixed crossing method to obtain a plurality of new chromosomes to form a child population Q;

step S6, based on the given variation rate p_mPerforming mutation operation on individuals in the offspring population Q by adopting a mixed mutation method;

step S7, decoding each individual in the offspring population Q into a corresponding neural network, obtaining a structure weight value in an inheritance or random initialization mode, and adopting the classification precision of a verification set as a fitness function to evaluate the fitness of the individual;

step S8, merging the parent population P and the child population Q into a population R, selecting a plurality of individuals as the original population of the next generation by adopting an environment selection method, and feeding back to the step S4 until a preset maximum evolution generation is reached; and after the evolution is finished, outputting the individual with the highest fitness value as an optimal neural network architecture.

2. The improved searching method for evolutionary neural network architecture based on super network as claimed in claim 1, wherein the input layer is composed of convolutional layer, ReLU activation function and batch normalization layer encapsulation in sequence.

3. The improved searching method for neural network architecture based on super network evolution as claimed in claim 1, wherein in step S1, said computing node is a computing unit in the neural network, and can be randomly selected from the operation search space θ; all the calculation node steps in the first calculation module, the third calculation module and the fifth calculation module are 1; the step length of all the computing nodes in the second computing module and the fourth computing module is 2.

4. The improved searching method for neural network architecture based on super network as claimed in claim 1, wherein in step S2, said hybrid coding mode is a coding mode combining integer and binary number; describing the types of the computing nodes in the neural network architecture and the connection relation between the nodes by using integer coding; binarizing the connection relation of the computing nodes in the neural network architecture by using binary numbers to describe whether the connection between the two computing nodes is activated or not; the method specifically comprises the following steps:

in step S21, a compute node is encoded as a five tuple

Wherein the content of the first and second substances,

I₁，I₂is a set of integers; j. the design is a square₁，J₂Representing a compute node I and a compute node I for a set of binary numbers₁，I₂The four states of the connection mode are specifically: j. the design is a square₁＝0，J₂0 denotes a compute node I and a compute node I₁，I₂All the connections are in an activated state; then at this point, node I is computed₁，I₂After the output feature graphs are fused, the feature graphs are used as the input of a computing node i; the output δ of the computing node i is:

J₁＝0，J₂1, denotes a compute node I and a compute node I₁Is activated, computing node I and computing node I₂The connection of (2) is closed; then, at this time, the output δ of the computing node i is:

J₁＝1，J₂0 denotes a compute node I and a compute node I₁Is closed, computing node I and computing node I₂Is activated; then, at this time, the output δ of the computing node i is:

J₁＝1，J₂1, denotes a compute node I and a compute node I₁，I₂All the connections are in a closed state; namely, the current computing node i is shielded; then at this point, node I is computed₁，I₂After the output feature graph is fused, the feature graph does not pass through a computing node

Processing is done directly as the output value δ of the compute node i:

δ＝I₁(x_c)+I₂(x_d)

After processing, the output of the computing node i is used;

step S22, the computing module comprises M computing nodes; then the coding structure of a computing module at this time is:

step S23, the chromosome is a neural network architecture, each neural network architecture comprises five calculation modules; then, at this time, the coding structure of a neural network architecture is:

5. the improved search method for an evolutionary neural network architecture based on a super network as claimed in claim 1, wherein in step S3, aiming at the individuals in the population, uniform sampling is performed, training is performed based on training data, a structure weight is generated for each computing node, and fitness evaluation is performed on the individuals by using the classification precision of the validation set as a fitness function; the method specifically comprises the following steps:

step S31, equally dividing a predetermined training data set into B batches (batch) according to the size of the given batch size; b is a natural number larger than N; in each batch, randomly selecting an individual from the parent population P, decoding the individual into a corresponding neural network for training until a maximum training batch B is reached;

step S32, evaluating the fitness value fitness of each individual in the parent population; the method comprises the following steps of adopting the classification accuracy of the pictures in the verification data set as a fitness function to evaluate the fitness, wherein the expression is as follows:

6. The improved searching method for neural network architecture based on super networks as claimed in claim 1, wherein in step S4, for said binary tournament selection method, the steps are as follows:

step S41, randomly selecting two individuals from the original population, reserving the individual with higher fitness value to the parent population P according to the fitness value, and putting the individual with lower fitness value back to the original population;

7. The improved searching method for evolutionary neural network architecture based on super network as claimed in claim 1, wherein in step S5, based on a given crossover rate p_cCarrying out pairwise crossing on chromosome individuals in the parent population P by using a mixed crossing method to obtain a plurality of chromosome individualsThe method comprises the following steps:

step S51, splitting the integer part and the binary number part of each chromosome into an integer chromosome part and a binary number chromosome part;

step S52, in the interval [0, 1]]Randomly generating a random number r, and randomly selecting two individuals P from the parent population P₁And p₂Determining the two individuals p by using the random number r₁And p₂Whether to perform a crossover operation;

step S53, if r is less than or equal to p_mAligning the left sides of the integer chromosome parts of the two chromosomes to carry out single-point crossing, namely randomly setting a crossing point in the two integer chromosomes, and exchanging genes after the crossing point, wherein the crossing point of the two integer chromosomes is at the same position; aligning the left sides of the binary chromosome parts of the two chromosomes to carry out multi-point crossing, namely randomly selecting a plurality of cross points in the two binary chromosomes, and exchanging genes at the cross points, wherein the cross points of the two binary numbers are in the same position; combining two individuals q resulting from said hybrid crossover method₁And q is₂Storing the obtained product in a filial generation population Q;

8. The improved searching method for neural network architecture based on super-networks as claimed in claim 1, wherein in step S6, based on a given variation rate p_mPerforming mutation operation on individuals in the offspring population Q by using a mixed mutation method; the method comprises the following specific steps:

step S61, splitting the integer part and the binary number part of each chromosome into an integer chromosome part and a binary number chromosome part;

step S62, randomly generating a random number t corresponding to any chromosome individual in the interval [0, 1] aiming at any gene position in any chromosome individual, and determining whether the gene position of the individual is subjected to mutation operation by using the random number;

step S63, if t is less than or equal to p_mPerforming a polynomial mutation operation on the integer chromosome portion of the chromosome;

respectively represent the genes a_iUpper and lower bounds of variation;

9. The improved searching method for the neural network architecture based on the evolution of the super network as claimed in claim 1, wherein in step S7, each individual in the offspring population obtains the structure weight through inheritance or random initialization, specifically: for any chromosome individual in the offspring population Q, if any calculation node in the chromosome individual is obtained by the hybrid crossover method of the step S5, inheriting the weight from the corresponding calculation node in the parent generation chromosome individual; if the hybrid mutation method in step S6 is used, the weight of the computing node is generated by random initialization.

10. The improved searching method for the neural network architecture based on the evolution of the super network as claimed in claim 1, wherein in step S8, the parent population P and the child population Q are combined into a population R, and a plurality of individuals are selected as the original population of the next generation by using the environment selection method, which comprises the following specific steps:

step S81, according to the fitness value, sorting the individuals in the population R according to the sequence of the fitness value from high to low;