CN112784949A

CN112784949A - Neural network architecture searching method and system based on evolutionary computation

Info

Publication number: CN112784949A
Application number: CN202110120132.8A
Authority: CN
Inventors: 高明柯; 杜欣军; 赵�卓; 逄涛; 冒睿瑞; 张浩博; 郭威; 王熠; 刘晓娟; 于楠
Original assignee: CETC 32 Research Institute
Current assignee: CETC 32 Research Institute
Priority date: 2021-01-28
Filing date: 2021-01-28
Publication date: 2021-05-11
Anticipated expiration: 2041-01-28
Also published as: CN112784949B

Abstract

The invention provides a neural network architecture searching method and a system based on evolutionary computation, which comprises the following steps: setting a target requirement through a target function according to the target requirement and the platform requirement; according to the set size of the search space, randomly generating N directed cyclic graphs based on the sub-network module set to serve as a network search space for evolutionary optimization; under the guidance of heuristic information, combining with a pheromone dynamic volatilization and probability path selection mechanism, searching N directed acyclic graphs for optimizing paths in N randomly generated acyclic graphs through an ant colony algorithm to form a candidate set; acquiring the accuracy and reasoning time delay of N optimizing paths in the candidate set through training and testing, and selecting an optimal result as a current optimal network structure; and evaluating whether the current network architecture meets the target requirements. The invention has certain application flexibility and expandability, and obtains the neural network model which realizes good balance between precision and speed under the condition of resource limitation.

Description

Neural network architecture searching method and system based on evolutionary computation

Technical Field

The invention relates to the technical field of architecture design and optimization of a deep neural network, in particular to a neural network architecture searching method and system based on evolutionary computation.

Background

Deep learning has a powerful automatic feature extraction function on unstructured data and has powerful automatic representation capability, so that the deep learning makes major breakthrough and progress in many fields such as machine translation, image recognition, voice recognition, target detection, natural language processing and the like. Based on the importance of the design of neural network architectures to the characterization of data and the ultimate performance, researchers have focused on designing various complex neural network architectures to obtain good data characterization. However, the design of neural network architectures relies heavily on a priori knowledge and experience of researchers, requiring a great deal of time and effort. The prior knowledge and the fixed thinking paradigm of human beings are difficult to find a better network architecture to a certain extent, and beginners are also difficult to reasonably modify the network architecture according to the actual needs of the beginners. Therefore, Neural Architecture Search (NAS) has come into play. The NAS aims at automatically designing a neural network architecture with optimal performance by utilizing an algorithm under the condition of limited computing resources, and manual intervention is reduced as far as possible. The research that the network architecture obtained by using the reinforcement learning method achieves the SOTA classification precision on the image classification task is considered as the pioneering work of the NAS, and the idea of designing the automatic network architecture is also shown to be feasible. Subsequently, the feasibility of this idea was again verified by the research work of large-scale evolutionary computation using evolutionary learning to obtain similar results. NAS has been rapidly applied in target detection, semantic segmentation, counterlearning, building scale, and multi-objective optimization.

As the NAS method requires a strong computational support and consumes a huge amount of computation, research has been conducted to reconstruct a search space to reduce a search range and reduce search complexity, and to accelerate a search of a network architecture through strategies such as parameter sharing, model reuse, gradient optimization, and the like to reduce the amount of computation. Early NAS trained each candidate network architecture from scratch in the architecture search phase, resulting in a proliferation of computational effort; although a parameter sharing strategy is adopted to speed up the process of architecture search, the inaccurate ranking of the candidate architectures is likely to cause the NAS to have difficulty in selecting the optimal network architecture from a large number of candidate architectures, thereby further reducing the performance of the finally searched network architecture. The search space is relaxed from discrete to continuous by the differentiable neural network architecture search method based on One-Shot, so that the architecture and the learning weight can be searched simultaneously by using gradient descent, the search time is shortened, but when the number of search rounds is too large, a searched architecture contains a lot of jump connections, and the network becomes shallower. Shallow networks can learn fewer parameters and have weaker expression ability, resulting in a sharp decline in network performance. Although the improved differentiable neural network architecture searching method adopts an early-stopping mechanism to directly control the number of the jump connections, the control of the early-stopping mechanism is an important problem, and the early stopping can cause the incomplete architecture searching. Therefore, how to balance performance and efficiency under the condition of limited resources is a problem to be solved urgently.

Domestic patent CN111353313A discloses an emotion analysis model construction method based on evolutionary neural network architecture search, which comprises the following steps: group initialization; packaging a plurality of convolutional layer units, a plurality of pooling units and a plurality of full-connection units by taking the embedded layer as a first layer, and ending by the full-connection units to randomly generate M chromosomes; the accuracy is adopted as a fitness function to carry out fitness evaluation; selecting a plurality of chromosome individuals by adopting a roulette method to form a first chromosome population; carrying out pairwise crossing on chromosome individuals of the first chromosome population by adopting an unequal length chromosome crossing method to obtain a plurality of chromosome individuals to form a second chromosome population; adding or replacing or deleting a certain convolution layer unit or pooling unit or full-link unit of chromosome individuals of the second chromosome population; and calculating the fitness of the chromosome individuals of the second chromosome population until reaching a preset iteration number, and selecting the chromosome individuals with the optimal neural network structure by adopting the fitness.

The domestic patent CN111144555A discloses a cyclic neural network architecture searching method, a system and a medium based on an improved evolutionary algorithm, and the method comprises the steps of training a plurality of cyclic neural network submodels to update shared weight; initializing a generation population and a historical record table for recording the performance of all the recurrent neural network models; randomly sampling from the population to generate samples, selecting an optimal sample model for mutation operation, removing the oldest or worst model in the population according to specified probability, and adding the mutated child nodes into the population and a historical record table; and judging whether a preset finishing condition is met, if not, continuing to perform sample variation, and otherwise, outputting the optimal model in the history table. The invention can accelerate the process of searching the recurrent neural network architecture, and can greatly improve the efficiency of searching the recurrent neural network architecture by simultaneously considering the performance and the searching time when updating the seed group in each step.

Domestic patent CN110728355A discloses a neural network architecture search method, device, computer equipment and storage medium, relating to the technical field of deep learning, wherein the method can include: dividing a neural network architecture into M substructures, wherein M is a positive integer greater than 1; searching the topological structures in the substructures respectively; the neural network architecture is obtained by connecting the topological structures in the substructures, so that the searching speed can be improved.

The domestic patent CN110232434A discloses a neural network architecture evaluation method based on attribute graph optimization, which models a neural network architecture into an attribute graph and constructs a Bayesian graph neural network agent model; randomly generating, training and testing a group of neural network architectures, and taking the group of neural network architectures and performance indexes corresponding to the tests as an initial training set, wherein the training set is used for training a Bayesian graph neural network agent model; generating a new neural network candidate set through an evolutionary algorithm according to the current training set and training a Bayesian graph neural network agent model; selecting a potential individual from the neural network candidate set through a maximized acquisition function, then training and testing the individual, and adding the potential individual and a performance index corresponding to the test into the current training set; and repeating the steps under the constraint of fixed cost until the best neural network architecture and the corresponding weight of the architecture are obtained in the current training set.

The foreign patent JP2020522035A discloses a method, system and apparatus for determining the structure of a neural network. The method includes generating a batch of output sequences from current values of controller parameters using a controller neural network having the controller parameters. Generating an instance of a sub-Convolutional Neural Network (CNN) in a batch process from an output sequence, the CNN comprising a plurality of instances having a first convolution unit of an architecture defined by the output sequence; training an instance of the sub-CNN to perform an image processing task and evaluating performance of the instance of the sub-CNS training for determining a performance metric of the sub-CNN training instance, and including adjusting current values of controller parameters of the controller neural network using the performance metric of the trained CNN training instance.

The foreign patent WO2018081563a1 discloses a method, system and apparatus for determining the architecture of a neural network. The method comprises the following steps: generating a batch of output sequences using a controller neural network, each output sequence in the batch being an architecture of a sub-neural network for performing a particular neural network task; for each output sequence in the batch: training an architecture defined by the output sequence through respective sub-neural network instances; evaluating the performance of the sub-neural network training instance on the particular neural network task to determine a performance metric of the sub-neural network training instance on the particular neural network task; and adjusting current values of controller neural network controller parameters using the performance metrics of the sub-neural network training instances.

Disclosure of Invention

Aiming at the defects in the prior art, the invention aims to provide a neural network architecture searching method and system based on evolutionary computation.

The invention provides a neural network architecture searching method based on evolutionary computation, which comprises the following steps:

step S1: setting target requirements through a target function according to the target requirements and the platform requirements, wherein the target requirements comprise: expected accuracy, reasoning time delay, search space size and evolution times;

step S2: according to the set size of the search space, randomly generating N directed cyclic graphs based on the sub-network module set to serve as a network search space for evolutionary optimization;

step S3: under the guidance of heuristic information, combining with a pheromone dynamic volatilization and probability path selection mechanism, searching N directed acyclic graphs for optimizing paths in N randomly generated acyclic graphs through an ant colony algorithm to form a candidate set;

step S4: acquiring the accuracy and reasoning time delay of N optimizing paths in the candidate set through training and testing, and selecting an optimal result as a current optimal network structure;

step S5: evaluating whether the current network architecture meets the target requirement, when the current network architecture does not meet the preset target requirement, and when the current network architecture meets the speed and precision requirements, pausing the optimization result, carrying out real-time evolution mutation on the internal structures of all the sub-network modules based on the current optimal network architecture, and continuing iteration until the preset target requirement is met; and when the preset target requirement is met, outputting the current optimal network result, otherwise, quitting the searching process and outputting the abnormal searching.

Preferably, the step S1 includes:

the objective function formula is as follows:

s.t.LAT(Net)≤T&ACC(Net)≥A

wherein, the target function is defined as a multi-target search; net represents the network obtained by the evolutionary algorithm; acc (net) indicates the accuracy of the network; lat (net) represents the inference delay of the network; t represents the expected inference time delay; a represents the desired accuracy; the expected accuracy is set according to preset target requirements; the expected inference time delay is set according to mobile, embedded or general platform types.

Preferably, the step S2 includes:

the sub-network modules are nodes in an undirected cyclic graph; the sub-network modules comprise a plurality of types of sub-networks with M layers, and the types of the sub-networks can be expanded and selected according to target requirements and platform requirements;

the structure of the sub-network comprises: multi-convolution layers, ResNet blocks, depth separable convolutions, reversed residual structure with linear bottlenecks, and lightweight attention structure based on compression-excitation structures.

Preferably, the network search space comprises: the network search space comprises N search subspaces;

wherein ,

representing the i-th generation of a collection of sub-network modules,

represents the jth sub-network module in the ith generation,

representing an edge set of the ith generation of search space, and connecting the sub-network modules through edges;

representing an edge between a jth sub-network module and a kth sub-network module in an ith generation;

representing the ith generation nth search space; i denotes an iteration number.

Preferably, the step S3 includes:

step S3.1: selecting any point in each search subspace as a starting point, selecting a node farthest from the starting point as an end point, and initializing the ant number, the pheromone intensity constant and the cycle number;

step S3.2: calculating heuristic information;

wherein ,η_I，J(t) heuristic information from node I to node J at time t; dep_I，J、Wig_I，J、Con_I，J、Fil_I，JRespectively representing the depth, the width, the connectivity and the number of filters of the joint J, wherein omega represents an excitation factor, and omega is more than or equal to 0 and less than or equal to 1; the reward mechanism is defined as exciting all nodes in the current optimal network architecture; the smaller the omega value is, the larger heuristic information eta is, and the initial value of omega is set to be 1 because evolution is not generated in the initial network search space;

step S3.3: selecting a probability path;

wherein ,

representing the probability that the ant m moves from the point I to the point J at the t-th moment; allowed_mRepresenting nodes which can be selected by ants in the next step; alpha represents pheromone elicitation factors, represents the effect of residual pheromones on the paths in the optimizing process, and the larger the value is, the stronger the cooperation capability among ants is, and the paths passed by other ants are prone to be selected; beta represents an expected value heuristic factor, which shows the accuracy and the degree of importance of reasoning time delay when ants select paths, and the larger the value is, the closer the state transition rule is to the greedy rule; tau is_I，J(t) pheromones on the path from point I to point J at time t; tau is_I,S(t) indicates points I to allowed_mPheromone on any point path;

representing I to allowed_mHeuristic information on the path of any point in the tree;

step S3.4: the pheromone is volatilized dynamically;

wherein ,ρ_I，J(t) represents the volatility coefficient on the path from I to J at the moment t; eta_I，J(t) heuristic information on the path from time I to J at t;

η_irepresenting all initiation information, wherein L represents the total number of nodes in the current network;

step S3.5: performing pheromone increment calculation;

wherein Q is a pheromone strength constant, which is the total amount of pheromones released by ants on a path traveled in one cycle; eta_mTo representThe total amount of heuristic information suffered by the mth ant in the cycle;

step S3.6: updating pheromone;

τ_I，J(t+1)＝(1-ρ)τ_I，J(t)+Δτ_I，J(t,t+1)

wherein rho is the pheromone dynamic volatility coefficient;

represents the pheromone increment left by the mth ant on the path (I, J) in the current cycle, delta tau_I,J(t, t +1) represents pheromone increment left by all ants passing through the path (I, J) in the current cycle; k represents the total number of ants passing through the paths (I, J) in the current cycle;

step S3.7: and (4) optimizing and judging: when the optimization of all the search subspaces reaches the maximum cycle times, the circulation is exited, and the optimization results of all the search subspaces are output as a current candidate set; otherwise, step S3.2 to step S3.7 are repeated until the maximum number of cycles is reached.

Preferably, the evolving mutations in step S5 include: setting the excitation factors omega in all the sub-network modules in the current optimal network architecture as constants, wherein omega is more than or equal to 0 and less than or equal to 1; and simultaneously, randomly selecting mutation operation in the mutation set, promoting the internal structure of the sub-network module, generating a next generation sub-network module, and repeatedly executing the steps S2 to S5 until the preset target requirement is met.

The invention provides a neural network architecture search system based on evolutionary computing, which comprises:

module M1: setting target requirements through a target function according to the target requirements and the platform requirements, wherein the target requirements comprise: expected accuracy, reasoning time delay, search space size and evolution times;

module M2: according to the set size of the search space, randomly generating N directed cyclic graphs based on the sub-network module set to serve as a network search space for evolutionary optimization;

module M3: under the guidance of heuristic information, combining with a pheromone dynamic volatilization and probability path selection mechanism, searching N directed acyclic graphs for optimizing paths in N randomly generated acyclic graphs through an ant colony algorithm to form a candidate set;

module M4: acquiring the accuracy and reasoning time delay of N optimizing paths in the candidate set through training and testing, and selecting an optimal result as a current optimal network structure;

module M5: evaluating whether the current network architecture meets the target requirement, when the current network architecture does not meet the preset target requirement, and when the current network architecture meets the speed and precision requirements, pausing the optimization result, carrying out real-time evolution mutation on the internal structures of all the sub-network modules based on the current optimal network architecture, and continuing iteration until the preset target requirement is met; and when the preset target requirement is met, outputting the current optimal network result, otherwise, quitting the searching process and outputting the abnormal searching.

Preferably, said module M1 comprises:

the objective function formula is as follows:

s.t.LAT(Net)≤T&ACC(Net)≥A

Preferably, said module M2 comprises:

the structure of the sub-network comprises: the system comprises a multi-convolution layer, a ResNet block, a depth separable convolution, an inverted residual error structure with a linear bottleneck and a lightweight attention structure based on a compression-excitation structure;

the network search space includes: the network search space comprises N search subspaces;

wherein ,

representing the i-th generation of a collection of sub-network modules,

represents the jth sub-network module in the ith generation,

Preferably, said module M3 comprises:

module M3.1: selecting any point in each search subspace as a starting point, selecting a node farthest from the starting point as an end point, and initializing the ant number, the pheromone intensity constant and the cycle number;

module M3.2: calculating heuristic information;

module M3.3: selecting a probability path;

wherein ,

module M3.4: the pheromone is volatilized dynamically;

module M3.5: performing pheromone increment calculation;

wherein Q is a pheromone strength constant, which is the total amount of pheromones released by ants on a path traveled in one cycle; eta_mRepresenting the total amount of heuristic information suffered by the mth ant in the cycle;

module M3.6: updating pheromone;

τ_I，J(t+1)＝(1-ρ)τ_I，J(t)+Δτ_I，J(t，t+1)

wherein rho is the pheromone dynamic volatility coefficient;

represents the pheromone increment left by the mth ant on the path (I, J) in the current cycle, delta tau_I，J(t, t +1) represents pheromone increment left by all ants passing through the path (I, J) in the current cycle; k represents the total number of ants passing through the paths (I, J) in the current cycle;

module M3.7: and (4) optimizing and judging: when the optimization of all the search subspaces reaches the maximum cycle times, the circulation is exited, and the optimization results of all the search subspaces are output as a current candidate set; otherwise, repeatedly triggering the execution of the modules M3.2 to M3.7 until the maximum cycle number is reached;

the mutations evolved in the module M5 include: setting the excitation factors omega in all the sub-network modules in the current optimal network architecture as constants, wherein omega is more than or equal to 0 and less than or equal to 1; and meanwhile, randomly selecting mutation operation in the mutation set, promoting the internal structure of the sub-network module, generating a next generation sub-network module, and repeatedly triggering the execution of the modules M2 to M5 until the preset target requirement is met.

Compared with the prior art, the invention has the following beneficial effects:

1. according to the invention, through the idea of modular graph theory construction, a system level searching subspace taking the sub-network modules as basic components is constructed, so that the complexity of the searching space can be effectively reduced, the framework searching process is accelerated in a module level searching mode, and the searching performance is improved; by utilizing the heuristic optimization capability of the ant colony algorithm, the network performance reduction caused by the fact that the framework search is trapped into local optimization can be prevented from the global angle; the reward and mutation evolution mechanism are fused, and the evolution is as comprehensive as possible through a structure level mode, so that the incomplete framework search can be avoided.

2. The invention can set the expected target through the target function according to the actual application requirement and the platform requirement, is not limited by an application platform, simultaneously encourages the diversity of the structure in the module of the whole network, has certain application flexibility and expandability, and can obtain a neural network model realizing good balance between precision and speed under the condition of resource limitation.

Drawings

Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:

fig. 1 is a flowchart of a neural network architecture search method based on evolutionary computation.

Detailed Description

The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that it would be obvious to those skilled in the art that various changes and modifications can be made without departing from the spirit of the invention. All falling within the scope of the present invention.

Example 1

Specifically, the step S1 includes:

the objective function formula is as follows:

s.t.LAT(Net)≤T&ACC(Net)≥A

Specifically, the step S2 includes:

Specifically, the network search space includes: the network search space comprises N search subspaces;

wherein ,

representing the i-th generation of a collection of sub-network modules,

represents the jth sub-network module in the ith generation,

Specifically, the step S3 includes:

step S3.2: calculating heuristic information;

step S3.3: selecting a probability path;

wherein ,

indicates the t-th timeThe probability that ant m moves from point I to point J; allowed_mRepresenting nodes which can be selected by ants in the next step; alpha represents pheromone elicitation factors, represents the effect of residual pheromones on the paths in the optimizing process, and the larger the value is, the stronger the cooperation capability among ants is, and the paths passed by other ants are prone to be selected; beta represents an expected value heuristic factor, which shows the accuracy and the degree of importance of reasoning time delay when ants select paths, and the larger the value is, the closer the state transition rule is to the greedy rule; tau is_I，J(t) pheromones on the path from point I to point J at time t; tau is_I,S(t) indicates points I to allowed_mPheromone on any point path;

step S3.4: the pheromone is volatilized dynamically;

step S3.5: performing pheromone increment calculation;

step S3.6: updating pheromone;

τ_I，J(t+1)＝(1-ρ)τ_I，J(t)+Δτ_I，J(t，t+1)

wherein rho is the pheromone dynamic volatility coefficient;

Specifically, the evolving mutations in step S5 include: setting the excitation factors omega in all the sub-network modules in the current optimal network architecture as constants, wherein omega is more than or equal to 0 and less than or equal to 1; and simultaneously, randomly selecting mutation operation in the mutation set, promoting the internal structure of the sub-network module, generating a next generation sub-network module, and repeatedly executing the steps S2 to S5 until the preset target requirement is met.

Specifically, the module M1 includes:

the objective function formula is as follows:

s.t.LAT(Net)≤T&ACC(Net)≥A

Specifically, the module M2 includes:

wherein ,

representing the i-th generation of a collection of sub-network modules,

represents the jth sub-network module in the ith generation,

Specifically, the module M3 includes:

module M3.2: calculating heuristic information;

module M3.3: selecting a probability path;

wherein ,

module M3.4: the pheromone is volatilized dynamically;

module M3.5: performing pheromone increment calculation;

module M3.6: updating pheromone;

τ_I，J(t+1)＝(1-ρ)τ_I，J(t)+Δτ_I，J(t,t+1)

wherein rho is the pheromone dynamic volatility coefficient;

Example 2

Example 2 is a modification of example 1

The invention provides a neural network architecture searching method and system based on evolutionary computation, which are used for exploring an optimal neural network architecture in a mode of combining module-level search and structure-level evolution by utilizing the optimization capability and a reward evolution mechanism of an ant colony algorithm on the basis of a neural network initialization module according to the idea constructed by a modular graph theory, and can take account of speed and precision to solve the technical problems under the condition of resource limitation.

Aiming at the problem that the performance and efficiency are difficult to balance under the condition of limited resources in the existing neural network architecture search, the invention aims to provide a neural network architecture search method and system based on evolutionary computation. The invention takes a sub-network module as a basic component, randomly generates a plurality of network searching subspaces (directed acyclic graphs), takes the depth, the breadth, the connectivity and the number of filters of the sub-network module (node) as the complexity of the sub-network module and combines an incentive factor as heuristic information, utilizes an improved ant colony algorithm to search for optimization in the subspaces through an pheromone dynamic volatilization and random probability path selection mechanism, and simultaneously integrates a reward and an evolution mutation mechanism to continuously evolve so as to search for an optimal neural network architecture which gives consideration to both performance and efficiency.

The Evolutionary computing is in the field of computer science, and Evolutionary computing (evolution computing) is a sub-domain related to a combinatorial optimization problem in intelligent computing (Computational Intelligence). The evolution algorithm is influenced by a natural selection mechanism of 'winning or losing' in the biological evolution process and a transmission rule of genetic information, the process is simulated by program iteration, the problem to be solved is regarded as the environment, and in a population consisting of some possible solutions, the optimal solution is sought through natural evolution.

The Evolutionary algorithm or "Evolutionary algorithm" is an algorithm cluster, which generates the inspiration from the biological evolution of nature, despite its many variations, different genetic expression patterns, different crossover and mutation operators, the introduction of special operators, and different regeneration and selection methods. Compared with the traditional optimization algorithms such as a calculus-based method and an exhaustion method, the evolutionary algorithm is a mature global optimization method with high robustness and wide applicability, has the characteristics of self-organization, self-adaptation and self-learning, can not be limited by problem properties, and effectively solves the complex problem which is difficult to solve by the traditional optimization algorithm.

The Neural Network Architecture Search (NAS) is one of the hot spots for deep learning research. NAS aims to design a neural network architecture with optimal performance in an automated manner, with minimal human intervention, by using limited computational resources.

The method is one of group intelligent algorithms in the bionics, which is proposed in 1991 by the italian scholars m.Dorigo et al after inspired by ant foraging in the real world. During the foraging process of ants, each ant releases a chemical substance called pheromone on the path which can exchange information with other ants. As time goes on, the pheromone volatilizes, but the shorter the path that the ants climb over, the slower the corresponding pheromone volatilizes, and the concentration of the pheromone left on the path is relatively higher. Ants can find the pheromone and perceive the concentration of the pheromone, and can select a path with the highest pheromone concentration with higher probability. Thus, more ants can select a path containing high-concentration pheromone, and more ants are attracted to the path to form positive feedback. Based on the principle, ants can quickly find a shortest path away from the food source.

The invention relates to a neural network architecture searching method and system based on evolutionary computation, which comprises the steps of searching target setting, searching space initialization, ant colony optimization, target evaluation, evolution mutation and the like. Firstly, setting a search target parameter from the aspects of accuracy, reasoning time delay, evolution times and the like according to target requirements and platform requirements; then initializing a search space, and randomly generating N search subspaces; on the basis, ant colony optimization is started, and a current candidate set is constructed; then, target evaluation is completed on the data set, and the current optimal network architecture is selected from the candidate set; and finally, carrying out evolution mutation on the internal structures of all the sub-network modules of the current optimal network architecture, and continuing iteration until the target requirement is met.

The searching target setting is to set the expected accuracy and the reasoning time delay through a defined target function according to the target requirement and the platform requirement, and set parameters such as the size of a searching space and the evolution times.

The search space is initialized, namely N directed cyclic graphs are randomly generated for the sub-network module set according to the set size of the search space and are used as the network search space for evolutionary optimization.

Specifically, the network search space is a hierarchically partitioned unidirectional cyclic graph, each sub-network module representing a node in the unidirectional cyclic graph, allowing module-level searching and module internal structure-level mutation and searching.

Specifically, the sub-network modules are nodes in an undirected cyclic graph, are defined as a plurality of types of sub-networks with M layers, can expand the types of the sub-networks and are selected according to target requirements and application platforms, and the sub-network structures of the sub-networks comprise but are not limited to multi-convolution layers, ResNet blocks, deep separable convolutions, inverse residual structure with linear bottleneck, lightweight attention structure based on compression-excitation structure and the like.

The ant colony optimization is that under the guidance of heuristic information, by combining with a pheromone dynamic volatilization and probability path selection mechanism, N directed acyclic graphs are searched for optimizing paths in N randomly generated acyclic graphs through an ant colony algorithm, so that a candidate set of the generation is formed.

And the target evaluation is to obtain the accuracy and reasoning time delay of the N optimizing paths in the candidate set through training and testing, select the optimal result as the optimal network architecture of the generation, and evaluate whether the target requirement is met.

And (3) the evolution mutation, namely awarding rewards for internal structures of all sub-network modules in the optimal network architecture of the generation, and randomly selecting mutation operation in a mutation set to generate the sub-network module of the next generation, so as to simulate the process that excellent individuals in the nature are easier to leave offspring.

Specifically, the mutation set comprises module structure keeping unchanged, convolution type randomly selected, convolution kernel size, filter size, convolution layer insertion, convolution layer deletion, connection addition and connection deletion operations.

A neural network architecture searching method and system based on evolutionary computation comprises the following steps: as shown in figure 1 of the drawings, in which,

step 1, search target setting:

according to the target requirement and the platform requirement, the expected accuracy and the reasoning time delay are set through the defined target function, and parameters such as the size of a search space and the evolution times are set at the same time. In order to seek the balance between accuracy and reasoning time delay, an objective function is defined as a multi-objective searching problem, and the aim is to find a neural network architecture with high accuracy and low thrust time delay. The accuracy can be set according to the specific requirements of users, and TOP-1 or TOP-5 is adopted; the inference delay may be set according to mobile, embedded, or general platform types. The overall objective function is formalized specifically as follows:

s.t. LAT(Net)≤T&ACC(Net)≥A

where Net represents the network obtained by the evolutionary algorithm, acc (Net) represents the accuracy of the network, lat (Net) represents the inference delay of the network, and T represents the desired target delay. A represents the desired accuracy. Lat (Net) in s.t.

Step 2, initializing a search space:

and according to the set size of the search space, randomly generating N directed cyclic graphs from the sub-network module set to serve as the network search space for evolutionary optimization.

Each sub-network module represents a node in the undirected cyclic graph, is defined as a plurality of types of sub-networks with M layers, can extend the sub-network module types and select according to target requirements and application platform requirements, and has a sub-network structure including, but not limited to, multi-convolution layer (2D conv +1x1 conv with 1x1 conv + 3 x 3 filtering), ResNet block, deep separable convolution, reverse residual structure with linear bottleneck, lightweight attention structure based on compression-excitation structure, etc. Edges represent connections between sub-network modules. A search space is composed of N search subspaces, i.e.

The formalization is as follows:

wherein ,

representing the i-th generation of a collection of sub-network modules,

represents the jth sub-network module in the ith generation,

a set of edges representing the ith generation of search space,

Step 3, ant colony optimization:

under the guidance of heuristic information, combining with a pheromone dynamic volatilization and random probability path selection mechanism, searching N directed acyclic optimizing paths in N randomly generated acyclic graphs through an ant colony algorithm, and forming a candidate set of the generation.

Step 3.1: and initializing parameters. And selecting any point in each search subspace as a starting point, automatically selecting the node farthest from the starting point as an end point, initializing the ant number, the pheromone strength constant, the cycle number and the like.

Step 3.2: and (5) heuristic information calculation. The heuristic function has an important influence on the convergence and stability of the optimization process. In the ant colony algorithm, heuristic information eta is usually the inverse ratio of the distance between two points, but the standard of neural network architecture search balances accuracy and reasoning time delay, and the complexity of the internal structure of a sub-network module influencing the calculated quantity, such as depth, width, connectivity and the number of filters, and the excitation given by a reward mechanism need to be considered in the optimization process, so the heuristic information is defined as

wherein ,η_I，J(t) heuristic information from node I to node J at time t; dep_I，J、Wig_I，J、Con_I，J、Fil_I，JRespectively representing the depth, the width, the connectivity and the number of filters of the node J, wherein omega represents an excitation factor, omega is more than or equal to 0 and less than or equal to 1, and a reward mechanism is defined to excite all nodes in the optimal network architecture of the generation. The smaller the value of ω is, the larger heuristic information η is, and since no evolution is generated in the initial network search space, the initial value of ω is set to 1.

Step 3.3: and (3) probability path selection: in each step of the path selection of the subspace search process, a pheromone heuristic factor is usedInfluence of expected value heuristic factor, according to probability

And deciding which way to move next.

Represents the probability that the ant m moves from the point I to the point J at the t-th moment, allowed_mRepresenting nodes which can be selected by ants in the next step; the pheromone heuristic factor alpha represents the effect of the residual pheromone on the path in the optimizing process, and the larger the value is, the stronger the cooperation capability among the ants is, and the paths passed by other ants are prone to be selected; and the expected value heuristic factor beta shows the accuracy and the degree of importance of the reasoning delay when ants select paths, and the larger the value is, the closer the state transition rule is to the greedy rule. Tau is_I，JAnd (t) represents pheromone on the path from point I to point J at time t. Tau is_I,S(t) indicates points I to allowed_mThe pheromone on the path of any point in the tree.

Step 3.4: dynamic volatilization of pheromones: the volatilization speed of pheromones in nature is a dynamic changing process and can be changed by environmental factors such as temperature, humidity and the like along with the lapse of time. In the optimization process of the neural network architecture, the more complex the structure of the nodes, the smaller the heuristic information is; the larger the reasoning time delay, the faster the volatilization speed of the node is, and the less the pheromone remains, so that after all ants reach the end point, the pheromone dynamic volatilization coefficient rho is calculated according to the following relation,

ρ_I，J(t) represents the volatility coefficient on the path from I to J at time t, eta_I，J(t) indicates heuristic information on the path from time I to J at t,

η_irepresenting all the initiation information and L representing the total number of nodes under the network.

Step 3.5: pheromone increment calculation: the pheromone updating model is an important link of random search and rapid convergence of a basic ant colony algorithm. According to the global optimization requirement, the ant surrounding model is modified and formalized as follows:

q is a pheromone strength constant, which is the total amount of pheromones released by ants on the path traveled in a cycle, and affects the convergence rate of the algorithm to some extent. Eta_mRepresenting the total amount of heuristic information the mth ant receives in the cycle.

Step 3.6: and (3) updating pheromone: the pheromone amount on each path is equal at the initial moment, and after an ant completes one cycle, the pheromone gradually volatilizes along with the time, so that the pheromone concentration needs to be updated before the ant enters the next cycle, and the pheromone concentration is formalized as follows:

τ_I，J(t+1)＝(1-ρ)τ_I，J(t)+Δτ_I，J(t，t+1)

p is the dynamic volatility coefficient of the pheromone,

represents the amount of pheromone, delta tau, left on the path (I, J) by the mth ant in the current cycle_I，J(t, t +1) represents the pheromone increment left by all ants passing through path (I, J) in this cycle. K represents the total number of ants passing through the path (I, J) in the current cycle.

Step 3.7: and (4) optimizing and judging: when the optimization of all the search subspaces reaches the maximum cycle times, the circulation is exited, the optimization results of all the search subspaces are output as a current candidate set, and the step 4 is entered; otherwise, the heuristic information calculation is carried out for 3.2, and the circulation is continued.

And 4, step 4: and (4) target evaluation. Acquiring the accuracy and reasoning time delay of N optimizing paths in the candidate set through training and testing, selecting an optimal result as an optimal network architecture of the generation, if the optimal result does not reach the evolution times and does not meet the final target requirement but meets the speed and precision requirements, temporarily storing the optimizing result, entering the step 5, and otherwise discarding the search subspace; and if the maximum evolution times are reached and the target requirements are met, sequencing the optimal network architecture results of each generation in all the evolution processes, entering the step 6, otherwise, exiting the searching process and outputting search abnormity.

And 5: and (5) evolving the mutation. All the sub-network modules in the optimal network architecture are rewarded, namely, an incentive factor omega is set to be a constant, omega is more than or equal to 0 and less than or equal to 1, and the smaller the value of omega is, the larger the incentive effect is. And meanwhile, randomly selecting mutation operation in the mutation set, developing the internal structure of the sub-network module, generating a next generation sub-network module, and returning to the step 2.

Step 6: and outputting a final result. And outputting the optimal neural network architecture.

Those skilled in the art will appreciate that, in addition to implementing the systems, apparatus, and various modules thereof provided by the present invention in purely computer readable program code, the same procedures can be implemented entirely by logically programming method steps such that the systems, apparatus, and various modules thereof are provided in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system, the device and the modules thereof provided by the present invention can be considered as a hardware component, and the modules included in the system, the device and the modules thereof for implementing various programs can also be considered as structures in the hardware component; modules for performing various functions may also be considered to be both software programs for performing the methods and structures within hardware components.

The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.

Claims

1. A neural network architecture searching method based on evolutionary computing is characterized by comprising the following steps:

2. The evolutionary computing-based neural network architecture search method of claim 1, wherein the step S1 comprises:

the objective function formula is as follows:

s.t.LAT(Net)≤T&ACC(Net)≥A

3. The evolutionary computing-based neural network architecture search method of claim 1, wherein the step S2 comprises:

4. The evolutionary computing-based neural network architecture search method of claim 3, wherein the network search space comprises: the network search space comprises N search subspaces;

wherein ,

representing the i-th generation of a collection of sub-network modules,

represents the jth sub-network module in the ith generation,

5. The evolutionary computing-based neural network architecture search method of claim 1, wherein the step S3 comprises:

step S3.2: calculating heuristic information;

wherein ,η_I，J(t) heuristic information from node I to node J at time t; dep_I，J、Wig_I，J、Con_I，J、Fil_I，JRespectively representing the depth, the width, the connectivity and the number of filters of the joint J, wherein omega represents an excitation factor, and omega is more than or equal to 0 and less than or equal to 1; prize-giving deviceThe excitation mechanism is defined to excite all nodes in the current optimal network architecture; the smaller the omega value is, the larger heuristic information eta is, and the initial value of omega is set to be 1 because evolution is not generated in the initial network search space;

step S3.3: selecting a probability path;

wherein ,

representing the probability that the ant m moves from the point I to the point J at the t-th moment; allowed_mRepresenting nodes which can be selected by ants in the next step; alpha represents pheromone elicitation factors, represents the effect of residual pheromones on the paths in the optimizing process, and the larger the value is, the stronger the cooperation capability among ants is, and the paths passed by other ants are prone to be selected; beta represents an expected value heuristic factor, which shows the accuracy and the degree of importance of reasoning time delay when ants select paths, and the larger the value is, the closer the state transition rule is to the greedy rule; tau is_I，J(t) pheromones on the path from point I to point J at time t; tau is_I，S(t) indicates points I to allowed_mPheromone on any point path;

step S3.4: the pheromone is volatilized dynamically;

representing all initiation information, wherein L represents the total number of nodes in the current network;

step S3.5: performing pheromone increment calculation;

step S3.6: updating pheromone;

τ_I，J(t+1)＝(1-ρ)τ_I，J(t)+Δτ_I，J(t，t+1)

wherein rho is the pheromone dynamic volatility coefficient;

6. The evolutionary computing-based neural network architecture searching method of claim 1, wherein the evolving mutations in step S5 comprise: setting the excitation factors omega in all the sub-network modules in the current optimal network architecture as constants, wherein omega is more than or equal to 0 and less than or equal to 1; and simultaneously, randomly selecting mutation operation in the mutation set, promoting the internal structure of the sub-network module, generating a next generation sub-network module, and repeatedly executing the steps S2 to S5 until the preset target requirement is met.

7. An evolutionary computing-based neural network architecture search system, comprising:

8. The evolutionary computing-based neural network architecture search system of claim 7, wherein the module M1 comprises:

the objective function formula is as follows: