CN110490320B - Deep neural network structure optimization method based on fusion of prediction mechanism and genetic algorithm - Google Patents
Deep neural network structure optimization method based on fusion of prediction mechanism and genetic algorithm Download PDFInfo
- Publication number
- CN110490320B CN110490320B CN201910696239.XA CN201910696239A CN110490320B CN 110490320 B CN110490320 B CN 110490320B CN 201910696239 A CN201910696239 A CN 201910696239A CN 110490320 B CN110490320 B CN 110490320B
- Authority
- CN
- China
- Prior art keywords
- network
- individual
- training
- code
- population
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/12—Computing arrangements based on biological models using genetic models
- G06N3/126—Evolutionary algorithms, e.g. genetic algorithms or genetic programming
Abstract
The invention discloses a deep neural network structure optimization method based on the fusion of a prediction mechanism and a genetic algorithm, which is used for solving the technical problem of low search efficiency of the conventional network structure search method. The technical scheme is that a deep network structure is coded and expressed to form a network structure code, and then the network structure code is randomly generated to be used as the initial generation of a genetic algorithm; then, carrying out selection, crossing, mutation and prediction processes on individuals in the initial generation, and only carrying out actual training on a network corresponding to an individual with higher expected performance; finally, all individual performances are evaluated and the next round of selection is entered. And after the algorithm is finished, selecting the individual with the best fitness as the optimal network structure under the specific task. By predicting the network performance before the actual training of the network, the time cost for training the search algorithm on the low-price network can be reduced, and the search process of the search algorithm is greatly accelerated.
Description
Technical Field
The invention relates to a network structure searching method, in particular to a deep neural network structure optimization method based on the fusion of a prediction mechanism and a genetic algorithm.
Background
Document 1 "Lingxi Xie, Alan Yuille, Genetic CNN. computer Vision and Pattern Recognition (2017)" proposes a network structure searching method based on Genetic algorithm, which introduces Darwinian theory of evolution, considers the network structure as an individual in a population, and continuously updates the network structure through the processes of selection, intersection, variation and evaluation. However, the network structure search method requires complete training of the network before evaluating the network performance, which consumes a lot of time and computing resources.
Document 2, "Bowen Baker, Otkrist Gupta1, Ramesh rake: additive Neural Architecture Search Performance prediction, international Conference on Learning recovery (2018)", predicts the final Performance of the network by using the time sequence information of the network training earlier stage, and introduces an "Early Stop" mechanism to terminate the training process of the network with poor effect in advance. Although the method has a certain acceleration effect on the network search algorithm, the method still needs to carry out partial training on the network, thereby limiting the acceleration effect on the structure search algorithm.
Disclosure of Invention
In order to overcome the defect of low searching efficiency of the conventional network structure searching method, the invention provides a deep neural network structure optimization method based on the fusion of a prediction mechanism and a genetic algorithm. The method randomly generates neural networks with different structures to carry out complete training, and trains a network performance prediction model by utilizing information in the network training process; in the network structure searching stage, firstly, coding expression is carried out on a deep network structure to form a network structure code, and then, the network structure code is randomly generated to be used as the initial generation of a genetic algorithm; then, carrying out selection, crossing, mutation and prediction processes on the individuals in the primary generation, and only carrying out actual training on the network corresponding to the individual with higher expected performance; finally, all individual performances are evaluated and the next round of selection is entered. And after the algorithm is finished, selecting the individual with the best fitness as the optimal network structure under the specific task. By predicting the network performance before the actual training of the network, the time cost for training the search algorithm on the low-price network can be reduced, and the search process of the search algorithm is greatly accelerated.
The technical scheme adopted by the invention for solving the technical problems is as follows: a deep neural network structure optimization method based on the fusion of a prediction mechanism and a genetic algorithm is characterized by comprising the following steps:
step one, data preprocessing:
firstly, defining image classification database X ═ X 1 ,x 2 ...x n } T ∈R n×b ,x n ∈R 1×b Representing the nth sample data; the class label vector is Y ═ Y 1 ,y 2 ...y n } T ∈R n×l ,y n ∈R 1×l Is a one-hot label of the nth sample data, where N ═ 1,2.. N }, N is the total number of samples, l represents the total number of classes of the samples, and b represents the spectral dimension; each sample in the image classification database X is then normalized to a range of 0-1, and N is randomly selected therefrom train Obtaining training data X by individual sample data and class labels thereof train And its corresponding category label Y train Wherein N is train < N. In addition, the rest data and labels in the data set are all classified into a test set, and the data and labels are respectively marked as X test And Y test 。
Step two, determining a coding rule of a network structure:
firstly, M different network structures are generated, wherein the structure code of the mth neural network is C m The code includes S stages, i.e.WhereinIs the coding segment of the s-th stage. The stage comprises K s Each node represents a mixed operation composed of convolution, batch normalization and ReLU activation, and is recorded asThe nodes with small numbers in the same stage are connected to the nodes with large numbers, and the connection mode between the nodes is usedBit binary encoding for representation. Wherein the 1 st bit is binary coded to represent (v) s,1 ,v s,2 ) The bit is 1 if there is connection, and is 0 if there is no connection; the next two bits represent three nodes (v) s,1 ,v s,3 ),(v s,2 ,v s,3 ) The situation of the connection between them. Setting S to 3, K 1 =3,K 2 =4,K 3 Network structure code length is 19 bits, i.e. 5
Step three, collecting training data of the network performance prediction model:
randomly generating m mutually different structural codes C 1 ,C 2 ,...,C m And after automatic compilation, the depth network corresponding to the code is completely trained on a specified data set. The Adam optimizer is used for training to learn the network parameters, and the training is iterated for T times. When the network is trained in a batch size, recording the iteration times t of the current network and the classification accuracy Ag on the verification set t And taking the data as data required by the prediction model training: data ═ C m ,t,Ag t ],t={1,2...T}。
Step four, constructing and training a network performance prediction model:
defining a network performance prediction model f, inputting a structure code C into the model and mapping mu, and measuring the accuracy rate Ap of the structure neural network on a test set after t times of iterative training by the model t Namely:
Ap t =f(μ(C m ),t) (2)
in the mapping phase, the model maps the structure code C into a network structure code group consisting of s structure codesWherein p is s First, theBit to bitThe value of each bit is equal to the value of the corresponding position of the original structure code, and the rest positions are filled with zero values, namely:
wherein p is s [idx]And C [ idx ]]Coding p for a structure s And the idx-th bit of C.
After the structure code is mapped, p is mapped 1 ,p 2 ...p s And sequentially inputting a single-layer long and short term memory network with the hidden layer size of 128 and finally obtaining the hidden state h of the long and short term memory network unit, wherein the hidden state h is called as a network structure characteristic. Meanwhile, inputting the iteration times t into a multilayer perceptron consisting of a full-link layer with the size of (1,64), a ReLU activation function layer, a full-link layer with the size of (64,32) and a full-link layer with the size of (32,1), and obtaining the contribution D of the iteration times to the final classification accuracy of the network t 。
Degree of contribution D t Element-by-element multiplication is carried out with the structural feature h of the network:
h[id]=D t ×h[id],id={1,2,...,len(h)} (4)
and inputting the calculation result into a small-sized full-connection module. It comprises a full-junction layer with the size of (128 ), a random inactivation layer with inactivation probability of 0.5, a ReLU kinaseAn activity function layer, a full link layer of size (128,32), a ReLU activation function layer, and a full link layer of size (32, 1). The output result of the full connection module is the predicted value Ap of the final classification accuracy of the current network t 。
Before training the performance prediction network, randomly initializing network parameters, and solving the following optimization problem by using a back propagation algorithm to learn the network parameters to obtain the optimal parameters theta of the network:
wherein | · | purple sweet 2 Is the norm of L2.
Step five, initializing a genetic algorithm:
setting parameters of genetic algorithm, including population individual number G N Number of iteration rounds G T Probability of mutation G M Cross probability G C Variation parameter q M Cross parameter q C And threshold fit mgn And randomly generating G N Coding of a structureAs initial population Ge 0 The initial generation population is marked as 0 th generation, and the ith individual in the population is marked asThen, the score of each individual in the population is evaluated to obtain the score of the individualRecording the current highest accuracy as fit max 。
Step six, selecting the individuals:
the selection operation is directed to each individual in the previous generation population. The method is Ge of the previous generation population j-1 ,j=1,2...G T According to the rules of Russian roulette, according to the individual scoresSelecting a new generation of Ge population j (ii) a The higher the individual score, the greater the probability of being selected and retained to the next generation.
Step seven, performing cross operation on the individuals:
interleaving operations for encoding each stage of an individual within a groupAccording to G between every two individuals in the population C Probability crossing, the operation of which is that the code string of three stages in two individuals is according to q C The exchange of probabilities occurs.
Step eight, performing mutation operation on individuals:
the mutation operation aims at each bit of the individual code, and the mutation is represented by that each binary digit on the individual code is according to the probability q M Inversion occurs, i.e., from 0 to 1 or from 1 to 0.
Step nine, predicting the performance of the network corresponding to the individual:
inputting the iteration times of the network structure coding and training ending into the network performance prediction model to obtain the expected score of each individual in the populationI.e. the expected classification accuracy after the network has been fully trained.
Step ten, evaluating the individual:
will score the expected scoreFit with the current best score max And (6) comparing. If it isThe algorithm will test the network on the test set after it has been fully trained, and take the actual performance on the test set as the actual score of the individualIf it isThen no actual training of the network is performed and only the lower expected performance is taken as the score for that individualAfter the evaluation is finished, the current best individual score fit is updated max And returning to the step six until the total iteration number is more than T. And obtaining the optimal network structure after the algorithm is finished.
The invention has the beneficial effects that: the method randomly generates neural networks with different structures to carry out complete training, and trains a network performance prediction model by utilizing information in the network training process; in the network structure searching stage, firstly, coding expression is carried out on a deep network structure to form a network structure code, and then, the network structure code is randomly generated to be used as the initial generation of a genetic algorithm; then, carrying out selection, crossing, mutation and prediction processes on the individuals in the primary generation, and only carrying out actual training on the network corresponding to the individual with higher expected performance; finally, all individual performances are evaluated and the next round of selection is entered. And after the algorithm is finished, selecting the individual with the best fitness as the optimal network structure under the specific task. By predicting the network performance before the actual training of the network, the time cost of training the search algorithm on the low-price network can be reduced, and the search process of the search algorithm is greatly accelerated.
Because the network performance prediction model is introduced into the deep neural network structure optimization method based on the genetic algorithm, the network performance can be predicted by the algorithm before the actual training of the network, and the actual training process of the network with poor expected performance is cancelled, so that the time consumption of the structure optimization algorithm is greatly reduced. Compared with the network structure searching algorithm based on the genetic algorithm in the background art, the method has the advantage that the searching speed is improved by 55% on the premise of keeping the searched network performance similar.
The present invention will be described in detail with reference to the following embodiments.
Detailed Description
The deep neural network structure optimization method based on the fusion of the prediction mechanism and the genetic algorithm specifically comprises the following steps:
1. and (4) preprocessing data.
Defining an image classification database X ═ { X ═ X 1 ,x 2 ...x n } T ∈R n×b The class label vector is Y ═ Y 1 ,y 2 ...y n } T ∈R n×l Wherein x is n ∈R 1×b Represents the nth sample data, y n ∈R 1×l Is a one-hot label of the nth sample data, where N ═ 1,2.. N }, N is the total number of samples, l represents the total number of classes of the samples, and b represents the spectral dimension; normalizing each sample in the hyperspectral image data X to be in the range of 0-1, and randomly selecting N from the samples train Obtaining training data X by individual sample data and class labels thereof train And its corresponding category label Y train Wherein N is train < N. In addition, the rest data and labels in the data set are all classified into a test set, and the data and labels are respectively marked as X test And Y test 。
2. And determining a deep network structure coding rule.
In order to optimize the deep network structure, the topological structure of the deep network structure needs to be represented by coding. The network is divided into a plurality of stages in the coding process, parameters (channel number, convolution kernel size and the like) of convolution operation in the same stage are kept unchanged, and different stages are connected through pooling operation. Each stage of the deep network comprises a plurality of nodes with ordered numbers, and each node represents a mixed operation of convolution, batch standardization and ReLU activation; the small-number nodes in the same stage can be connected to the large-number nodes, and the connection mode among the nodes represents the flowing condition of data in the network in the stage.
M different network structures are generated in the network structure optimization process, and the structure of the mth (M ═ {1,2.., M }) neural network is coded as C m The code includes S stages, i.e.WhereinIs the code segment of the S (S) {1,2., S }). The s-th stage in the coding comprising K s A node, is marked asTherefore, this stage needs to be usedA bit binary code (hereinafter, a bit binary code is referred to as a bit) represents a connection relationship between nodes. Wherein the 1 st bit represents (v) s,1 ,v s,2 ) The bit is 1 if there is connection, and is 0 if there is no connection; the next two bits represent three nodes (v) s,1 ,v s,3 ),(v s,2 ,v s,3 ) The situation of the connection between them. In the experiment, S is 3, K 1 =3,K 2 =4,K 3 The total length of the network structure code is 19 bits, that is:
where len () represents the length of the code (i.e., the number of bits in the binary code).
3. And collecting training data of the network performance prediction model.
Randomly generating m mutually different structural codes C 1 ,C 2 ,...,C m . After the codes are generated, the codes are automatically compiled into calculation graphs, and then the depth networks corresponding to the calculation graphs are carried out on the specified data setsAnd (4) complete training. The network parameters are learned by using an Adam optimizer, and the parameters of the optimizer are set to be a learning rate alpha of 0.001 and an exponential decay factor beta 1 =0.9,β 2 0.999. The training process is iterated for T times. Meanwhile, in the training process, every time the network is trained in a batch size, the iteration times t of the current network experience and the classification accuracy rate Ag on the verification set need to be recorded t After arrangement, data [ C ] needed by the prediction model training is obtained m ,t,Ag t ],t={1,2...T}。
4. And constructing and training a network performance prediction model.
Recording the network performance prediction model as f, the model firstly encodes the structure C m Mapping mu is performed, and then the mapping result mu (C) can be obtained m ) Predicting the accuracy rate Ap of the structural neural network on the test set after t times of iterative training t Namely:
Ap t =f(μ(C m ),t) (2)
the specific structure of the prediction model is as follows:
(a) structure code mapping
In the mapping phase, the model maps a single structure code C into a network structure code group consisting of s structure codesNoting the mapping process as μ, the mapping for encoding the structure can be expressed as:
for a structure-coded set:
wherein ps isBit to bitThe value of each bit is equal to the value of the corresponding position of the original structure code, and the rest positions are filled with zero values. The invention marks the value of idx position of p and C of the structural code as p [ idx ]]And C [ idx ]]Then the mapping can be expressed as:
(b) network performance prediction model f:
the structure code is mapped to obtain a structure code groupThen, p1, p2... ps are input into a single-layer long-short term memory network (LSTM) with hidden layer size of 128 in sequence, and finally a one-dimensional array h with length of 128 is obtained, which is called the network structure characteristic of the predicted network.
And inputting the iteration times t into the multilayer perceptron while obtaining the network structure characteristics h. The multi-layered perceptron consists of a fully-connected layer of size (1,64), a ReLU activation function layer, a fully-connected layer of size (64,32), and a fully-connected layer of size (32, 1). Outputting a scalar value by the multi-layer perception opportunity, thereby giving the contribution D of the iteration number to the final classification accuracy of the network t 。
Then the contribution degree D t Element-by-element multiplication is performed with the structural feature h of the network, and the operation can be expressed as:
h[id]=D t ×h[id],id={1,2,...,len(h)} (4)
and passing the operation result through a small-sized full-connection module. The full-link module is composed of a full-link module with the size of (128 ), a random deactivation layer with the deactivation probability of 0.5, a ReLU activation function layer, a full-link layer with the size of (128,32), a ReLU activation function layer and a full-link layer with the size of (32,1) which are sequentially connected. All-purposeThe output result of the connection module is the predicted value Ap of the final classification accuracy of the current network t 。
Before using a network performance prediction model to guide the network optimization process, random initialization needs to be performed on network parameters, and a back propagation algorithm is used to solve the following optimization problem for network training, so as to obtain the optimal parameter theta of the network:
wherein r is the number of samples contained in a single training batch, | · | | computationally 2 Is the norm of L2.
5. And initializing a genetic algorithm.
First, a parameter of the genetic algorithm, i.e. the number of population individuals G, is determined N Number of iteration rounds G T Probability of variation G M Cross probability G C Variation parameter q M Cross parameter q C And threshold fit mgn . Random generation of G N Coding of a structureAs 0 th generation initial population Ge 0 The ith individual (i.e., the ith structural code) in the population is notedAnd then, carrying out complete training on the deep network corresponding to each individual in the population, and after the test set tests, taking the classification accuracy of the network as the score of the individualRecording the current highest accuracy as fit max 。
6. And carrying out selection operation on the individual.
Then, the selection operation O needs to be carried out on the individuals in the population s . In the j-1 generation Ge population j-1 ,j=1,2...G T Selecting the j generation Ge population according to the rule of Russian roulette j (ii) a The selection is based on the score of each individual in the current populationBy using the russian roulette approach, individuals with higher scores have a greater probability of remaining in the next generation, and the process is iterated.
7. And performing cross operation on the individuals.
For individuals in the population, the probability is G C Parameter is q C The interleaving operation of (2); the interleaving process is directed to a segment of the code string for each stage in the individualBetween each two individuals in the population according to G C Probability is crossed, and the specific operation of the cross is that the code strings of three stages in two individuals are crossed according to q C The exchange of probabilities occurs.
8. Performing mutation operation on individuals.
For individuals without crossover, the probability is G M Is represented by each binary digit on the individual code string according to the probability q M Inversion occurs, i.e., from 0 to 1 or from 1 to 0. The mutation process is directed to the change of a single binary digit.
9. And predicting the performance of the network corresponding to the individual.
Inputting the iteration times when the network structure coding and training are finished into the network performance prediction model to obtain the expected score of each individual in the populationI.e. the expected classification accuracy after the network has been fully trained.
10. And performing evaluation operation on the individual.
Obtained in step 8After the expected score of the individual is obtained, the expected score is obtainedFit with the current best score max And (6) comparing. If it isThe expected performance of the individual is better, the algorithm can fully train the individual and then test the individual on the test set, and the actual performance on the test set is used as the actual score of the individual. If it isThis indicates that the expected performance of the individual is poor. For an individual with poor expected performance, the algorithm is not actually trained, and only the lower expected performance is taken as the score of the individualAfter the evaluation is finished, the current best individual score fit is updated max And returning to the step 6 until the total iteration number of the algorithm is more than G T Until now. After the algorithm is finished, the optimal network structure can be given.
The method has better acceleration effect on the optimization tasks of various image classification network structures. Taking the optimization process of the classification network structure on the Pa via University data set as an example, the traditional network structure optimization method based on the genetic algorithm needs 0.99 hour to provide the optimal deep network structure with the classification accuracy rate of 89.1%; the method can provide the optimal deep network structure with the classification accuracy rate of 88.6% only in 0.635 hour. Therefore, the deep neural network structure optimization method based on the fusion of the prediction mechanism and the genetic algorithm can greatly accelerate the structure optimization process, and the classification accuracy of the finally searched network optimal structure on the designated data set is almost the same as the result of the traditional network structure optimization method based on the genetic algorithm.
Claims (1)
1. A deep neural network structure optimization method based on the fusion of a prediction mechanism and a genetic algorithm is characterized by comprising the following steps:
step one, data preprocessing:
first, an image classification database X is defined as X 1 ,x 2 ...x n } T ∈R n×b ,x n ∈R 1×b Representing the nth sample data; the class label vector is Y ═ Y 1 ,y 2 ...y n } T ∈R n×l ,y n ∈R 1×l Is a one-hot tag of the nth sample data, where N ═ {1,2.. N }, N is the total number of samples, l denotes the total number of categories of the samples, and b denotes the spectral dimension; each sample in the image classification database X is then normalized to a range of 0-1, and N is randomly selected therefrom train Obtaining training data X by individual sample data and class labels thereof train And its corresponding category label Y train Wherein N is train < N; in addition, the rest data and labels in the data set are all classified into a test set, and the data and labels are respectively marked as X test And Y test ;
Step two, determining a coding rule of a network structure:
firstly, M different network structures are generated, wherein the structure code of the mth neural network is C m The code includes S stages, i.e.WhereinIs the coding segment of the s stage; the stage comprises K s Each node represents a mixed operation composed of convolution, batch normalization and ReLU activation, and is recorded asThe nodes with small numbers in the same stage are connected to the nodes with large numbers, and the connection mode between the nodes is usedBit binary coding for representation; wherein the 1 st bit is binary coded to represent (v) s,1 ,v s,2 ) The connection condition between the two is that if the connection exists, the bit is 1, and if the connection does not exist, the bit is 0; the next two bits represent three nodes (v) s,1 ,v s,3 ),(v s,2 ,v s,3 ) The connection condition between the two; setting S to 3, K 1 =3,K 2 =4,K 3 Network structure code length is 19 bits, i.e. 5
Where "len () represents the length of the structure code in parentheses;
step three, collecting training data of the network performance prediction model:
randomly generating m mutually different structural codes C 1 ,C 2 ,...,C m After automatic compilation, the depth network corresponding to the code is completely trained on a specified data set; training and learning network parameters by using an Adam optimizer, and training for T times in total; when the network is trained in a batch size, recording the iteration times t of the current network and the classification accuracy Ag on the verification set t And taking the data as data required by the prediction model training: data ═ C m ,t,Ag t ],t={1,2...T};
Step four, constructing and training a network performance prediction model:
defining a network performance prediction model f, inputting a structure code C into the model and mapping mu, and measuring the accuracy rate Ap of the neural network of the structure on a test set after t times of iterative training by the model t Namely:
Ap t =f(μ(C m ),t) (2)
in the mapping phase, the model maps the structure code C into a network structure code group consisting of s structure codesWherein p is s First, theBit to bitThe value of each bit is equal to the value of the corresponding position of the original structure code, and the rest positions are filled with zero values, namely:
wherein p is s [idx]And C [ idx ]]Coding p for a structure s And the value at idx-th bit of C;
after the structure code is mapped, p is mapped 1 ,p 2 ...p s Sequentially inputting a single-layer long and short term memory network with a hidden layer size of 128 and finally obtaining a hidden state h of a long and short term memory network unit, wherein the hidden state h is called a network structure characteristic; meanwhile, inputting the iteration times t into a multilayer perceptron consisting of a full-link layer with the size of (1,64), a ReLU activation function layer, a full-link layer with the size of (64,32) and a full-link layer with the size of (32,1), and obtaining the contribution D of the iteration times to the final classification accuracy of the network t ;
Degree of contribution D t Element-by-element multiplication is carried out with the structural feature h of the network:
h[id]=D t ×h[id],id={1,2,...,len(h)} (4)
inputting the calculation result into a small-sized full-connection module; it contains a full-link layer of size (128 ), a random deactivation layer with deactivation probability of 0.5, a ReLU activation function layer, a full-link layer of size (128,32), a ReLU activation function layer and a full-link layer of size (32, 1); the output result of the full connection module is the predicted value Ap of the final classification accuracy of the current network t ;
Before training the performance prediction network, randomly initializing network parameters, and solving the following optimization problem by using a back propagation algorithm to learn the network parameters to obtain the optimal parameters theta of the network:
wherein | · | purple sweet 2 Is the norm of L2;
step five, initializing a genetic algorithm:
setting parameters of genetic algorithm, including population individual number G N Number of iteration rounds G T Probability of mutation G M Cross probability G C Coding parameter q M Cross parameter q C And threshold fit mgn And randomly generating G N Coding of a structureAs initial population Ge 0 The initial generation population is marked as 0 th generation, and the ith individual in the population is marked asThen, the score of each individual in the population is evaluated to obtain the score of the individualRecording the current highest accuracy as fit max ;
Step six, selecting the individuals:
selecting operation is directed to each individual in the previous generation population; the method is Ge of the previous generation population j-1 ,j=1,2...G T According to the rules of Russian roulette, according to the individual scoresSelecting a new generation of Ge population j (ii) a The higher the individual score is, the greater the probability of being selected and retained to the next generation;
step seven, performing cross operation on the individuals:
interleaving encodes for each stage of an individual within a populationBetween each two individuals in the population according to G C Probability crossing, the operation of which is that the code string of three stages in two individuals is according to q C Exchanging probability;
step eight, carrying out mutation operation on individuals
The mutation operation aims at each bit of the individual code, and the mutation is represented by that each binary digit on the individual code is according to the probability q M Inversion occurs, i.e., from 0 to 1 or from 1 to 0;
step nine, predicting the performance of the network corresponding to the individual:
inputting the iteration times of the network structure coding and training ending into the network performance prediction model to obtain the expected score of each individual in the populationNamely the expected classification precision after the network is fully trained;
step ten, evaluating the individual:
will score the expectationFit with the current best score max Comparing; if it isThe algorithm will fully train the network and then test it on the test set, and take the actual performance on the test set as the actual score of the individualIf it isThen no actual training of the network is performed and only the lower expected performance is taken as the score for that individualAfter the evaluation is finished, the current best individual score fit is updated max And returning to the step six until the total iteration times are more than T; and obtaining the optimal network structure after the algorithm is finished.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910696239.XA CN110490320B (en) | 2019-07-30 | 2019-07-30 | Deep neural network structure optimization method based on fusion of prediction mechanism and genetic algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910696239.XA CN110490320B (en) | 2019-07-30 | 2019-07-30 | Deep neural network structure optimization method based on fusion of prediction mechanism and genetic algorithm |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110490320A CN110490320A (en) | 2019-11-22 |
CN110490320B true CN110490320B (en) | 2022-08-23 |
Family
ID=68548791
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910696239.XA Active CN110490320B (en) | 2019-07-30 | 2019-07-30 | Deep neural network structure optimization method based on fusion of prediction mechanism and genetic algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110490320B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111415009B (en) * | 2020-03-19 | 2021-02-09 | 四川大学 | Convolutional variational self-encoder network structure searching method based on genetic algorithm |
CN112084877B (en) * | 2020-08-13 | 2023-08-18 | 西安理工大学 | NSGA-NET-based remote sensing image recognition method |
CN112001485B (en) * | 2020-08-24 | 2024-04-09 | 平安科技(深圳)有限公司 | Group convolution number searching method and device |
CN112183749B (en) * | 2020-10-26 | 2023-04-18 | 天津大学 | Deep learning library test method based on directed model variation |
CN114842328B (en) * | 2022-03-22 | 2024-03-22 | 西北工业大学 | Hyperspectral change detection method based on collaborative analysis autonomous perception network structure |
CN114943866B (en) * | 2022-06-17 | 2024-04-02 | 之江实验室 | Image classification method based on evolutionary neural network structure search |
CN115994575B (en) * | 2023-03-22 | 2023-06-02 | 方心科技股份有限公司 | Power failure diagnosis neural network architecture design method and system |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102915445A (en) * | 2012-09-17 | 2013-02-06 | 杭州电子科技大学 | Method for classifying hyperspectral remote sensing images of improved neural network |
CN103971162A (en) * | 2014-04-04 | 2014-08-06 | 华南理工大学 | Method for improving BP (back propagation) neutral network and based on genetic algorithm |
CN105303252A (en) * | 2015-10-12 | 2016-02-03 | 国家计算机网络与信息安全管理中心 | Multi-stage nerve network model training method based on genetic algorithm |
CN106503802A (en) * | 2016-10-20 | 2017-03-15 | 上海电机学院 | A kind of method of utilization genetic algorithm optimization BP neural network system |
US9785886B1 (en) * | 2017-04-17 | 2017-10-10 | SparkCognition, Inc. | Cooperative execution of a genetic algorithm with an efficient training algorithm for data-driven model creation |
CN108021983A (en) * | 2016-10-28 | 2018-05-11 | 谷歌有限责任公司 | Neural framework search |
CN108229657A (en) * | 2017-12-25 | 2018-06-29 | 杭州健培科技有限公司 | A kind of deep neural network training and optimization algorithm based on evolution algorithmic |
CN109243172A (en) * | 2018-07-25 | 2019-01-18 | 华南理工大学 | Traffic flow forecasting method based on genetic algorithm optimization LSTM neural network |
CN110020667A (en) * | 2019-02-21 | 2019-07-16 | 广州视源电子科技股份有限公司 | Searching method, system, storage medium and the equipment of neural network structure |
-
2019
- 2019-07-30 CN CN201910696239.XA patent/CN110490320B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102915445A (en) * | 2012-09-17 | 2013-02-06 | 杭州电子科技大学 | Method for classifying hyperspectral remote sensing images of improved neural network |
CN103971162A (en) * | 2014-04-04 | 2014-08-06 | 华南理工大学 | Method for improving BP (back propagation) neutral network and based on genetic algorithm |
CN105303252A (en) * | 2015-10-12 | 2016-02-03 | 国家计算机网络与信息安全管理中心 | Multi-stage nerve network model training method based on genetic algorithm |
CN106503802A (en) * | 2016-10-20 | 2017-03-15 | 上海电机学院 | A kind of method of utilization genetic algorithm optimization BP neural network system |
CN108021983A (en) * | 2016-10-28 | 2018-05-11 | 谷歌有限责任公司 | Neural framework search |
US9785886B1 (en) * | 2017-04-17 | 2017-10-10 | SparkCognition, Inc. | Cooperative execution of a genetic algorithm with an efficient training algorithm for data-driven model creation |
CN108229657A (en) * | 2017-12-25 | 2018-06-29 | 杭州健培科技有限公司 | A kind of deep neural network training and optimization algorithm based on evolution algorithmic |
CN109243172A (en) * | 2018-07-25 | 2019-01-18 | 华南理工大学 | Traffic flow forecasting method based on genetic algorithm optimization LSTM neural network |
CN110020667A (en) * | 2019-02-21 | 2019-07-16 | 广州视源电子科技股份有限公司 | Searching method, system, storage medium and the equipment of neural network structure |
Non-Patent Citations (6)
Title |
---|
ACCELERATING NEURAL ARCHITECTURE SEARCH USING PERFORMANCE PREDICTION;Bowen Baker 等;《ICLR 2018》;20181231;1-19 * |
Genetic CNN;Lingxi Xie 等;《2017 IEEE International Conference on Computer Vision》;20171231;1388-1397 * |
Hyperspectral Image Classification Based on Convolutional Neural Networks With Adaptive Network Structure;Chen Ding 等;《2018 international conference on orange technologies》;20190506;1-5 * |
NSGA-Net: Neural Architecture Search using Multi-Objective Genetic Algorithm;Zhichao Lu 等;《arXiv》;20190418;1-13 * |
动态贝叶斯网络结构搜索法辨识生物神经网络连接;陈晓艳 等;《生命科学研究》;20171231;第21卷(第6期);527-533 * |
遥感影像要素提取的可变结构卷积神经网络方法;王华斌 等;《测绘学报》;20190531;第48卷(第5期);583-596 * |
Also Published As
Publication number | Publication date |
---|---|
CN110490320A (en) | 2019-11-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110490320B (en) | Deep neural network structure optimization method based on fusion of prediction mechanism and genetic algorithm | |
CN108984724B (en) | Method for improving emotion classification accuracy of specific attributes by using high-dimensional representation | |
WO2022083624A1 (en) | Model acquisition method, and device | |
WO2023024412A1 (en) | Visual question answering method and apparatus based on deep learning model, and medium and device | |
US11087086B2 (en) | Named-entity recognition through sequence of classification using a deep learning neural network | |
CN109753571B (en) | Scene map low-dimensional space embedding method based on secondary theme space projection | |
CN111898689A (en) | Image classification method based on neural network architecture search | |
CN112465120A (en) | Fast attention neural network architecture searching method based on evolution method | |
CN111882042B (en) | Neural network architecture automatic search method, system and medium for liquid state machine | |
Tirumala | Evolving deep neural networks using coevolutionary algorithms with multi-population strategy | |
CN114528835A (en) | Semi-supervised specialized term extraction method, medium and equipment based on interval discrimination | |
CN114625882B (en) | Network construction method for improving unique diversity of image text description | |
CN113239897A (en) | Human body action evaluation method based on space-time feature combination regression | |
CN112084877A (en) | NSGA-NET-based remote sensing image identification method | |
Jastrzebska et al. | Fuzzy cognitive map-driven comprehensive time-series classification | |
CN112651499A (en) | Structural model pruning method based on ant colony optimization algorithm and interlayer information | |
CN111461229A (en) | Deep neural network optimization and image classification method based on target transfer and line search | |
CN116167353A (en) | Text semantic similarity measurement method based on twin long-term memory network | |
CN115422945A (en) | Rumor detection method and system integrating emotion mining | |
CN116208399A (en) | Network malicious behavior detection method and device based on metagraph | |
CN114863508A (en) | Expression recognition model generation method, medium and device of adaptive attention mechanism | |
CN115063374A (en) | Model training method, face image quality scoring method, electronic device and storage medium | |
CN112416358B (en) | Intelligent contract code defect detection method based on structured word embedded network | |
CN111259860A (en) | Multi-order characteristic dynamic fusion sign language translation method based on data self-driving | |
Qu et al. | Two-stage coevolution method for deep CNN: A case study in smart manufacturing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |