CN110490320A - Deep neural network structural optimization method based on forecasting mechanism and Genetic Algorithm Fusion - Google Patents

Deep neural network structural optimization method based on forecasting mechanism and Genetic Algorithm Fusion Download PDF

Info

Publication number
CN110490320A
CN110490320A CN201910696239.XA CN201910696239A CN110490320A CN 110490320 A CN110490320 A CN 110490320A CN 201910696239 A CN201910696239 A CN 201910696239A CN 110490320 A CN110490320 A CN 110490320A
Authority
CN
China
Prior art keywords
network
individual
coding
data
population
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910696239.XA
Other languages
Chinese (zh)
Other versions
CN110490320B (en
Inventor
魏巍
徐松正
李威
王聪
张艳宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwest University of Technology
Original Assignee
Northwest University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwest University of Technology filed Critical Northwest University of Technology
Priority to CN201910696239.XA priority Critical patent/CN110490320B/en
Publication of CN110490320A publication Critical patent/CN110490320A/en
Application granted granted Critical
Publication of CN110490320B publication Critical patent/CN110490320B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Physiology (AREA)
  • Genetics & Genomics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The deep neural network structural optimization method based on forecasting mechanism and Genetic Algorithm Fusion that the invention discloses a kind of, for solving the low technical problem of existing network infrastructure searching method search efficiency.Technical solution is to carry out coded representation to depth network structure first, forms network structure coding, then random to generate network structure coding, as the primary of genetic algorithm;Then, the individual in primary selected, intersected, being made a variation and prediction process, and the only corresponding network progress hands-on of individual higher to estimated performance;Finally, assessing all individual performances, and enter the selection operation of next round.After algorithm, selecting the optimal individual of fitness is the network optimum structure under particular task.By predicting before network hands-on network performance, the time cost that searching algorithm is trained on low value network can be reduced, thus the search process of greatly acceleration search algorithm.

Description

Deep neural network structural optimization method based on forecasting mechanism and Genetic Algorithm Fusion
Technical field
The present invention relates to a kind of network structure searching methods, are melted more particularly to one kind based on forecasting mechanism and genetic algorithm The deep neural network structural optimization method of conjunction.
Background technique
" Lingxi Xie, the Alan Yuille:Genetic CNN.Computer Vision and Pattern of document 1 Recognition (2017) " proposes a kind of network structure searching method based on genetic algorithm, this method introduce Darwin into Change and discuss thought, regard network structure as individual in population, network is constantly updated by selection, intersection, variation and evaluation process Structure.However, the network structure searching method before evaluating network performance, needs completely to train network, This process consumes plenty of time and computing resource.
" Bowen Baker, Otkrist Gupta1, the Ramesh Raskar:Accelerating Neural of document 2 Architecture Search using Performance Prediction.International Conference on Learning Representations (2018) " utilizes the time serial message of network training early period to the final performance of network It is predicted, and introduces " Early Stop " mechanism, terminate the training process of the poor network of effect in advance.This method is although right Searching algorithm has certain acceleration, but this method still needs to carry out network part training, to limit To the acceleration effect of search structure algorithm.
Summary of the invention
In order to overcome the shortcomings of that existing network infrastructure searching method search efficiency is low, the present invention provides a kind of based on prediction machine The deep neural network structural optimization method of system and Genetic Algorithm Fusion.This method generate at random the neural network of configurations with It is completely trained, and network performance prediction model is trained using the information of network training process;It is searched in network structure The rope stage carries out coded representation to depth network structure first, forms network structure coding, and then the random network structure that generates is compiled Code, as the primary of genetic algorithm;Then, the individual in primary selected, intersected, being made a variation and prediction process, and is only right The corresponding network of the higher individual of estimated performance carries out hands-on;Finally, assessing all individual performances, and under entrance The selection operation of one wheel.After algorithm, selecting the optimal individual of fitness is the network optimum structure under particular task.It is logical It crosses and network performance is predicted before network hands-on, can reduce what searching algorithm was trained on low value network Time spends, thus the search process of greatly acceleration search algorithm.
The technical solution adopted by the present invention to solve the technical problems is: one kind being based on forecasting mechanism and Genetic Algorithm Fusion Deep neural network structural optimization method, its main feature is that the following steps are included:
Step 1: data prediction:
Image classification data library X=x is defined first1,x2...xn T∈Rn×b,xn∈R1×bIndicate n-th of sample data;Its Class label vector is Y=y1,y2...yn T∈Rn×l, yn∈R1×lIt is the one-hot label of n-th of sample data, n=1, 2...N, N is total sample number, and L indicates that the classification sum of sample, b indicate Spectral dimension;It then will be in the X of image classification data library Each samples normalization is therefrom randomly chosen N to 0~1 rangetrainA sample data and its class label, are trained Data XtrainClass label Y corresponding with itstrain, wherein Ntrain< N.In addition, by remaining data and its mark in data set Label all divide test set into, and data and label are denoted as X respectivelytestWith Ytest
Step 2: determining the coding rule of network structure:
M different network structures are firstly generated, remember that the structured coding of wherein m-th of neural network is Cm, encode interior packet Containing S stage, i.e.,WhereinFor the coding section in s stage.The stage includes KsA node, often A node indicates that one is activated the hybrid manipulation constituted by convolution+batch standardization+ReLU, is denoted asIn same phase Small numbered node is connected to big numbered node, and the connection type between node is usedPosition binary coding is indicated. Wherein, the 1st position binary coding representation (vs,1,vs,2) between connection, if having connection if the bit be 1, if without even Connecing the then bit is 0;Next two bits indicate three node (vs,1,vs,3),(vs,2,vs,3) between connection. Set S=3, K1=3, K2=4, K3=5, it is 19 that network structure, which encodes overall length, i.e.,
Step 3: the training data of collection network performance prediction model:
It is random to generate m mutually different structured coding C1,C2,...,Cm, to the corresponding depth of coding after compiling automatically Network is completely trained on specified data set.Training learns network parameter using Adam optimizer, and training changes altogether For T times.After network undergoes the training of one batch of size, the number of iterations t and point on verifying collection of record current network experience Class accuracy rate Agt, and the data required in this, as prediction model training: data=Cm,t,Agt, t={ 1,2...T }.
Step 4: the building and training of network performance prediction model:
Network performance prediction model f is defined, after carrying out mapping μ to mode input structured coding C and to it, model measures this Artificial neural is in the accuracy rate Ap after t repetitive exercise on test sett, it may be assumed that
Apt=f (μ (Cm),t) (2)
In mapping phase, structured coding C is mapped as the network structure code set being made of s structured coding by modelWherein, PsTheA bit is toThe value of a bit is equal to former knot Structure encodes the value of corresponding position, remaining position is filled with zero, it may be assumed that
Wherein, p [idx] and C [idx] are the value of structured coding p and C the i-th dx.
After being mapped structured coding, by p1,p2...psThe single layer shot and long term that hidden layer size is 128 is sequentially inputted to remember Recall network and finally obtains the hidden state h of shot and long term memory network unit, referred to as network structure feature.Meanwhile it is iteration is secondary Number t input is by a full articulamentum having a size of (1,64), a ReLU activation primitive layer, one having a size of the complete of (64,32) The multi-layer perception (MLP) of articulamentum and a full articulamentum composition having a size of (32,1), obtains the number of iterations and network is finally divided The contribution degree D of class accuracy ratet
By contribution degree DtIt carries out with the structure feature h of network by element multiplication:
H [id]=Dt× h [id], id=1,2 ..., len (h) } (4)
Calculated result is inputted into a small-sized full link block.It includes a full articulamentum having a size of (128,128), The random deactivating layer that one inactivation probability is 0.5, a ReLU activation primitive layer, a full connection having a size of (128,32) Layer, a ReLU activation primitive layer and a full articulamentum having a size of (32,1).The output result of full link block is to work as The predicted value Ap of preceding network final classification accuracy ratet
Before training performance predicts network, random initializtion is carried out to network parameter, and solve using back-propagation algorithm Following optimization problem learns network parameter, obtains the optimized parameter θ of network:
Wherein, | | | |2For L2 norm.
Step 5: initial time genetic algorithm:
The parameter of genetic algorithm, including population at individual number G are setN, iteration wheel number GT, mutation probability GM, crossover probability GC、 Mutation parameter qM, cross parameter qCWith threshold value Amgn, and G is generated at randomNA structured codingAs initial population Ge0, population primary was denoted as the 0th generation, and i-th of individual in population is denoted asThen in population it is each individual score into Row assessment, obtains the score of the individualCurrent highest accuracy rate is denoted as fitmax
Step 6: carrying out selection operation to individual:
Selection operation is for each individual in previous generation population.Method is in previous generation population Gej-1, j=1,2...GT According to Russian roulette rule according to individual scoreSelect the population Ge of a new generationj;Individual score is higher, quilt It chooses and to remain into follow-on probability bigger.
Step 7: carrying out crossover operation to individual:
Coding of the crossover operation for individual each stage in middle groupAll in accordance with G between every two individual in populationC Probability intersects, and the operation of intersection is the sequence of the three phases in two individuals according to qCProbability exchanges.
Step 8: carrying out mutation operation to individual
Mutation operation is directed to each bit of individual UVR exposure, each binary number of variation shown as on individual UVR exposure Word is all in accordance with probability qMIt inverts, i.e., become 1 from 0 or becomes 0 from 1.
Step 9: predicting the performance of individual corresponding network:
The number of iterations at the end of network structure is encoded with training inputs network performance prediction model, obtains every in population The expection score of individualI.e. network train up after expection nicety of grading.
Step 10: carrying out evaluation operation to individual:
It will expected scoreWith current best score fitmaxComparison.IfThen algorithm It is tested on test set after being trained up to the network, and using the actual performance on test set as the individual Practical scoreIfThen without the hands-on of the network, only by lower estimated performance Score as the individualAfter assessment, current optimized individual score fit is updatedmax, and return step six, until total Until the number of iterations is greater than T.Optimum network structure is obtained after algorithm.
The beneficial effects of the present invention are: this method generates the neural network of configurations at random completely to be trained, and Network performance prediction model is trained using the information of network training process;In the network structure search phase, first to depth It spends network structure and carries out coded representation, form network structure coding, it is then random to generate network structure coding, as genetic algorithm It is primary;Then, the individual in primary selected, intersected, being made a variation and prediction process, and only higher to estimated performance The corresponding network of body carries out hands-on;Finally, assessing all individual performances, and enter the selection operation of next round. After algorithm, selecting the optimal individual of fitness is the network optimum structure under particular task.By in the practical instruction of network Network performance is predicted before practicing, the time cost that searching algorithm is trained on low value network can be reduced, thus The search process of very big acceleration search algorithm.
Due to introducing network performance prediction model into the deep neural network structural optimization method based on genetic algorithm, Algorithm predicts network performance before carrying out hands-on to network, and cancels the poor net of estimated performance The hands-on process of network, to greatly reduce the time-consuming of structural optimization algorithm.Net with background technique based on genetic algorithm Network search structure algorithm is compared, and under the premise of keeping similar in the network performance searched out, search speed improves this method 55%.
It elaborates With reference to embodiment to the present invention.
Specific embodiment
The present invention is based on the deep neural network structural optimization method specific steps of forecasting mechanism and Genetic Algorithm Fusion such as Under:
1, data prediction.
Define image classification data library X=x1,x2...xn T∈Rn×b, class label vector is Y=y1,y2...yn T∈Rn ×l, wherein xn∈R1×bIndicate n-th of sample data, yn∈R1×lIt is the one-hot label of n-th of sample data, n=1, 2...N, N is total sample number, and L indicates that the classification sum of sample, b indicate Spectral dimension;By each of hyperspectral image data X After samples normalization to 0~1 range, it is therefrom randomly chosen NtrainA sample data and its class label, obtain training data XtrainClass label Y corresponding with itstrain, wherein Ntrain< N.In addition, by data set remaining data and its label it is complete Portion divides test set into, and data and label are denoted as X respectivelytestWith Ytest
2, depth network structure coding rule is determined.
In order to optimize depth network structure, need to carry out coded representation to the topological structure of depth network structure. Network is divided into multiple stages by cataloged procedure, and the parameter (port number, convolution kernel size etc.) of convolution operation is kept in same phase It is constant, it is then attached by pondization operation between the different stages.In each stage of depth network orderly comprising several The node of number, each node indicate " convolution+batch standardization+ReLU activation " hybrid manipulation;In same phase Small numbered node may be coupled to big numbered node, and connection type between node indicates flowing feelings of the data at this stage in network Condition.
M different network structures will be generated during Topological expansion, note m (m={ 1,2 ..., M }) is a The structured coding of neural network is Cm, interior coding includes S stage, i.e.,WhereinFor s (s= 1,2...S) the coding section in stage.The s stage in coding includes KsA node, is denoted asTherefore should Stage need usingOne binary coding (is known as by position binary coding below One bit) connection relationship node is indicated.Wherein, the 1st bit indicates (vs,1,vs,2) between connection, The bit is 1 if having connection, if the connectionless bit is 0;Next two bits indicate three node (vs,1, vs,3),(vs,2,vs,3) between connection.S=3, K are set in an experiment1=3, K2=4, K3=5, network structure encodes overall length It is 19, it may be assumed that
The wherein length (i.e. binary-coded digit) of len () presentation code.
3, the training data of collection network performance prediction model.
It is random to generate m mutually different structured coding C1,C2,...,Cm.After coding generates, certainly by these codings It is dynamic to be compiled as calculating figure, then the corresponding depth network of these calculating figures is completely trained on specified data set.Training Network parameter is learnt using Adam optimizer, optimizer parameter is set as learning rate α=0.001, the exponential damping factor β1=0.9, β2=0.999.Training whole iteration T times altogether.Simultaneously in the training process, whenever network undergoes one batch of size After training, the number of iterations t and the classification accuracy Ag on verifying collection of record current network experience are requiredt, obtained after arrangement The required data data=C of prediction model trainingm,t,Agt, t={ 1,2...T }.
4, the building and training of network performance prediction model.
Note network performance prediction model is f, and the model is first to structured coding CmMapping μ is carried out, it then can be according to reflecting Penetrate result μ (Cm) artificial neural is predicted in the accuracy rate Ap after t repetitive exercise on test sett, it may be assumed that
Apt=f (μ (Cm),t) (2)
The specific structure of the prediction model is as follows:
(a) structured coding maps
In mapping phase, single structure coding C is mapped as the network structure code set being made of s structured coding by modelNote mapping process is μ, then may be expressed as: to the mapping of structured coding
For structured coding group:
Wherein, psTheA bit is toThe value of a bit is compiled equal to original structure The value of code corresponding position, remaining position is filled with zero.The value of structured coding p and C the i-th dx are denoted as p by the present invention [idx] and C [idx], then the mapping mode may be expressed as:
(b) network performance prediction model f:
It is mapped by structured coding, and obtains structured coding groupIt afterwards, can be by p1,p2...ps In sequence input hidden layer size be 128 single layer shot and long term memory network (LSTM), and finally obtain length be 128 it is one-dimensional Array h, we are referred to as being predicted the network structure feature of network.
While obtaining network structure feature h, the number of iterations t is inputted into multi-layer perception (MLP).The multi-layer perception (MLP) is by one A full articulamentum having a size of (1,64), a ReLU activation primitive layer, one having a size of the full articulamentum of (64,32) and one Full articulamentum composition having a size of (32,1).Multilayer Perception chance exports a scalar value, to provide the number of iterations for net The contribution degree D of network final classification accuracy ratet
Then by contribution degree DtIt carries out with the structure feature h of network by element multiplication, which may be expressed as:
H [id]=Dt× h [id], id=1,2 ..., len (h) } (4)
Operation result is passed through into a small-sized full link block.Full link block is by one having a size of the complete of (128,128) Link block, the random deactivating layer that an inactivation probability is 0.5, a ReLU activation primitive layer, one having a size of (128,32) Full articulamentum, the full articulamentum sequence of a ReLU activation primitive layer and one having a size of (32,1) is connected to form.Full connection The output result of module is the predicted value Ap of current network final classification accuracy ratet
Before instructing network searching process using network performance prediction model, need to carry out network parameter random Initialization, and following optimization problem is solved using back-propagation algorithm to carry out network training, obtain the optimized parameter θ of network:
Wherein, the sample size that r includes by individualized training batch, | | | |2For L2 norm.
5, genetic algorithm initializes.
The parameter of genetic algorithm, i.e. population at individual number G are determined firstN, iteration wheel number GT, mutation probability GM, crossover probability GC, Mutation parameter qM, cross parameter qCWith threshold value Amgn.It is random to generate GNA structured codingIt is initial as the 0th generation Population Ge0, i-th individual (i.e. i-th of structured coding) in population is denoted asIt is then right to individual institute each in population The depth network answered completely is trained, after test set is tested, using the classification accuracy of the network as the individual ScoreCurrent highest accuracy rate is denoted as fitmax
6, selection operation is carried out to individual.
Next it needs to carry out selection operation O to the individual in populationS.In jth -1 generation population Gej-1(j=1,2...GT) According to Russian roulette rule selection jth for population Gej;The foundation of selection is the score of each individual in current populationBy using the mode of Russian roulette, so that the higher individual of score has bigger probability to remain into the next generation, And continuous iteration this process.
7, crossover operation is carried out to individual.
Making probability for the individual in population is GC, parameter qCCrossover operation;Crossover process is directed in individual often The one segment encode string in a stageAll in accordance with G between every two individual in populationCProbability intersects, the concrete operations of intersection According to q between the sequence of the three phases in two individualsCProbability exchanges.
8, mutation operation is carried out to individual.
Carrying out probability for the individual there is no intersection is GMMutation operation, that morphs is embodied in this Each binary digit in body sequence is all in accordance with probability qMIt inverts, i.e., become 1 from 0 or becomes 0 from 1.Mutation process needle Pair be single binary number word change.
9, the performance of individual corresponding network is predicted.
The number of iterations at the end of network structure is encoded with training inputs network performance prediction model, obtains every in population The expection score of individualI.e. network train up after expection nicety of grading.
10, evaluation operation is carried out to individual.
After obtaining the expected score of individual obtained in step 8, by expected scoreWith current best score fitmax Comparison.IfThen illustrate that the estimated performance of the individual is preferable, algorithm can train up it It is tested on test set afterwards, and using the actual performance on test set as the practical score of the individualIfThen illustrate that the estimated performance of the individual is poor.The individual poor for estimated performance, algorithm not into Row hands-on, only using lower estimated performance as the score of the individualAfter assessment, current best is updated Body score fitmax, and return step 6, until total the number of iterations of algorithm is greater than GTUntil.After algorithm, it can provide most Excellent network structure.
This method all has preferable acceleration effect to a variety of image classification Topological expansion tasks.In Pavia For sorter network structure optimization process on University data set, traditional Topological expansion based on genetic algorithm Method needs to spend 0.99 hour to provide the optimal depth network structure that classification accuracy is 89.1%;And our rule only needs The optimal depth network structure that classification accuracy is 88.6% can be provided within 0.635 hour.As it can be seen that proposed by the present invention based on pre- The deep neural network structural optimization method of survey mechanism and Genetic Algorithm Fusion can very big accelerating structure optimization process, and it is final Classification accuracy and traditional network structure based on genetic algorithm of the network optimum structure searched out on specified data set Optimization method result it is almost the same.

Claims (1)

1. a kind of deep neural network structural optimization method based on forecasting mechanism and Genetic Algorithm Fusion, it is characterised in that including Following steps:
Step 1: data prediction:
Image classification data library X=x is defined first1,x2...xn T∈Rn×b,xn∈R1×bIndicate n-th of sample data;Its classification Label vector is Y=y1,y2...yn T∈Rn×l, yn∈R1×lIt is the one-hot label of n-th of sample data, n=1,2...N, N For total sample number, L indicates that the classification sum of sample, b indicate Spectral dimension;Then by each sample in the X of image classification data library Originally it is normalized to 0~1 range, and is therefrom randomly chosen NtrainA sample data and its class label, obtain training data XtrainClass label Y corresponding with itstrain, wherein Ntrain< N;In addition, by data set remaining data and its label it is complete Portion divides test set into, and data and label are denoted as X respectivelytestWith Ytest
Step 2: determining the coding rule of network structure:
M different network structures are firstly generated, remember that the structured coding of wherein m-th of neural network is Cm, coding is interior to include S Stage, i.e.,WhereinFor the coding section in s stage;The stage includes KsA node, each node Indicate that one is activated the hybrid manipulation constituted by convolution+batch standardization+ReLU, is denoted asSmall number in same phase Node is connected to big numbered node, and the connection type between node is usedPosition binary coding is indicated;Wherein, 1st position binary coding representation (vs,1,vs,2) between connection, if having connection if the bit be 1, if it is connectionless should Bit is 0;Next two bits indicate three node (vs,1,vs,3),(vs,2,vs,3) between connection;Set S= 3, K1=3, K2=4, K3=5, it is 19 that network structure, which encodes overall length, i.e.,
Step 3: the training data of collection network performance prediction model:
It is random to generate m mutually different structured coding C1,C2,...,Cm, to the corresponding depth network of coding after compiling automatically It is completely trained on specified data set;Training learns network parameter using Adam optimizer, the total iteration T of training It is secondary;After network undergoes the training of one batch of size, the number of iterations t of record current network experience and the classification verified on collection are quasi- True rate Agt, and the data required in this, as prediction model training: data=Cm,t,Agt, t={ 1,2...T };
Step 4: the building and training of network performance prediction model:
Network performance prediction model f is defined, after carrying out mapping μ to mode input structured coding C and to it, model measures the structure Neural network is in the accuracy rate Ap after t repetitive exercise on test sett, it may be assumed that
Apt=f (μ (Cm),t) (2)
In mapping phase, structured coding C is mapped as the network structure code set being made of s structured coding by modelWherein, PsTheA bit is toThe value of a bit is equal to former knot Structure encodes the value of corresponding position, remaining position is filled with zero, it may be assumed that
Wherein, p [idx] and C [idx] are the value of structured coding p and C the i-th dx;
After being mapped structured coding, by p1,p2...psIt sequentially inputs the single layer shot and long term that hidden layer size is 128 and remembers net Network and the hidden state h for finally obtaining shot and long term memory network unit, referred to as network structure feature;Meanwhile it is the number of iterations t is defeated Enter full articulamentum, a ReLU activation primitive layer, the full articulamentum having a size of (64,32) by one having a size of (1,64) With the multi-layer perception (MLP) of a full articulamentum composition having a size of (32,1), it is accurate for network final classification to obtain the number of iterations The contribution degree D of ratet
By contribution degree DtIt carries out with the structure feature h of network by element multiplication:
H [id]=Dt× h [id], id=1,2 ..., len (h) } (4)
Calculated result is inputted into a small-sized full link block;It includes a full articulamentum having a size of (128,128), one The random deactivating layer that inactivation probability is 0.5, a ReLU activation primitive layer, a full articulamentum having a size of (128,32), one A ReLU activation primitive layer and a full articulamentum having a size of (32,1);The output result of full link block is current network The predicted value Ap of final classification accuracy ratet
Before training performance predicts network, random initializtion is carried out to network parameter, and as follows using back-propagation algorithm solution Optimization problem learns network parameter, obtains the optimized parameter θ of network:
Wherein, | | | |2For L2 norm;
Step 5: initial time genetic algorithm:
The parameter of genetic algorithm, including population at individual number G are setN, iteration wheel number GT, mutation probability GM, crossover probability GC, variation ginseng Number qM, cross parameter qCWith threshold value Amgn, and G is generated at randomNA structured codingAs initial population Ge0, primary Population was denoted as the 0th generation, and i-th of individual in population is denoted asThen individual score each in population is assessed, Obtain the score of the individualCurrent highest accuracy rate is denoted as fitmax
Step 6: carrying out selection operation to individual:
Selection operation is for each individual in previous generation population;Method is in previous generation population Gej-1, j=1,2...GTIn press According to the regular score according to individual of Russian rouletteSelect the population Ge of a new generationj;Individual score is higher, is selected And it is bigger to remain into follow-on probability;
Step 7: carrying out crossover operation to individual:
Coding of the crossover operation for individual each stage in middle groupAll in accordance with G between every two individual in populationCProbability Intersect, the operation of intersection is the sequence of the three phases in two individuals according to qCProbability exchanges;
Step 8: carrying out mutation operation to individual
Mutation operation is directed to each bit of individual UVR exposure, each binary digit of variation shown as on individual UVR exposure According to probability qMIt inverts, i.e., become 1 from 0 or becomes 0 from 1;
Step 9: predicting the performance of individual corresponding network:
The number of iterations at the end of network structure is encoded with training inputs network performance prediction model, obtains in population per each and every one The expection score of bodyI.e. network train up after expection nicety of grading;
Step 10: carrying out evaluation operation to individual:
It will expected scoreWith current best score fitmaxComparison;IfThen algorithm can be right The network is tested on test set after being trained up, and using the actual performance on test set as the reality of the individual ScoreIfThen without the hands-on of the network, only using lower estimated performance as The score of the individualAfter assessment, current optimized individual score fit is updatedmax, and return step six, until total iteration Until number is greater than T;Optimum network structure is obtained after algorithm.
CN201910696239.XA 2019-07-30 2019-07-30 Deep neural network structure optimization method based on fusion of prediction mechanism and genetic algorithm Active CN110490320B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910696239.XA CN110490320B (en) 2019-07-30 2019-07-30 Deep neural network structure optimization method based on fusion of prediction mechanism and genetic algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910696239.XA CN110490320B (en) 2019-07-30 2019-07-30 Deep neural network structure optimization method based on fusion of prediction mechanism and genetic algorithm

Publications (2)

Publication Number Publication Date
CN110490320A true CN110490320A (en) 2019-11-22
CN110490320B CN110490320B (en) 2022-08-23

Family

ID=68548791

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910696239.XA Active CN110490320B (en) 2019-07-30 2019-07-30 Deep neural network structure optimization method based on fusion of prediction mechanism and genetic algorithm

Country Status (1)

Country Link
CN (1) CN110490320B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111415009A (en) * 2020-03-19 2020-07-14 四川大学 Convolution variable integral self-encoder network structure searching method based on genetic algorithm
CN112001485A (en) * 2020-08-24 2020-11-27 平安科技(深圳)有限公司 Group convolution number searching method and device
CN112084877A (en) * 2020-08-13 2020-12-15 西安理工大学 NSGA-NET-based remote sensing image identification method
CN112183749A (en) * 2020-10-26 2021-01-05 天津大学 Deep learning library test method based on directed model variation
CN114842328A (en) * 2022-03-22 2022-08-02 西北工业大学 Hyperspectral change detection method based on cooperative analysis autonomous sensing network structure
CN114943866A (en) * 2022-06-17 2022-08-26 之江实验室 Image classification method based on evolutionary neural network structure search
CN115994575A (en) * 2023-03-22 2023-04-21 方心科技股份有限公司 Power failure diagnosis neural network architecture design method and system

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102915445A (en) * 2012-09-17 2013-02-06 杭州电子科技大学 Method for classifying hyperspectral remote sensing images of improved neural network
CN103971162A (en) * 2014-04-04 2014-08-06 华南理工大学 Method for improving BP (back propagation) neutral network and based on genetic algorithm
CN105303252A (en) * 2015-10-12 2016-02-03 国家计算机网络与信息安全管理中心 Multi-stage nerve network model training method based on genetic algorithm
CN106503802A (en) * 2016-10-20 2017-03-15 上海电机学院 A kind of method of utilization genetic algorithm optimization BP neural network system
US9785886B1 (en) * 2017-04-17 2017-10-10 SparkCognition, Inc. Cooperative execution of a genetic algorithm with an efficient training algorithm for data-driven model creation
CN108021983A (en) * 2016-10-28 2018-05-11 谷歌有限责任公司 Neural framework search
CN108229657A (en) * 2017-12-25 2018-06-29 杭州健培科技有限公司 A kind of deep neural network training and optimization algorithm based on evolution algorithmic
CN109243172A (en) * 2018-07-25 2019-01-18 华南理工大学 Traffic flow forecasting method based on genetic algorithm optimization LSTM neural network
CN110020667A (en) * 2019-02-21 2019-07-16 广州视源电子科技股份有限公司 Searching method, system, storage medium and the equipment of neural network structure

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102915445A (en) * 2012-09-17 2013-02-06 杭州电子科技大学 Method for classifying hyperspectral remote sensing images of improved neural network
CN103971162A (en) * 2014-04-04 2014-08-06 华南理工大学 Method for improving BP (back propagation) neutral network and based on genetic algorithm
CN105303252A (en) * 2015-10-12 2016-02-03 国家计算机网络与信息安全管理中心 Multi-stage nerve network model training method based on genetic algorithm
CN106503802A (en) * 2016-10-20 2017-03-15 上海电机学院 A kind of method of utilization genetic algorithm optimization BP neural network system
CN108021983A (en) * 2016-10-28 2018-05-11 谷歌有限责任公司 Neural framework search
US9785886B1 (en) * 2017-04-17 2017-10-10 SparkCognition, Inc. Cooperative execution of a genetic algorithm with an efficient training algorithm for data-driven model creation
CN108229657A (en) * 2017-12-25 2018-06-29 杭州健培科技有限公司 A kind of deep neural network training and optimization algorithm based on evolution algorithmic
CN109243172A (en) * 2018-07-25 2019-01-18 华南理工大学 Traffic flow forecasting method based on genetic algorithm optimization LSTM neural network
CN110020667A (en) * 2019-02-21 2019-07-16 广州视源电子科技股份有限公司 Searching method, system, storage medium and the equipment of neural network structure

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
BOWEN BAKER 等: "ACCELERATING NEURAL ARCHITECTURE SEARCH USING PERFORMANCE PREDICTION", 《ICLR 2018》 *
CHEN DING 等: "Hyperspectral Image Classification Based on Convolutional Neural Networks With Adaptive Network Structure", 《2018 INTERNATIONAL CONFERENCE ON ORANGE TECHNOLOGIES》 *
LINGXI XIE 等: "Genetic CNN", 《2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION》 *
ZHICHAO LU 等: "NSGA-Net: Neural Architecture Search using Multi-Objective Genetic Algorithm", 《ARXIV》 *
王华斌 等: "遥感影像要素提取的可变结构卷积神经网络方法", 《测绘学报》 *
陈晓艳 等: "动态贝叶斯网络结构搜索法辨识生物神经网络连接", 《生命科学研究》 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111415009A (en) * 2020-03-19 2020-07-14 四川大学 Convolution variable integral self-encoder network structure searching method based on genetic algorithm
CN112084877B (en) * 2020-08-13 2023-08-18 西安理工大学 NSGA-NET-based remote sensing image recognition method
CN112084877A (en) * 2020-08-13 2020-12-15 西安理工大学 NSGA-NET-based remote sensing image identification method
CN112001485A (en) * 2020-08-24 2020-11-27 平安科技(深圳)有限公司 Group convolution number searching method and device
WO2021151311A1 (en) * 2020-08-24 2021-08-05 平安科技(深圳)有限公司 Group convolution number searching method and apparatus
CN112001485B (en) * 2020-08-24 2024-04-09 平安科技(深圳)有限公司 Group convolution number searching method and device
CN112183749A (en) * 2020-10-26 2021-01-05 天津大学 Deep learning library test method based on directed model variation
CN112183749B (en) * 2020-10-26 2023-04-18 天津大学 Deep learning library test method based on directed model variation
CN114842328A (en) * 2022-03-22 2022-08-02 西北工业大学 Hyperspectral change detection method based on cooperative analysis autonomous sensing network structure
CN114842328B (en) * 2022-03-22 2024-03-22 西北工业大学 Hyperspectral change detection method based on collaborative analysis autonomous perception network structure
CN114943866A (en) * 2022-06-17 2022-08-26 之江实验室 Image classification method based on evolutionary neural network structure search
CN114943866B (en) * 2022-06-17 2024-04-02 之江实验室 Image classification method based on evolutionary neural network structure search
CN115994575B (en) * 2023-03-22 2023-06-02 方心科技股份有限公司 Power failure diagnosis neural network architecture design method and system
CN115994575A (en) * 2023-03-22 2023-04-21 方心科技股份有限公司 Power failure diagnosis neural network architecture design method and system

Also Published As

Publication number Publication date
CN110490320B (en) 2022-08-23

Similar Documents

Publication Publication Date Title
CN110490320A (en) Deep neural network structural optimization method based on forecasting mechanism and Genetic Algorithm Fusion
Zhang et al. Efficient evolutionary search of attention convolutional networks via sampled training and node inheritance
CN104751842B (en) The optimization method and system of deep neural network
CN109948029A (en) Based on the adaptive depth hashing image searching method of neural network
CN109299262A (en) A kind of text implication relation recognition methods for merging more granular informations
Foster et al. Structure in the space of value functions
CN105279555A (en) Self-adaptive learning neural network implementation method based on evolutionary algorithm
CN110826638A (en) Zero sample image classification model based on repeated attention network and method thereof
CN108629326A (en) The action behavior recognition methods of objective body and device
CN106777402B (en) A kind of image retrieval text method based on sparse neural network
CN109460855A (en) A kind of throughput of crowded groups prediction model and method based on focus mechanism
CN108763376A (en) Syncretic relation path, type, the representation of knowledge learning method of entity description information
CN103905246B (en) Link prediction method based on grouping genetic algorithm
CN106874655A (en) Traditional Chinese medical science disease type classification Forecasting Methodology based on Multi-label learning and Bayesian network
CN111461437B (en) Data-driven crowd motion simulation method based on generation of countermeasure network
CN110580727B (en) Depth V-shaped dense network imaging method with increased information flow and gradient flow
CN110222838A (en) Deep neural network and its training method, device, electronic equipment and storage medium
CN114861890A (en) Method and device for constructing neural network, computing equipment and storage medium
CN112634019A (en) Default probability prediction method for optimizing grey neural network based on bacterial foraging algorithm
CN115328971A (en) Knowledge tracking modeling method and system based on double-graph neural network
CN116306902A (en) Time sequence data environment analysis and decision method, device, equipment and storage medium
CN111882042A (en) Automatic searching method, system and medium for neural network architecture of liquid state machine
Baruah et al. Data augmentation and deep neuro-fuzzy network for student performance prediction with MapReduce framework
CN109948589A (en) Facial expression recognizing method based on quantum deepness belief network
CN116258504B (en) Bank customer relationship management system and method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant