CN111242281A - Weight optimization method for deep convolutional neural network - Google Patents

Weight optimization method for deep convolutional neural network Download PDF

Info

Publication number
CN111242281A
CN111242281A CN202010014858.9A CN202010014858A CN111242281A CN 111242281 A CN111242281 A CN 111242281A CN 202010014858 A CN202010014858 A CN 202010014858A CN 111242281 A CN111242281 A CN 111242281A
Authority
CN
China
Prior art keywords
individuals
neural network
convolutional neural
initial population
population
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010014858.9A
Other languages
Chinese (zh)
Inventor
安竹林
杨传广
徐勇军
程坦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Institute Of Data Intelligence Institute Of Computing Technology Chinese Academy Of Sciences
Original Assignee
Xiamen Institute Of Data Intelligence Institute Of Computing Technology Chinese Academy Of Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Institute Of Data Intelligence Institute Of Computing Technology Chinese Academy Of Sciences filed Critical Xiamen Institute Of Data Intelligence Institute Of Computing Technology Chinese Academy Of Sciences
Priority to CN202010014858.9A priority Critical patent/CN111242281A/en
Publication of CN111242281A publication Critical patent/CN111242281A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/086Learning methods using evolutionary algorithms, e.g. genetic algorithms or genetic programming

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physiology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method for optimizing a deep convolutional neural network weight, which comprises the following steps: acquiring an initial population, and carrying out initialization and gene coding; performing gradient descending parameter training on all individuals in the initial population until a preset number of times is reached; calculating individual fitness and sequencing; based on a genetic algorithm, carrying out selection, crossing and mutation operations on the initial population to obtain a new generation of population; and judging whether a termination condition is reached, and if not, performing iterative training and evolution on the new generation of population. The invention adopts the combination of the genetic algorithm and the gradient descent method to optimize the weight of the deep convolutional neural network, can improve the recognition rate of the deep convolutional neural network, and simultaneously improves the acquisition speed of the deep convolutional neural network.

Description

Weight optimization method for deep convolutional neural network
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a weight optimization method of a deep convolutional neural network.
Background
Deep learning develops rapidly in the field of artificial intelligence, and the performance of the deep learning is close to or surpasses that of human beings in the aspects of image recognition, voice recognition, natural language processing and the like, but the development of the deep learning is not only stopped, and the optimization of a deep learning algorithm and the combination with the technology in other fields become a new direction for the development of the deep learning. As is well known, an artificial neural network has been studied for decades under deep learning, and the inventive inspiration of the artificial neural network is also an information transfer mechanism from the brain of a biological human, as with other technologies in the field of artificial intelligence. In the optimization field, intelligent algorithms such as genetic algorithm, particle swarm algorithm and the like are also from the biological category, the heuristic biological intelligent algorithms have the advantages of strong global search capability and high convergence speed, and in deep learning, the heuristic method is mainly used for searching the optimal solution or gradient reduction at present. The gradient descent method utilizes the characteristic that the function changes in the negative gradient direction most quickly, subtracts the gradient value of the point from the parameter value and gradually descends to the minimum value of the function, so as to achieve the purpose of parameter optimization, but the method has the defects that the method easily falls into local optimization, complex gradient needs to be calculated, the convergence speed is slow, the non-trivial function cannot be effectively converged, and the like, so that the field combining deep learning and intelligent optimization algorithm is produced, the defect in gradient descent is overcome by utilizing the complementary advantages of the two, and the field gradually draws the attention of more researchers.
The application of genetic algorithm to neural network optimization research has been long, and the emergence of deep neural networks brings new challenges to the research in the field. Compared with the traditional neural network, the deep neural network has deeper layer number and larger network scale. In addition, due to the introduction of convolution operation, the optimized object becomes more complex, and therefore, the traditional method for optimizing the deep neural network based on the genetic algorithm cannot be directly applied to the deep neural network.
At present, the main research of deep neural networks optimized by applying genetic algorithms focuses on the field of reinforcement learning, such as Neural Evolution (NE), and the main purpose of the method is to allow neural networks to survive according to the principle of survival, excellence and decline of the qualified subjects in the biological world, and finally obtain individuals with the highest fitness, namely the best reinforcement learning neural networks. The biological inspiration in this field lies in the evolutionary process of neurons of the human brain, which has so developed learning and memory functions today, without leaving the complex neural network systems in the brain. The whole set of neural network system of the human brain benefits from a long evolution process instead of gradient descent and back propagation in deep learning, so that the evolution of the neural network by using an evolutionary algorithm becomes an emerging field based on the principle and has feasibility in theory.
The evolutionary algorithm is mainly divided into two forms, the first type is fixed topology neural evolution, namely the topological structure of a neural network is still set by a researcher, and the weight of the network is given to the evolutionary algorithm for evolution, so that the optimal solution is found out; the second is an artificial Neural Network (TWEANN, Topologies weightevolution), i.e. the topology and the Network weights are both evolved by using an evolutionary algorithm, and finally the optimal topology and Network weight combination is obtained. Later, more and more algorithms with evolved topology and weight gradually appear, firstly, a topological-enhanced Neural Evolution Algorithm (NEAT) is enhanced, the algorithm provides a historical marking technology and a species technology of genes, the limitation that immature individuals in twearnn can be extincted is solved, and the evolvable scale of the network is enlarged; subsequently, a Hypercube-based NEAT algorithm is changed based on the NEAT algorithm, the algorithm adopts indirect coding for genes, and a composite Pattern generation network CPPN (composite Pattern generating network) is used for generating network connection, so that the scale of the network topology which can be evolved is greatly increased; the emergence of Novel Search (NS) makes the only criterion for judging the quality of a neural network not only good or bad fitness, but also adds a novel concept, which makes it easier to find a potentially optimal individual.
At present, the neural evolution algorithm obtains remarkable results in the field of reinforcement learning, and even an agent evolved by using the neural evolution algorithm can be better than deep learning in performance in certain games. Meanwhile, in the field of supervised learning, the image classifier trained based on the evolutionary algorithm also obtains a good effect, but for the deep convolutional neural network, as the evolution time is longer, the overall performance and performance are not as good as those of the deep convolutional neural network only by using a gradient descent method, so that an effective optimization method and a result are not available.
Disclosure of Invention
In order to solve the problems, the invention provides a weight optimization method of a deep convolutional neural network.
The invention adopts the following technical scheme:
a deep convolutional neural network weight optimization method comprises the following steps:
s1, obtaining an initial population, initializing the weight and bias of an individual of the initial population, and carrying out gene coding;
s2, performing gradient descending parameter training on all individuals in the initial population until the preset times are reached;
s3, setting the weight and bias of the individuals in the initial population into a calculation graph, then obtaining individual fitness by using test set data, sequencing the individual fitness from small to large to form the ranking of each individual, and storing the individual with the maximum fitness;
s4, selecting, crossing and mutating the initial population based on the genetic algorithm to obtain a new generation population;
s5, judging whether a termination condition is reached, if so, terminating, otherwise, executing a step S5, wherein the termination condition is that the maximum fitness meets the requirement or the maximum iteration step number is reached;
and S6, jumping to execute the step S2, and performing iterative training and evolution on the new generation of population.
Preferably, the step S4 includes the following substeps:
s41, aiming at all individuals of the initial population, sequentially selecting n-1 individuals by adopting a roulette algorithm based on the ranking of individual fitness;
s42, performing cross operation on the interior of the selected individuals to obtain n-1 crossed individuals;
s43, performing mutation operation on the crossed n-1 individuals to obtain n-1 individuals after mutation;
and S44, combining the n-1 mutated individuals with the maximum fitness obtained in the step S3 to obtain a new generation population containing n individuals.
Preferably, the preset number of times described in step S2 is set by the following formula:
Figure BDA0002358494540000041
wherein, ga _ step is the interval step number used by the genetic algorithm, and step is the training step number.
Preferably, the step S1 of obtaining the initial population specifically includes: and randomly generating an initial population according to preset parameters.
Preferably, the initialization described in step S1 is a normal distribution initialization method.
Preferably, the crossover operation described in step S42 is performed in a single-point crossover manner, where a crossover point is randomly selected and the gene segments are segmented, and then the paired parents are swapped with the gene segments to form new offspring.
After adopting the technical scheme, compared with the background technology, the invention has the following advantages:
the invention adopts the combination of the genetic algorithm and the gradient descent method to optimize the weight of the deep convolutional neural network, can improve the recognition rate of the deep convolutional neural network, and simultaneously improves the acquisition speed of the deep convolutional neural network.
Drawings
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2 shows gene coding patterns;
FIG. 3 illustrates parameter dimension information;
FIG. 4 is a schematic diagram of a chromosome single point crossing pattern;
fig. 5 is a schematic diagram of a topological structure in a single-point crossing manner.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Examples
Referring to fig. 1, the invention discloses a method for optimizing weights of a deep convolutional neural network, which comprises the following steps:
s1, obtaining initial population, initializing individual weight and bias, and encoding gene. The initial population is randomly generated according to preset parameters, the preset parameters mainly aim at a genetic algorithm part, in the embodiment, a single-point crossing is adopted in a crossing mode, the probability is pc equal to 0.5, a replacement variation mode is adopted in a variation mode, values in a parameter range (set according to the weight and the bias distribution diagram of the convolutional neural network) are randomly generated firstly, then gene values are replaced, the individual variation probability pm is 0.1, and the parameter variation probability pw is 0.8. The initialization adopts a normal distribution initialization method. Since the weights and biases of each layer in the convolutional neural network are real numbers, the genes of the genetic algorithm are encoded by real numbers. Visual coding As shown in FIG. 2, the chromosome is a parameter list including 5 layers of parameters w of the neural network1,b1,w2,b2,w3,b3,w4,b4,w5,b5Therefore, the length of the chromosome is 10. The weights and offsets are not real scalars as in the conventional encoding method, but are arrays with dimension information, and the dimensions of the parameters are as shown in fig. 3 according to the structural design of the convolutional neural network.
And S2, performing gradient descending parameter training on all individuals in the initial population until the preset times are reached. The preset number of times is set by the following formula:
Figure BDA0002358494540000051
wherein, ga _ step is the interval step number used by the genetic algorithm, and step is the training step number.
The above predetermined formula of times is essentially a scheme of attenuation steps. Through preliminary experiments, the training results of the neural networks with ga _ step 2 and ga _ step 5 are difficult to rise after 1000 generations, the training results of the neural networks with ga _ step 40 are poor in the former 700 generations, and the neural networks using step number attenuation can keep better results in the early training and far exceed other neural networks in the later training. Because the classifier has poor effect in the early stage of training, the frequent use of the genetic algorithm is beneficial to a neural network to find good parameters more quickly and accelerate the training speed; in the later period of training, the gradient decline can be influenced by frequent genetic algorithm, which causes training disorder, and at this time, the use of genetic algorithm needs to be reduced, so that the gradient decline is carried out stably.
S3, setting the weight and bias of the individuals in the initial population into a calculation graph, then obtaining the individual fitness by using the test set data, sequencing the individual fitness from small to large to form the ranking of each individual, and storing the individual with the maximum fitness.
And S4, selecting, crossing and mutating the initial population based on the genetic algorithm to obtain a new generation population. The method comprises the following steps:
and S41, sequentially selecting n-1 individuals by adopting a roulette algorithm according to the ranking of the individual fitness for all the individuals of the initial population. The selection operation is responsible for selecting individuals in the population on a certain basis for reproduction or retention in the next generation. In the selection operation of the experiment, uniform sequencing is adopted, namely the fitness of all individuals in the population is arranged in an ascending order, and the obtained ranking is used as the selected basis. Roulette selection is then used, which is a pull-back sampling operation where the probability of a particular individual being selected into the next generation is the ratio of the individual's rank to the sum of the entire population's ranks. Firstly, a fan-shaped wheel disc is manufactured according to the fitness proportion of individuals, and the individuals are selected by rotating the wheel disc each time. It should be noted that the reason why the selection operation does not directly use the fitness as the sampling basis is the difference of the individual fitness, and in the experimental process, it is observed that the difference of all the individual fitness is mostly 10-2Level, so directly using the fitness ratio to sample is not obvious and cannotThe advantages of the selection operation are exerted to the maximum extent. In order to ensure that the optimal individuals can be reserved, the individuals with the maximum fitness automatically join the new population after each selection is finished.
And S42, performing cross operation on the interior of the selected individuals to obtain n-1 crossed individuals. Crossover refers to the process of exchanging two paired individuals for part of the gene they have, and then creating two new individuals. Considering that the single-point crossing can maintain the integrity of the topology to the maximum extent and can achieve the effect of the topology crossing, the crossing operation in this embodiment adopts the single-point crossing mode, firstly randomly selects a crossing point and divides the gene segment, and then the paired parents exchange the gene segment with each other to form a new offspring (as shown in fig. 4 and fig. 5). In the crossing process, the mode of crossing the 1 st individual and the 2 nd individual, the mode of crossing the 3 rd individual and the 4 th individual, … …, the mode of crossing the n-2 nd individual and the n-1 st individual is adopted in the pairing, and the n-1 st individual is obtained after the crossing.
S43, performing mutation operation on the crossed n-1 individuals to obtain n-1 individuals after mutation.
And S44, combining the n-1 mutated individuals with the maximum fitness obtained in the step S3 to obtain a new generation population containing n individuals. Mutation refers to an operation of changing a gene value on a chromosome according to a certain probability, and is mainly classified into disturbance mutation and substitution mutation according to a change mode. As the name implies, the disturbance variation is to increase or decrease the parameter value by using the variation value, and the replacement variation is to replace the parameter value by using the variation value. In this embodiment, the mutation operation changes the parameter by using the alternative mutation, specifically, an individual to be mutated is determined according to the individual mutation probability, and the determined variant individual is subjected to parameter mutation with a coding length of several times, where the parameter mutation is based on the parameter mutation probability each time.
And S5, judging whether a termination condition is reached, if so, terminating, otherwise, executing a step S6, wherein the termination condition is that the maximum fitness meets the requirement or the maximum iteration step number is reached.
And S6, jumping to execute the step S2, and performing iterative training and evolution on the new generation of population.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (6)

1. A deep convolutional neural network weight optimization method is characterized by comprising the following steps:
s1, obtaining an initial population, initializing the weight and bias of an individual of the initial population, and carrying out gene coding;
s2, performing gradient descending parameter training on all individuals in the initial population until the preset times are reached;
s3, setting the weight and bias of the individuals in the initial population into a calculation graph, then obtaining individual fitness by using test set data, sequencing the individual fitness from small to large to form the ranking of each individual, and storing the individual with the maximum fitness;
s4, selecting, crossing and mutating the initial population based on the genetic algorithm to obtain a new generation population;
s5, judging whether a termination condition is reached, if so, terminating, otherwise, executing a step S5, wherein the termination condition is that the maximum fitness meets the requirement or the maximum iteration step number is reached;
and S6, jumping to execute the step S2, and performing iterative training and evolution on the new generation of population.
2. The method of claim 1, wherein the step S4 comprises the following sub-steps:
s41, aiming at all individuals of the initial population, sequentially selecting n-1 individuals by adopting a roulette algorithm based on the ranking of individual fitness;
s42, performing cross operation on the interior of the selected individuals to obtain n-1 crossed individuals;
s43, performing mutation operation on the crossed n-1 individuals to obtain n-1 individuals after mutation;
and S44, combining the n-1 mutated individuals with the maximum fitness obtained in the step S3 to obtain a new generation population containing n individuals.
3. The method for optimizing weights of deep convolutional neural network as claimed in claim 1 or 2, wherein the predetermined number of times in step S2 is set by the following formula:
Figure FDA0002358494530000011
wherein, ga _ step is the interval step number used by the genetic algorithm, and step is the training step number.
4. The method for optimizing weights of a deep convolutional neural network according to claim 1 or 2, wherein the step of obtaining the initial population in step S1 specifically comprises: and randomly generating an initial population according to preset parameters.
5. The method for optimizing weights of deep convolutional neural network as claimed in claim 1 or 2, wherein the initialization in step S1 is a normal distribution initialization method.
6. The method as claimed in claim 2, wherein the crossover operation in step S42 is a single point crossover, and the crossover operation selects a crossover point randomly and divides the gene segments, and then the paired parents exchange the gene segments to form new offspring.
CN202010014858.9A 2020-01-07 2020-01-07 Weight optimization method for deep convolutional neural network Pending CN111242281A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010014858.9A CN111242281A (en) 2020-01-07 2020-01-07 Weight optimization method for deep convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010014858.9A CN111242281A (en) 2020-01-07 2020-01-07 Weight optimization method for deep convolutional neural network

Publications (1)

Publication Number Publication Date
CN111242281A true CN111242281A (en) 2020-06-05

Family

ID=70877602

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010014858.9A Pending CN111242281A (en) 2020-01-07 2020-01-07 Weight optimization method for deep convolutional neural network

Country Status (1)

Country Link
CN (1) CN111242281A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112446432A (en) * 2020-11-30 2021-03-05 西安电子科技大学 Handwritten picture classification method based on quantum self-learning self-training network
CN112580259A (en) * 2020-12-16 2021-03-30 天津水泥工业设计研究院有限公司 Intelligent mine automatic ore blending method and system based on genetic algorithm
CN114239792A (en) * 2021-11-01 2022-03-25 荣耀终端有限公司 Model quantization method, device and storage medium
WO2022127393A1 (en) * 2020-12-15 2022-06-23 International Business Machines Corporation Reinforcement learning for testing suite generation
CN114879494A (en) * 2022-04-25 2022-08-09 复旦大学 Robot self-adaptive design method based on evolution and learning

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112446432A (en) * 2020-11-30 2021-03-05 西安电子科技大学 Handwritten picture classification method based on quantum self-learning self-training network
CN112446432B (en) * 2020-11-30 2023-06-30 西安电子科技大学 Handwriting picture classification method based on quantum self-learning self-training network
WO2022127393A1 (en) * 2020-12-15 2022-06-23 International Business Machines Corporation Reinforcement learning for testing suite generation
GB2617737A (en) * 2020-12-15 2023-10-18 Ibm Reinforcement learning for testing suite generation
CN112580259A (en) * 2020-12-16 2021-03-30 天津水泥工业设计研究院有限公司 Intelligent mine automatic ore blending method and system based on genetic algorithm
CN112580259B (en) * 2020-12-16 2022-05-13 天津水泥工业设计研究院有限公司 Intelligent mine automatic ore blending method and system based on genetic algorithm
CN114239792A (en) * 2021-11-01 2022-03-25 荣耀终端有限公司 Model quantization method, device and storage medium
CN114879494A (en) * 2022-04-25 2022-08-09 复旦大学 Robot self-adaptive design method based on evolution and learning

Similar Documents

Publication Publication Date Title
CN111242281A (en) Weight optimization method for deep convolutional neural network
CN104751842B (en) The optimization method and system of deep neural network
CN108985515B (en) New energy output prediction method and system based on independent cyclic neural network
CN110222830B (en) Deep feed-forward network fault diagnosis method based on adaptive genetic algorithm optimization
CN113688573A (en) Extension spring optimization method based on improved black widow spider algorithm
CN112330487A (en) Photovoltaic power generation short-term power prediction method
CN113095477A (en) Wind power prediction method based on DE-BP neural network
CN117349732A (en) High-flow humidification therapeutic apparatus management method and system based on artificial intelligence
CN110188861A (en) A kind of Web service combination optimization method based on I-PGA algorithm
CN116401037B (en) Genetic algorithm-based multi-task scheduling method and system
CN113705098A (en) Air duct heater modeling method based on PCA and GA-BP network
CN116665786A (en) RNA layered embedding clustering method based on graph convolution neural network
CN117253037A (en) Semantic segmentation model structure searching method, automatic semantic segmentation method and system
CN117518792A (en) Ship motion non-parametric modeling method based on improved Gaussian process regression
CN113807005B (en) Bearing residual life prediction method based on improved FPA-DBN
CN115169754A (en) Energy scheduling method and device, electronic equipment and storage medium
CN112819161A (en) Variable-length gene genetic algorithm-based neural network construction system, method and storage medium
Sivaraj et al. An efficient grouping genetic algorithm
CN111639797A (en) Gumbel-softmax technology-based combined optimization method
Londt et al. A Two-Stage Hybrid GA-Cellular Encoding Approach to Neural Architecture Search
Garay et al. A GH-SOM optimization with SOM labelling and dunn index
Takaishi et al. Percolative Learning: Time-Series Prediction from Future Tendencies
CN113780575B (en) Visual classification method based on progressive deep learning model
CN112270952B (en) Method for identifying cancer drive pathway
CN117951558A (en) RBF water quality parameter spatial distribution prediction method and system based on adjacent points

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200605