CN113762370A - Depth network set generation method combined with Gaussian random field - Google Patents
Depth network set generation method combined with Gaussian random field Download PDFInfo
- Publication number
- CN113762370A CN113762370A CN202111001978.6A CN202111001978A CN113762370A CN 113762370 A CN113762370 A CN 113762370A CN 202111001978 A CN202111001978 A CN 202111001978A CN 113762370 A CN113762370 A CN 113762370A
- Authority
- CN
- China
- Prior art keywords
- individual
- individuals
- value
- accuracy
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000000034 method Methods 0.000 title claims abstract description 57
- 238000013528 artificial neural network Methods 0.000 claims abstract description 71
- 238000012216 screening Methods 0.000 claims abstract description 21
- 238000004364 calculation method Methods 0.000 claims abstract description 14
- 238000012549 training Methods 0.000 claims abstract description 14
- 238000003860 storage Methods 0.000 claims description 38
- 239000013598 vector Substances 0.000 claims description 34
- 239000011159 matrix material Substances 0.000 claims description 18
- 238000009826 distribution Methods 0.000 claims description 12
- 238000005314 correlation function Methods 0.000 claims description 9
- 230000035772 mutation Effects 0.000 claims description 9
- 238000011156 evaluation Methods 0.000 claims description 7
- 238000007476 Maximum Likelihood Methods 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 6
- 230000001419 dependent effect Effects 0.000 claims description 5
- 238000005304 joining Methods 0.000 claims description 5
- 238000012546 transfer Methods 0.000 claims description 5
- 230000015572 biosynthetic process Effects 0.000 claims description 3
- 238000001914 filtration Methods 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 238000003786 synthesis reaction Methods 0.000 claims description 3
- 238000011176 pooling Methods 0.000 claims description 2
- 230000006870 function Effects 0.000 description 69
- 238000003062 neural network model Methods 0.000 description 5
- 238000002474 experimental method Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- HPTJABJPZMULFH-UHFFFAOYSA-N 12-[(Cyclohexylcarbamoyl)amino]dodecanoic acid Chemical group OC(=O)CCCCCCCCCCCNC(=O)NC1CCCCC1 HPTJABJPZMULFH-UHFFFAOYSA-N 0.000 description 1
- 238000012935 Averaging Methods 0.000 description 1
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention uses a pre-screening strategy combined with a Gaussian random field to accelerate the generation of a deep neural network set, and discloses a deep network set generation method combined with the Gaussian random field. The method comprises the following steps: training a Gaussian random field model by using the preprocessed neural network; step two: initializing a neural network set, and generating a network set with higher accuracy; step three: optimizing a neural network set by combining a Gaussian random field model; step four: the network set is compressed. The method predicts the fitness of the neural network through the Gaussian random field model, reduces the calculation times by combining a pre-screening strategy, obtains the neural network set more quickly and improves the model performance.
Description
Technical Field
The invention belongs to the field of machine learning algorithms, and relates to a depth network set generation method combined with a Gaussian random field.
Background
Along with the development of modern science and technology and the progress of productivity, the performance of a computer is greatly broken through, and the acquisition of mass data is not difficult any more. As a pronoun of artificial neural networks, deep learning has become the most fierce research direction in the field of machine learning, and various algorithms for improving the training speed of the neural networks are proposed in succession. On the basis, the neural network has been rapidly developed, and more deep network models are proposed and widely applied to various fields such as face recognition, image classification, natural language processing and the like.
The neural network model simulates the working principle of human brain neurons, and performs a series of calculations and processing on input information to finally extract useful information as an output result. A classical fully-connected neural network can be roughly divided into an input layer, a hidden layer and an output layer. Each layer of the full-connection network comprises a plurality of nodes, wherein each node is connected with all nodes of the adjacent layer, and the nodes of the same layer are not connected with each other. Each layer processes and processes the information of the upper layer, and then transfers the information to the next layer until the output layer. The information contained in the output layer nodes is the information extracted from the original input by the neural network, and can be used for corresponding task processing. However, when the accuracy of the neural network model is calculated, construction and complete calculation are often required, and a large amount of time is spent, so that the training speed of the neural network model is difficult to improve.
If the Gaussian random field model is used for predicting accuracy, the performance of the model can be well improved, and training time is reduced. The basic principle of the Gaussian random field model is that before the evaluation of a real fitness function, the evaluated solution is used for establishing the Gaussian random field model, and then a function value corresponding to an unknown solution is predicted. By setting a pre-screening rule, only those solutions with larger promotion space are reserved, so that the purpose of reducing the evaluation of the true fitness function is achieved, the evaluation times of the fitness function of the evolutionary algorithm in the optimization process can be effectively reduced, and the overall training speed of the model is improved.
From the above background, in order to obtain a set of neural network models with high accuracy and large variability, the calculation process takes much time, and the final set of neural network models is too large to be stored.
Disclosure of Invention
In order to reduce the execution time of the differential evolution algorithm, the invention provides a pre-screening strategy combined with a Gaussian random field model to reduce the unnecessary evaluation times of a fitness function, and the overall training speed of the model is improved through the pre-screening strategy.
The technical scheme of the invention is as follows:
a depth network set generation method combining Gaussian random fields. Firstly, a Gaussian random field model is trained by using neural network individuals with accurate fitness calculated, and the neural network individuals are used for predicting the fitness of the individuals. And then, initializing a network set by using a differential evolution algorithm to obtain a group of networks with higher accuracy. And then, the generated network set is used, two objective functions are respectively constructed by relying on two indexes of difference and accuracy, and therefore the two objective functions are converted into a multi-objective problem to be solved by using a multi-objective differential evolution algorithm. In the selection process of the evolutionary algorithm, the value of the fitness function is predicted by using the established Gaussian random field model, and whether accurate fitness is calculated or not is judged by combining a pre-screening strategy, so that the overall operation efficiency of the model is improved. And for the final network set, the clustering algorithm is used for reducing the scale of the set, so that the storage of the final network set is facilitated.
The method comprises the following steps:
step one, respectively constructing a Gaussian random field prediction model of two indexes of accuracy and difference by using a neural network individual of which the fitness function is accurately calculated;
secondly, after population initialization operation is carried out by using a single-target differential evolution algorithm, a neural network set with higher accuracy is obtained, namely the neural network set is initialized; predicting the accuracy of the variant individual by using the Gaussian random field prediction model established in the step one;
step three, further optimizing the neural network set obtained by initialization in the step two;
calculating a fitness function of each individual in the initial population, and carrying out Gaussian variation on the individual; predicting the fitness function of the new individual by using a Gaussian random field prediction model, and judging by combining a pre-screening strategy; updating the reference point and neighborhood problems of the individual; judging whether to update an external storage set; outputting the external storage set;
step four, the clustering algorithm reduces the scale of the network set;
acquiring an external storage set output in the step three; adopting a clustering algorithm to narrow the selection range of the networks, and clustering according to the accuracy and difference of each network in the set; and selecting a central network from each cluster, constructing a new deep neural network set and outputting the new deep neural network set as a final result.
Further, the method for generating the depth network set by combining the Gaussian random field specifically comprises the following steps:
the method comprises the following steps: and (4) building a prediction model by combining the Gaussian random field.
And respectively constructing a Gaussian random field prediction model with two indexes of accuracy and difference by using the neural network individuals of which the accurate fitness functions are calculated. A kernel function of gaussian regression prediction is first defined for passing the covariance between the input individuals. And then performing fitting Gaussian process regression, and fitting the neural network individuals with the fitness value to obtain a regression model. And finally, optimizing the hyper-parameters of the kernel function by maximizing the transfer-based log-edge likelihood fitting to obtain a final Gaussian random field prediction model.
Further, the first step specifically comprises:
step 1.1, acquiring a neural network individual set of which the fitness function is accurately calculated, establishing the accuracy and difference set of each individual, and ensuring the consistency of the lengths of the sets;
step 1.2, defining a kernel function of Gaussian regression prediction, and transferring covariance between input individuals. The kernel function is WhiteKernel and is used for estimating the noise of the target function and reducing the influence of the noise;
step 1.3, fitting Gaussian process regression is carried out on the neural network individuals and the accurate fitness function values, and the neural network individuals and the fitness values are fitted to obtain a regression model;
step 1.4, optimizing the over-parameter optimizer of the kernel function by maximizing the transfer-based log-edge likelihood estimation; because the log-edge likelihood estimation has a plurality of local optimal solutions, the optimization process is repeated for a plurality of times by specifying a kernel parameter n _ resets _ optimum; performing first optimization operation by setting an initial value of a hyper-parameter n _ resets _ optimum of the kernel; in the subsequent operation process, values of the hyper-parameter n _ resets _ optimum are randomly selected from reasonable range values;
step 1.5, obtaining a prediction model combined with the Gaussian random field, and establishing a reference network set S of the Gaussian random field modelevalAnd the method is used for predicting the fitness of the individual subsequently and updating in real time in the prediction process.
Still further, the step 1.3 further comprises:
step 1.3.1: taking a neural network individual x in the neural network individual set obtained in the step 1.1 as an independent variable, taking a corresponding fitness function value as a dependent variable y, and constructing an objective function y which is g (x);
step 1.3.2: from gaussian process regression prediction, assume that the objective function y ═ g (x) obeys a mean value μ and a variance δ2Positive-negative distribution of;
step 1.3.3: constructing a maximum likelihood estimation function for fitting a normal distribution curve; the maximum likelihood function PDF is shown in equation (1.1):
wherein exp represents an exponential function with a natural logarithm e as a base, and det represents a value of a determinant corresponding to the calculation matrix; the matrix C is a K × K matrix, and the values C of corresponding positions in the matrixi,j=c(xi,xj),c(xi,xj) Representing the value of the correlation function, for an arbitrary argument x, x' ∈ Rn,RnRepresenting a set of real numbers, the correlation function c (x, x ') -exp [ -d (x, x')]Characterizing the correlation between the objective function values g (x) and g (x ') corresponding to the independent variables x, x'; wherein the distance functionθiAnd piEach represents a hyper-parameter controlling the distance function, independent of the independent variable x, x ', so that the value of the correlation function is only dependent on the magnitude of (x-x '), the greater the (x-x '), the smaller the correlation, and vice versa; vector y ═ y1,y2,...yK) And 1 is a column vector of dimension K;
step 1.3.4: according to the features of the positive Tai distribution, maximizing the likelihood estimation function and making the mean valueVariance (variance)The unbiased estimate of the objective function g (x) is Denotes the expectation, variance, of the mean μWherein r ═ c (x, x)1),c(x,x2),...,c(x,xK))T,Represents the variance δ2Considering the objective function g (x) to obey a positive distribution
Step 1.3.5: and (3) training the individual neural network used in the step (1.3.1) and the fitness value to a Gaussian random field prediction model to obtain a final regression model.
Step two: and initializing a neural network set.
Firstly, coding hyper-parameters such as the size and the number of convolution kernels in a deep convolution network, the filter size and the step length of a pooling layer, the number of nodes of a full connection layer and the like into individuals in an evolutionary algorithm, then generating a series of neural networks with high enough accuracy by only taking the accuracy as a target function and applying a single target evolutionary algorithm, and when the accuracy in an individual set is higher than a control threshold: i.e. the single target algorithm controls the output minimum accuracy rate r1And (4) ending the calculation, and outputting the last generation of individuals as the initial population of the next multi-target algorithm. In the process, the accuracy of the variant individual is predicted by using the Gaussian random field prediction model established in the step one, and whether real accuracy calculation is carried out or not is judged by combining a pre-screening strategy.
Further, the second step specifically comprises:
step 2.1, generating a plurality of individuals uniformly and randomly in a decision space Ω, wherein all the individuals are composed of real number codes corresponding to hyper-parameters of a neural network structure, and the composition is shown in a formula (2.1):
xi(0)=(xi,1(0),xi,2(0),...,xi,d(0))i=1,2,3...,Md=1,2,3...,V (2.1)
wherein M represents the number of target generation individuals, V represents the maximum dimension of a decision space omega, and the initialization mode of the jth dimension of the ith individual is shown as a formula (2.2);
xi,j(0)=L(0,1)(Lj_minj_max|i=1,2,3...,M j=1,2,3,...,d)j_min (2.2)
wherein L isj_minAnd Lj_maxRespectively representing the upper and lower boundaries of the value of the jth dimension of the parameter vector, and rand (0, 1) representing the generation of a random number between 0 and 1;
step 2.2, performing variation operation of the population initialization algorithm, namely randomly selecting two different individuals from the population, scaling the vector difference of the two different individuals, and then performing vector synthesis on the two different individuals and the individual to be varied, as shown in a formula (2.3):
x′i(g)=xr1(g)+F·(xr2(g)-xr3(g)) (2.3)
wherein the scaling factor F ∈ [0, 2 ]]And r is1≠r2≠r3≠i,xri(g) X 'represents a pre-mutated individual'i(g) Represents new individuals generated after mutation; carrying out boundary value control on the newly generated individuals while carrying out mutation;
step 2.3, initializing a cross operation by the population, wherein the value of each dimension of the crossed individuals is randomly selected from the corresponding dimension value of the variant individuals or the corresponding original individuals, so as to obtain the crossed individuals, and the specific generation method is shown as a formula (2.4):
wherein the cross probability cr is E [0, 1 ]],x″i,jRepresents the mutated individual x ″)iThe j-th dimension value of (a).
Step 2.4, selecting operation of population initialization, calling the Gaussian random field prediction model established in the step one, and pre-screening the variant individuals by setting a pre-screening rule; wherein the pre-filtering rule is set based on the possible boosting probability defined in equation (2.5):
wherein x represents an individual solution vector, representing an unbiased estimate of the objective function, fminRepresents the minimum value of the fitness function,representing an unbiased estimate of the variance, poi (x) representing the probability of possible lifting for the individual solution vector x;
selecting M/2 individuals from PoI values of all variant individuals in a roulette mode to perform real fitness function evaluation, comparing the PoI values with the fitness function values of parent population individuals, selecting the optimal individuals according to a greedy rule, and constructing a new generation population, wherein the specific selection rule is shown as a formula (2.6):
wherein f (x) represents a fitness function value for the target individual;
when the individuals in the population are updated, adding the new individuals into a reference network set S of the Gaussian random field modeleval(ii) a If no change occurs, no update occurs.
Step 2.5: judging the output accuracy of the population, and when the accuracy of the network corresponding to all individuals in the population is greater than the minimum accuracy r of the single-target algorithm control output1When the yield is 0.9, the algorithm is terminated and a final generation population is output; if not, return to step 2.2.
Step three: and optimizing the network set by the multi-objective differential evolution algorithm. And (4) firstly, acquiring the last generation of individuals in the step two as an initial population of the multi-target differential evolution algorithm. Then, the fitness function of each individual in the initial population is calculated and the individual is placed in an external storage set. And calculating adjacent subproblems of each individual and carrying out Gaussian variation on the individual. And predicting the fitness function of the new individual by using the Gaussian random field prediction model constructed in the step one, judging by combining a pre-screening strategy, and updating the reference point and neighborhood problems of the individual. And finally, the accuracy and the control threshold are as follows: minimum accuracy rate r of multi-target algorithm control joining external storage set2Comparing and judgingWhether to update the external storage set. When the individual accuracy rates in the external storage set are all higher than the control threshold: control output minimum accuracy rate r of multi-target algorithm3And when the time or the evolution algebra is larger than the control evolution algebra threshold value, ending the circulation and outputting the external storage set.
Further, the third step specifically comprises:
step 3.1: taking a plurality of neural network individuals of the last generation population of the single-target algorithm obtained in the step two as an initial population of the multi-target algorithm;
step 3.2: calculating the fitness function of each individual of the population, and storing the fitness function into an external storage set outEP and a Gaussian random field model reference network set SevalPerforming the following steps;
step 3.3: generation of adjacent subproblems: multi-objective differential evolution algorithm generates an evenly distributed weight vector [ lambda ] for all sub-problems1,...,λMH, wherein the weight vector corresponding to the ith sub-problem Represents a single weight value; obtaining T sub-problems which are closest to each sub-problem, namely a neighborhood, by calculating Euclidean distances between weight vectors corresponding to the sub-problems;
step 3.4: the Gaussian variation of the neural network individuals: in the evolution process of the first generation of population individuals, two indexes p and q are selected randomly from the neighborhood of the ith subproblem in a circulating mode, and then corresponding individuals are obtainedAndobtaining variant individuals according to a basic variant formula of a differential evolution algorithmAnd according to probability isAdding a Gaussian random variable into each dimension value of the variant individual, as shown in formula (3.1):
wherein the scaling factor F ∈ [0, 2 ]],rndU(0, 1) represents from [0, 1 ]]Fractions within the range obtained by uniform random sampling; rnd (r)G(0, sigma) represents a Gaussian random vector with the mean value of 0 and the standard deviation of sigma, the value of sigma is one twentieth of the value range of the corresponding dimension element, the value is 0.5,the expression means an individual after the mutation,representing the original individual;
after traversing each individual in the population, ending the circulation and entering the next step;
step 3.5: predicting the fitness of the new individual according to a Gaussian random field prediction model, calculating a PoI value according to a formula (2.5), and if the PoI is more than 0.5, calculating the true fitness of the individual; otherwise, discarding the variant individual and continuing to predict the next variant individual;
step 3.6: and (3) updating the reference point and the neighborhood: before updating the reference point, it is first determined whether the accuracy of the network corresponding to the variant individual is greater than a threshold: minimum accuracy rate r of multi-target algorithm control joining external storage set20.9 if the accuracy is greater than r2Updating the reference point, otherwise not updating; wherein the reference pointAnd toThe part has:
wherein f isi(x) Representing fitness function values of corresponding individuals;
when neighborhood replacement operation is carried out, T neighborhood individuals of the ith individual need to be judged, and when a formula (3.3) is met, the corresponding domain network individual is replaced by a new variant individual;
wherein isRepresenting elements in a neighborhood B (i) corresponding to the ith individual;
only if the accuracy of the network corresponding to the variant individual is greater than the threshold r2When the replacement condition shown in the formula (3.3) is satisfied and 0.9 is set, performing neighborhood replacement operation;
step 3.7: after the variant individual is updated in step 3.6, the variant individual is added into the reference network set S of the Gaussian random field modeleavlPerforming the following steps;
step 3.8: continuing to circulate until the generation of individuals completes iteration;
step 3.9: updating the outer set outEP; in each generation of the multi-target algorithm evolution, firstly, the network accuracy in the variant individual is greater than the minimum accuracy rate r of the control output of the multi-target algorithm3Saving the individuals with the size of 0.95, and adding the individuals into an external storage set after all the variations of the contemporary population are finished; then evolution algebra G is G + 1;
step 3.10: when all the individual accuracy rates in the external storage set are greater than the control output accuracy rate threshold value, the multi-target algorithm controls the output minimum accuracy rate r3When the evolution algebra G is larger than the control threshold value 20 or 0.95, ending the operation of the algorithm and outputting all individuals in the external storage set; and when the condition is not met, returning to the step 3.4 to continue the operation.
Step four: the clustering algorithm reduces the size of the network set. Firstly, the external storage set output in the step three is obtained as the network set of the step. Then, a clustering algorithm is adopted to narrow the selection range of the networks, K clusters are formed according to the two objective function values of the accuracy and the difference of each network in the set, and the networks which are relatively close to each other in the set are clustered into a whole. And finally, selecting a central network from each cluster, constructing a new deep neural network set and outputting the new deep neural network set as a final result.
Further, the fourth step further comprises:
step 4.1: obtaining the difference and the accuracy of the deep neural network in the external storage set outEP in the third step to generate a data set;
step 4.2: initializing centers of K categories randomly;
step 4.3: an initial clustering center matrix;
step 4.4: traversing the distances between all data in the data set and the centers of the K clustering matrixes, and measuring the distance between two individuals by adopting the Euclidean distance, wherein the calculation mode is shown as a formula (4.1):
wherein x1And x2Respectively representing two different network entities, fi(xi) Representing network individuals xiThe value of i is 1 or 2, dis represents the distance between the two individuals;
step 4.5: finding out the minimum distance and determining whether to update the clustering center matrix; if the minimum distance is smaller, updating the clustering center individual; traversing K clusters, and ending the cycle after each cluster is traversed;
step 4.6: from each cluster, the central individual outputs are selected and combined together as the final set of deep neural networks output.
The invention has the beneficial effects that: the method introduces the Gaussian random field model to predict the fitness function of the deep neural network, and combines the pre-screening strategy to reduce the calculation times, so that a proper deep neural network set can be obtained more quickly and accurately, the training speed of the model is accelerated, and the performance of the model is improved.
Drawings
FIG. 1 is a flow chart of a method for generating a deep network set by combining Gaussian random fields according to the invention.
FIG. 2 is an algorithm flow diagram of the population initialization strategy of the present invention.
Detailed Description
The following detailed description of embodiments of the invention refers to the accompanying drawings.
The embodiment of the invention is carried out on the basis of laboratory equipment, and adopts a PC of Windows 10. The CPU used for the experiment was an Intel core i 77700K processor with 4 cores and 8 threads, a base frequency of 4.5G Hz, and a dynamic acceleration frequency of 4.5G Hz. The GPU is NVIDIA GTX 1082Ti, and the GPU is provided with 3584 CUDA cores, 11GB video memory and 352bit video memory bit width. The PC is also equipped with a 16GB memory and a 1T hard disk.
The data set used in the present invention is an MNIST data set, and MNIST (Mixed National Institute of Standards and Technology database) is a classic data set in the computer vision field, which contains 70000 gray-scale pictures of handwritten numbers in total, and each picture is 28 × 28 pixel points. In this task, each picture corresponds to a certain label, which is the actual number represented by the handwritten digital picture. The entire MNIST data set is divided into two parts, a training data set consisting of 60000 pictures and a test data set consisting of 10000 pictures. Wherein the training data set is further divided into a training set of 55000 pictures and a validation set of 5000 pictures.
The meaning of some parameters in the steps of the invention is as follows:
PoI: the probability of possible boosting is derived from the gaussian random field model.
r1: the single target algorithm controls the output with the lowest accuracy.
r2: the multi-objective algorithm controls the lowest accuracy rate of joining the external storage set.
r3: and controlling and outputting the lowest accuracy rate by a multi-target algorithm.
Seval: the gaussian random field model references a set of network individuals.
outEP: and the external storage set is used for outputting the final individual.
λi: average weight vector of network individuals.
G: evolution algebra of the multi-objective evolution algorithm.
The method for generating the depth network set by combining the Gaussian random field comprises the following specific steps:
the method comprises the following steps: and (4) building a prediction model by combining the Gaussian random field.
Step 1.1: acquiring a neural network individual set of which the accurate fitness function is calculated, wherein the number of the network individuals is 300, establishing an accuracy rate and a difference set of each individual, and ensuring the consistency of the lengths of the sets.
Step 1.2: a kernel function of gaussian regression prediction is defined for passing the covariance between the input individuals. The kernel function uses WhiteKernel, which can estimate the noise of the target function and reduce the influence of the noise.
Step 1.3: and fitting the neural network individuals and the fitness value to obtain a regression model.
Step 1.4: the hyper-parameter optimizer of the kernel function is optimized by maximizing the transfer-based log-edge likelihood estimation. Since there may be multiple locally optimal solutions for log-edge likelihood estimation, the optimization process may be repeated multiple times by specifying the kernel parameter n _ resets _ optimum. The first optimized run is made by setting the initial value of the kernel's hyper-parameter n _ resets _ optimum. The value of the over-parameter n _ thresholds _ optimum in the subsequent operation process is randomly selected from reasonable range values.
Step 1.5: obtaining a prediction model combined with a Gaussian random field and establishing a reference network set SevalAnd the method is used for predicting the fitness of the individual subsequently and updating in real time in the prediction process.
Wherein the step 1.3 is divided into the following five steps:
step 1.3.1: taking the neural network individual x in the neural network individual set obtained in step 1.1 as an independent variable, the corresponding fitness function value as a dependent variable y, and constructing an objective function y as g (x).
Step 1.3.2: from gaussian process regression prediction, assume that the objective function y ═ g (x) obeys a mean value μ and a variance δ2Positive too distribution of (c).
Step 1.3.3: and constructing a maximum likelihood estimation function for fitting a normal distribution curve. The maximum likelihood function PDF is shown in equation (1.1):
wherein exp represents an exponential function with a natural logarithm e as a base, and det represents a value of a determinant corresponding to the calculation matrix; the matrix C is a K × K matrix, and the values C of corresponding positions in the matrixi,j=c(xi,xj),c(xi,xj) Representing the value of the correlation function, for an arbitrary argument x, x' ∈ Rn,RnRepresenting a set of real numbers, the correlation function c (x, x ') -exp [ -d (x, x')]And characterizing the correlation between the objective function values g (x) and g (x ') corresponding to the independent variables x, x'. Wherein the distance functionθiAnd piBoth represent the hyperparameters controlling the distance function, independently of the independent variables x, x'. The value of the correlation function is therefore only related to the size of (x-x '), the greater the (x-x'), the smaller the correlation and vice versa; vector y ═ y1,y2,...yK) And 1 is a column vector of dimension K.
Step 1.3.4: the likelihood estimation function is maximized according to the features of the positive distribution. Let mean valueVariance (variance)The unbiased estimate of the objective function g (x) is Denotes the expectation, variance, of the mean μWherein r ═ c (x, x)1),c(x,x2),...,c(x,xK))T,Represents the variance δ2The expectation is that. At this point, the objective function g (x) may be considered to obey the positive-Taiwan distribution
Step 1.3.5: and (3) training the individual neural network used in the step (1.3.1) and the fitness value to a Gaussian random field prediction model to obtain a final regression model.
Step two: obtaining neural network set with higher accuracy by single-target differential evolution algorithm
Step 2.1: 100 individuals are uniformly and randomly generated in a decision space omega, all the individuals are composed of real number codes corresponding to hyper-parameters of a neural network structure to be structured, and the specific composition is shown in a formula (2.1).
xi(0)=(xi,1(0),xi,2(0),...,xi,d(0))i=1,2,3...,Md=1,2,3...,V (2.1)
Wherein M represents the number of target generation individuals, V represents the maximum dimension of the decision space Ω, and the initialization mode of the jth dimension of the ith individual is shown as formula (2.2).
xi,j(0)=L(0,1)(Lj_minj_max|i=1,2,3...,M j=1,2,3,...,d)j_min (2.2)
Wherein L isj_minAnd Lj_maxRespectively representing the upper and lower boundaries of the value of the jth dimension of the parameter vector, and rand (0, 1) represents the generation of a random number between 0 and 1.
Step 2.2: in the variation operation of the population initialization algorithm, two different individuals are randomly selected from a population, the vector difference of the two different individuals is scaled and then vector synthesis is carried out on the two different individuals and the individual to be varied, and the specific details are shown in a formula (2.3):
x′i(g)=xr1(g)+F·(xr2(g)-xr3(g)) (2.3)
wherein the scaling factor F ∈ [0, 2 ]]And r is1≠r2≠r3≠i,xri(g) X 'represents a pre-mutated individual'i(g) Indicates the new individuals generated after mutation. Strict boundary value control needs to be carried out on newly generated individuals while mutation is carried out, and when the value of a certain dimension exceeds the corresponding interval, the algorithm remaps the value to a proper range through a specific operation.
Step 2.3: and (3) initializing a cross operation by the population, wherein the value of each dimension of the crossed individuals is randomly selected from the corresponding dimension value of the variant individuals or the corresponding original individuals. Thus, cross individuals are obtained, and a specific generation method is shown in formula (2.4):
wherein the cross probability cr is E [0, 1 ]],x″i,jRepresents the mutated individual x ″)iThe j-th dimension value of (a).
Step 2.4: and (3) carrying out selection operation of population initialization, calling the Gaussian random field prediction model established in the step one, and pre-screening the variant individuals by setting a pre-screening rule. The pre-filtering rules are set here based on the possible lifting probabilities defined in equation (2.5).
Where x represents an individual solution vector and, representing an unbiased estimate of the objective function, fminRepresents the minimum value of the fitness function,representing the unbiased estimation of the variance, and poi (x) represents the possible lifting probability corresponding to the individual solution vector x, and the value is about 0.5.
Selecting M/2 individuals (half of the population size) from PoI values of all variant individuals in a roulette mode to perform real fitness function evaluation, comparing the PoI values with the fitness function values of parent population individuals, selecting the optimal individuals according to a greedy rule, and constructing a new generation of population, wherein the specific selection rule is shown as a formula (2.6).
Where f (x) represents the fitness function value of the target individual.
When the individuals in the population are updated, adding the new individuals into a reference network set S of the Gaussian random field modeleval. If no change occurs, no update occurs.
Step 2.5: the method comprises the following steps that a group output accuracy rate judgment operation is carried out, wherein the function of a single-target algorithm at the position is to generate a group of networks with high enough accuracy rate, and therefore, an accuracy rate parameter r is added into the algorithm1The termination of the algorithm is controlled at 0.9. When the accuracy of the network corresponding to all the individuals in the population is greater than the accuracy r1When 0.9, the algorithm terminates and outputs the final generation population. If not, return to step 2.2.
Step three: and further optimizing the network set obtained by initialization.
The step mainly aims at the last generation population obtained by the single-target evolutionary algorithm to carry out the multi-target differential evolutionary algorithm
Step 3.1: and obtaining 100 neural network individuals of the last generation of population of the single-target algorithm, and taking the neural network individuals as the initial population of the multi-target algorithm.
Step 3.2: calculating the fitness function of each individual of the population, and storing the fitness function into a reference network set S of an external storage set outEP and a Gaussian random field modelevalIn (1).
Step 3.3: generation of adjacent subproblems. Multi-objective differential evolution algorithm generates an evenly distributed weight vector [ lambda ] for all sub-problems1,...,λMH, wherein the weight vector corresponding to the ith sub-problem Representing a single weight value. Then, T sub-problems (called neighborhoods) closest to each sub-problem can be obtained by calculating Euclidean distances between weight vectors corresponding to the sub-problems, and the evolution of the multi-objective algorithm is realized by information exchange between adjacent sub-problems.
Step 3.4: and (4) carrying out individual Gaussian variation on the neural network. In the evolution process of the first generation of population individuals, two indexes p and q are selected randomly from the neighborhood of the ith subproblem in a circulating manner, and then corresponding individuals are obtainedAndobtaining variant individuals according to a basic variant formula of a differential evolution algorithmAnd adding Gaussian random variables into each dimension value of the variant individual according to the probability, wherein the specific method is as a formula(3.1) is as follows:
wherein the scaling factor F ∈ [0, 2 ]],rndU(0, 1) represents from [0, 1 ]]The fraction within the range obtained by uniform random sampling. rnd (r)G(0, sigma) represents a Gaussian random vector with the mean value of 0 and the standard deviation of sigma, the value of sigma is one twentieth of the value range of the corresponding dimension element, the value is 0.5,the expression means an individual after the mutation,representing the original individual.
And after traversing each individual in the population, ending the circulation and entering the next step.
Step 3.5: predicting the fitness of the new individual according to the Gaussian random field prediction model, calculating the PoI value according to a formula (2.5), and if the PoI is more than 0.5, calculating the true fitness of the individual. Otherwise, discarding the variant individual and continuing to predict the next variant individual.
Step 3.6: and updating the reference point and the neighborhood. Before updating the reference point, it is first determined whether the accuracy of the network corresponding to the variant individual is greater than a threshold accuracy rate r20.9 if the accuracy is greater than r2The reference point is updated and otherwise not. Wherein the reference pointAnd toAre all provided with
Wherein f isi(x) The fitness function value of the ith individual is expressed.
When neighborhood replacement operation is performed, T neighborhood individuals of the ith individual (subproblem) need to be judged, and when the formula (3.3) is met, the corresponding domain network individual is replaced by a new variant individual.
Wherein isRepresents the elements in the neighborhood B (i) corresponding to the ith individual.
Only if the accuracy of the network corresponding to the variant individual is greater than the threshold r2When the replacement condition shown in equation (3.3) is satisfied at 0.9, the neighborhood replacement operation is performed. This is intended to avoid the situation where a higher accuracy individual is replaced by a lower accuracy individual.
Step 3.7: after the variant individual is updated in step 3.6, the variant individual is added into the reference network set S of the Gaussian random field modeleavlIn (1).
Step 3.8: continuing to loop until the iteration of the generation of individuals is completed
Step 3.9: the outer set outEP is updated. In each generation of multi-target algorithm evolution, the network accuracy in the variant individual is more than r3Individuals who are 0.95 remain, wait for the completion of all variations in the contemporary population, and add them to the external storage set. Then evolution algebra G +1
Step 3.10: when all the individual accuracy rates in the external storage set are greater than the control output accuracy rate threshold value r3When the evolution algebra G is larger than the control threshold value 20 or 0.95, the running of the algorithm is finished, and all the individuals in the external storage set are output. And when the condition is not met, returning to the step 3.4 to continue the operation.
Step four: the clustering algorithm reduces the scale of the neural network set.
Step 4.1: and obtaining the difference and the accuracy of the deep neural network in the external storage set outEP in the third step to generate a data set.
Step 4.2: the centers of the 10 classes are initialized randomly.
Step 4.3: an initial cluster center matrix.
Step 4.4: traversing the distances between all data in the data set and the centers of 10 clustering matrices, wherein the distance between two individuals is measured by the Euclidean distance, and the calculation mode is shown as formula (4.1):
wherein x1And x2Respectively representing two different network entities, fi(xi) Representing network individuals xiI has a value of 1 or 2, dis represents the distance between the two individuals.
Step 4.5: finding the minimum distance and determining whether to update the cluster center matrix. And updating the clustering center individual if the minimum distance is smaller. And traversing 10 clusters, and ending the cycle after each cluster is traversed.
Step 4.6: from each cluster, the central individual outputs are selected and combined together as the final set of deep neural networks output.
In order to verify the effectiveness of the pre-screening strategy combined with the Gaussian random field in the method provided by the invention, 5 times of experiments are respectively carried out on the models with or without the Gaussian random field under the condition that the overall algorithm setting is kept unchanged, so that the contingency of the experiments is eliminated and the accuracy of the experimental results is ensured as high as possible. Averaging the obtained experimental results and rounding the operation time to obtain the performance of the model on the test set and the optimized part of the operation time under the condition of the existence of the pre-screening strategy, wherein the specific experimental data are shown in table 1:
TABLE 1 influence of pre-screening strategy based on Gaussian random field on experimental results
In conclusion, the deep network set generation method combined with the Gaussian random field can effectively improve the training speed of the network set, and the accuracy is higher.
The above description of exemplary embodiments has been presented only to illustrate the technical solution of the invention and is not intended to be exhaustive or to limit the invention to the precise form described. Obviously, many modifications and variations are possible in light of the above teaching to those skilled in the art. The exemplary embodiments were chosen and described in order to explain certain principles of the invention and its practical application to thereby enable others skilled in the art to understand, implement and utilize the invention in various exemplary embodiments and with various alternatives and modifications. It is intended that the scope of the invention be defined by the following claims and their equivalents.
Claims (9)
1. A deep network set generation method combined with a Gaussian random field is characterized by comprising the following steps:
step one, respectively constructing a Gaussian random field prediction model of two indexes of accuracy and difference by using a neural network individual of which the fitness function is accurately calculated;
secondly, after population initialization operation is carried out by using a single-target differential evolution algorithm, a neural network set with higher accuracy is obtained, namely the neural network set is initialized; predicting the accuracy of the variant individual by using the Gaussian random field prediction model established in the step one;
step three, further optimizing the neural network set obtained by initialization in the step two;
calculating a fitness function of each individual in the initial population, and carrying out Gaussian variation on the individual; predicting the fitness function of the new individual by using a Gaussian random field prediction model, and judging by combining a pre-screening strategy; updating the reference point and neighborhood problems of the individual; judging whether to update an external storage set; outputting the external storage set;
step four, the clustering algorithm reduces the scale of the network set;
acquiring an external storage set output in the step three; adopting a clustering algorithm to narrow the selection range of the networks, and clustering according to the accuracy and difference of each network in the set; and selecting a central network from each cluster, constructing a new deep neural network set and outputting the new deep neural network set as a final result.
2. The method of claim 1, wherein the step one further comprises:
step 1.1, acquiring a neural network individual set of which the fitness function is accurately calculated, establishing the accuracy and difference set of each individual, and ensuring the consistency of the lengths of the sets;
step 1.2, defining a kernel function of Gaussian regression prediction, and transferring covariance between input individuals; the kernel function is WhiteKemel and is used for estimating the noise of the target function and reducing the influence of the noise;
step 1.3, fitting Gaussian process regression is carried out on the neural network individuals and the accurate fitness function values, and the neural network individuals and the fitness values are fitted to obtain a regression model;
step 1.4, optimizing the over-parameter optimizer of the kernel function by maximizing the transfer-based log-edge likelihood estimation; because the log-edge likelihood estimation has a plurality of local optimal solutions, the optimization process is repeated for a plurality of times by specifying a kernel parameter n _ resets _ optimum; performing first optimization operation by setting an initial value of a hyper-parameter n _ resets _ optimum of the kernel; in the subsequent operation process, values of the hyper-parameter n _ resets _ optimum are randomly selected from reasonable range values;
step 1.5, obtaining a prediction model combined with the Gaussian random field, and establishing a reference network set S of the Gaussian random field modelevalAnd the method is used for predicting the fitness of the individual subsequently and updating in real time in the prediction process.
3. The method of claim 2, wherein the step 1.3 further comprises:
step 1.3.1, taking a neural network individual x in the neural network individual set obtained in step 1.1 as an independent variable, taking a corresponding fitness function value as a dependent variable y, and constructing an objective function y which is g (x);
step 1.3.2, regression prediction is performed according to a gaussian process, and an objective function y is assumed to be g (x) and the mean value is assumed to be mu, and the variance is assumed to be delta2Positive-negative distribution of;
step 1.3.3, constructing a maximum likelihood estimation function for fitting a normal distribution curve; the maximum likelihood function PDF is shown in equation (1.1):
wherein exp represents an exponential function with a natural logarithm e as a base, and det represents a value of a determinant corresponding to the calculation matrix; the matrix C is a K × K matrix, and the values C of corresponding positions in the matrixi,j=c(xi,xj),c(xi,xj) Representing the value of the correlation function, for an arbitrary argument x, x' ∈ Rn,RnRepresenting a set of real numbers, the correlation function c (x, x ') -exp [ -d (x, x')]Characterizing the correlation between the objective function values g (x) and g (x ') corresponding to the independent variables x, x'; wherein the distance functionθiAnd piEach represents a hyper-parameter controlling the distance function, independent of the independent variable x, x ', so that the value of the correlation function is only dependent on the magnitude of (x-x '), the greater the (x-x '), the smaller the correlation, and vice versa; vector y ═ y1,y2,...yK) And 1 is a column vector of dimension K;
step 1.3.4, according to the feature of positive space distribution, maximizing the likelihood estimation function and making the mean valueVariance (variance)The unbiased estimate of the objective function g (x) is Denotes the expectation, variance, of the mean μWherein r ═ c (x, x)1),c(x,x2),...,c(x,xK))T,Represents the variance δ2Considering the objective function g (x) to obey a positive distribution
And step 1.3.5, training a Gaussian regression prediction model by the neural network individuals used in the step 1.3.1 and the fitness value to obtain a final regression model.
4. The method according to any one of claims 1 to 3, wherein in the second step, the hyper-parameters of the size and number of convolution kernels in the deep convolutional network, the filter size and step size of the pooling layer, and the number of nodes in the fully-connected layer are encoded into the individual in the evolutionary algorithm, and a series of neural networks with high enough accuracy are generated by using the single-target evolutionary algorithm with only the accuracy as the objective function, and when the accuracies in the individual set are all higher than the control threshold: single target algorithm control output minimum accuracy rate r1And when the current generation is finished, finishing the calculation and outputting the last generation of individuals.
5. The method of claim 4, wherein the step two further comprises:
step 2.1, generating a plurality of individuals uniformly and randomly in a decision space Ω, wherein all the individuals are composed of real number codes corresponding to hyper-parameters of a neural network structure, and the composition is shown in a formula (2.1):
xi(0)=(xi,1(0),xi,2(0),...,xi,d(0))i=1,2,3...,Md=1,2,3...,V (2.1)
wherein M represents the number of target generation individuals, V represents the maximum dimension of a decision space omega, and the initialization mode of the jth dimension of the ith individual is shown as a formula (2.2);
xi,j(0)=L(0,1)(Lj_minj-max|i=1,2,3...,M j=1,2,3,...,d)j_min (2.2)
wherein L isj_minAnd Lj_maxRespectively representing the upper and lower boundaries of the value of the jth dimension of the parameter vector, and rand (0, 1) representing the generation of a random number between 0 and 1;
step 2.2, performing variation operation of the population initialization algorithm, namely randomly selecting two different individuals from the population, scaling the vector difference of the two different individuals, and then performing vector synthesis on the two different individuals and the individual to be varied, as shown in a formula (2.3):
x′i(g)=xr1(g)+F·(xr2(g)-xr3(g)) (2.3)
wherein the scaling factor F ∈ [0, 2 ]]And r is1≠r2≠r3≠i,xri(g) X 'represents a pre-mutated individual'i(g) Represents new individuals generated after mutation; carrying out boundary value control on the newly generated individuals while carrying out mutation;
step 2.3, initializing a cross operation by the population, wherein the value of each dimension of the crossed individuals is randomly selected from the corresponding dimension value of the variant individuals or the corresponding original individuals, so as to obtain the crossed individuals, and the specific generation method is shown as a formula (2.4):
wherein the cross probability cr is E [0, 1 ]],x″i,jRepresents the mutated individual x ″)iThe jth dimension value of (a);
step 2.4, selecting operation of population initialization, calling the Gaussian random field prediction model established in the step one, and pre-screening the variant individuals by setting a pre-screening rule; wherein the pre-filtering rule is set based on the possible boosting probability defined in equation (2.5):
wherein x represents an individual solution vector, representing an unbiased estimate of the objective function, fminRepresents the minimum value of the fitness function,representing an unbiased estimate of the variance, poi (x) representing the probability of possible lifting for the individual solution vector x;
selecting M/2 individuals from PoI values of all variant individuals in a roulette mode to perform real fitness function evaluation, comparing the PoI values with the fitness function values of parent population individuals, selecting the optimal individuals according to a greedy rule, and constructing a new generation population, wherein the specific selection rule is shown as a formula (2.6):
wherein f (x) represents a fitness function value for the target individual;
when the individuals in the population are updated, adding the new individuals into a reference network set S of the Gaussian random field modeleval(ii) a If no change occurs, no update is performed;
step 2.5: judging the output accuracy of the population, and when the accuracy of the network corresponding to all individuals in the population is greater than the minimum accuracy r of the single-target algorithm control output1When the yield is 0.9, the algorithm is terminated and a final generation population is output; if not, return to step 2.2.
6. The method according to claim 5, wherein in the third step, the fitness function of each individual in the initial population is calculated and the individual is placed in an external storage set, with respect to the last generation of individuals obtained by the single-target evolutionary algorithm in the second step as the initial population of the multi-target differential evolutionary algorithm; calculating adjacent subproblems of each individual and carrying out Gaussian variation on the individual; predicting the fitness function of the new individual by using a Gaussian random field prediction model, and judging by combining a pre-screening strategy; updating the reference point and neighborhood problems of the individual; and comparing the accuracy of the network with a control threshold: minimum accuracy rate r of multi-target algorithm control joining external storage set2Comparing, and judging whether to update the external storage set; when the individual accuracy rates in the external storage set are all higher than the control threshold: control output minimum accuracy rate r of multi-target algorithm3And when the time or the evolution algebra is larger than the control evolution algebra threshold value, ending the circulation and outputting the external storage set.
7. The method of claim 6, wherein the step three further comprises:
step 3.1: taking a plurality of neural network individuals of the last generation population of the single-target algorithm obtained in the step two as an initial population of the multi-target algorithm;
step 3.2: calculating the fitness function of each individual of the population, and storing the fitness function into a reference network set S of an external storage set outEP and a Gaussian random field modelevalPerforming the following steps;
step 3.3: generation of adjacent subproblems: multi-objective differential evolution algorithm generates an evenly distributed weight vector [ lambda ] for all sub-problems1,...,λMH, wherein the weight vector corresponding to the ith sub-problem Represents a single weight value; obtaining T sub-problems which are closest to each sub-problem, namely a neighborhood, by calculating Euclidean distances between weight vectors corresponding to the sub-problems;
step 3.4: the Gaussian variation of the neural network individuals: in the evolution process of the first generation of population individuals, two indexes p and q are selected randomly from the neighborhood of the ith subproblem in a circulating manner, and then corresponding individuals are obtainedAndobtaining variant individuals according to a basic variant formula of a differential evolution algorithmAnd adding a Gaussian random variable into each dimension value of the variant individual according to the probability, as shown in a formula (3.1):
wherein the scaling factor F ∈ [0, 2 ]],rndU(0, 1) represents from [0, 1 ]]Obtained by uniform random sampling within rangeA decimal number; rnd (r)G(0, sigma) represents a Gaussian random vector with the mean value of 0 and the standard deviation of sigma, the value of sigma is one twentieth of the value range of the corresponding dimension element, the value is 0.5,the expression means an individual after the mutation,representing the original individual;
after traversing each individual in the population, ending the circulation and entering the next step;
step 3.5: predicting the fitness of the new individual according to a Gaussian random field prediction model, calculating a PoI value according to a formula (2.5), and if the PoI is more than 0.5, calculating the true fitness of the individual; otherwise, discarding the variant individual and continuing to predict the next variant individual;
step 3.6: and (3) updating the reference point and the neighborhood: before updating the reference point, it is first determined whether the accuracy of the network corresponding to the variant individual is greater than a threshold: minimum accuracy rate r of multi-target algorithm control joining external storage set20.9 if the accuracy is greater than r2Updating the reference point, otherwise not updating; wherein the reference pointAnd toAll have:
wherein f isi(x) Representing fitness function values of corresponding individuals;
when neighborhood replacement operation is carried out, T neighborhood individuals of the ith individual need to be judged, and when a formula (3.3) is met, the corresponding domain network individual is replaced by a new variant individual;
wherein isRepresenting elements in a neighborhood B (i) corresponding to the ith individual;
only if the accuracy of the network corresponding to the variant individual is greater than the threshold r2When the replacement condition shown in the formula (3.3) is satisfied and 0.9 is set, performing neighborhood replacement operation;
step 3.7: after the variant individual is updated in step 3.6, the variant individual is added into the reference network set S of the Gaussian random field modeleavlPerforming the following steps;
step 3.8: continuing to circulate until the generation of individuals completes iteration;
step 3.9: updating the outer set outEP; in each generation of the multi-target algorithm evolution, firstly, the network accuracy in the variant individual is greater than the minimum accuracy rate r of the control output of the multi-target algorithm3Saving the individuals with the size of 0.95, and adding the individuals into an external storage set after all the variations of the contemporary population are finished; then evolution algebra G is G + 1;
step 3.10: when all the individual accuracy rates in the external storage set are greater than the control output accuracy rate threshold value, the multi-target algorithm controls the output minimum accuracy rate r3When the evolution algebra G is larger than the control threshold value 20 or 0.95, ending the operation of the algorithm and outputting all individuals in the external storage set; and when the condition is not met, returning to the step 3.4 to continue the operation.
8. The method according to claim 7, wherein in the fourth step, the external storage set outputted in the third step is obtained; adopting a clustering algorithm to narrow the selection range of the network, forming K clusters according to two objective function values of the accuracy and the difference of each network in the set, and clustering networks which are relatively close to each other in the set into a class; and finally, selecting a central network from each cluster, constructing a new deep neural network set and outputting the new deep neural network set as a final result.
9. The method of claim 8, wherein the step four further comprises:
step 4.1: obtaining the difference and the accuracy of the deep neural network in the external storage set outEP in the third step to generate a data set;
step 4.2: initializing centers of K categories randomly;
step 4.3: an initial clustering center matrix;
step 4.4: traversing the distances between all data in the data set and the centers of the K clustering matrixes, and measuring the distance between two individuals by adopting the Euclidean distance, wherein the calculation mode is shown as a formula (4.1):
wherein x1And x2Respectively representing two different network entities, fi(xi) Representing network individuals xiThe value of i is 1 or 2, dis represents the distance between the two individuals;
step 4.5: finding out the minimum distance and determining whether to update the clustering center matrix; if the minimum distance is smaller, updating the clustering center individual; traversing K clusters, and ending the cycle after each cluster is traversed;
step 4.6: from each cluster, the central individual outputs are selected and combined together as the final set of deep neural networks output.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111001978.6A CN113762370A (en) | 2021-08-30 | 2021-08-30 | Depth network set generation method combined with Gaussian random field |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111001978.6A CN113762370A (en) | 2021-08-30 | 2021-08-30 | Depth network set generation method combined with Gaussian random field |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113762370A true CN113762370A (en) | 2021-12-07 |
Family
ID=78791715
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111001978.6A Withdrawn CN113762370A (en) | 2021-08-30 | 2021-08-30 | Depth network set generation method combined with Gaussian random field |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113762370A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115099393A (en) * | 2022-08-22 | 2022-09-23 | 荣耀终端有限公司 | Neural network structure searching method and related device |
CN117152568A (en) * | 2023-11-01 | 2023-12-01 | 常熟理工学院 | Deep integration model generation method and device and computer equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104921727A (en) * | 2015-06-24 | 2015-09-23 | 上海海事大学 | Brain function connectivity detection system and method based on self-adaptive priori information guidance |
CN106484758A (en) * | 2016-08-09 | 2017-03-08 | 浙江经济职业技术学院 | A kind of real-time stream Density Estimator method being optimized based on grid and cluster |
CN110417015A (en) * | 2019-06-18 | 2019-11-05 | 湖北追日电气股份有限公司 | Micro-capacitance sensor Multiobjective Optimal Operation method and system based on Model Predictive Control |
CN110443364A (en) * | 2019-06-21 | 2019-11-12 | 深圳大学 | A kind of deep neural network multitask hyperparameter optimization method and device |
-
2021
- 2021-08-30 CN CN202111001978.6A patent/CN113762370A/en not_active Withdrawn
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104921727A (en) * | 2015-06-24 | 2015-09-23 | 上海海事大学 | Brain function connectivity detection system and method based on self-adaptive priori information guidance |
CN106484758A (en) * | 2016-08-09 | 2017-03-08 | 浙江经济职业技术学院 | A kind of real-time stream Density Estimator method being optimized based on grid and cluster |
CN110417015A (en) * | 2019-06-18 | 2019-11-05 | 湖北追日电气股份有限公司 | Micro-capacitance sensor Multiobjective Optimal Operation method and system based on Model Predictive Control |
CN110443364A (en) * | 2019-06-21 | 2019-11-12 | 深圳大学 | A kind of deep neural network multitask hyperparameter optimization method and device |
Non-Patent Citations (2)
Title |
---|
CHEN ZHANG等: "an evolutionary generation method of deep neural networks sets combined with gaussian random field", WIRELESS NETWORKS, pages 1 - 10 * |
ISMAIL M ALI等: "a novel design of differential evolution for solving discrete traveling salesman problems", SWARM AND EVOLUTIONARY COMPUTATION, pages 1 - 17 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115099393A (en) * | 2022-08-22 | 2022-09-23 | 荣耀终端有限公司 | Neural network structure searching method and related device |
CN115099393B (en) * | 2022-08-22 | 2023-04-07 | 荣耀终端有限公司 | Neural network structure searching method and related device |
CN117152568A (en) * | 2023-11-01 | 2023-12-01 | 常熟理工学院 | Deep integration model generation method and device and computer equipment |
CN117152568B (en) * | 2023-11-01 | 2024-01-30 | 常熟理工学院 | Deep integration model generation method and device and computer equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110334843B (en) | Time-varying attention improved Bi-LSTM hospitalization and hospitalization behavior prediction method and device | |
CN111723674B (en) | Remote sensing image scene classification method based on Markov chain Monte Carlo and variation deduction and semi-Bayesian deep learning | |
CN108985515B (en) | New energy output prediction method and system based on independent cyclic neural network | |
CN111259738B (en) | Face recognition model construction method, face recognition method and related device | |
CN113128671B (en) | Service demand dynamic prediction method and system based on multi-mode machine learning | |
CN111898689A (en) | Image classification method based on neural network architecture search | |
CN113762370A (en) | Depth network set generation method combined with Gaussian random field | |
CN108732931A (en) | A kind of multi-modal batch process modeling method based on JIT-RVM | |
CN113241122A (en) | Gene data variable selection and classification method based on fusion of adaptive elastic network and deep neural network | |
CN116187835A (en) | Data-driven-based method and system for estimating theoretical line loss interval of transformer area | |
CN115393632A (en) | Image classification method based on evolutionary multi-target neural network architecture structure | |
Louhichi et al. | Shapley values for explaining the black box nature of machine learning model clustering | |
CN116629352A (en) | Hundred million-level parameter optimizing platform | |
CN113988358A (en) | Carbon emission index prediction and treatment method based on transfer reinforcement learning | |
Ma | An Efficient Optimization Method for Extreme Learning Machine Using Artificial Bee Colony. | |
Rad et al. | GP-RVM: Genetic programing-based symbolic regression using relevance vector machine | |
Loni et al. | Densedisp: Resource-aware disparity map estimation by compressing siamese neural architecture | |
CN110555530B (en) | Distributed large-scale gene regulation and control network construction method | |
CN116842354A (en) | Feature selection method based on quantum artificial jellyfish search mechanism | |
Shin et al. | A novel method for fashion clothing image classification based on deep learning | |
Cong et al. | Self-paced weight consolidation for continual learning | |
CN116912600A (en) | Image classification method based on variable step length ADMM algorithm extreme learning machine | |
CN117151277A (en) | Two-dimensional irregular layout method based on network migration and hybrid positioning and application | |
Wang et al. | Evolving connectivity for recurrent spiking neural networks | |
Liu et al. | Predicting stock trend using multi-objective diversified Echo State Network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20211207 |
|
WW01 | Invention patent application withdrawn after publication |