CN115081323A - Method for solving multi-objective constrained optimization problem and storage medium thereof - Google Patents

Method for solving multi-objective constrained optimization problem and storage medium thereof Download PDF

Info

Publication number
CN115081323A
CN115081323A CN202210688940.9A CN202210688940A CN115081323A CN 115081323 A CN115081323 A CN 115081323A CN 202210688940 A CN202210688940 A CN 202210688940A CN 115081323 A CN115081323 A CN 115081323A
Authority
CN
China
Prior art keywords
population
individuals
constraint
objective
optimization problem
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210688940.9A
Other languages
Chinese (zh)
Inventor
何克晶
黄秋越
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202210688940.9A priority Critical patent/CN115081323A/en
Publication of CN115081323A publication Critical patent/CN115081323A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2111/00Details relating to CAD techniques
    • G06F2111/06Multi-objective optimisation, e.g. Pareto optimisation using simulated annealing [SA], ant colony algorithms or genetic algorithms [GA]

Abstract

The invention discloses a method for solving a multi-objective constraint optimization problem and a storage medium thereof. The method comprises the following steps: generating an initial population, and then respectively calculating a target function value and a constraint value; self-adaptively generating a new difference factor and a constraint relaxation factor by using a strategy network of deep reinforcement learning; generating new individuals by using differential factor evolution, and forming a temporary population by the new individuals and the previous generation population; obtaining a Chebyshev value and a constraint value of an individual in the temporary population based on a decomposed multi-objective evolutionary algorithm MOEA/D, and then updating to a new generation population; and if the maximum iteration times are reached, taking the finally updated population as the optimal solution of the multi-objective constraint optimization problem, otherwise, calculating feedback reward and returning to the strategy network of deep reinforcement learning to update the loss function, and then continuing iteration. The method can adaptively adjust the sensitive parameters in the evolutionary algorithm, and realize the balance of convergence and distribution in the evolutionary process.

Description

Method for solving multi-objective constrained optimization problem and storage medium thereof
Technical Field
The invention belongs to the technical field of optimization problem algorithms, and particularly relates to a method for solving a multi-objective constraint optimization problem and a storage medium thereof.
Background
Complex problems in science and engineering, such as workshop scheduling, model design, path optimization and the like, are generally NP-difficult problems. These problems carry multiple objectives and at the same time have numerous conditional constraints.
The evolutionary algorithm simulates the process of self-learning, self-adaptation and self-solving problems in the biological genetic evolution, and comprises four basic operations: propagation, recombination, competition and selection, are often used to solve such complex problems. A Multi-objective optimization Algorithm (MOEA/D) Based on Decomposition is a classic Algorithm for solving Multi-objective problems in the prior art, and the core idea is to decompose a Multi-objective optimization problem into a plurality of single-objective optimization sub-problems and solve the sub-problems one by one to finally obtain each Pareto optimal solution on the Pareto front.
However, in the existing multi-objective optimization algorithm based on decomposition, in the iterative process of the MOEA/D algorithm operation, the parameters used in each generation are solidified, but the solidified parameters are not necessarily suitable for each iteration. Moreover, most of the practical scientific and engineering problems are constrained, and the industry focuses more on a constraint processing mechanism of a multi-objective optimization problem, but does not fundamentally solve the problem that the parameter solidification in the constraint mechanism causes the performance loss of the algorithm. On the other hand, in the prior art, a scheme for introducing a differential evolution algorithm into a multi-objective optimization algorithm based on decomposition exists, but the performance of the differential evolution algorithm highly depends on a variation and crossing strategy and associated control parameters, and the parameter setting process is time-consuming and has certain limitations.
Disclosure of Invention
In order to overcome one or more of the drawbacks and deficiencies of the prior art, a first object of the present invention is to provide a method for solving a multi-objective constrained optimization problem, and a second object of the present invention is to provide a storage medium for introducing a reinforcement learning and evolution algorithm into solving the multi-objective constrained optimization problem.
In order to achieve the above object, the present invention adopts the following technical means.
A method for solving a multi-objective constraint optimization problem comprises the following steps:
generating initial population from raw data of multi-objective constrained optimization problem
Figure BDA0003700827290000021
Then, respectively calculating an objective function value F (x) and a constraint value phi (x) of an individual x in the initial population;
adaptive generation of new difference factors using deep reinforcement learning policy networks
Figure BDA0003700827290000022
And a constraint relaxation factor epsilon t T represents the number of iterations;
from the starting population
Figure BDA0003700827290000023
Starting iteration using difference factor
Figure BDA0003700827290000024
Evolving to generate new individuals, and mixing the new individuals with the previous generation population
Figure BDA0003700827290000025
Forming a temporary population
Figure BDA0003700827290000026
Converting a multi-target constraint optimization problem into a single-target optimization problem to solve based on a decomposed multi-target evolutionary algorithm MOEA/D to obtain a temporary population
Figure BDA0003700827290000027
The Chebyshev value, the constraint value phi (x) of the middle individual, and then based on the constraint relaxation factor epsilon t Updating to a new generation population;
judging whether the maximum iteration number t is reached end (ii) a If so, taking the finally updated population as the optimal solution of the multi-target constraint optimization problem; if not, the difference between the new population and the individuals of the previous generation population is used as the feedback reward of the reinforcement learning, the feedback reward is returned to the strategy network of the deep reinforcement learning to update the loss function, and then the iteration is continued until the maximum iteration time t is reached end
Preferably, an initial population is generated
Figure BDA0003700827290000028
The process of calculating the objective function value f (x) and the constraint value Φ (x) includes:
forming a search space by using the original data of the multi-objective constrained optimization problem, and randomly generating an initial population from the search space
Figure BDA0003700827290000029
N represents the population size;
setting the maximum iteration number t for iterating the population end
Calculating an initial population
Figure BDA00037008272900000210
The target function value f (x), the inequality constraint value g (x), and the equality constraint value h (x) for the intermediate node, where Φ (x) ∑ g (x) +∑ c (x).
Preferably, the strategy network for deep reinforcement learning is a long-short term memory artificial neural network, the input of the long-short term memory artificial neural network is the objective function value F (x) of the individual in the population, and the output is a differential factor
Figure BDA0003700827290000031
And a constraint relaxation factor epsilon t Wherein, CR is the mutation probability, and F is the scaling factor.
Further, the input and output of the long-short term memory artificial neural network for learning are shown as follows:
Figure BDA0003700827290000032
wherein the content of the first and second substances,
Figure BDA0003700827290000033
is the information of the current population,
Figure BDA0003700827290000034
including the value of the objective function F (x), the inequality constraint value g (x), etc. of all the individuals in the current populationThe constraint value h (x) of formula (II),
Figure BDA0003700827290000035
is the weight information of the LSTM,
Figure BDA0003700827290000036
are cells in the LSTM and are,
Figure BDA0003700827290000037
the meaning of LSTM is the learning process of long-short term memory artificial neural networks, which are hidden units in LSTM.
Preferably, a difference factor is used
Figure BDA0003700827290000038
Evolution to generate new individuals to reconstitute temporary populations
Figure BDA0003700827290000039
The process comprises the following steps:
from the starting population
Figure BDA00037008272900000310
From the previous generation population
Figure BDA00037008272900000311
Randomly selecting two individuals from the neighbors of any individual, and using difference factors for the three individuals
Figure BDA00037008272900000312
Calculating based on the variation probability CR to generate a new individual;
combining newly generated individuals with the previous generation population
Figure BDA00037008272900000313
Forming a temporary population
Figure BDA00037008272900000314
Further, the process of generating new individuals is:
at the lastGeneration group
Figure BDA00037008272900000315
Any one of the individuals
Figure BDA00037008272900000316
Of (2)
Figure BDA00037008272900000317
Figure BDA00037008272900000318
In, randomly selecting Lambda t-1 Two individuals in
Figure BDA00037008272900000319
And
Figure BDA00037008272900000320
will be provided with
Figure BDA00037008272900000321
Figure BDA00037008272900000322
Difference factor used by three individuals
Figure BDA00037008272900000323
Performing operation based on the variation probability CR to generate new individuals
Figure BDA00037008272900000324
The process is shown as the following formula:
Figure BDA00037008272900000325
wherein rand (0,1) represents a probability between 0 and 1, N represents the total number of individuals in the population, M represents the total number of individuals in the neighborhood, i represents the number of individuals, j represents the number of individuals in the neighborhood, k represents the number of individuals in the neighborhood, i j Representing the jth individual, i, in the neighborhood of the ith individual j To representTake the kth individual in the neighborhood of the ith individual.
Preferably, the process of updating the MOEA/D to the next generation population based on the decomposition multi-objective evolutionary algorithm is as follows:
from temporary populations
Figure BDA0003700827290000041
Of (2)
Figure BDA0003700827290000042
Of (2)
Figure BDA0003700827290000043
In, randomly selecting individuals
Figure BDA0003700827290000044
Respectively calculating respective Chebyshev value and constraint value phi (x) of the two and comparing;
if it is
Figure BDA0003700827290000045
Are all less than a constraint relaxation factor epsilon t Then let
Figure BDA0003700827290000046
The individuals with smaller Chebyshev value in the two enter the next generation population
Figure BDA0003700827290000047
If it is
Figure BDA0003700827290000048
Not all less than the constraint relaxation factor epsilon t Then let
Figure BDA0003700827290000049
Of the two, individuals with smaller constraint value phi (x) enter the next generation population
Figure BDA00037008272900000410
Wherein i represents a temporary population
Figure BDA00037008272900000411
J denotes the second individual in the neighborhood.
Preferably, the maximum number of iterations t is not reached at the decision end The following steps are specifically:
the next generation is clustered
Figure BDA00037008272900000412
And the previous generation population
Figure BDA00037008272900000413
The difference between the two strategies is used as feedback reward of reinforcement learning, the updating loss function in the strategy network of the deep reinforcement learning is returned, and the updating loss function in the strategy network of the deep reinforcement learning is trained;
the iteration is then continued to update the population until a maximum number of iterations is reached.
Further, the calculation process of the feedback reward specifically includes:
new population
Figure BDA00037008272900000414
And the previous generation population
Figure BDA00037008272900000415
The difference between the two is expressed by an anti-generation distance index IGD; the feedback reward is noted as R t Feedback of the reward R t Calculated according to the following formula:
Figure BDA00037008272900000416
wherein abs represents an absolute value operation, and:
Figure BDA00037008272900000417
where A represents the solution set to be solved, P * Representing true PEvenly distributed sampling units on the areto front edge, d (y) * A) denotes the Euclidean distance of y from A, y * Represents P * Y represents the individual in A,
Figure BDA0003700827290000051
denotes the minimum Euclidean distance of y to A, y i The data representing each of the dimensions in y,
Figure BDA0003700827290000052
denotes y * For each dimension of data, the index i indicates the dimension, and m is the total dimension of the data.
A storage medium for storing a computer program arranged to perform a method of solving a multi-objective constrained optimization problem as described in any one of the preceding claims.
Compared with the prior art, the technical scheme of the invention has the following beneficial effects:
compared with the existing multi-objective optimization algorithm and differential evolution algorithm, the invention introduces deep reinforcement learning on the basis of the multi-objective evolutionary algorithm MOEA/D, designs a reward feedback mechanism based on IGD value for the deep reinforcement learning, can adaptively adjust the sensitive parameters (difference factors and constraint condition relaxation factors) in the evolutionary algorithm, can effectively adjust the balance of convergence and distribution in the evolutionary process, can adjust the balance of the concern and the distribution of constraint conditions in different stages in the population evolutionary process, effectively solves the performance problem of the multi-objective evolutionary algorithm caused by parameter sensitivity in the constraint condition scene, realizes the function of adaptive parameter adjustment, and has better performance.
Drawings
FIG. 1 is a generalized flow diagram of one method of solving a multi-objective constrained optimization problem according to the present invention;
FIG. 2 is a graph illustrating the effectiveness of the method of FIG. 1 in testing the LIR-COMP1 problem;
FIG. 3 is a graph illustrating the effectiveness of the method of FIG. 1 in testing the LIR-COMP7 problem.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail below with reference to the accompanying drawings and embodiments thereof. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Example 1
For a multi-objective constrained optimization problem, it can be described in the form:
minimize F(x)=(f 1 (x),f 2 (x),…,f m (x))
subject to g u (x)≤0,u=1,…,p;h v (x)=0,v=1,…,q
wherein F (x) is the objective function, g u (x) 0 is the u-th inequality constraint, and h v (x) 0 is the vth equality constraint, x ═ x 1 ,x 2 ,…,x D E.g. R) is a decision variable, f m (x) An objective function.
As shown in fig. 1, the method for solving the multi-objective constraint optimization problem in this embodiment includes the following specific steps:
s1, forming a search space by the original data of the multi-objective constraint optimization problem, and randomly generating an initial population from the search space
Figure BDA0003700827290000061
The t generation population is recorded as
Figure BDA0003700827290000062
N denotes the size of the population, t denotes the number of iterations, x denotes the individuals in the population,
Figure BDA0003700827290000063
representing a population;
then, the maximum iteration times for iterating the population is set, and then the initial population is calculated
Figure BDA0003700827290000064
The target function value F (x), the inequality constraint value g (x), and the equality constraint value h (x) corresponding to the middle body;
s2, generating new difference factor by strategy network self-adaption using deep reinforcement learning
Figure BDA0003700827290000065
And a constraint relaxation factor epsilon t Wherein the difference factor
Figure BDA0003700827290000066
The subscript DE means deep reinforcement learning, CR means variation probability, F means a scaling factor, and the influence degree of differential disturbance on the generated test vector is determined by the value of F; the specific process is as follows:
constructing a Long Short-Term Memory artificial neural network (LSTM) as a strategy network for deep reinforcement learning to learn; the input of the long-short term memory artificial neural network is the objective function value F (x) of the current population individual, and the output is a differential factor
Figure BDA0003700827290000067
And a constraint relaxation factor epsilon t (ii) a The input and output of the long-short term memory artificial neural network for learning are shown as follows:
Figure BDA0003700827290000068
wherein the content of the first and second substances,
Figure BDA0003700827290000069
is the information of the current population,
Figure BDA00037008272900000610
including the objective function value F (x), the inequality constraint value g (x), the equality constraint value h (x) of all the individuals in the current population,
Figure BDA00037008272900000611
is the weight information of the LSTM,
Figure BDA0003700827290000071
is a cell in the LSTM, and is a cell,
Figure BDA0003700827290000072
is a hidden unit in the LSTM, and the meaning of the LSTM is the learning process of the long-term and short-term memory artificial neural network;
s3, starting from the initial population
Figure BDA0003700827290000073
Using difference factors
Figure BDA0003700827290000074
Evolving to generate new individuals, and combining the new individuals with the previous generation population
Figure BDA0003700827290000075
Forming a temporary population
Figure BDA0003700827290000076
The method comprises the following steps:
s31, starting from the initial population
Figure BDA0003700827290000077
From the previous generation population
Figure BDA0003700827290000078
Any one of the individuals
Figure BDA0003700827290000079
Of (2)
Figure BDA00037008272900000710
In, randomly selecting Lambda t-1 Two individuals in
Figure BDA00037008272900000711
And
Figure BDA00037008272900000712
will be provided with
Figure BDA00037008272900000713
Three individuals using difference factors
Figure BDA00037008272900000714
Performing operation based on the variation probability CR to generate new individuals
Figure BDA00037008272900000715
The process is shown as follows:
Figure BDA00037008272900000716
wherein rand (0,1) represents the probability between 0 and 1, N represents the total number of individuals in the population, M represents the total number of individuals in the neighborhood, i represents the number of individuals, j represents the number of individuals in the neighborhood, k represents the number of individuals in the neighborhood, i represents the number of individuals in the neighborhood, and j representing the jth individual, i, in the neighborhood of the ith individual j Representing taking the kth individual in the neighborhood of the ith individual;
s32, combining N newly generated individuals with the previous generation population
Figure BDA00037008272900000717
Composition of temporary populations of size 2N
Figure BDA00037008272900000718
S4, converting the multi-target constraint optimization problem into a single-target optimization problem by using a multi-target evolutionary algorithm MOEA/D based on decomposition to solve to obtain a temporary population
Figure BDA00037008272900000719
The Chebyshev value and the constraint value phi (x) of the middle individual are updated to the next generation of population; the chebyshev value calculation formula is as follows:
Figure BDA00037008272900000720
wherein, g te (x ii ,z * ) For the chebyshev value of the ith individual,
Figure BDA00037008272900000721
is a set of ideal points, λ i Is a weight coefficient, f k (x) As an objective function at the k-th ideal point, x i Representing the ith individual, where k represents the corresponding several ideal points;
the steps of obtaining the Chebyshev value and the constraint value phi (x) and updating the population are as follows:
s41, pairing population
Figure BDA0003700827290000081
The single individual x in (a) is marked as corresponding constraint value phi (x) ═ Σ g (x) +∑ c (x), wherein g (x) is inequality constraint value, and c (x) is equality constraint value;
s42, based on the obtained constraint relaxation factor epsilon t According to the epsilon constraint rule, will
Figure BDA0003700827290000082
Update to a new population of size N
Figure BDA0003700827290000083
The method specifically comprises the following steps:
from an individual
Figure BDA0003700827290000084
Of (2)
Figure BDA0003700827290000085
In, randomly selecting individuals
Figure BDA0003700827290000086
Respectively calculating respective Chebyshev value and constraint value phi (x) of the two and comparing;
if it is
Figure BDA0003700827290000087
Are all less than a constraint relaxation factor epsilon t Then let
Figure BDA0003700827290000088
The individuals with smaller Chebyshev value in the two enter the next generation population
Figure BDA0003700827290000089
If it is
Figure BDA00037008272900000810
Not all less than the constraint relaxation factor epsilon t Then let
Figure BDA00037008272900000811
Of the two, individuals with smaller constraint value phi (x) enter the next generation population
Figure BDA00037008272900000812
S5, judging whether the current iteration number reaches the termination condition of the set maximum iteration number to judge whether the optimal solution is obtained;
s51, if the set maximum iteration number is not reached, the new population is added
Figure BDA00037008272900000813
And the previous generation population
Figure BDA00037008272900000814
The difference between the long-term and short-term memory artificial neural networks is used as feedback reward of reinforcement learning, and the long-term and short-term memory artificial neural networks are returned to update the loss function of the long-term and short-term memory artificial neural networks, so that the training of the long-term and short-term memory artificial neural networks is realized; then returning to execute the steps 2 to S4 in sequence; new population
Figure BDA00037008272900000815
And the previous generation population
Figure BDA00037008272900000816
The gap between them, expressed using the inverse Generational Distance (inversed Generational Distance) index IGD;
recording the feedback Reward as R t ,R t Calculated according to the following formula:
Figure BDA00037008272900000817
wherein abs represents an absolute value operation, and:
Figure BDA00037008272900000818
in the above formula, A represents the solution set to be solved, P * Representing evenly distributed sampled individuals on the true Pareto front, d (y) * A) denotes the Euclidean distance of y from A, y * Represents P * Y represents the individual in A,
Figure BDA0003700827290000091
denotes the minimum Euclidean distance of y to A, y i The data representing each of the dimensions in y,
Figure BDA0003700827290000092
denotes y * Data of each dimension, subscript i represents the dimension of the data, and m is the total dimension of the data;
s52, if the set maximum iteration number is reached, outputting the optimal solution population representing the multi-objective constraint optimization problem
Figure BDA0003700827290000093
t end Is a preset maximum number of iterations.
As shown in fig. 2 and 3. The inverse generation distance evaluation index IGD index is used for describing the convergence and the distribution of the algorithm, and the smaller the IGD is, the better the convergence and the distribution of the algorithm are. Fig. 2 and fig. 3 are respectively box charts of optimization results IGD of classical test cases LIR-COMP1 and LIR-COMP7 based on the constraint multi-objective optimization problem in the present embodiment, wherein the population size is set to 300, the evolution generation number is 500, and the number of operations is 30, which can be obtained by combining fig. 2 and fig. 3.
Compared with the prior art, the method for solving the multi-objective constraint optimization problem and the storage medium thereof have the advantages that:
compared with the existing multi-objective optimization algorithm and differential evolution algorithm, the method introduces deep reinforcement learning on the basis of the multi-objective evolutionary algorithm MOEA/D, designs a reward feedback mechanism based on an IGD value for the deep reinforcement learning, can adaptively adjust the sensitive parameter differential factor and the constraint condition relaxation factor in the evolutionary algorithm, can effectively adjust the balance between convergence and distribution in the evolutionary process, can adjust the balance between the attention and the distribution of the constraint conditions in different stages in the population evolutionary process, effectively solves the performance problem of the multi-objective evolutionary algorithm caused by parameter sensitivity in the constraint condition scene, realizes the function of adaptive parameter adjustment, and has better performance.
Example 2
A storage medium of the present embodiment stores a computer program for executing the method of solving a multi-objective constraint optimization problem in embodiment 1, and the computer program is retained in the storage medium in the form of data.
The storage medium of the present embodiment is provided in a computer device, and when a processing unit in the computer device executes the computer program of the method for solving a multi-objective constraint optimization problem of embodiment 1, data corresponding to the computer program is read from the storage medium of the present embodiment.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (10)

1. A method for solving a multi-objective constraint optimization problem is characterized by comprising the following steps:
generating initial population from raw data of multi-objective constrained optimization problem
Figure FDA0003700827280000011
Then respectively calculating the objective function value F (x) and the constraint value phi (x) of the individual x in the initial population;
policy network adaptive generation of new difference factors using deep reinforcement learning
Figure FDA0003700827280000012
And a constraint relaxation factor epsilon t T represents the number of iterations;
from the starting population
Figure FDA0003700827280000013
Starting iteration using difference factor
Figure FDA0003700827280000014
Evolving to generate new individuals, and combining the new individuals with the previous generation population
Figure FDA0003700827280000015
Forming a temporary population
Figure FDA0003700827280000016
Converting a multi-target constraint optimization problem into a single-target optimization problem to solve based on a decomposed multi-target evolutionary algorithm MOEA/D to obtain a temporary population
Figure FDA0003700827280000017
The Chebyshev value, the constraint value phi (x) of the middle individual, and then based on the constraint relaxation factor epsilon t Updating to a new generation population;
judging whether the maximum iteration number t is reached end (ii) a If so, taking the finally updated population as the optimal solution of the multi-target constraint optimization problem; if not, the difference between the new population and the individuals of the previous generation population is used as the feedback reward of the reinforcement learning, the feedback reward is returned to the strategy network of the deep reinforcement learning to update the loss function, and then the iteration is continued until the maximum iteration time t is reached end
2. The method for solving a multi-objective constrained optimization problem according to claim 1, wherein an initial population is generated
Figure FDA0003700827280000018
The process of calculating the objective function value f (x) and the constraint value Φ (x) includes:
forming a search space by using the original data of the multi-objective constrained optimization problem, and randomly generating an initial population from the search space
Figure FDA0003700827280000019
N represents the population size;
setting the maximum iteration number t for iterating the population end
Calculating an initial population
Figure FDA00037008272800000110
The target function value f (x), the inequality constraint value g (x), and the equality constraint value h (x) corresponding to the second entity, where Φ (x) Σ g (x) Σ c (x).
3. The method for solving the multi-objective constraint optimization problem according to claim 1, wherein the strategy network for deep reinforcement learning is specifically a long-short term memory artificial neural network, the input of the long-short term memory artificial neural network is the objective function value F (x) of an individual in a population, and the output of the long-short term memory artificial neural network is a differential factor
Figure FDA0003700827280000021
And a constraint relaxation factor epsilon t Wherein, CR is the mutation probability, and F is the scaling factor.
4. The method for solving a multi-objective constraint optimization problem according to claim 3, wherein the input and output of the learning of the long-short term memory artificial neural network are shown as follows:
Figure FDA0003700827280000022
wherein the content of the first and second substances,
Figure FDA0003700827280000024
is the information of the current population,
Figure FDA0003700827280000025
including the objective function value F (x), the inequality constraint value g (x), the equality constraint value h (x) of all the individuals in the current population,
Figure FDA00037008272800000225
is the weight information of the LSTM,
Figure FDA0003700827280000026
are cells in the LSTM and are,
Figure FDA0003700827280000027
the meaning of LSTM is the learning process of long-short term memory artificial neural networks, which are hidden units in LSTM.
5. The method for solving a multi-objective constrained optimization problem according to claim 1, wherein difference factors are used
Figure FDA0003700827280000028
Evolution to generate new individuals to reconstitute temporary populations
Figure FDA0003700827280000029
The process comprises the following steps:
from the starting population
Figure FDA00037008272800000211
From the previous generation population
Figure FDA00037008272800000210
Randomly selecting two individuals from the neighbors of any individual, and using difference factors for the three individuals
Figure FDA00037008272800000212
Calculating based on the variation probability CR to generate a new individual;
combining newly generated individuals with the previous generation population
Figure FDA00037008272800000213
Forming a temporary population
Figure FDA00037008272800000214
6. The method for solving a multi-objective constraint optimization problem according to claim 5, wherein the process for generating new individuals is as follows:
population of previous generation
Figure FDA00037008272800000215
Any one of the individuals
Figure FDA00037008272800000217
Of (2)
Figure FDA00037008272800000216
Figure FDA00037008272800000218
In, randomly selecting Lambda t-1 Two individuals in
Figure FDA00037008272800000219
And
Figure FDA00037008272800000220
will be provided with
Figure FDA00037008272800000221
Figure FDA00037008272800000222
Difference factor used by three individuals
Figure FDA00037008272800000223
Performing operation based on the variation probability CR to generate new individuals
Figure FDA00037008272800000224
The process is shown as the following formula:
Figure FDA0003700827280000023
wherein rand (0,1) represents the probability between 0 and 1, N represents the total number of individuals in the population, M represents the total number of individuals in the neighborhood, i represents the number of individuals, j represents the number of individuals in the neighborhood, k represents the number of individuals in the neighborhood, i represents the number of individuals in the neighborhood, and j representing the jth individual, i, in the neighborhood of the ith individual j Representing taking the kth individual in the neighborhood of the ith individual.
7. The method for solving the multi-objective constraint optimization problem according to claim 1, wherein the process of updating the MOEA/D to the next generation population based on the multi-objective evolutionary algorithm of decomposition is as follows:
from temporary populations
Figure FDA00037008272800000314
Of (2)
Figure FDA00037008272800000311
Of (2)
Figure FDA00037008272800000312
In, randomly selecting individuals
Figure FDA00037008272800000313
Respectively calculating respective Chebyshev value and constraint value phi (x) of the two and comparing;
if it is
Figure FDA0003700827280000037
Are all less than a constraint relaxation factor epsilon t Then let
Figure FDA0003700827280000038
The individuals with smaller Chebyshev value in the two enter the next generation population
Figure FDA00037008272800000310
If it is
Figure FDA0003700827280000036
Not all less than the constraint relaxation factor epsilon t Then let
Figure FDA0003700827280000039
Of the two, individuals with smaller constraint value phi (x) enter the next generation population
Figure FDA00037008272800000315
Wherein i represents a temporary population
Figure FDA00037008272800000316
J denotes the second individual in the neighborhood.
8. The method for solving a multi-objective constrained optimization problem of claim 1, wherein the maximum number of iterations t is determined not to be reached end The following steps are specifically:
the next generation is clustered
Figure FDA0003700827280000034
And the previous generation population
Figure FDA0003700827280000035
The difference between the two parameters is used as feedback reward of reinforcement learning, the update loss function in the strategy network of the deep reinforcement learning is returned, and the update loss function in the strategy network of the deep reinforcement learning is trained;
the iteration is then continued to update the population until a maximum number of iterations is reached.
9. The method for solving a multi-objective constraint optimization problem according to claim 8, wherein the calculation process of the feedback rewards is specifically as follows:
new population
Figure FDA0003700827280000032
And the previous generation population
Figure FDA0003700827280000033
The difference between the two is expressed by an anti-generation distance index IGD; the feedback reward is noted as R t Feedback of the reward R t Calculated according to the following formula:
Figure FDA0003700827280000031
wherein abs represents an absolute value operation, and:
Figure FDA0003700827280000041
where A represents the solution set to be solved, P * Representing evenly distributed sampled individuals on the true Pareto front, d (y) * A) denotes the Euclidean distance of y from A, y * Represents P * Y represents the individual in A,
Figure FDA0003700827280000042
denotes the maximum of y to ASmall Euclidean distance, y i The data representing each of the dimensions in y,
Figure FDA0003700827280000043
denotes y * For each dimension of data, the index i indicates the dimension, and m is the total dimension of the data.
10. A storage medium for storing a computer program configured to execute the method for solving a multi-objective constrained optimization problem according to any one of claims 1 to 9.
CN202210688940.9A 2022-06-17 2022-06-17 Method for solving multi-objective constrained optimization problem and storage medium thereof Pending CN115081323A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210688940.9A CN115081323A (en) 2022-06-17 2022-06-17 Method for solving multi-objective constrained optimization problem and storage medium thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210688940.9A CN115081323A (en) 2022-06-17 2022-06-17 Method for solving multi-objective constrained optimization problem and storage medium thereof

Publications (1)

Publication Number Publication Date
CN115081323A true CN115081323A (en) 2022-09-20

Family

ID=83254237

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210688940.9A Pending CN115081323A (en) 2022-06-17 2022-06-17 Method for solving multi-objective constrained optimization problem and storage medium thereof

Country Status (1)

Country Link
CN (1) CN115081323A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116667467A (en) * 2023-08-01 2023-08-29 齐齐哈尔市君威节能科技有限公司 Intelligent control magnetic suspension breeze power generation capacity-increasing compensation device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116667467A (en) * 2023-08-01 2023-08-29 齐齐哈尔市君威节能科技有限公司 Intelligent control magnetic suspension breeze power generation capacity-increasing compensation device
CN116667467B (en) * 2023-08-01 2023-10-13 齐齐哈尔市君威节能科技有限公司 Intelligent control magnetic suspension breeze power generation capacity-increasing compensation device

Similar Documents

Publication Publication Date Title
Stoyanov et al. Empirical risk minimization of graphical model parameters given approximate inference, decoding, and model structure
CN111260030B (en) A-TCN-based power load prediction method and device, computer equipment and storage medium
Modares et al. Parameter estimation of bilinear systems based on an adaptive particle swarm optimization
CN110909926A (en) TCN-LSTM-based solar photovoltaic power generation prediction method
CN111260124A (en) Chaos time sequence prediction method based on attention mechanism deep learning
Han et al. Network traffic prediction using variational mode decomposition and multi-reservoirs echo state network
CN112884236B (en) Short-term load prediction method and system based on VDM decomposition and LSTM improvement
Wang et al. A compact constraint incremental method for random weight networks and its application
CN111832817A (en) Small world echo state network time sequence prediction method based on MCP penalty function
CN116542382A (en) Sewage treatment dissolved oxygen concentration prediction method based on mixed optimization algorithm
CN115081323A (en) Method for solving multi-objective constrained optimization problem and storage medium thereof
Liang et al. A wind speed combination forecasting method based on multifaceted feature fusion and transfer learning for centralized control center
Upadhyay et al. IIR system identification using differential evolution with wavelet mutation
CN112381591A (en) Sales prediction optimization method based on LSTM deep learning model
Yang Combination forecast of economic chaos based on improved genetic algorithm
CN116667322A (en) Power load prediction method based on phase space reconstruction and improved RBF neural network
CN116415177A (en) Classifier parameter identification method based on extreme learning machine
Sujamol et al. A genetically optimized method for weight updating in fuzzy cognitive maps
Ortelli et al. Faster estimation of discrete choice models via dataset reduction
CN114202063A (en) Fuzzy neural network greenhouse temperature prediction method based on genetic algorithm optimization
Karadede et al. A hierarchical soft computing model for parameter estimation of curve fitting problems
CN111639797A (en) Gumbel-softmax technology-based combined optimization method
Zhu et al. Application of Improved Deep Belief Network Based on Intelligent Algorithm in Stock Price Prediction
Phiromlap et al. A frequency-based updating strategy in compact genetic algorithm
Li et al. Macroeconomics modelling on UK GDP growth by neural computing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination