CN115204062A - Reinforced hybrid differential evolution method and system for interplanetary exploration orbit design - Google Patents

Reinforced hybrid differential evolution method and system for interplanetary exploration orbit design Download PDF

Info

Publication number
CN115204062A
CN115204062A CN202211118194.6A CN202211118194A CN115204062A CN 115204062 A CN115204062 A CN 115204062A CN 202211118194 A CN202211118194 A CN 202211118194A CN 115204062 A CN115204062 A CN 115204062A
Authority
CN
China
Prior art keywords
max
fes
global
solving
cma
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211118194.6A
Other languages
Chinese (zh)
Other versions
CN115204062B (en
Inventor
彭雷
袁卓铭
戴光明
王茂才
宋志明
陈晓宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Geosciences
Original Assignee
China University of Geosciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Geosciences filed Critical China University of Geosciences
Priority to CN202211118194.6A priority Critical patent/CN115204062B/en
Publication of CN115204062A publication Critical patent/CN115204062A/en
Application granted granted Critical
Publication of CN115204062B publication Critical patent/CN115204062B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2111/00Details relating to CAD techniques
    • G06F2111/08Probabilistic or stochastic CAD

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Mathematical Analysis (AREA)
  • Evolutionary Biology (AREA)
  • Computational Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Geometry (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Genetics & Genomics (AREA)
  • Physiology (AREA)
  • Computer Hardware Design (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Feedback Control In General (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses a reinforced hybrid differential evolution method and a system for interplanetary exploration orbit design, wherein the method comprises the following steps: (1) RL _ HDE uses Q-Learning algorithm to adaptively control six different variation strategies, and enhances the algorithm optimizing ability. Aiming at the self-adaptive control of six different variation strategies, the global operator uses an LSHADE _ EIG method, the method improves an international evolution computing competition (CEC 2015) algorithm LSHADE _ SPS _ EIG, and an SPS frame is not used; (2) Adaptive control of trigger parameters using reinforcement Learning Q-Learning algorithmρ 1,max Andρ 2,max balancing algorithm exploration and development capabilities. The invention has the beneficial effects that: can haveThe solving speed of the optimization design of the interplanetary detection orbit is effectively improved, and the calculation precision of the detector orbit is improved.

Description

Reinforced hybrid differential evolution method and system for interplanetary exploration orbit design
Technical Field
The invention relates to the field of interplanetary orbit detection, in particular to an enhanced hybrid differential evolution method and system for interplanetary detection orbit design.
Background
The design and optimization of the interplanetary exploration orbit are one of the key engineering problems of the deep space exploration system, and as the deep space exploration needs to consider a plurality of complex factors, when interplanetary exploration (particularly minor planet exploration tasks) needs to be selected from thousands of alternative stars, the problem solving scale is rapidly enlarged, and a search space often has the characteristics of large parameter space, high nonlinearity, associated extreme points, sensitive globally optimal solution and attraction basins and the like, so that the interplanetary orbit design is difficult. The existing deep space track optimization design method has the following defects:
(1) The method has insufficient universality. Can only solve the problem that the characteristics of one or more problems are consistent.
(2) Due to the characteristics of high nonlinearity, extreme point concomitance, sensitivity of the global optimal solution attraction basin and the like, the algorithm is difficult to find the optimal feasible solution, the search performance is unstable, and the robustness is poor. The existing optimization method is not enough to study on how to design the method by using the implicit knowledge of the deep space orbit data and the analytic knowledge of the problems.
(3) The time consumption is large. The current MIDACO algorithm with higher universality needs to depend on super-computing equipment. Even so, it still takes a high time consumption (days to weeks) to find a good solution.
Disclosure of Invention
In order to solve the technical problems, the invention provides an enhanced hybrid differential evolution method and system (English name: RL _ HDE) for interplanetary detection track design, which can effectively improve the solving speed of interplanetary detection track optimization design, improve the calculation precision of a detector track, and provide a new solution for the track design of remote star detection such as wooden stars, earth stars, asteroids and the like in China.
The method comprises the following steps:
s1, determining the design problem of the deep space track of the detector to be solvedM
S2, problem of constructionMIs an objective function off(x) And a decision vectorxGlobal search area upper boundaryx ub Lower boundaryx lb
S3, initializing parameters for Q-learning: learning rateαDiscount factorγ
Control parameters for initializing CMA-ES local search area boundariesBound init AndBound min
initializing the global operator LSHADE _ EIG highest dead algebraρ 1,max And current stagnation algebraρ 1
Initializing local operator CMA-ES maximum stagnation timesρ 2,max Current stagnation algebraρ 2
Initializing scale factor parameters of interior point methodls_eval
Initializing the maximum number of solution to the objective functionMAX_FESAnd current number of solutionsFES
The global operator LSHADE _ EIG is used for carrying out preliminary exploration on the whole search space to obtain a preliminary global optimal solution;
stagnant algebraρ 1 The system is used for recording the accumulated stagnation times when the global operator LSHADE _ EIG is solved;
the local operator is used for further searching and calculating in the preliminary solution space to accelerate the objective functionf(x) The solving process of (2);
stagnant algebraρ 2 The method is used for recording the accumulated stagnation times when the solving of the local operator CMA-ES is finished;
separately initializing the parameters for adaptive control mutation strategiesρ 1,max Andρ 2,max Q-Table of (1). Wherein, each individual in the LSHADE _ EIG operator initiates a Q-Table to adaptively control the selection of mutation strategy.
S4, adoptSelf-adaptive updating trigger parameter of Q-Learning algorithmρ 1,max
S5, judgingρ 1 Whether or not less thanρ 1,max And is andFESwhether or not less thanMAX_FESIf yes, entering step S6; otherwise, the step S10 is carried out, the solution of the global search space is finished, and the adaptive control trigger parameters are updatedρ max1, Starting local solving by the Q-Table matrix;
s6, adopting a Q-Learning algorithm to adaptively select a mutation strategy;
s7, starting a global operator LSHADE _ EIG, and starting to perform preliminary exploration solving on the whole search space;
s8, updating a Q-Table matrix of the adaptive control mutation strategy;
s9, judging
Figure 906074DEST_PATH_IMAGE001
Is established, wherein
Figure 195104DEST_PATH_IMAGE002
The optimal solution obtained for lshand _ EIG,x gmin is a global optimal solution;
if true, the number of stagnating algebrasρ 1 Set to zero, willx gmin Instead of using
Figure 78746DEST_PATH_IMAGE002
(ii) a Otherwise stagnation algebraρ 1 Self-adding 1.ρ 1 Returning to the step S5 after updating;
s10, according to
Figure 755715DEST_PATH_IMAGE002
And control parametersBound init Bound min Determining a local search space;
s11, in the local search space, self-adaptively updating the trigger parameters by adopting a Q-Learning algorithmρ 2,max
S12, judgingρ 2 Whether or not less thanρ 2,max And is andFESwhether or not less thanMAX_FESIf yes, go to step S13; otherwise, updating adaptive control trigger parametersρ 2,max The Q-Table matrix enters the step S15 to represent the end of the CMA-ES local search solution;
s13, starting a local operator CMA-ES, and starting to solve a local search space;
s14, judgment
Figure 600175DEST_PATH_IMAGE003
Whether or not the above-mentioned conditions are satisfied,
Figure 817529DEST_PATH_IMAGE004
the optimal solution obtained for the CMA-ES,x gmin is a global optimal solution;
if so, it will stall algebraρ 2 Is set to zero, willx gmin Instead of using
Figure 693694DEST_PATH_IMAGE004
(ii) a Otherwise stagnation algebraρ 2 And (4) adding 1 by itself.ρ 2 Returning to the step S12 after updating;
s15, judging the current solving timesFESIs less than 0.75MAX_FESIf so, the process returns to step S4. If the current number of solving timesFESIs no longer less thanMAX_FESThen, the process proceeds to step S16. If the current number of solution timesFESWhether it is greater than 0.75MAX_FESAnd is less thanMAX_FESAnd updating the global optimal solution by using a local operator inner point method. Judgment of
Figure 541564DEST_PATH_IMAGE005
If it is, it willx gmin Is replaced by
Figure 997953DEST_PATH_IMAGE006
Figure 894365DEST_PATH_IMAGE006
The optimal solution is obtained by the interior point method. And finally, updating the local operator inner point method parameters, and entering the step S16.
S16, judging the current solving timesFESWhether or not it is greater than or equal toMAX_FESIf yes, the solution is finished, and the current situation isx gmin Solving the result for the final; if not, the process returns to step S4.
The system comprises:
the deep space track design problem construction module comprises:
determining the design problem of the deep space track of the detector to be solvedM(ii) a Build problemsMIs an objective function off(x) And a decision vectorxGlobal search area upper boundaryx ub Lower boundary, lower boundaryx lb
The deep space track design problem parameter initialization module:
initialize parameters for Q-learning: learning rateαDiscount factorγ
Control parameters for initializing CMA-ES local search area boundariesBound init AndBound min
initializing global operator LSHADE _ EIG highest-standing algebraρ 1,max And current stagnation algebraρ 1
Initializing local operator CMA-ES maximum stagnation timesρ 2,max Current number of stalled algebrasρ 2
Scale factor parameter for initializing interior point methodls_eval
Initializing the maximum number of solution to the objective functionMAX_FESAnd current number of solutionsFES
The global operator LSHADE _ EIG is used for carrying out preliminary exploration on the whole search space to obtain a preliminary global optimal solution;
stagnant algebraρ 1 The system is used for recording the accumulated stagnation times when the global operator LSHADE _ EIG is solved;
the local operator is used for further searching and calculating in the preliminary solution space to accelerate the objective functionf(x) The solving process of (2);
stagnant algebraρ 2 The method is used for recording the accumulated stagnation times when the solving of the local operator CMA-ES is finished;
separately initializing the parameters for adaptive control mutation strategiesρ 1,max Andρ 2,max Q-Table of (1); each individual in the LSHADE _ EIG operator initializes a Q-Table to adaptively control the selection of a mutation strategy;
the global solving module of the deep space track design problem comprises:
self-adaptive updating trigger parameter by adopting Q-Learning algorithmρ 1,max
Judgment ofρ 1 Whether or not less thanρ 1,max And is made ofFESWhether or not less thanMAX_FESIf yes, a self-adaptive selection mutation strategy adopting a Q-Learning algorithm is adopted, and a global operator LSHADE _ EIG is started; otherwise, updating adaptive control trigger parametersρ max1, The Q-Table matrix enters a deep space track design problem local solving module;
starting the global operator LSHADE _ EIG, and initially exploring and solving the whole search space;
updating a Q-Table matrix of the adaptive control mutation strategy;
judgment of
Figure 487021DEST_PATH_IMAGE001
Whether or not, wherein
Figure 505792DEST_PATH_IMAGE002
The optimal solution obtained for lshand _ EIG,x gmin is a global optimal solution;
if true, the number of stagnating algebrasρ 1 Is set to zero, willx gmin Is replaced by
Figure 590423DEST_PATH_IMAGE002
(ii) a Otherwise, the stagnating algebraρ 1 Self-adding 1;ρ 1 returning to a deep space track design problem global solving module after updating;
the local solving module of the deep space track design problem comprises:
according to
Figure 149580DEST_PATH_IMAGE002
And control parametersBound init Bound min Determining a local search space;
in the local search space, trigger parameters are updated adaptively by adopting Q-Learning algorithmρ 2,max
Judgment ofρ 2 Whether or not less thanρ 2,max And is made ofFESWhether or not less thanMAX_FESIf yes, starting a local operator CMA-ES, and starting to solve the local search space; otherwise, updating adaptive control trigger parametersρ max2, The Q-Table matrix enters a deep space track design problem convergence solving module;
judgment of
Figure 472108DEST_PATH_IMAGE003
Whether or not the above-mentioned conditions are satisfied,
Figure 192940DEST_PATH_IMAGE004
the optimal solution obtained for the CMA-ES,x gmin is a global optimal solution; if so, it will stall algebraρ 2 Is set to zero, willx gmin Is replaced by
Figure 764866DEST_PATH_IMAGE004
(ii) a Otherwise, the stagnating algebraρ 2 Self-adding 1;ρ 2 returning to a local solving module of the deep space track design problem after updating;
the deep space orbit design problem convergence solving module comprises:
the first solving section: judging the current solving timesFESWhether it is less than 0.75MAX_FESIf yes, returning to a global solution module of the deep space track design problem; if the current number of solving timesFESIs no longer less thanMAX_FESThen enter the second solution portion; if the current number of solving timesFESIs greater than 0.75MAX_FESAnd is smaller thanMAX_FESUpdating the global optimal solution by using a local operator inner point method; judgment of
Figure 862135DEST_PATH_IMAGE005
If it is, it willx gmin Is replaced by
Figure 570328DEST_PATH_IMAGE006
Figure 462061DEST_PATH_IMAGE006
Obtaining an optimal solution for an interior point method; finally, updating local operator interior point method parameters, and entering a second solving part;
the second solving part: judging the current solving timesFESWhether or not it is greater than or equal toMAX_FESIf yes, the solution is finished, and the current situation isx gmin The final solution result is obtained; if not, returning to the global solution module of the deep space track design problem.
The beneficial effects provided by the invention are as follows: the solving speed of the interplanetary detection track optimization design can be effectively improved, and the calculation precision of the detector track is improved.
Drawings
FIG. 1 is a schematic flow diagram of the process of the present invention;
FIG. 2, FIG. 3, and FIG. 4 are diagrams illustrating adaptive Q-learning control mutation strategies and triggering parameters according to the present inventionρ max1, And triggering parametersρ max2, A detailed flow chart of the method;
FIG. 5 isρ max1, Q-Table matrix ofQ DE A divided schematic;
FIG. 6 is a diagram illustrating a Q-Table matrix of an adaptive control mutation strategy after being partitioned;
FIG. 7 isρ max2, Q-Table matrix ofQ CMA Divided schematic.
Detailed Description
To make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be further described with reference to the accompanying drawings.
Before describing the present application in detail, some basic concepts that will be mentioned later are introduced in advance.
(1) Global operator LSHADE _ EIG highest-standing algebraρ 1,max And current stall algebraρ 1 (ii) a The global operator LSHADE _ EIG is used for carrying out preliminary exploration on the whole search space to obtain a preliminary solution space;
stagnant algebraρ 1 The system is used for recording the accumulated stagnation times when the global operator LSHADE _ EIG is solved;
(2) Local operator CMA-ES maximum number of stallsρ 2,max Current number of stalled algebrasρ 2
The local operator is used for further searching and calculating in the preliminary solution space to accelerate the objective functionf(x) The solving process of (2); stagnant algebraρ 2 The method is used for recording the accumulated stagnation times when the solving of the local operator CMA-ES is finished;
(3) Maximum number of solution to the objective functionMAX_FESAnd current number of solutionsFES
The maximum solving times of the objective function controls the maximum solving times of the whole solving process, so that the solving process can be smoothly ended; current number of solutionFESAs the name implies, the number of times the objective function is currently solved is recorded.
(4) Q-Learning algorithm and Q-Table matrix
Q-Learning is an off-line Learning algorithm that does not require modeling of the environment, and is widely used due to its simple application, fast convergence speed, low computation cost, and the like. The main idea of Q-Learning is based on instant rewardsrAnd the current Q-Value (Value in Q-Table matrix), evaluating the next statesTaking actionaThe processes are all calculated by Q-Table iteration (see the following Table), and a state set is setS={s 1 ,s 2 ,...,s i }, action setA={a 1 ,a 2 ,...,a j }, agent consensusiIn the case of a seed-state,jan action is performed. WhereinQ(s t ,a t ) Is shown at the presents t In the state, selecting actiona t The future fatigue caused byAnd (7) calculating the income. During each interaction, the agent is assigned a states t And selecting the best action to performa t After the action has taken place, the environment gives a rewardr t+1 The agent may also transition to a new states t+1 In this iteration, the agent forms the expected value for each given action by learning past experienceQ(s t ,a t ) And (4) evaluating.
Referring to fig. 1, fig. 1 is a simplified flow chart of the method of the present invention.
The method mainly adopts a Q-Learning-based mixed differential evolution algorithm to solve the track design problem of the deep space probe; for solving the problem, initializing corresponding population parameters, performing iterative evolution solving by using a global operator LSHADE _ EIG, and obtaining a local search space in a global search space; further, in the local search space, carrying out iterative evolution solving by using a local operator CMA _ ES to obtain a more accurate solving space; and finally, updating the optimal solution by adopting an interior point method within the accurate solution space range until the iteration condition or the maximum solution times of the objective function are met, and completing the objective function solution process.
Please refer to fig. 2, fig. 3 and fig. 4.
Fig. 2, fig. 3 and fig. 4 are detailed flowcharts of the method of the present invention. The invention provides a reinforced hybrid differential evolution method for interplanetary exploration orbit design, which specifically comprises the following steps of:
s1, determining the design problem of the deep space track of the detector to be solvedM
It should be noted that the deep space track design problemMCan be as follows: calculating the accumulated change speed of the detector during the deep space detection taskΔVOr cumulative energy change, etc., and the present application is not intended to limit the specific problems, but only to schematically illustrate them.
S2, problem of constructionMIs an objective function off(x) And a decision vectorxGlobal search area upper boundaryx ub Lower boundaryx lb (ii) a It should be noted that the decision vectorxHas a dimension ofD
S3, initializing parameters for Q-learning: learning rateαDiscount factorγ(ii) a As an example, the learning rate in this applicationαInitialized to 0.1, discount factorγInitialization was 0.9;
control parameters for initializing CMA-ES local search area boundariesBound init AndBound min (ii) a As an example, the control parameter is used in the present applicationBound init The initial value is set to be 0.5,Bound mint initialization is 0.1;
initializing global operator LSHADE _ EIG highest-standing algebraρ 1,max And current stall algebraρ 1 (ii) a As an example, the highest dead algebra in this applicationρ 1,max Initialized to 20, current dead algebraρ 1 Initializing to 0;
initializing the maximum stagnation times of the local operators CMA-ESρ 2,max Current stagnation algebraρ 2
As an example, the highest dead algebra in this applicationρ 2,max Initialization to 10, current stall algebraρ 2 Initialization is 0;
initializing scale factor parameters of interior point methodls_eval(ii) a As an example, the scale factor is used in this applicationls_ evalInitialization is 0.01;
initializing the maximum number of solution to the objective functionMAX_FESAnd current number of solutionsFES
The global operator LSHADE _ EIG is used for carrying out preliminary exploration on the whole search space to obtain a preliminary solution space;
stagnant algebraρ 1 The system is used for recording the accumulated stagnation times when the global operator LSHADE _ EIG is solved;
the local operator is used for further searching and calculating in the preliminary solution space to accelerate the objective functionf(x) The solving process of (2);
stagnant algebraρ 2 For recording partsThe accumulated stagnation times when the solving of the operator CMA-ES is finished;
s4, updating the trigger parameters in a self-adaptive mode by adopting a Q-Learning algorithmρ 1,max
It should be noted that, initializing Q-Learning for adaptive control of trigger parametersρ 1,max Q-Table matrix ofQ DE Q DE According to the first parameterSc DE1 And a first parameterf DE1 Dividing the states, combining to obtain six population evolution states, and including seven preset first action update values; wherein the population evolution state is used as a matrixQ DE The first action update value as a matrixQ DE A column of (1); first parameterSc DE1 Comprises three states; first parameterf DE1 Including two states.
Referring to fig. 5, fig. 5 is a schematic diagram illustrating a state after a matrix is divided; wherein the six combination states are defined by a first parameterSc DE1 State and first parameter off DE1 The states of (a) are combined;
first parameterSc DE1 And a first parameterf DE1 Satisfies the following formula:
Figure 255705DEST_PATH_IMAGE007
(4.1)
Figure 359927DEST_PATH_IMAGE008
(4.2)
wherein,X DE for the final population obtained after the LSHADE _ EIG operator evolution,X 0 the initial population that begins to evolve for the LSHADE _ EIG operator,diversitythe function is used to evaluate population diversity of the input population,avg_fitnessthe function is used for calculating the average fitness of individuals in the input population;
Figure 781681DEST_PATH_IMAGE009
(4.3)
Figure 250840DEST_PATH_IMAGE010
(4.4)
Lfor searching spaceSR D The length of the diagonal line of (a),NPin order to be of the population scale,f(x i ) Is an individualx i The value of the corresponding objective function is determined,
Figure 390834DEST_PATH_IMAGE011
is the first of all individuals in the populationjThe average value of the dimensional variables is,x j,i is the first in the populationiThe first of an individualjA dimension variable value;
the seven preset first action update values are respectively-5, -3, -1,0,1,3 and 5;
s5, judgingρ 1 Whether or not less thanρ 1,max And is andFESwhether or not less thanMAX_FESIf yes, entering step S6; otherwise, the step S10 is carried out, which indicates that the global search space solution is finished, and the self-adaptive control trigger parameter is updatedρ max1, Starting local solving by the Q-Table matrix;
note that in S4ρ 1,max The main process of the self-adaptive value taking is as follows:
and calculating the probability of each action in the preset first action selected under the current state according to a formula (5.1), and randomly selecting a certain action to execute according to the probability of each action.
Whereinp(s i ,a j ) Is in a states i Lower selection actiona j The probability of (a) of (b) being,Q(s i ,a j ) Being in Q-Table states i Lower selection actiona j The corresponding Q value is set according to the current value,nthe number of types of operation.
Figure 298747DEST_PATH_IMAGE012
(5.1)
Figure 718883DEST_PATH_IMAGE013
(5.2)
Wherein,ε 1 one of the seven first actions is updated.
If it isρ 1 Greater than or equal toρ 1,max Then the global search is over and the prize is distributed according to equation (5.3) at which timex gmin Is a global optimal solution;
Figure 483576DEST_PATH_IMAGE014
(5.3)
then updated according to the formula (8.1)Q DE
S6, adopting a Q-Learning algorithm to adaptively select a mutation strategy;
it should be noted that the Q-Table matrix of the adaptive control mutation strategy isQ strategy Q strategy According to the second parameterSc DE2 And a second parameterf DE2 Dividing the states, combining to obtain twenty population evolution states and including six variation strategies; wherein the population evolution state is used as a matrixQ strategy As a matrix, a mutation strategyQ strategy A column of (1); second parameterSc DE2 Comprises five states; second parameterf DE2 Including four states. The six variation strategies are shown in table 1.
TABLE 1 six different mutation strategies
Figure 986233DEST_PATH_IMAGE015
Whereinir 1r 2r 3r 4 And is andx r G,1, x r G,3, x r G4, is an individual randomly selected from a population of individuals,x r2 G, are randomly selected individuals from the population and external archive. And the external archive is used for protecting the diversity of the population and storing the failed parent vectors in the selection process.x best,G Is the best individual in the population,x pbest,G is the top rank in the populationpOf (a).
Referring to fig. 6, fig. 6 is a schematic diagram of a Q-Table matrix of an adaptive mutation control strategy after being divided. The step is based on the second parameterSc DE2 State and second parameter off DE2 The value of (a) divides the population into twenty states.
Second parameterSc DE2 And a second parameterf DE2 Satisfies the following formula:
Figure 432258DEST_PATH_IMAGE016
(6.1)
Figure 94183DEST_PATH_IMAGE017
(6.2)
wherein,
Figure 639565DEST_PATH_IMAGE018
(6.3)
Figure 957414DEST_PATH_IMAGE019
(6.4)
the definitions of the relevant parameters in the formula are equivalent to those described above,X G evolution process for LSHADE _ EIG operatorGGeneration group;
s7, starting a global operator LSHADE _ EIG, and starting to perform preliminary exploration solving on the whole search space;
at the upper boundaryx ub And a lower boundaryx lb In-range random generationNPA subject (A)MSolution vector of) of the same, the individualsx i Together form a first generation populationX 0 . Population initialization satisfies equation (7.1):
Figure 738288DEST_PATH_IMAGE020
(7.1)
rand i,j (0, 1) is a value in the range of [0,1 ]]A randomly distributed variable of (a);
calculating the probability of selecting each mutation strategy in the current state according to a formula (5.1), selecting one mutation strategy according to the probability of each mutation strategy for execution, and performing mutation operation on the population;
after mutation is finished, performing cross operation on the population by using an EIG cross operator to generate cross individuals;
the resulting cross individuals are selected and prize distribution is performed according to equation (7.2) whereu i Is an individualx i And obtaining the individual after mutation and crossing.
Figure 130087DEST_PATH_IMAGE021
(7.2)
S8, updating a Q-Table matrix of an adaptive control variation strategy;
updating according to equation (8.1)Q strategy In whichαIn order to obtain a learning rate,γis the discount rate.r t+1 Performing actions for agentsa t The reward that is obtained later is that the user can,s t+1 is that the agent is in a states t Performing an actiona t Then, the state is shifted to the state of the next time,max a Q(s t+1 ,a) RepresentQ strategy The middle state iss t+1 Maximum of timeQA value;
Figure 236583DEST_PATH_IMAGE022
(8.1)
s9, judging
Figure 41728DEST_PATH_IMAGE001
Whether or not, wherein
Figure 501659DEST_PATH_IMAGE002
The optimal solution obtained for lshand _ EIG,x gmin is a global optimal solution;
if true, the number of stagnating algebrasρ 1 Is set to zero, willx gmin Instead of using
Figure 138177DEST_PATH_IMAGE002
(ii) a Otherwise, the stagnating algebraρ 1 Self-adding 1.ρ 1 Returning to the step S5 after updating;
updating according to equation (9.1)ρ 1
Figure 25361DEST_PATH_IMAGE023
(9.1)
S10, according to
Figure 52223DEST_PATH_IMAGE002
And control parametersBound init Bound min Determining a local search space;
it should be noted that, the search space of CMA-ES is determined according to the formulas (10.1), (10.2) and (10.3);
Figure 440479DEST_PATH_IMAGE024
(10.1)
Figure 806870DEST_PATH_IMAGE025
(10.2)
Figure 192852DEST_PATH_IMAGE026
(10.3)
wherein,x LSlb ,x LSub respectively represent the minimum boundary vector and the maximum boundary vector of the local search space of CMA-ES, respectivelyx lb ,x ub Then the minimum boundary vector and the maximum boundary vector of the global search space, respectively.
Figure 769327DEST_PATH_IMAGE002
Optimal solution obtained for LSHADE _ EIG, whereinBound init AndBound min the control parameters for the local search space are initialized to 0.5 and 0.1, respectively.BoundIs a scale factor for controllingx LSlb Andx LSub the degree of scaling of (a).
S11, in the local search space, self-adaptively updating the trigger parameters by adopting a Q-Learning algorithmρ 2,max
S12, judgingρ 2 Whether or not less thanρ 2,max And whether FES is less than MAX _ FES, if yes, go to step S13; otherwise, updating adaptive control trigger parametersρ 2,max The Q-Table matrix enters the step S15 to represent the end of the CMA-ES local search solution;
it should be noted that, in step S11, the probability of selecting various actions in the current state is calculated according to the formula (5.1): randomly selecting one action to be executed according to the probability of each action, and updating through a formula (12.1)ρ max2, Whereinε 2 Updating one of the values for the five second actions;
Figure 836640DEST_PATH_IMAGE027
(12.1)
the parameters areρ max2, Is Q-Table matrix ofQ CMA Q CMA According to the parametersSuc CMA State and parameters ofRatio CMA Is combined to obtainEight species group evolution states and five preset second action update values; wherein the population evolution state is used as a matrixQ CMA The second action update value as a matrixQ CMA A column of (1); parameter(s)Suc CMA Comprises two states; parameter(s)Ratio CMA Including four states.
Referring to FIG. 7, FIG. 7 is a drawingρ max2, Q-Table matrix ofQ CMA A divided schematic;
wherein the parametersSuc CMA State and parameters ofRatio CMA The calculation formula of (2) is shown in formulas (12.2) and (12.3). The second action update value is five for adaptive updateρ max2, Values of-5, -3,0,1 and 2, respectively;
Figure 851345DEST_PATH_IMAGE028
(12.2)
Figure 673807DEST_PATH_IMAGE029
(12.3)
if it isρ 2 Is greater than or equal toρ 2,max Then the CMA-ES local search is completed and the reward is distributed according to the formula (12.4), whereinx gmin Is a global optimal solution;
Figure 347365DEST_PATH_IMAGE030
(12.4)
and updated according to the formula (8.1)Q CMA
S13, starting a local operator CMA-ES, and starting to solve a local search space;
s14, judging
Figure 343003DEST_PATH_IMAGE003
Whether or not the above-mentioned conditions are satisfied,
Figure 152827DEST_PATH_IMAGE004
the optimal solution obtained for the CMA-ES,x gmin is a global optimal solution;
if so, it will stall algebraρ 2 Is set to zero, willx gmin Is replaced by
Figure 146191DEST_PATH_IMAGE004
(ii) a Otherwise, the stagnating algebraρ 2 Self-adding 1.ρ 2 Returning to the step S12 after updating;
the steps S12 to S14 are as follows:
initializing parameters, the initial value satisfying formula (14.1):
Figure 431679DEST_PATH_IMAGE031
(14.1)
whereinE n Is thatnThe order of the unit matrix is,nis the dimension of the problem;
using normal distributions
Figure 840794DEST_PATH_IMAGE032
Generating new individuals, see formula (14.2):
Figure 895338DEST_PATH_IMAGE033
(14.2)
sorting the individuals in the group according to the fitness and taking the first according to a formula (14.3)μOptimal individual update average vectorm. Wherein:
Figure 731707DEST_PATH_IMAGE034
and is made of
Figure 442174DEST_PATH_IMAGE035
Wherein
Figure 779614DEST_PATH_IMAGE036
The calculation satisfies formula (14.4):
Figure 829610DEST_PATH_IMAGE037
(14.3)
Figure 899197DEST_PATH_IMAGE038
(14.4)
updating step size
Figure 424856DEST_PATH_IMAGE039
Sum covariance matrixC t+1 ;
Updating according to equation (14.5)ρ 2 Wherein
Figure 910196DEST_PATH_IMAGE040
The optimal solution obtained for the CMA-ES,x gmin then the solution is the global optimal solution;
Figure 142594DEST_PATH_IMAGE041
(14.5)
whereinx gmin Is a global optimal solution;
s15, judging the current solving timesFESIs less than 0.75MAX_FESAnd if so, returning to the step S4. If the current number of solving timesFESIs no longer less thanMAX_FESThen, the process proceeds to step S16. If the current number of solving timesFESGreater than 0.75MAX_FESAnd is less thanMAX_FESThen, the global optimal solution is updated by using a local operator inner point method. Judgment of
Figure 179820DEST_PATH_IMAGE005
If it is, it willx gmin Instead of using
Figure 68142DEST_PATH_IMAGE006
Figure 481805DEST_PATH_IMAGE006
The optimal solution is obtained by the interior point method. Finally updating the local operatorThe point method parameter enters step S16;
s16, judging the current solving timesFESWhether or not greater thanMAX_FESIf yes, the solution is finished, and the current situation isx gmin Solving the result for the final; if not, the process returns to step S4.
Regarding the local operator interior point method adopted in the steps S15 to S16, the following is specific:
taking an initial penalty factorμ 0 Allowable errorε>0
Get initial point of feasible fieldx 0k=1 whereinx 0 Is the current global optimal solution;
constructing a penalty function
Figure 568710DEST_PATH_IMAGE042
In the iteration ofkFromx k-1 Starting point solutionx k Point;
if the termination condition is satisfied, obtaining the optimal solutionx k Otherwise, entering the next step;
Figure 908994DEST_PATH_IMAGE043
updating interior point method parameters according to formula (16.1)ls_evalWhereinx gmin Is a global optimal solution;
Figure 878087DEST_PATH_IMAGE044
(16.1)
when in use
Figure 970808DEST_PATH_IMAGE045
When the solution is complete.
Based on the method, the invention provides a reinforced hybrid differential evolution system for interplanetary detection orbit design. The system comprises:
the deep space track design problem construction module comprises:
determining the design problem of the deep space track of the detector to be solvedM(ii) a Problem of constructionMIs an objective function off(x) And a decision vectorxGlobal search area upper boundaryx ub Lower boundary, lower boundaryx lb
The deep space track design problem parameter initialization module:
initializing parameters for Q-learning: learning rateαDiscount factorγ
Control parameters for initializing CMA-ES local search area boundariesBound init AndBound min
initializing the global operator LSHADE _ EIG highest dead algebraρ 1,max And current stagnation algebraρ 1
Initializing the maximum stagnation times of the local operators CMA-ESρ 2,max Current number of stalled algebrasρ 2
Initializing scale factor parameters of interior point methodls_eval
Initializing the maximum number of solution to the objective functionMAX_FESAnd current number of solutionsFES
The global operator LSHADE _ EIG is used for carrying out preliminary exploration on the whole search space to obtain a preliminary global optimal solution;
stagnant algebraρ 1 The system is used for recording the accumulated stagnation times when the global operator LSHADE _ EIG is solved;
the local operator is used for further searching and calculating in the preliminary solution space to accelerate the objective functionf(x) The solving process of (2);
stagnant algebraρ 2 The method is used for recording the accumulated stagnation times when the solving of the local operator CMA-ES is finished;
separately initializing the parameters for adaptive control mutation strategiesρ 1,max Andρ 2,max Q-Table of (1); each individual in the LSHADE _ EIG operator initializes a Q-Table to adaptively control the selection of a mutation strategy;
the global solving module of the deep space track design problem comprises:
self-adaptive updating trigger parameter by adopting Q-Learning algorithmρ 1,max
Judgment ofρ 1 Whether or not less thanρ 1,max And is andFESwhether or not less thanMAX_FESIf yes, a self-adaptive selection mutation strategy adopting a Q-Learning algorithm is adopted, and a global operator LSHADE _ EIG is started; otherwise, updating adaptive control trigger parametersρ max1, The Q-Table matrix enters a deep space track design problem local solving module;
starting the global operator LSHADE _ EIG, and initially exploring and solving the whole search space;
updating a Q-Table matrix of an adaptive control variation strategy;
judgment of
Figure 115481DEST_PATH_IMAGE046
Is established, wherein
Figure 760089DEST_PATH_IMAGE002
The optimal solution obtained for lshand _ EIG,x gmin is a global optimal solution;
if true, the number of generations will be stagnantρ 1 Is set to zero, willx gmin Is replaced by
Figure 91845DEST_PATH_IMAGE002
(ii) a Otherwise stagnation algebraρ 1 Self-adding 1;ρ 1 returning to a global solution module of the deep space track design problem after updating;
the local solving module of the deep space track design problem comprises:
according to
Figure 50573DEST_PATH_IMAGE002
And control parametersBound init Bound min Determining a local search space;
in a local search space, trigger parameters are updated adaptively by adopting a Q-Learning algorithmρ 2,max
Judgment ofρ 2 Whether or not less thanρ 2,max And is made ofFESWhether or not less thanMAX_FESIf yes, starting a local operator CMA-ES, and starting to solve the local search space; otherwise, updating adaptive control trigger parametersρ max2, The Q-Table matrix enters a deep space track design problem convergence solving module;
judgment of
Figure 393961DEST_PATH_IMAGE003
Whether or not the above-mentioned conditions are satisfied,
Figure 209470DEST_PATH_IMAGE004
the optimal solution obtained for the CMA-ES,x gmin is a global optimal solution; if so, it will stall algebraρ 2 Set to zero, willx gmin Is replaced by
Figure 356418DEST_PATH_IMAGE004
(ii) a Otherwise stagnation algebraρ 2 Self-adding 1;ρ 2 returning to a deep space track design problem local solving module after updating;
the deep space track design problem convergence solving module comprises:
the first solving section: judging the current solving timesFESWhether it is less than 0.75MAX_FESIf yes, returning to a global solution module of the deep space track design problem; if the current number of solving timesFESIs no longer less thanMAX_FESThen enter the second solution portion; if the current number of solving timesFESIs greater than 0.75MAX_FESAnd is smaller thanMAX_FESUpdating the global optimal solution by using a local operator inner point method; judgment of
Figure 788012DEST_PATH_IMAGE005
If it is, it willx gmin Is replaced by
Figure 500753DEST_PATH_IMAGE006
Figure 690426DEST_PATH_IMAGE006
Obtaining an optimal solution for an interior point method; finally updating local operator inner point normal parametersCounting, entering a second solving part;
the second solving part: judging the current solving timesFESWhether or not it is greater than or equal toMAX_FESIf yes, the solution is finished, and the current situation isx gmin The final solution result is obtained; and if not, returning to the global solution module of the deep space track design problem.
As an example, the present invention compares the proposed method with other methods. Refer to table 2.
TABLE 2 comparison of RL _HDE (method of the present application) with Friedman results of the other method
Figure 262353DEST_PATH_IMAGE047
The method is used for solving seven famous interplanetary orbit detection tasks, and the performance of the method is verified to be superior to that of other design methods.
The seven interplanetary exploration tasks are respectively as follows: the geosynchronous exploration casini tasks (Cassini 1 and Cassini2 for short), the asteroid TW229 exploration task (Gtoc 1 for short), the 67P/Churyumov-Gerasimenko comet exploration Rosemata task (Rosetta for short), the flying Jupiter exploration task (Sagas for short), and the Mercury intersection exploration Messenger number tasks (Messenger and Messenger-Full for short).
In the Friedman analysis of the design results in table 2, the lower the algorithm score, the better the performance of the corresponding design method. RL _ HDE scored the lowest in the comparative method, 2.7143, indicating that RL _ HDE was superior in optimizing performance to the comparative method over the seven interplanetary track design tasks described above.
The innovation points of the invention are as follows:
(1) RL _ HDE uses Q-Learning algorithm to adaptively control six different variation strategies, and enhances the optimization ability of the algorithm. Meanwhile, aiming at the self-adaptive control of six different mutation strategies, the global operator uses an LSHADE _ EIG method, the method improves an international evolution calculation competition (CEC 2015) algorithm LSHADE _ SPS _ EIG, and an SPS framework is not used any more.
(2) Book (I)The invention uses Q-Learning algorithm to adaptively control the trigger parametersρ 1,max Andρ 2,max and the exploration and development capacity of the method is better balanced.
The invention has the beneficial effects that: the solving speed of the interplanetary detection track optimization design can be effectively improved, and the calculation precision of the detector track is improved.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and should not be taken as limiting the scope of the present invention, which is intended to cover any modifications, equivalents, improvements, etc. within the spirit and scope of the present invention.

Claims (7)

1. A reinforced hybrid differential evolution method for interplanetary exploration orbit design is characterized by comprising the following steps: the method comprises the following steps:
s1, determining the design problem of the deep space track of the detector to be solvedM
S2, problem of constructionMIs an objective function off(x) And a decision vectorxGlobal search area upper boundaryx ub Lower boundary, lower boundaryx lb
S3, initializing parameters for Q-learning: learning rateαDiscount factorγ
Control parameters for initializing CMA-ES local search area boundariesBound init AndBound min
initializing global operator LSHADE _ EIG highest-standing algebraρ 1,max And current stall algebraρ 1
Initializing the maximum stagnation times of the local operators CMA-ESρ 2,max Current stagnation algebraρ 2
Scale factor parameter for initializing interior point methodls_eval
Initializing the maximum number of solution to the objective functionMAX_FESAnd current number of solutionsFES
The global operator LSHADE _ EIG is used for carrying out preliminary exploration on the whole search space to obtain a preliminary global optimal solution;
stagnant algebraρ 1 The system is used for recording the accumulated stagnation times when the global operator LSHADE _ EIG is solved;
the local operator is used for further searching and calculating in the preliminary solution space to accelerate the objective functionf(x) The solving process of (2);
stagnant algebraρ 2 The system is used for recording the accumulated stagnation times when the local operator CMA-ES is solved;
separately initializing the parameters for adaptive control mutation strategiesρ 1,max Andρ 2,max Q-Table of (1); each individual in the LSHADE _ EIG operator initializes a Q-Table to adaptively control the selection of a mutation strategy;
s4, updating the trigger parameters in a self-adaptive mode by adopting a Q-Learning algorithmρ 1,max
S5, judgingρ 1 Whether or not less thanρ 1,max And is andFESwhether or not less thanMAX_FESIf yes, entering step S6; otherwise, the step S10 is carried out, the solution of the global search space is finished, and the adaptive control trigger parameters are updatedρ max1, Starting local solving by the Q-Table matrix;
s6, adopting a Q-Learning algorithm to adaptively select a mutation strategy;
s7, starting a global operator LSHADE _ EIG, and starting to perform preliminary exploration solving on the whole search space;
s8, updating a Q-Table matrix of the adaptive control mutation strategy;
s9, judging
Figure 697415DEST_PATH_IMAGE001
Is established, wherein
Figure 614555DEST_PATH_IMAGE002
The optimal solution obtained for lshand _ EIG,x gmin is a global optimal solution;
will stall algebraρ 1 Set to zero, willx gmin Is replaced by
Figure 394292DEST_PATH_IMAGE002
(ii) a Otherwise, the stagnating algebraρ 1 Self-adding 1;ρ 1 returning to the step S5 after updating;
s10, according to
Figure 914135DEST_PATH_IMAGE002
And control parametersBound init Bound min Determining a local search space;
s11, in the local search space, self-adaptively updating the trigger parameters by adopting a Q-Learning algorithmρ 2,max
S12, judgingρ 2 Whether or not less thanρ 2,max And is made ofFESWhether or not less thanMAX_FESIf yes, go to step S13; otherwise, updating adaptive control trigger parametersρ 2,max The Q-Table matrix enters the step S15 to represent the end of the CMA-ES local search solution;
s13, starting a local operator CMA-ES, and starting to solve a local search space;
s14, judging
Figure 728507DEST_PATH_IMAGE003
Whether or not the above-mentioned conditions are satisfied,
Figure 550970DEST_PATH_IMAGE004
the optimal solution obtained for the CMA-ES,x gmin is a global optimal solution;
if so, it will stall algebraρ 2 Set to zero, willx gmin Instead of using
Figure 818003DEST_PATH_IMAGE004
(ii) a Otherwise stagnation algebraρ 2 Self-adding 1;ρ 2 returning to the step S12 after updating;
s15, judging the current solving timesFESWhether it is less than 0.75MAX_FESAnd if so, the control unit is used for controlling the operation of the mobile phone,returning to the step S4; if the current number of solution timesFESIs no longer less thanMAX_FESThen, go to step S16; if the current number of solving timesFESWhether it is greater than 0.75MAX_FESAnd is less thanMAX_FESUpdating the global optimal solution by using a local operator inner point method; judgment of
Figure 16903DEST_PATH_IMAGE005
If it is, it willx gmin Instead of using
Figure 335318DEST_PATH_IMAGE006
Figure 328682DEST_PATH_IMAGE006
Obtaining an optimal solution for an interior point method; finally, updating local operator interior point method parameters, and entering step S16;
s16, judging the current solving timesFESWhether or not it is greater than or equal toMAX_FESIf yes, the solution is finished, and the current situation isx gmin The final solution result is obtained; if not, the process returns to step S4.
2. The method for enhanced mixed differential evolution of interplanetary exploration orbit design according to claim 1, characterized in that: parameter(s)ρ max1, Is Q-Table matrix ofQ DE Q DE According to a first parameterSc DE 1 state and first parameterf DE1 The state of (1) is combined to obtain six population evolution states, and the six population evolution states comprise seven preset first action update values; wherein the population evolution state is used as a matrixQ DE The first action update value as a matrixQ DE A column of (1); first parameterSc DE 1 comprises three states; first parameterf DE1 Two states are included.
3. The reinforced mixture of claim 2 for interplanetary exploration orbit designA combined differential evolution method, characterized by: updating trigger parametersρ 1,max The concrete formula of (2) is as follows:
Figure 83011DEST_PATH_IMAGE007
wherein,ε 1 one of the seven first actions is updated.
4. The method for the enhanced mixed differential evolution of the interplanetary exploration orbit design as claimed in claim 1, wherein: the Q-Table matrix of the adaptive control mutation strategy isQ strategy Q strategy According to the second parameterSc DE2 And a second parameterf DE2 Dividing the states, combining to obtain twenty population evolution states and including six variation strategies; wherein the population evolution state is used as a matrixQ strategy As a matrix, a mutation strategyQ strategy A column of (1); second parameterSc DE2 Comprises five states; second parameterf DE2 Including four states.
5. The method for enhanced mixed differential evolution of interplanetary exploration orbit design according to claim 1, characterized in that: parameter(s)ρ max2, The Q-Table matrix ofQ CMA Q CMA According to the parametersSuc CMA And parameters ofRatio CMA Dividing the states, combining to obtain eight species of population evolution states, and updating values of five preset second actions; wherein the population evolution state is used as a matrixQ CMA The second action update value as a matrixQ CMA The columns of (a); parameter(s)Suc CMA Comprises two states; parameter(s)Ratio CMA Including four states.
6. The method for the enhanced mixed differential evolution of the interplanetary exploration orbit design as claimed in claim 5, wherein: updating trigger parametersρ 2,max The concrete formula of (1) is as follows:
Figure 820023DEST_PATH_IMAGE008
wherein,ε 2 one of the five second actions is updated.
7. The utility model provides a reinforced mixed differential evolution system towards interplanetary exploration track design which characterized in that: the system comprises:
the deep space track design problem construction module comprises:
determining the design problem of the deep space track of the detector to be solvedM(ii) a Problem of constructionMIs an objective function off(x) And a decision vectorxGlobal search area upper boundaryx ub Lower boundary, lower boundaryx lb
The deep space track design problem parameter initialization module:
initialize parameters for Q-learning: learning rateαDiscount factorγ
Control parameters for initializing CMA-ES local search area boundariesBound init AndBound min
initializing global operator LSHADE _ EIG highest-standing algebraρ 1,max And current stagnation algebraρ 1
Initializing local operator CMA-ES maximum stagnation timesρ 2,max Current stagnation algebraρ 2
Scale factor parameter for initializing interior point methodls_eval
Initializing the maximum number of solution to the objective functionMAX_FESAnd current number of solutionsFES
The global operator LSHADE _ EIG is used for carrying out preliminary exploration on the whole search space to obtain a preliminary global optimal solution;
stagnant algebraρ 1 The system is used for recording the accumulated stagnation times when the global operator LSHADE _ EIG is solved;
the local operator is used for further searching and calculating in the preliminary solution space to accelerate the objective functionf(x) The solving process of (2);
stagnant algebraρ 2 The method is used for recording the accumulated stagnation times when the solving of the local operator CMA-ES is finished;
separately initializing the parameters for adaptive control mutation strategiesρ 1,max Andρ 2,max Q-Table of (1); each individual in the LSHADE _ EIG operator initializes a Q-Table to adaptively control the selection of a mutation strategy;
the deep space track design problem global solving module comprises:
self-adaptive updating trigger parameter by adopting Q-Learning algorithmρ 1,max
Judgment ofρ 1 Whether or not less thanρ 1,max And is made ofFESWhether or not less thanMAX_FESIf yes, a self-adaptive selection mutation strategy adopting a Q-Learning algorithm is adopted, and a global operator LSHADE _ EIG is started; otherwise, updating adaptive control trigger parametersρ max1, The Q-Table matrix enters a deep space track design problem local solving module;
starting the global operator LSHADE _ EIG, and initially exploring and solving the whole search space;
updating a Q-Table matrix of the adaptive control mutation strategy;
judgment of
Figure 343409DEST_PATH_IMAGE009
Is established, wherein
Figure 897887DEST_PATH_IMAGE002
The optimal solution obtained for lshand _ EIG,x gmin is a global optimal solution;
if true, the number of generations will be stagnantρ 1 Set to zero, willx gmin Instead of using
Figure 139512DEST_PATH_IMAGE002
(ii) a Otherwise, the stagnating algebraρ 1 Self-adding 1;ρ 1 returning to a deep space track design problem global solving module after updating;
the local solving module of the deep space track design problem comprises:
according to
Figure 414636DEST_PATH_IMAGE002
And control parametersBound init Bound min Determining a local search space;
in a local search space, trigger parameters are updated adaptively by adopting a Q-Learning algorithmρ 2,max
Judgment ofρ 2 Whether or not less thanρ 2,max And is andFESwhether or not less thanMAX_FESIf yes, starting a local operator CMA-ES, and starting to solve the local search space; otherwise, updating adaptive control trigger parametersρ max2, The Q-Table matrix enters a deep space track design problem convergence solving module;
judgment of
Figure 58107DEST_PATH_IMAGE003
Whether or not the above-mentioned conditions are satisfied,
Figure 393273DEST_PATH_IMAGE004
the optimal solution obtained for the CMA-ES,x gmin is a global optimal solution; if so, it will stall algebraρ 2 Set to zero, willx gmin Is replaced by
Figure 748293DEST_PATH_IMAGE004
(ii) a Otherwise stagnation algebraρ 2 Self-adding 1;ρ 2 returning to a deep space track design problem local solving module after updating;
the deep space track design problem convergence solving module comprises:
the first solving section: judging the current solving timesFESIs less than 0.75MAX_FESIf so, returning to a global solution module of the deep space track design problem; if the current number of solution timesFESIs no longer less thanMAX_FESThen enter the second solution portion; if the current number of solution timesFESIs greater than 0.75MAX_FESAnd is less thanMAX_FESUpdating the global optimal solution by using a local operator inner point method; judgment of
Figure 827108DEST_PATH_IMAGE010
If it is, it willx gmin Instead of using
Figure 325085DEST_PATH_IMAGE006
Figure 96732DEST_PATH_IMAGE006
Obtaining an optimal solution for the interior point method; finally, updating local operator interior point method parameters, and entering a second solving part;
the second solving part: judging the current solving timesFESWhether or not it is greater than or equal toMAX_FESIf yes, the solution is finished, and the current situation isx gmin Solving the result for the final; if not, returning to the global solution module of the deep space track design problem.
CN202211118194.6A 2022-09-15 2022-09-15 Reinforced hybrid differential evolution method and system for interplanetary exploration orbit design Active CN115204062B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211118194.6A CN115204062B (en) 2022-09-15 2022-09-15 Reinforced hybrid differential evolution method and system for interplanetary exploration orbit design

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211118194.6A CN115204062B (en) 2022-09-15 2022-09-15 Reinforced hybrid differential evolution method and system for interplanetary exploration orbit design

Publications (2)

Publication Number Publication Date
CN115204062A true CN115204062A (en) 2022-10-18
CN115204062B CN115204062B (en) 2022-12-30

Family

ID=83572851

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211118194.6A Active CN115204062B (en) 2022-09-15 2022-09-15 Reinforced hybrid differential evolution method and system for interplanetary exploration orbit design

Country Status (1)

Country Link
CN (1) CN115204062B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116258090A (en) * 2023-05-16 2023-06-13 中国地质大学(武汉) Differential evolution deep space orbit design method and system based on double-stage information migration
CN116627027A (en) * 2023-07-19 2023-08-22 济南大学 Optimal robustness control method based on improved PID
CN118551672A (en) * 2024-07-30 2024-08-27 威鹏晟(山东)机械有限公司 Performance evaluation and optimization system and method for vacuum pump cooling system

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100131440A1 (en) * 2008-11-11 2010-05-27 Nec Laboratories America Inc Experience transfer for the configuration tuning of large scale computing systems
CN103050983A (en) * 2012-12-18 2013-04-17 河海大学 Mixed algorithm-based economic operation optimization method for regional power grid
CN106600025A (en) * 2016-10-10 2017-04-26 昆明市环境科学研究院(昆明环境工程技术研究中心、昆明低碳城市发展研究中心、昆明市环境污染损害鉴定评估中心) Multi-level urban sewage water reuse-and-recycle configuration data's dynamic processing method based on multi-objective hybrid genetic algorithm
CN107909140A (en) * 2017-11-21 2018-04-13 中国地质大学(武汉) A kind of optimization method, equipment and storage device for preserving outstanding sample individual strategy
US20180343567A1 (en) * 2016-08-05 2018-11-29 Nxgen Partners Ip, Llc Private multefire network with sdr-based massive mimo, multefire and network slicing
CN109379780A (en) * 2018-10-23 2019-02-22 华南理工大学 Wireless sensor network locating method based on adaptive differential evolution algorithm
CN111090898A (en) * 2019-11-07 2020-05-01 郑州大学 Building indoor layout design method
CN111625936A (en) * 2020-05-06 2020-09-04 中国电子科技集团公司第三十八研究所 Aperiodic planar sparse phased array design method
CN112712193A (en) * 2020-12-02 2021-04-27 南京航空航天大学 Multi-unmanned aerial vehicle local route planning method and device based on improved Q-Learning
CN113204417A (en) * 2021-04-30 2021-08-03 武汉大学 Multi-satellite multi-point target observation task planning method based on improved genetic and firefly combined algorithm
CN114066122A (en) * 2020-08-06 2022-02-18 兰州理工大学 Scheduling method based on multi-strategy water wave optimization algorithm

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100131440A1 (en) * 2008-11-11 2010-05-27 Nec Laboratories America Inc Experience transfer for the configuration tuning of large scale computing systems
CN103050983A (en) * 2012-12-18 2013-04-17 河海大学 Mixed algorithm-based economic operation optimization method for regional power grid
US20180343567A1 (en) * 2016-08-05 2018-11-29 Nxgen Partners Ip, Llc Private multefire network with sdr-based massive mimo, multefire and network slicing
CN106600025A (en) * 2016-10-10 2017-04-26 昆明市环境科学研究院(昆明环境工程技术研究中心、昆明低碳城市发展研究中心、昆明市环境污染损害鉴定评估中心) Multi-level urban sewage water reuse-and-recycle configuration data's dynamic processing method based on multi-objective hybrid genetic algorithm
CN107909140A (en) * 2017-11-21 2018-04-13 中国地质大学(武汉) A kind of optimization method, equipment and storage device for preserving outstanding sample individual strategy
CN109379780A (en) * 2018-10-23 2019-02-22 华南理工大学 Wireless sensor network locating method based on adaptive differential evolution algorithm
CN111090898A (en) * 2019-11-07 2020-05-01 郑州大学 Building indoor layout design method
CN111625936A (en) * 2020-05-06 2020-09-04 中国电子科技集团公司第三十八研究所 Aperiodic planar sparse phased array design method
CN114066122A (en) * 2020-08-06 2022-02-18 兰州理工大学 Scheduling method based on multi-strategy water wave optimization algorithm
CN112712193A (en) * 2020-12-02 2021-04-27 南京航空航天大学 Multi-unmanned aerial vehicle local route planning method and device based on improved Q-Learning
CN113204417A (en) * 2021-04-30 2021-08-03 武汉大学 Multi-satellite multi-point target observation task planning method based on improved genetic and firefly combined algorithm

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
MOHAMED ABD ELAZIZ等: "Improved evolutionary-based feature selection technique using extension of knowledge based on the rough approximations", 《INFORMATION SCIENCES》 *
YANG ZUO等: "A knowledge-based differential covariance matrix adaptation cooperative algorithm", 《EXPERT SYSTEMS WITH APPLICATIONS》 *
原杨飞: "求解约束优化问题的差分进化算法研究", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》 *
李香平: "多策略差分进化算法研究及在多星协同任务规划上的应用", 《中国博士学位论文全文数据库 (工程科技Ⅱ辑)》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116258090A (en) * 2023-05-16 2023-06-13 中国地质大学(武汉) Differential evolution deep space orbit design method and system based on double-stage information migration
CN116258090B (en) * 2023-05-16 2023-08-18 中国地质大学(武汉) Differential evolution deep space orbit design method and system based on double-stage information migration
CN116627027A (en) * 2023-07-19 2023-08-22 济南大学 Optimal robustness control method based on improved PID
CN116627027B (en) * 2023-07-19 2024-01-30 济南大学 Optimal robustness control method based on improved PID
CN118551672A (en) * 2024-07-30 2024-08-27 威鹏晟(山东)机械有限公司 Performance evaluation and optimization system and method for vacuum pump cooling system

Also Published As

Publication number Publication date
CN115204062B (en) 2022-12-30

Similar Documents

Publication Publication Date Title
CN115204062B (en) Reinforced hybrid differential evolution method and system for interplanetary exploration orbit design
Such et al. Deep neuroevolution: Genetic algorithms are a competitive alternative for training deep neural networks for reinforcement learning
KR102242516B1 (en) Train machine learning models on multiple machine learning tasks
Le et al. A simple way to initialize recurrent networks of rectified linear units
EP3899797A1 (en) Multi-agent reinforcement learning with matchmaking policies
JP7419547B2 (en) Planning for agent control using learned hidden states
CN111931067A (en) Interest point recommendation method, device, equipment and medium
EP3776363A1 (en) Reinforcement learning using agent curricula
Bäck et al. Evolutionary algorithms for parameter optimization—thirty years later
CN115437795B (en) Video memory recalculation optimization method and system for heterogeneous GPU cluster load perception
KR20240138087A (en) Routing to expert subnetworks in a mixture-of-expert neural network
Xin et al. Exploration entropy for reinforcement learning
Gaier et al. Data-efficient neuroevolution with kernel-based surrogate models
Rohmatillah et al. Hierarchical reinforcement learning with guidance for multi-domain dialogue policy
CN118192472A (en) Improved sparrow optimization method for scheduling problem of flexible job shop
CN113449182A (en) Knowledge information personalized recommendation method and system
CN113407820A (en) Model training method, related system and storage medium
CN117332693A (en) Slope stability evaluation method based on DDPG-PSO-BP algorithm
Lin Evolutionary multi-armed bandits with genetic thompson sampling
Feng et al. On the application of data-driven deep neural networks in linear and nonlinear structural dynamics
Zamstein et al. Koolio: Path planning using reinforcement learning on a real robot platform
Tang et al. Deep sparse representation via deep dictionary learning for reinforcement learning
TWI851438B (en) Optimizing algorithms for hardware devices
Lin et al. Empirical explorations of strategic reinforcement learning: a case study in the sorting problem.
Buruzs et al. Fuzzy cognitive maps and bacterial evolutionary algorithm approach to integrated waste management systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant