CN112344944B - Reinforced learning path planning method introducing artificial potential field - Google Patents
Reinforced learning path planning method introducing artificial potential field Download PDFInfo
- Publication number
- CN112344944B CN112344944B CN202011327198.6A CN202011327198A CN112344944B CN 112344944 B CN112344944 B CN 112344944B CN 202011327198 A CN202011327198 A CN 202011327198A CN 112344944 B CN112344944 B CN 112344944B
- Authority
- CN
- China
- Prior art keywords
- value
- action
- algorithm
- path planning
- state
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 28
- 230000006870 function Effects 0.000 claims abstract description 21
- 230000002787 reinforcement Effects 0.000 claims abstract description 18
- 238000004088 simulation Methods 0.000 claims abstract description 5
- 230000000694 effects Effects 0.000 claims description 3
- 230000011218 segmentation Effects 0.000 claims description 3
- 238000002922 simulated annealing Methods 0.000 claims description 3
- 230000007704 transition Effects 0.000 claims description 3
- 230000005484 gravity Effects 0.000 claims 1
- 238000010586 diagram Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C21/00—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
- G01C21/20—Instruments for performing navigational calculations
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
- G05D1/0088—Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots characterized by the autonomous decision making process, e.g. artificial intelligence, predefined behaviours
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
- G05D1/02—Control of position or course in two dimensions
- G05D1/021—Control of position or course in two dimensions specially adapted to land vehicles
- G05D1/0212—Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory
- G05D1/0221—Control of position or course in two dimensions specially adapted to land vehicles with means for defining a desired trajectory involving a learning process
Landscapes
- Engineering & Computer Science (AREA)
- Remote Sensing (AREA)
- Radar, Positioning & Navigation (AREA)
- Automation & Control Theory (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Aviation & Aerospace Engineering (AREA)
- Artificial Intelligence (AREA)
- Medical Informatics (AREA)
- Game Theory and Decision Science (AREA)
- Evolutionary Computation (AREA)
- Business, Economics & Management (AREA)
- Health & Medical Sciences (AREA)
- Feedback Control In General (AREA)
- Manipulator (AREA)
Abstract
The invention discloses a reinforcement learning path planning method introducing an artificial potential field, which comprises the following steps: s1, establishing a grid map, introducing a gravitational field function initialization state value, and obtaining a simulation environment for training the reinforcement learning agent; s2, initializing algorithm parameters; s3, selecting actions by adopting a dynamic factor adjustment strategy; s4, executing the action and updating the Q value; s5, repeating the third step and the fourth step until reaching a certain step number or a certain convergence condition; s6, selecting the action with the maximum Q value in each step to obtain an optimal path; and S7, sending the optimal path to a controller of the mobile robot, and controlling the mobile robot to walk according to the optimal path. Compared with the traditional algorithm, the improved Q-learning algorithm shortens the path planning time by 85.1%, reduces the iteration times before convergence by 74.7%, and improves the convergence result stability of the algorithm.
Description
Technical Field
The invention relates to the technical field of robot path planning, in particular to a reinforcement learning path planning method introducing an artificial potential field.
Background
With the development of science and technology, more and more mobile robots walk into the daily life of people. The problem of path planning for mobile robots is also becoming more and more important. The path planning technology can help the robot avoid the obstacle to plan an optimal movement route from the starting point to the target point under the condition of referring to a certain index. Path planning can be divided into global path planning and local path planning according to the known degree of environmental knowledge in the path planning process. The global path planning algorithm which is widely applied comprises an A-star algorithm, a dijkstra algorithm, a visual graph method, a free space method and the like; the local path planning algorithm comprises an artificial potential field algorithm, a genetic algorithm, a neural network algorithm, a reinforcement learning algorithm and the like. The reinforcement learning algorithm is an algorithm with relatively strong adaptability, and can continuously try and error to find an optimal path in a completely unknown environment, so that the reinforcement learning algorithm obtains more and more attention in the field of path planning of the mobile robot.
The reinforced learning algorithm which is most widely applied in the field of path planning of mobile robots is a Q-learning algorithm. The conventional Q-learning algorithm has the following problems: (1) all Q values are set to be 0 or random values in the initialization process, so that the intelligent agent can only search blindly in the initial stage, and excessive invalid iterations occur in the initial stage of the algorithm; (2) an epsilon-greedy strategy is adopted during action selection, too large epsilon value enables more exploration environments of the intelligent agent to be difficult to converge, and too small epsilon value enables the intelligent agent to find out a suboptimal solution due to insufficient environment exploration, and the relation between exploration and utilization is difficult to balance.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a reinforcement learning path planning method introducing an artificial potential field, which introduces a gravitational field function of the artificial potential field in the Q value initialization process, so that the state value is larger when the artificial potential field is closer to a target position, an intelligent agent can search towards the target position in the initial stage, invalid iteration of the initial stage of the algorithm is reduced, and the path planning time of a mobile robot based on reinforcement learning is shortened.
A reinforcement learning path planning method for introducing an artificial potential field comprises the following steps:
s1, establishing a grid map, introducing a gravitational field function initialization state value, and obtaining a simulation environment for training the reinforcement learning agent;
s2, initializing algorithm parameters;
s3, selecting actions by adopting a dynamic factor adjustment strategy;
s4, executing the action and updating the Q value;
s5, repeating the third step and the fourth step until reaching a certain step number or a certain convergence condition;
s6, selecting the action with the maximum Q value in each step to obtain an optimal path;
and S7, sending the optimal path to a controller of the mobile robot, and controlling the mobile robot to walk according to the optimal path.
Preferably, in step S1, the specific process is as follows: the method comprises the steps of carrying out segmentation processing on an environment image obtained by a mobile robot, segmenting the image into 20 x 20 grids, establishing an environment model by adopting a grid method, and if an obstacle is found in the grids, defining the grids as the position of the obstacle, wherein the robot cannot pass through the grids; if the target point is found in the grid, defining the grid as a target position, namely a position to be finally reached by the mobile robot; the other grids are defined as non-obstacle grids, and the robot can calculate the attraction value of each grid according to the formula (1);
where ζ is a scale factor greater than 0 for adjusting the magnitude of attraction; | d | is the distance between the current position and the position of the target point; eta is a normal number, and the attraction value at the target point is prevented from being infinite.
Preferably, in step S2, the parameters include: learning rate alpha epsilon (0, 1), discount factor gamma epsilon (0, 1), maximum iteration times, reward function r and greedy factor dynamic adjustment strategy parameter epsilon max ,ε min ,T,n;
Initializing the Q function using equation (2)
Wherein, P(s) , | s, a) is the probability of transition to state s, V(s), from the current state s and action a determined , ) As a function of the state value of the next state,V(s , )=U att 。
preferably, in step S3, the greedy factor adjustment strategy is as follows:
wherein the specific form of the tanh function is as follows:
e is the base of the natural logarithm, as the independent variabletWhen greater than 0, tanh: (t) The value range of (1) is (0);std n step number standard deviation under continuous n times of iteration; t is a coefficient, and is opposite to the effect of temperature values in the simulated annealing algorithm, and the larger T is, the smaller the randomness is; epsilon max And ε min The set maximum value and minimum value of the search rate are respectively.
Preferably, in step S4, the action a selected in the third step is executed to arrive at S, the instant reward R (S, a) is obtained, the Q-value function is updated by using a Q-learning algorithm introducing artificial potential field, and the updating rule is as shown in formula (5)
Wherein (s, a) is a current state-action pair; (s) , ,a , ) Is the state-action pair at the next time, and R (s, a) is the instant reward for performing action a at state s.
The invention has the beneficial effects that:
in order to solve the problems of low convergence speed, multiple iteration times, unstable convergence result and the like when the traditional reinforcement learning algorithm is applied to path planning of an unknown environment of a mobile robot, an improved Q-learning algorithm is provided. Introducing an artificial potential field method during state initialization, so that the state value is larger when the intelligent agent is closer to a target position, and the intelligent agent is guided to move towards the target position; and an epsilon-greedy strategy is improved on action selection, and a greedy factor epsilon is dynamically adjusted according to the convergence degree of the algorithm, so that the relationship between exploration and utilization is well balanced. Simulation results based on the grid map show that compared with the traditional algorithm, the improved Q-learning algorithm shortens the path planning time by 85.1%, the iteration frequency before convergence is reduced by 74.7%, and meanwhile, the convergence result stability of the algorithm is improved.
Drawings
FIG. 1 is a schematic view of the general flow of the process of the present invention.
Fig. 2 is a grid map of the operation of the mobile robot according to the embodiment of the present invention.
FIG. 3 is a diagram of conventional Q-learning convergence.
FIG. 4 is a diagram of improved Q-learning convergence according to an embodiment of the invention.
FIG. 5 is a diagram of an optimized path drawn by the improved Q-learning scheme according to the embodiment of the invention.
Detailed Description
Embodiments of the present invention will be described in detail below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and therefore are only used as examples, and the protection scope of the present invention is not limited thereby.
Referring to fig. 1, the method for planning a reinforcement learning path by introducing an artificial potential field according to the present invention includes the following steps:
the first step is as follows: the method comprises the steps of carrying out segmentation processing on an environment image obtained by a mobile robot, segmenting the image into 20 x 20 grids, establishing an environment model by adopting a grid method, and if an obstacle is found in the grids, defining the grids as the position of the obstacle, wherein the robot cannot pass through the grids; if the target point is found in the grid, determining that the grid is the target position and the position to which the mobile robot finally arrives; the other grids are defined as barrier-free grids, and the robot can pass through, and calculate the attraction value of each grid according to formula (1).
Zeta is a scale factor greater than 0 and is used for adjusting the size of the attraction force, | d | is the distance between the current position and the position of the target point, and η is a normal number, so that the attraction force value at the target point is prevented from being infinite.
Through the above steps, a simulation environment for training the reinforcement learning agent can be obtained, and the grid map for the mobile robot in the embodiment is shown in fig. 2.
The second step is that: initializing algorithm parameters, the parameters comprising: learning rate alpha epsilon (0, 1), discount factor gamma epsilon (0, 1), maximum iteration times, reward function r and greedy factor dynamic adjustment strategy parameter epsilon max ,ε min ,T,n。
The Q value function is initialized using equation (2).
Wherein, P(s) , | s, a) is the probability of transition to state s, V(s), from the current state s and action a determined , ) As a function of the state value of the next state, V(s) , )=U att 。
The third step: and selecting the action by adopting a greedy factor dynamic adjustment strategy, wherein the greedy factor dynamic adjustment strategy is as follows:
wherein the specific form of the tanh function is as follows:
e is the base of the natural logarithm, as the independent variabletAbove 0, tan h: (t) The value range of (1) is (0);std n step number standard deviation under continuous n times of iteration; t is a coefficient, and is opposite to the effect of temperature values in the simulated annealing algorithm, and the larger T is, the smaller the randomness is; epsilon max And ε min Respectively the maximum value of the set exploration rateAnd a minimum value.
The fourth step: performing the action a selected in the third step , Arrival s , Obtaining an instant reward R (s, a), updating the Q value function by using a Q-learning algorithm introducing an artificial potential field, and updating the rule as shown in the formula (5)
Wherein (s, a) is a current state-action pair; (s) , ,a , ) Is the state-action pair at the next time; r (s, a) is an instant reward for performing action a in state s.
And repeatedly executing the third step and the fourth step until a certain step number or a certain convergence condition is reached.
The fifth step: and selecting the action with the maximum Q value in each step to obtain the optimal path.
And a sixth step: and sending the optimal path to a controller of the mobile robot, and controlling the mobile robot to walk according to the optimal path.
The parameter settings in this embodiment are as follows: learning rate a =0.01, discount factor γ =0.9, maximum number of iterations is set to 20000, scale factor ζ is set to 0.6, constant η is set to 1, and ∈ is set max =0.5,ε min =0.01, T =500, n =10, the reward function is set as:
in this embodiment, we can obtain the optimal path by using the above method and setting the parameters as shown in fig. 5.
Comparing fig. 3 and fig. 4, it can be seen that, by comparing the improved Q-learning algorithm with the conventional Q-learning algorithm, the improved algorithm shortens the algorithm convergence time by 85.1%, reduces the iteration number by 74.7%, and improves the convergence result stability of the algorithm.
It is to be noted that, unless otherwise specified, technical or scientific terms used herein shall have the ordinary meaning as understood by those skilled in the art to which the invention pertains.
Claims (4)
1. A reinforcement learning path planning method for introducing an artificial potential field is characterized by comprising the following steps: the method comprises the following steps:
s1, establishing a grid map, introducing a gravitational field function initialization state value, and obtaining a simulation environment for training the reinforcement learning agent;
s2, initializing algorithm parameters;
s3, selecting actions by adopting a dynamic factor adjustment strategy;
s4, executing the action and updating the Q value;
s5, repeating the third step and the fourth step until reaching a certain step number or a certain convergence condition;
s6, selecting the action with the maximum Q value in each step to obtain an optimal path;
s7, sending the optimal path to a controller of the mobile robot, and controlling the mobile robot to walk according to the optimal path;
the specific process of step S1 is as follows: the method comprises the steps of carrying out segmentation processing on an environment image obtained by a mobile robot, segmenting the image into 20 x 20 grids, establishing an environment model by adopting a grid method, and if an obstacle is found in the grids, defining the grids as the position of the obstacle, wherein the robot cannot pass through the grids; if the target point is found in the grid, defining the grid as a target position, namely a position to be finally reached by the mobile robot; the other grids are defined as non-obstacle grids, and the robot can calculate the attraction value of each grid according to the formula (1);
where ζ is a scale factor greater than 0 for adjusting the magnitude of attraction; | d | is the distance between the current position and the position of the target point; eta is a normal number, so that the attraction value at the target point is prevented from being infinite;
in step S2, the parameters include: learning rate alpha epsilon (0, 1), discount factor gamma epsilon (0, 1), maximum iteration times, reward function r and greedy factor dynamic adjustment strategy parameter epsilon max ,ε min ,T,n;
Initializing the Q function using equation (2)
Wherein, P(s) , S, a) is a transition to the next state s from the case determined by the current state s and the action a , Probability of (a), V(s) , ) As a function of the state value of the next state, V(s) , )=U att ;The prize value obtained for the current state s and taking action a,s in (1) is a state set, U att A gravity value for the current position;
in step S3, the greedy factor adjustment strategy is as follows:
wherein the specific form of the tanh function is as follows:
e is the base of the natural logarithm, and when the independent variable is greater than 0, the value range of the tanh () is (0, 1);std n step number standard deviation under continuous n times of iteration; t is a coefficient, and is opposite to the effect of temperature values in the simulated annealing algorithm, and the larger T is, the smaller the randomness is; epsilon max And epsilon min The set maximum and minimum values of the search rate are set.
2. The method of reinforcement learning path planning incorporating an artificial potential field of claim 1, further comprising:
in step S4, the action a selected in the third step is executed to reach S to obtain the instant reward R (S, a), the Q value function is updated by using the Q-learning algorithm introduced with artificial potential field, and the updating rule is as the formula (5)
Wherein (s, a) is a current state-action pair; (s) , ,a , ) Is the state-action pair at the next time; r (s, a) is the instant reward for executing action a under state s, a is the learning rate, and γ is the discount factor.
3. The method of reinforcement learning path planning with introduction of an artificial potential field according to claim 1, characterized by: the scale factor ζ is set to 0.6 and the constant η is set to 1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011327198.6A CN112344944B (en) | 2020-11-24 | 2020-11-24 | Reinforced learning path planning method introducing artificial potential field |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011327198.6A CN112344944B (en) | 2020-11-24 | 2020-11-24 | Reinforced learning path planning method introducing artificial potential field |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112344944A CN112344944A (en) | 2021-02-09 |
CN112344944B true CN112344944B (en) | 2022-08-05 |
Family
ID=74365572
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011327198.6A Active CN112344944B (en) | 2020-11-24 | 2020-11-24 | Reinforced learning path planning method introducing artificial potential field |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112344944B (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112964272A (en) * | 2021-03-16 | 2021-06-15 | 湖北汽车工业学院 | Improved Dyna-Q learning path planning algorithm |
CN113534819B (en) * | 2021-08-26 | 2024-03-15 | 鲁东大学 | Method and storage medium for pilot following type multi-agent formation path planning |
CN113848911B (en) * | 2021-09-28 | 2023-06-27 | 华东理工大学 | Mobile robot global path planning method based on Q-learning and RRT |
CN114296440B (en) * | 2021-09-30 | 2024-04-09 | 中国航空工业集团公司北京长城航空测控技术研究所 | AGV real-time scheduling method integrating online learning |
CN113790729B (en) * | 2021-11-16 | 2022-04-08 | 北京科技大学 | Unmanned overhead traveling crane path planning method and device based on reinforcement learning algorithm |
CN114518758B (en) * | 2022-02-08 | 2023-12-12 | 中建八局第三建设有限公司 | Indoor measurement robot multi-target point moving path planning method based on Q learning |
CN115542912B (en) * | 2022-09-29 | 2024-06-07 | 福州大学 | Mobile robot path planning method based on improved Q-learning algorithm |
CN116700258B (en) * | 2023-06-13 | 2024-05-03 | 万基泰科工集团数字城市科技有限公司 | Intelligent vehicle path planning method based on artificial potential field method and reinforcement learning |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102799179A (en) * | 2012-07-06 | 2012-11-28 | 山东大学 | Mobile robot path planning algorithm based on single-chain sequential backtracking Q-learning |
WO2018120739A1 (en) * | 2016-12-30 | 2018-07-05 | 深圳光启合众科技有限公司 | Path planning method, apparatus and robot |
CN110132296A (en) * | 2019-05-22 | 2019-08-16 | 山东师范大学 | Multiple agent sub-goal based on dissolution potential field divides paths planning method and system |
CN110307848A (en) * | 2019-07-04 | 2019-10-08 | 南京大学 | A kind of Mobile Robotics Navigation method |
CN110726416A (en) * | 2019-10-23 | 2020-01-24 | 西安工程大学 | Reinforced learning path planning method based on obstacle area expansion strategy |
CN110794842A (en) * | 2019-11-15 | 2020-02-14 | 北京邮电大学 | Reinforced learning path planning algorithm based on potential field |
CN111896006A (en) * | 2020-08-11 | 2020-11-06 | 燕山大学 | Path planning method and system based on reinforcement learning and heuristic search |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8626565B2 (en) * | 2008-06-30 | 2014-01-07 | Autonomous Solutions, Inc. | Vehicle dispatching method and system |
US10839302B2 (en) * | 2015-11-24 | 2020-11-17 | The Research Foundation For The State University Of New York | Approximate value iteration with complex returns by bounding |
CN110462544A (en) * | 2017-03-20 | 2019-11-15 | 御眼视觉技术有限公司 | The track of autonomous vehicle selects |
-
2020
- 2020-11-24 CN CN202011327198.6A patent/CN112344944B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102799179A (en) * | 2012-07-06 | 2012-11-28 | 山东大学 | Mobile robot path planning algorithm based on single-chain sequential backtracking Q-learning |
WO2018120739A1 (en) * | 2016-12-30 | 2018-07-05 | 深圳光启合众科技有限公司 | Path planning method, apparatus and robot |
CN110132296A (en) * | 2019-05-22 | 2019-08-16 | 山东师范大学 | Multiple agent sub-goal based on dissolution potential field divides paths planning method and system |
CN110307848A (en) * | 2019-07-04 | 2019-10-08 | 南京大学 | A kind of Mobile Robotics Navigation method |
CN110726416A (en) * | 2019-10-23 | 2020-01-24 | 西安工程大学 | Reinforced learning path planning method based on obstacle area expansion strategy |
CN110794842A (en) * | 2019-11-15 | 2020-02-14 | 北京邮电大学 | Reinforced learning path planning algorithm based on potential field |
CN111896006A (en) * | 2020-08-11 | 2020-11-06 | 燕山大学 | Path planning method and system based on reinforcement learning and heuristic search |
Non-Patent Citations (3)
Title |
---|
Yukiyasu Noguchi 等.Path Planning Method Based on Artificial Potential Field and Reinforcement Learning for Intervention AUVs.《2019 IEEE Underwater Technology (UT)》.2019,第1-6页. * |
宋勇等.移动机器人路径规划强化学习的初始化.《控制理论与应用》.2012,第29卷(第12期),第1623-1628页. * |
徐晓苏 等.基于改进强化学习的移动机器人路径规划方法.《中国惯性技术学报》.2019,第27卷(第3期),第314-320页. * |
Also Published As
Publication number | Publication date |
---|---|
CN112344944A (en) | 2021-02-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112344944B (en) | Reinforced learning path planning method introducing artificial potential field | |
CN111896006B (en) | Path planning method and system based on reinforcement learning and heuristic search | |
CN109765893A (en) | Method for planning path for mobile robot based on whale optimization algorithm | |
CN111381600B (en) | UUV path planning method based on particle swarm optimization | |
CN107703751A (en) | PID controller optimization method based on dragonfly algorithm | |
CN113867369B (en) | Robot path planning method based on alternating current learning seagull algorithm | |
CN114460941B (en) | Robot path planning method and system based on improved sparrow search algorithm | |
CN108594803B (en) | Path planning method based on Q-learning algorithm | |
CN115629607A (en) | Reinforced learning path planning method integrating historical information | |
CN113885536A (en) | Mobile robot path planning method based on global gull algorithm | |
CN115115284B (en) | Energy consumption analysis method based on neural network | |
CN114967713B (en) | Underwater vehicle buoyancy discrete change control method based on reinforcement learning | |
CN116859903A (en) | Robot smooth path planning method based on improved Harris eagle optimization algorithm | |
CN114742231A (en) | Multi-objective reinforcement learning method and device based on pareto optimization | |
Khasanov et al. | Gradient descent in machine learning | |
CN108955689A (en) | It is looked for food the RBPF-SLAM method of optimization algorithm based on adaptive bacterium | |
CN108121206A (en) | Compound self-adaptive model generation optimization method based on efficient modified differential evolution algorithm | |
CN110889531A (en) | Wind power prediction method and prediction system based on improved GSA-BP neural network | |
CN115167419A (en) | Robot path planning method based on DQN algorithm | |
CN114548497B (en) | Crowd motion path planning method and system for realizing scene self-adaption | |
CN115655279A (en) | Marine unmanned rescue airship path planning method based on improved whale algorithm | |
CN115344046A (en) | Mobile robot path planning based on improved deep Q network algorithm | |
CN114995105A (en) | Water turbine regulating system PID parameter optimization method based on improved genetic algorithm | |
CN113807505A (en) | Method for improving cyclic variation learning rate through neural network | |
Hewlett et al. | Optimization using a modified second-order approach with evolutionary enhancement |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |