CN110470306A - A kind of multi-robot formation air navigation aid based on deeply study of certifiable connectivity constraint - Google Patents
A kind of multi-robot formation air navigation aid based on deeply study of certifiable connectivity constraint Download PDFInfo
- Publication number
- CN110470306A CN110470306A CN201910795982.0A CN201910795982A CN110470306A CN 110470306 A CN110470306 A CN 110470306A CN 201910795982 A CN201910795982 A CN 201910795982A CN 110470306 A CN110470306 A CN 110470306A
- Authority
- CN
- China
- Prior art keywords
- robot
- function
- constraint condition
- connectivity
- parameterized function
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C21/00—Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
- G01C21/20—Instruments for performing navigational calculations
Landscapes
- Engineering & Computer Science (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Automation & Control Theory (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Feedback Control In General (AREA)
- Manipulator (AREA)
Abstract
The present invention relates to mobile robot technology fields, more particularly, to a kind of multi-robot formation air navigation aid based on deeply study of certifiable connectivity constraint.The geometric center of multi-robot formation can efficiently be navigated to target point by this method in collisionless situation, and guarantee the connectivity of robot team formation in navigation procedure.The present invention indicates navigation strategy using the parameterized function that can meet constraint condition, guarantees the connectivity of robot team formation in navigation procedure with this.Meanwhile the present invention realizes the parameterized function that can meet constraint condition using virtual policy-extension environment frame, with the compatible deeply learning algorithm required to the parameterized function property led.
Description
Technical field
The present invention relates to mobile robot technology field, more particularly, to a kind of certifiable connectivity constraint based on
The multi-robot formation air navigation aid of deeply study.
Background technique
Multi-robot formation has wide practical use, such as rescue, search, exploration, agricultural spray and collaboration are carried.
In the task of execution, multi-robot formation is likely to require the operation in unknown complex scene.At this point, multi-robot formation navigates
Strategy is critically important for the safety and efficiency of multi-robot formation.Under normal conditions, the communication distance of robot is limited, because
, in order to guarantee being in communication with each other between multirobot, multi-robot formation navigation strategy needs to consider the company of multi-robot formation for this
The general character.Multi-robot formation air navigation aid includes for rule-based method and the method learnt based on deeply.Based on rule
Barrier map of the method then dependent on building, and sometimes, the building of barrier map is relatively difficult and can account for
With many computing resources.It, can directly will be original using the method learnt based on deeply compared to rule-based method
Perception data is mapped to the control amount of robot, without the need to build barrier map, thus obtains extensive concern.However, making
When with the method learnt based on deeply, last control amount is generated by a unconfined parameterized function mostly, therefore
The control amount is possible to that the connectivity of multi-robot formation can be destroyed, and then leads to the communication disruption between multi-robot formation.
There is different methods that can avoid producing to the resulting strategy addition constraint condition of method learnt based on deeply
Life can destroy the control amount of constraint condition.Reward is moulded through modification reward function and is constrained for strategy addition.However due to most
Control amount afterwards is still generated by unconfined parameterized function, therefore the constraint added is soft-constraint, that is to say, that Zhi Nengti
Height meets the probability constrained, reduces the probability for destroying constraint, and not can guarantee certain satisfaction constraint.Method for normalizing passes through in nothing
Normalized function (such as Sigmoid, Tanh and Clipping function) is arranged finally to guarantee in the parameterized function of constraint
Output is fallen within certain section, then by acquiring last control amount multiplied by a coefficient.This method can be fine
Ground handles the constraint of Interval Type, but can not handle the constraint of connectivity.Method based on control theory is used and is such as controlled
The tool of the control theories such as barrier function and liapunov function constrains to add, and has very strong theoretical foundation.However it is this kind of
Method need to be introduced into it is additional it is assumed that and these hypothesis in multi-robot formation navigates and may be unsatisfactory for.Hierarchical structure
Decision process has been divided into decision of the senior level and bottom decision using the thought divided and ruled by method.Decision of the senior level is learnt by deeply
It arrives, bottom decision is by that can guarantee that the bottom decision-making device of constraint condition is completed.However, Design hierarchy structure (i.e. decision of the senior level and
The line of demarcation of bottom decision is at which) it is not easy to, and sometimes, can not guarantee the bottom decision of constraint condition
Device.
Summary of the invention
The present invention is to overcome above-mentioned defect in the prior art, provides a kind of the strong based on depth of certifiable connectivity constraint
The multi-robot formation air navigation aid that chemistry is practised, can effectively ensure that the connectivity of multi-robot formation in navigation procedure.
In order to solve the above technical problems, the technical solution adopted by the present invention is that: a kind of certifiable connectivity constraint based on
The multi-robot formation air navigation aid of deeply study, key are to carry out table using the parameterized function that can meet constraint condition
Show multi-robot formation navigation strategy, guarantees the connectivity of multi-robot formation in navigation procedure with this.Meanwhile the present invention makes
The parameterized function that can meet constraint condition is realized with virtual policy-extension environment frame, parameterizes letter with compatibility requirements
The guidable deeply learning algorithm of number.
Further, the parameterized function that can meet constraint condition includes two parts, is a general no constraint respectively
Parameterized function (such as neural network) and a constrained optimization module.Since last control amount is by a constrained optimization
Module is generated rather than is directly generated by no constrained parameters function, therefore can guarantee to meet constraint condition.
It is parameter without constrained parameters functions value o according to the observation using θ in the parameterized function that can meet constraint condition
The z being calculatedθ(o) no longer it is final control amount, but passes to constrained optimization problem module as an input.It constrains excellent
Change module according to incoming zθ(o) constrained optimization problem is solved, the control amount a that can guarantee connectivity is obtained.Specifically, about
Objective function f (the z of beam optimization problemo(o), a) with the variable a to be optimized and output z without constrained parameters functionθ(o) phase
It closes, the constraint condition of constrained optimization problem is connectivity constraint.
For given observed value o, different parameter θs1And θ2It can generate differentWithAnd then it generates not
Same objective functionWithAnd different objective functionWith
After constrained optimization module, different final control amount a can be being generated1And a2.And since final control amount is by constraining always
Optimization module generates, therefore final control amount centainly can satisfy connectivity constraint.
Further, can meet constraint condition parameterized function be one with observed value o for input, control amount a be it is defeated
Out, θ is the function of parameter, and mathematical form is as follows:
In formula, zθIt (o) is no constrained parameters function, f (zθ(o), a) be constrained optimization problem objective function;gi(zθ
(o), a), hi(zθIt (o), a) is inequality constraints function and equality constraints functions in constrained optimization problem respectively;By f (z,
A) it is used as medium, the parameterized function of constraint condition can be met while having the ability to express peace treaty of no constrained parameters function
Restriction ability of the beam optimization problem to final control amount.
7. further, the derivation about connectivity constraint: assuming that the kinematics model of i-th of robot are as follows:
WhereinWithPosition and control amount of respectively i-th of the robot in t moment, Δ t are time interval;Design mesh
(z, concrete form a) are a to scalar functions fTa+zTA, then meeting the mathematical form of the parameterized function of connectivity constraint condition such as
Under:
Wherein N is the quantity of robot in multi-robot formation, and d is communication distance, atAnd otIt is that t moment is entire respectively
The splicing of multi-robot formation N number of robot control amount and observed value, i.e., And t
The observed value of i-th of robot of momentInclude the perception information to environment For the present speed of itself,For it
The position of remaining robot, andFor the position of target point.
8. it is further, it, can be further by its observed value about teammate's information for different robot iDetermine
Justice:
Formula (4) are substituted into the above-mentioned parameterized function formula (3) for meeting connectivity constraint condition;It is artificial with the 1st machine
The mathematical form of example, the final resulting parameterized function for meeting connectivity constraint condition is as follows:
Constraints above optimization problem is convex optimization problem, therefore can be with Efficient Solution.To sum up, each robot utilizes shared
Without constrained parameters functionAccording to the observed value of itselfIt calculates respectiveThen all robots pass through information
Interaction is ownedConstrained optimization problem of equal value is respectively solved, a that can satisfy connectivity constraint is obtainedt;Finally
According to the number of oneself from atMiddle taking-up is correspondingIt is executed using it as control amount.
In the present invention, a constrained optimization module is contained since the parameterized function of constraint condition can be met, even if
No constrained parameters function therein be it is guidable, finally can entirely meet constraint condition parameterized function be also likely to be can not
It leads.Multi-robot formation navigation strategy is directly indicated with the parameterized function that can meet constraint condition, then can not utilize and want
Seek the guidable deeply learning method of parameterized function.
In order to enable the guidable deeply of parameterized function compatibility requirements parameterized function of constraint condition can be met
Learning method, the present invention realize the parameterized function that can meet constraint condition using virtual policy-extension environment mode.Pass through
Virtual policy-extension environment framework, intensified learning problem originally is (by the parametrization that guidable may not meet constraint condition
Function and primal environment are constituted) an intensified learning problem of equal value is converted into (by guidable virtual policy and extension environment
Constitute), therefore can be using requiring the guidable deeply learning method of parameterized function to be solved.Next will above may be used
Meet the parameterized function of constraint condition and its is substituted into based on the realization of virtual policy-extension environment framework based on deeply
Final navigation strategy can be acquired in the multi-robot formation air navigation aid of habit.
Compared with prior art, beneficial effect is: the present invention proposes to come using the parameterized function that can meet constraint condition
It indicates multi-robot formation navigation strategy, guarantees the connectivity of multi-robot formation in navigation procedure with this.Compared to level
Structure Method, for method of the invention while can guarantee connectivity constraint, more plug and play is (i.e. without explicitly design level
Secondary structure does not depend on the bottom decision-making device that can guarantee constraint condition yet).Meanwhile the present invention uses virtual policy-extension environment
Frame realize the parameterized function that can meet constraint condition, with the guidable deeply study of compatibility requirements parameterized functions
Algorithm.
Detailed description of the invention
Fig. 1 shows the parameterized functions that can meet constraint condition.
Fig. 2 expression illustrates the parameterized function that can meet constraint condition.
Fig. 3 indicates strategy-environment framework structural schematic diagram.
Fig. 4 indicates virtual policy-extension environment block schematic illustration.
Fig. 5 indicates decision flow diagram.
Specific embodiment
Attached drawing only for illustration, is not considered as limiting the invention;In order to better illustrate this embodiment, attached
Scheme certain components to have omission, zoom in or out, does not represent the size of actual product;To those skilled in the art,
The omitting of some known structures and their instructions in the attached drawings are understandable.Being given for example only property of positional relationship is described in attached drawing
Illustrate, is not considered as limiting the invention.
Embodiment 1
The multi-robot formation air navigation aid based on deeply study of certifiable connectivity constraint proposed by the present invention
Key be to indicate multi-robot formation navigation strategy using the parameterized function that can meet constraint condition, guaranteed with this
The connectivity of multi-robot formation in navigation procedure.Meanwhile the present invention can to realize using virtual policy-extension environment frame
Meet the parameterized function of constraint condition, with the guidable deeply learning algorithm of compatibility requirements parameterized function.
As shown in Figure 1, the parameterized function that can meet constraint condition includes two parts, it is a general no constraint respectively
Parameterized function (such as neural network) and a constrained optimization module.Since last control amount is by a constrained optimization
Module is generated rather than is directly generated by no constrained parameters function, therefore can guarantee to meet constraint condition.
It is parameter without constrained parameters functions value o according to the observation using θ in the parameterized function that can meet constraint condition
The z being calculatedθ(o) no longer it is final control amount, but passes to constrained optimization problem module as an input.It constrains excellent
Change module according to incoming zθ(o) constrained optimization problem is solved, the control amount a that can guarantee connectivity is obtained.Specifically, about
Objective function f (the z of beam optimization problemθ(o), a) with the variable a to be optimized and output z without constrained parameters functionθ(o) phase
It closes, the constraint condition of constrained optimization problem is connectivity constraint.
Illustrate the input/output procedure that can meet the parameterized function of constraint condition with the example of Fig. 2 below.For giving
Fixed observed value o, different parameter θs1And θ2It can generate differentWithAnd then generate different objective functionsWithAnd different objective functionWithPassing through constrained optimization
After module, different final control amount a can be generated1And a2.And since final control amount is generated by constrained optimization module always,
Therefore final control amount centainly can satisfy connectivity constraint.
To sum up, can meet constraint condition parameterized function be one with observed value o for input, control amount a be output, θ
For the function of parameter, mathematical form is as follows:
By f, (z a) is used as medium, can meet the parameterized function of constraint condition while have no constrained parameters letter
Restriction ability of several abilities to express and constraint optimization problem to final control amount.
Next further derivation of the supplement about connectivity constraint.Assuming that the kinematics model of i-th of robot are as follows:
WhereinWithPosition and control amount of respectively i-th of the robot in t moment, Δ t are time interval.Design mesh
(z, concrete form a) are a to scalar functions fTa+zTA, then meeting the mathematical form of the parameterized function of connectivity constraint condition such as
Under:
Wherein N=3 is the quantity of robot in multi-robot formation, and d=3.5 is communication distance, atAnd otIt is t respectively
The splicing of moment entire multi-robot formation N number of robot control amount and observed value, i.e.,And the observed value of i-th of robot of t momentInclude the perception letter to environment
Breath(i.e. the point cloud data of two-dimensional laser radar), the present speed of itselfThe position of remaining robotAnd mesh
The position of punctuate
It, can be further by its observed value about teammate's information for different robot iDefinition:
Formula (4) are substituted into the above-mentioned parameterized function formula (3) for meeting connectivity constraint condition.It is artificial with the 1st machine
The mathematical form of example, the final resulting parameterized function for meeting connectivity constraint condition is as follows:
Constraints above optimization problem is convex optimization problem, therefore can be with Efficient Solution.To sum up, it is ensured that connectivity constraint
The overall process of multi-robot formation navigation based on deeply study as shown in figure 5, each robot using shared without about
Beam parameterized functionAccording to the observed value of itselfIt calculates respectiveThen all robots are obtained by information exchange
To allConstrained optimization problem of equal value is respectively solved, a that can satisfy connectivity constraint is obtainedt;Last basis is certainly
Oneself number is from atMiddle taking-up is correspondingIt is executed using it as control amount.
Parameterized function due to that can meet constraint condition contains a constrained optimization module, even if no constraint therein
Parameterized function be it is guidable, it is also likely to be not guidable for finally can entirely meeting the parameterized function of constraint condition.Therefore false
As shown in Figure 3 as, multi-robot formation navigation strategy is directly indicated with the parameterized function that can meet constraint condition, then without
Method utilizes and requires the guidable deeply learning method of parameterized function.
In order to enable the guidable deeply of parameterized function compatibility requirements parameterized function of constraint condition can be met
Learning method, the present invention realize the parameterized function that can meet constraint condition using virtual policy-extension environment mode.Such as figure
Shown in 4, extension environment is no longer located at as Fig. 3 about intelligent body with the boundary of environment virtual policy-can meet constraint condition
Parameterized function and environment between, but be virtual policy and extension environment between.Wherein, virtual policy is that can meet about
In the parameterized function of beam condition without constrained parameters function, extend environment by that can meet in the parameterized function of constraint condition
Constrained optimization module and primal environment constitute.
By virtual policy-extension environment framework, intensified learning problem originally is (by that guidable may not meet constraint
The parameterized function and primal environment of condition are constituted) an intensified learning problem of equal value is converted into (by guidable virtual plan
Slightly constituted with extension environment), therefore can be using requiring the guidable deeply learning method of parameterized function to be solved.It connects
The parameterized function of constraint condition will can be met above and its substitute into base based on the realization of virtual policy-extension environment framework by getting off
Final navigation strategy can be acquired in the multi-robot formation air navigation aid of deeply study.
In the present embodiment, multi-robot formation navigation plan is indicated using the parameterized function that can meet constraint condition
Slightly, guarantee the connectivity of multi-robot formation in navigation procedure with this;It is realized using virtual policy-extension environment frame
The parameterized function of constraint condition can be met, with the compatible deeply learning algorithm required to the parameterized function property led.
Obviously, the above embodiment of the present invention be only to clearly illustrate example of the present invention, and not be pair
The restriction of embodiments of the present invention.For those of ordinary skill in the art, may be used also on the basis of the above description
To make other variations or changes in different ways.There is no necessity and possibility to exhaust all the enbodiments.It is all this
Made any modifications, equivalent replacements, and improvements etc., should be included in the claims in the present invention within the spirit and principle of invention
Protection scope within.
Claims (6)
1. a kind of multi-robot formation air navigation aid based on deeply study of certifiable connectivity constraint, feature exist
In indicating multi-robot formation navigation strategy using the parameterized function that can meet constraint condition;Constraint condition can be met
Parameterized function includes two parts, is one respectively without constrained parameters function and a constrained optimization module;Constraint can be met
The parameterized function of condition be one with observed value o for input, control amount a be output, θ be parameter function, mathematical form
It is as follows:
In formula, zθIt (o) is no constrained parameters function, f (zθ(o), a) be constrained optimization problem objective function;gi(zθ(o),
a)、hi(zθIt (o), a) is inequality constraints function and equality constraints functions in constrained optimization problem respectively.
2. the multi-robot formation navigation based on deeply study of certifiable connectivity constraint according to claim 1
Method, which is characterized in that assuming that the kinematics model of i-th of robot are as follows:
WhereinWithPosition and control amount of respectively i-th of the robot in t moment, Δ t are time interval;Design object letter
(z, concrete form a) are a to number fTa+zTA, then the mathematical form for meeting the parameterized function of connectivity constraint condition is as follows:
Wherein N is the quantity of robot in multi-robot formation, and d is communication distance, atAnd otIt is the entire multimachine of t moment respectively
Device people forms into columns the splicing of N number of robot control amount and observed value, i.e., And t moment
The observed value of i-th of robotInclude the perception information to environment For the present speed of itself,For remaining machine
The position of people, andFor the position of target point.
3. the multi-robot formation navigation based on deeply study of certifiable connectivity constraint according to claim 2
Method, which is characterized in that, can be further by its observed value about teammate's information for different robot iDefinition:
Formula (4) are substituted into the above-mentioned parameterized function formula (3) for meeting connectivity constraint condition;By taking the 1st robot as an example, most
The mathematical form of the resulting parameterized function for meeting connectivity constraint condition is as follows eventually:
4. the multi-robot formation navigation based on deeply study of certifiable connectivity constraint according to claim 3
Method, which is characterized in that each robot is using shared without constrained parameters functionAccording to the observed value of itselfIt calculates
It is respectiveThen all robots are owned by information exchangeConstrained optimization of equal value is respectively solved to ask
Topic, obtains a that can satisfy connectivity constraintt;Finally according to the number of oneself from atMiddle taking-up is correspondingUsing it as control
Amount processed executes.
5. the multimachine device based on deeply study of certifiable connectivity constraint according to any one of claims 1 to 4
People's formation air navigation aid, which is characterized in that realize the ginseng that can meet constraint condition using virtual policy-extension environment mode
Numberization function;By virtual policy-extension environment framework, intensified learning problem originally is converted into an extensive chemical of equal value
Habit problem is realized with this using requiring the guidable deeply learning method of parameterized function to be solved;It is wherein original strong
Change problem concerning study by the parameterized function of constraint condition guidable may not be met and primal environment is constituted;Extensive chemical of equal value
Habit problem is made of guidable virtual policy and extension environment.
6. the multi-robot formation navigation based on deeply study of certifiable connectivity constraint according to claim 5
Method, which is characterized in that assuming that a is intelligent body, e is environment, and f is the parameterized function that can meet constraint condition;F is by b and c two
Part forms, and wherein b is no constrained parameters function, and c is constrained optimization module;When code is realized, virtual policy-is used
Constrained optimization module c is realized in the data prediction part of environment e, i.e., will be located at intelligent body a originally by the mode for extending environment
The constrained optimization module c of end moves on to the front end of environment e.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910795982.0A CN110470306B (en) | 2019-08-27 | 2019-08-27 | Multi-robot formation navigation method capable of guaranteeing connectivity constraint and based on deep reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910795982.0A CN110470306B (en) | 2019-08-27 | 2019-08-27 | Multi-robot formation navigation method capable of guaranteeing connectivity constraint and based on deep reinforcement learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110470306A true CN110470306A (en) | 2019-11-19 |
CN110470306B CN110470306B (en) | 2023-03-10 |
Family
ID=68512365
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910795982.0A Active CN110470306B (en) | 2019-08-27 | 2019-08-27 | Multi-robot formation navigation method capable of guaranteeing connectivity constraint and based on deep reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110470306B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111781922A (en) * | 2020-06-15 | 2020-10-16 | 中山大学 | Multi-robot collaborative navigation method based on deep reinforcement learning and suitable for complex dynamic scene |
CN111897224A (en) * | 2020-08-13 | 2020-11-06 | 福州大学 | Multi-agent formation control method based on actor-critic reinforcement learning and fuzzy logic |
CN112051780A (en) * | 2020-09-16 | 2020-12-08 | 北京理工大学 | Brain-computer interface-based mobile robot formation control system and method |
CN112711261A (en) * | 2020-12-30 | 2021-04-27 | 浙江大学 | Multi-agent formation planning method based on local visual field |
CN112817327A (en) * | 2020-12-30 | 2021-05-18 | 北京航空航天大学 | Multi-unmanned aerial vehicle collaborative search method under communication constraint |
CN114326438A (en) * | 2021-12-30 | 2022-04-12 | 北京理工大学 | Safety reinforcement learning four-rotor control system and method based on control barrier function |
CN115328143A (en) * | 2022-08-26 | 2022-11-11 | 齐齐哈尔大学 | Master-slave water surface robot recovery guiding method based on environment driving |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2004199160A (en) * | 2002-12-16 | 2004-07-15 | Canon Inc | Optimum design method |
JP2016009354A (en) * | 2014-06-25 | 2016-01-18 | 日本電信電話株式会社 | Behavior control system, method therefor, and program |
US20180074524A1 (en) * | 2015-07-17 | 2018-03-15 | Mitsubishi Heavy Industries, Ltd. | Aircraft control device, aircraft, and method for computing aircraft trajectory |
US20180260700A1 (en) * | 2017-03-09 | 2018-09-13 | Alphaics Corporation | Method and system for implementing reinforcement learning agent using reinforcement learning processor |
CN109270934A (en) * | 2018-11-01 | 2019-01-25 | 中国科学技术大学 | Multi-robot formation continuation of the journey method based on pilotage people's switching |
US20190094866A1 (en) * | 2017-09-22 | 2019-03-28 | Locus Robotics Corporation | Dynamic window approach using optimal reciprocal collision avoidance cost-critic |
CN109540151A (en) * | 2018-03-25 | 2019-03-29 | 哈尔滨工程大学 | A kind of AUV three-dimensional path planning method based on intensified learning |
CN110147101A (en) * | 2019-05-13 | 2019-08-20 | 中山大学 | A kind of end-to-end distributed robots formation air navigation aid based on deeply study |
-
2019
- 2019-08-27 CN CN201910795982.0A patent/CN110470306B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2004199160A (en) * | 2002-12-16 | 2004-07-15 | Canon Inc | Optimum design method |
JP2016009354A (en) * | 2014-06-25 | 2016-01-18 | 日本電信電話株式会社 | Behavior control system, method therefor, and program |
US20180074524A1 (en) * | 2015-07-17 | 2018-03-15 | Mitsubishi Heavy Industries, Ltd. | Aircraft control device, aircraft, and method for computing aircraft trajectory |
US20180260700A1 (en) * | 2017-03-09 | 2018-09-13 | Alphaics Corporation | Method and system for implementing reinforcement learning agent using reinforcement learning processor |
US20190094866A1 (en) * | 2017-09-22 | 2019-03-28 | Locus Robotics Corporation | Dynamic window approach using optimal reciprocal collision avoidance cost-critic |
CN109540151A (en) * | 2018-03-25 | 2019-03-29 | 哈尔滨工程大学 | A kind of AUV three-dimensional path planning method based on intensified learning |
CN109270934A (en) * | 2018-11-01 | 2019-01-25 | 中国科学技术大学 | Multi-robot formation continuation of the journey method based on pilotage people's switching |
CN110147101A (en) * | 2019-05-13 | 2019-08-20 | 中山大学 | A kind of end-to-end distributed robots formation air navigation aid based on deeply study |
Non-Patent Citations (1)
Title |
---|
王洋: ""主动配电网优化运行策略研究"", 《中国优秀硕士学位论文全文数据库工程科技II辑》 * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111781922A (en) * | 2020-06-15 | 2020-10-16 | 中山大学 | Multi-robot collaborative navigation method based on deep reinforcement learning and suitable for complex dynamic scene |
CN111781922B (en) * | 2020-06-15 | 2021-10-26 | 中山大学 | Multi-robot collaborative navigation method based on deep reinforcement learning |
CN111897224A (en) * | 2020-08-13 | 2020-11-06 | 福州大学 | Multi-agent formation control method based on actor-critic reinforcement learning and fuzzy logic |
CN112051780A (en) * | 2020-09-16 | 2020-12-08 | 北京理工大学 | Brain-computer interface-based mobile robot formation control system and method |
CN112051780B (en) * | 2020-09-16 | 2022-05-17 | 北京理工大学 | Brain-computer interface-based mobile robot formation control system and method |
CN112711261A (en) * | 2020-12-30 | 2021-04-27 | 浙江大学 | Multi-agent formation planning method based on local visual field |
CN112817327A (en) * | 2020-12-30 | 2021-05-18 | 北京航空航天大学 | Multi-unmanned aerial vehicle collaborative search method under communication constraint |
CN112817327B (en) * | 2020-12-30 | 2022-07-08 | 北京航空航天大学 | Multi-unmanned aerial vehicle collaborative search method under communication constraint |
CN114326438A (en) * | 2021-12-30 | 2022-04-12 | 北京理工大学 | Safety reinforcement learning four-rotor control system and method based on control barrier function |
CN114326438B (en) * | 2021-12-30 | 2023-12-19 | 北京理工大学 | Safety reinforcement learning four-rotor control system and method based on control obstacle function |
CN115328143A (en) * | 2022-08-26 | 2022-11-11 | 齐齐哈尔大学 | Master-slave water surface robot recovery guiding method based on environment driving |
CN115328143B (en) * | 2022-08-26 | 2023-04-18 | 齐齐哈尔大学 | Master-slave water surface robot recovery guiding method based on environment driving |
Also Published As
Publication number | Publication date |
---|---|
CN110470306B (en) | 2023-03-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110470306A (en) | A kind of multi-robot formation air navigation aid based on deeply study of certifiable connectivity constraint | |
Deo et al. | Trajectory forecasts in unknown environments conditioned on grid-based plans | |
CN110147101B (en) | End-to-end distributed multi-robot formation navigation method based on deep reinforcement learning | |
Qin et al. | Review of autonomous path planning algorithms for mobile robots | |
Li | An effective methodology for solving matrix games with fuzzy payoffs | |
Xiang et al. | Continuous control with deep reinforcement learning for mobile robot navigation | |
Yang et al. | UAV formation trajectory planning algorithms: A review | |
Cui et al. | Learning world transition model for socially aware robot navigation | |
WO2021116875A1 (en) | Formally safe symbolic reinforcement learning on visual inputs | |
Huang et al. | Recoat: A deep learning-based framework for multi-modal motion prediction in autonomous driving application | |
Wang et al. | Oracle-guided deep reinforcement learning for large-scale multi-UAVs flocking and navigation | |
Dey | Applied Genetic Algorithm and Its Variants: Case Studies and New Developments | |
Tao et al. | A path-planning method for wall surface inspection robot based on improved genetic algorithm | |
Qiu | Multi-agent navigation based on deep reinforcement learning and traditional pathfinding algorithm | |
Li et al. | When digital twin meets deep reinforcement learning in multi-UAV path planning | |
Zhu et al. | Tri-HGNN: Learning triple policies fused hierarchical graph neural networks for pedestrian trajectory prediction | |
Bialas et al. | Coverage path planning for unmanned aerial vehicles in complex 3d environments with deep reinforcement learning | |
Desai et al. | Auxiliary tasks for efficient learning of point-goal navigation | |
CN116718190A (en) | Mobile robot path planning method in long-distance dense crowd scene | |
Wang et al. | Vision-Based Autonomous Driving: A Hierarchical Reinforcement Learning Approach | |
CN114326826B (en) | Multi-unmanned aerial vehicle formation transformation method and system | |
Wu et al. | 3D multi-constraint route planning for UAV low-altitude penetration based on multi-agent genetic algorithm | |
Braglia et al. | Online Motion Planning for Safe Human–Robot Cooperation Using B-Splines and Hidden Markov Models | |
CN115527272A (en) | Construction method of pedestrian trajectory prediction model | |
CN111723941B (en) | Rule generation method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |