CN110470306A - A kind of multi-robot formation air navigation aid based on deeply study of certifiable connectivity constraint - Google Patents

A kind of multi-robot formation air navigation aid based on deeply study of certifiable connectivity constraint Download PDF

Info

Publication number
CN110470306A
CN110470306A CN201910795982.0A CN201910795982A CN110470306A CN 110470306 A CN110470306 A CN 110470306A CN 201910795982 A CN201910795982 A CN 201910795982A CN 110470306 A CN110470306 A CN 110470306A
Authority
CN
China
Prior art keywords
robot
function
constraint condition
connectivity
parameterized function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910795982.0A
Other languages
Chinese (zh)
Other versions
CN110470306B (en
Inventor
林俊潼
成慧
杨旭韵
郑培炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Sun Yat Sen University
Original Assignee
National Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Sun Yat Sen University filed Critical National Sun Yat Sen University
Priority to CN201910795982.0A priority Critical patent/CN110470306B/en
Publication of CN110470306A publication Critical patent/CN110470306A/en
Application granted granted Critical
Publication of CN110470306B publication Critical patent/CN110470306B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/20Instruments for performing navigational calculations

Landscapes

  • Engineering & Computer Science (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Automation & Control Theory (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Feedback Control In General (AREA)
  • Manipulator (AREA)

Abstract

The present invention relates to mobile robot technology fields, more particularly, to a kind of multi-robot formation air navigation aid based on deeply study of certifiable connectivity constraint.The geometric center of multi-robot formation can efficiently be navigated to target point by this method in collisionless situation, and guarantee the connectivity of robot team formation in navigation procedure.The present invention indicates navigation strategy using the parameterized function that can meet constraint condition, guarantees the connectivity of robot team formation in navigation procedure with this.Meanwhile the present invention realizes the parameterized function that can meet constraint condition using virtual policy-extension environment frame, with the compatible deeply learning algorithm required to the parameterized function property led.

Description

A kind of multi-robot formation based on deeply study of certifiable connectivity constraint Air navigation aid
Technical field
The present invention relates to mobile robot technology field, more particularly, to a kind of certifiable connectivity constraint based on The multi-robot formation air navigation aid of deeply study.
Background technique
Multi-robot formation has wide practical use, such as rescue, search, exploration, agricultural spray and collaboration are carried. In the task of execution, multi-robot formation is likely to require the operation in unknown complex scene.At this point, multi-robot formation navigates Strategy is critically important for the safety and efficiency of multi-robot formation.Under normal conditions, the communication distance of robot is limited, because , in order to guarantee being in communication with each other between multirobot, multi-robot formation navigation strategy needs to consider the company of multi-robot formation for this The general character.Multi-robot formation air navigation aid includes for rule-based method and the method learnt based on deeply.Based on rule Barrier map of the method then dependent on building, and sometimes, the building of barrier map is relatively difficult and can account for With many computing resources.It, can directly will be original using the method learnt based on deeply compared to rule-based method Perception data is mapped to the control amount of robot, without the need to build barrier map, thus obtains extensive concern.However, making When with the method learnt based on deeply, last control amount is generated by a unconfined parameterized function mostly, therefore The control amount is possible to that the connectivity of multi-robot formation can be destroyed, and then leads to the communication disruption between multi-robot formation.
There is different methods that can avoid producing to the resulting strategy addition constraint condition of method learnt based on deeply Life can destroy the control amount of constraint condition.Reward is moulded through modification reward function and is constrained for strategy addition.However due to most Control amount afterwards is still generated by unconfined parameterized function, therefore the constraint added is soft-constraint, that is to say, that Zhi Nengti Height meets the probability constrained, reduces the probability for destroying constraint, and not can guarantee certain satisfaction constraint.Method for normalizing passes through in nothing Normalized function (such as Sigmoid, Tanh and Clipping function) is arranged finally to guarantee in the parameterized function of constraint Output is fallen within certain section, then by acquiring last control amount multiplied by a coefficient.This method can be fine Ground handles the constraint of Interval Type, but can not handle the constraint of connectivity.Method based on control theory is used and is such as controlled The tool of the control theories such as barrier function and liapunov function constrains to add, and has very strong theoretical foundation.However it is this kind of Method need to be introduced into it is additional it is assumed that and these hypothesis in multi-robot formation navigates and may be unsatisfactory for.Hierarchical structure Decision process has been divided into decision of the senior level and bottom decision using the thought divided and ruled by method.Decision of the senior level is learnt by deeply It arrives, bottom decision is by that can guarantee that the bottom decision-making device of constraint condition is completed.However, Design hierarchy structure (i.e. decision of the senior level and The line of demarcation of bottom decision is at which) it is not easy to, and sometimes, can not guarantee the bottom decision of constraint condition Device.
Summary of the invention
The present invention is to overcome above-mentioned defect in the prior art, provides a kind of the strong based on depth of certifiable connectivity constraint The multi-robot formation air navigation aid that chemistry is practised, can effectively ensure that the connectivity of multi-robot formation in navigation procedure.
In order to solve the above technical problems, the technical solution adopted by the present invention is that: a kind of certifiable connectivity constraint based on The multi-robot formation air navigation aid of deeply study, key are to carry out table using the parameterized function that can meet constraint condition Show multi-robot formation navigation strategy, guarantees the connectivity of multi-robot formation in navigation procedure with this.Meanwhile the present invention makes The parameterized function that can meet constraint condition is realized with virtual policy-extension environment frame, parameterizes letter with compatibility requirements The guidable deeply learning algorithm of number.
Further, the parameterized function that can meet constraint condition includes two parts, is a general no constraint respectively Parameterized function (such as neural network) and a constrained optimization module.Since last control amount is by a constrained optimization Module is generated rather than is directly generated by no constrained parameters function, therefore can guarantee to meet constraint condition.
It is parameter without constrained parameters functions value o according to the observation using θ in the parameterized function that can meet constraint condition The z being calculatedθ(o) no longer it is final control amount, but passes to constrained optimization problem module as an input.It constrains excellent Change module according to incoming zθ(o) constrained optimization problem is solved, the control amount a that can guarantee connectivity is obtained.Specifically, about Objective function f (the z of beam optimization problemo(o), a) with the variable a to be optimized and output z without constrained parameters functionθ(o) phase It closes, the constraint condition of constrained optimization problem is connectivity constraint.
For given observed value o, different parameter θs1And θ2It can generate differentWithAnd then it generates not Same objective functionWithAnd different objective functionWith After constrained optimization module, different final control amount a can be being generated1And a2.And since final control amount is by constraining always Optimization module generates, therefore final control amount centainly can satisfy connectivity constraint.
Further, can meet constraint condition parameterized function be one with observed value o for input, control amount a be it is defeated Out, θ is the function of parameter, and mathematical form is as follows:
In formula, zθIt (o) is no constrained parameters function, f (zθ(o), a) be constrained optimization problem objective function;gi(zθ (o), a), hi(zθIt (o), a) is inequality constraints function and equality constraints functions in constrained optimization problem respectively;By f (z, A) it is used as medium, the parameterized function of constraint condition can be met while having the ability to express peace treaty of no constrained parameters function Restriction ability of the beam optimization problem to final control amount.
7. further, the derivation about connectivity constraint: assuming that the kinematics model of i-th of robot are as follows:
WhereinWithPosition and control amount of respectively i-th of the robot in t moment, Δ t are time interval;Design mesh (z, concrete form a) are a to scalar functions fTa+zTA, then meeting the mathematical form of the parameterized function of connectivity constraint condition such as Under:
Wherein N is the quantity of robot in multi-robot formation, and d is communication distance, atAnd otIt is that t moment is entire respectively The splicing of multi-robot formation N number of robot control amount and observed value, i.e., And t The observed value of i-th of robot of momentInclude the perception information to environment For the present speed of itself,For it The position of remaining robot, andFor the position of target point.
8. it is further, it, can be further by its observed value about teammate's information for different robot iDetermine Justice:
Formula (4) are substituted into the above-mentioned parameterized function formula (3) for meeting connectivity constraint condition;It is artificial with the 1st machine The mathematical form of example, the final resulting parameterized function for meeting connectivity constraint condition is as follows:
Constraints above optimization problem is convex optimization problem, therefore can be with Efficient Solution.To sum up, each robot utilizes shared Without constrained parameters functionAccording to the observed value of itselfIt calculates respectiveThen all robots pass through information Interaction is ownedConstrained optimization problem of equal value is respectively solved, a that can satisfy connectivity constraint is obtainedt;Finally According to the number of oneself from atMiddle taking-up is correspondingIt is executed using it as control amount.
In the present invention, a constrained optimization module is contained since the parameterized function of constraint condition can be met, even if No constrained parameters function therein be it is guidable, finally can entirely meet constraint condition parameterized function be also likely to be can not It leads.Multi-robot formation navigation strategy is directly indicated with the parameterized function that can meet constraint condition, then can not utilize and want Seek the guidable deeply learning method of parameterized function.
In order to enable the guidable deeply of parameterized function compatibility requirements parameterized function of constraint condition can be met Learning method, the present invention realize the parameterized function that can meet constraint condition using virtual policy-extension environment mode.Pass through Virtual policy-extension environment framework, intensified learning problem originally is (by the parametrization that guidable may not meet constraint condition Function and primal environment are constituted) an intensified learning problem of equal value is converted into (by guidable virtual policy and extension environment Constitute), therefore can be using requiring the guidable deeply learning method of parameterized function to be solved.Next will above may be used Meet the parameterized function of constraint condition and its is substituted into based on the realization of virtual policy-extension environment framework based on deeply Final navigation strategy can be acquired in the multi-robot formation air navigation aid of habit.
Compared with prior art, beneficial effect is: the present invention proposes to come using the parameterized function that can meet constraint condition It indicates multi-robot formation navigation strategy, guarantees the connectivity of multi-robot formation in navigation procedure with this.Compared to level Structure Method, for method of the invention while can guarantee connectivity constraint, more plug and play is (i.e. without explicitly design level Secondary structure does not depend on the bottom decision-making device that can guarantee constraint condition yet).Meanwhile the present invention uses virtual policy-extension environment Frame realize the parameterized function that can meet constraint condition, with the guidable deeply study of compatibility requirements parameterized functions Algorithm.
Detailed description of the invention
Fig. 1 shows the parameterized functions that can meet constraint condition.
Fig. 2 expression illustrates the parameterized function that can meet constraint condition.
Fig. 3 indicates strategy-environment framework structural schematic diagram.
Fig. 4 indicates virtual policy-extension environment block schematic illustration.
Fig. 5 indicates decision flow diagram.
Specific embodiment
Attached drawing only for illustration, is not considered as limiting the invention;In order to better illustrate this embodiment, attached Scheme certain components to have omission, zoom in or out, does not represent the size of actual product;To those skilled in the art, The omitting of some known structures and their instructions in the attached drawings are understandable.Being given for example only property of positional relationship is described in attached drawing Illustrate, is not considered as limiting the invention.
Embodiment 1
The multi-robot formation air navigation aid based on deeply study of certifiable connectivity constraint proposed by the present invention Key be to indicate multi-robot formation navigation strategy using the parameterized function that can meet constraint condition, guaranteed with this The connectivity of multi-robot formation in navigation procedure.Meanwhile the present invention can to realize using virtual policy-extension environment frame Meet the parameterized function of constraint condition, with the guidable deeply learning algorithm of compatibility requirements parameterized function.
As shown in Figure 1, the parameterized function that can meet constraint condition includes two parts, it is a general no constraint respectively Parameterized function (such as neural network) and a constrained optimization module.Since last control amount is by a constrained optimization Module is generated rather than is directly generated by no constrained parameters function, therefore can guarantee to meet constraint condition.
It is parameter without constrained parameters functions value o according to the observation using θ in the parameterized function that can meet constraint condition The z being calculatedθ(o) no longer it is final control amount, but passes to constrained optimization problem module as an input.It constrains excellent Change module according to incoming zθ(o) constrained optimization problem is solved, the control amount a that can guarantee connectivity is obtained.Specifically, about Objective function f (the z of beam optimization problemθ(o), a) with the variable a to be optimized and output z without constrained parameters functionθ(o) phase It closes, the constraint condition of constrained optimization problem is connectivity constraint.
Illustrate the input/output procedure that can meet the parameterized function of constraint condition with the example of Fig. 2 below.For giving Fixed observed value o, different parameter θs1And θ2It can generate differentWithAnd then generate different objective functionsWithAnd different objective functionWithPassing through constrained optimization After module, different final control amount a can be generated1And a2.And since final control amount is generated by constrained optimization module always, Therefore final control amount centainly can satisfy connectivity constraint.
To sum up, can meet constraint condition parameterized function be one with observed value o for input, control amount a be output, θ For the function of parameter, mathematical form is as follows:
By f, (z a) is used as medium, can meet the parameterized function of constraint condition while have no constrained parameters letter Restriction ability of several abilities to express and constraint optimization problem to final control amount.
Next further derivation of the supplement about connectivity constraint.Assuming that the kinematics model of i-th of robot are as follows:
WhereinWithPosition and control amount of respectively i-th of the robot in t moment, Δ t are time interval.Design mesh (z, concrete form a) are a to scalar functions fTa+zTA, then meeting the mathematical form of the parameterized function of connectivity constraint condition such as Under:
Wherein N=3 is the quantity of robot in multi-robot formation, and d=3.5 is communication distance, atAnd otIt is t respectively The splicing of moment entire multi-robot formation N number of robot control amount and observed value, i.e.,And the observed value of i-th of robot of t momentInclude the perception letter to environment Breath(i.e. the point cloud data of two-dimensional laser radar), the present speed of itselfThe position of remaining robotAnd mesh The position of punctuate
It, can be further by its observed value about teammate's information for different robot iDefinition:
Formula (4) are substituted into the above-mentioned parameterized function formula (3) for meeting connectivity constraint condition.It is artificial with the 1st machine The mathematical form of example, the final resulting parameterized function for meeting connectivity constraint condition is as follows:
Constraints above optimization problem is convex optimization problem, therefore can be with Efficient Solution.To sum up, it is ensured that connectivity constraint The overall process of multi-robot formation navigation based on deeply study as shown in figure 5, each robot using shared without about Beam parameterized functionAccording to the observed value of itselfIt calculates respectiveThen all robots are obtained by information exchange To allConstrained optimization problem of equal value is respectively solved, a that can satisfy connectivity constraint is obtainedt;Last basis is certainly Oneself number is from atMiddle taking-up is correspondingIt is executed using it as control amount.
Parameterized function due to that can meet constraint condition contains a constrained optimization module, even if no constraint therein Parameterized function be it is guidable, it is also likely to be not guidable for finally can entirely meeting the parameterized function of constraint condition.Therefore false As shown in Figure 3 as, multi-robot formation navigation strategy is directly indicated with the parameterized function that can meet constraint condition, then without Method utilizes and requires the guidable deeply learning method of parameterized function.
In order to enable the guidable deeply of parameterized function compatibility requirements parameterized function of constraint condition can be met Learning method, the present invention realize the parameterized function that can meet constraint condition using virtual policy-extension environment mode.Such as figure Shown in 4, extension environment is no longer located at as Fig. 3 about intelligent body with the boundary of environment virtual policy-can meet constraint condition Parameterized function and environment between, but be virtual policy and extension environment between.Wherein, virtual policy is that can meet about In the parameterized function of beam condition without constrained parameters function, extend environment by that can meet in the parameterized function of constraint condition Constrained optimization module and primal environment constitute.
By virtual policy-extension environment framework, intensified learning problem originally is (by that guidable may not meet constraint The parameterized function and primal environment of condition are constituted) an intensified learning problem of equal value is converted into (by guidable virtual plan Slightly constituted with extension environment), therefore can be using requiring the guidable deeply learning method of parameterized function to be solved.It connects The parameterized function of constraint condition will can be met above and its substitute into base based on the realization of virtual policy-extension environment framework by getting off Final navigation strategy can be acquired in the multi-robot formation air navigation aid of deeply study.
In the present embodiment, multi-robot formation navigation plan is indicated using the parameterized function that can meet constraint condition Slightly, guarantee the connectivity of multi-robot formation in navigation procedure with this;It is realized using virtual policy-extension environment frame The parameterized function of constraint condition can be met, with the compatible deeply learning algorithm required to the parameterized function property led.
Obviously, the above embodiment of the present invention be only to clearly illustrate example of the present invention, and not be pair The restriction of embodiments of the present invention.For those of ordinary skill in the art, may be used also on the basis of the above description To make other variations or changes in different ways.There is no necessity and possibility to exhaust all the enbodiments.It is all this Made any modifications, equivalent replacements, and improvements etc., should be included in the claims in the present invention within the spirit and principle of invention Protection scope within.

Claims (6)

1. a kind of multi-robot formation air navigation aid based on deeply study of certifiable connectivity constraint, feature exist In indicating multi-robot formation navigation strategy using the parameterized function that can meet constraint condition;Constraint condition can be met Parameterized function includes two parts, is one respectively without constrained parameters function and a constrained optimization module;Constraint can be met The parameterized function of condition be one with observed value o for input, control amount a be output, θ be parameter function, mathematical form It is as follows:
In formula, zθIt (o) is no constrained parameters function, f (zθ(o), a) be constrained optimization problem objective function;gi(zθ(o), a)、hi(zθIt (o), a) is inequality constraints function and equality constraints functions in constrained optimization problem respectively.
2. the multi-robot formation navigation based on deeply study of certifiable connectivity constraint according to claim 1 Method, which is characterized in that assuming that the kinematics model of i-th of robot are as follows:
WhereinWithPosition and control amount of respectively i-th of the robot in t moment, Δ t are time interval;Design object letter (z, concrete form a) are a to number fTa+zTA, then the mathematical form for meeting the parameterized function of connectivity constraint condition is as follows:
Wherein N is the quantity of robot in multi-robot formation, and d is communication distance, atAnd otIt is the entire multimachine of t moment respectively Device people forms into columns the splicing of N number of robot control amount and observed value, i.e., And t moment The observed value of i-th of robotInclude the perception information to environment For the present speed of itself,For remaining machine The position of people, andFor the position of target point.
3. the multi-robot formation navigation based on deeply study of certifiable connectivity constraint according to claim 2 Method, which is characterized in that, can be further by its observed value about teammate's information for different robot iDefinition:
Formula (4) are substituted into the above-mentioned parameterized function formula (3) for meeting connectivity constraint condition;By taking the 1st robot as an example, most The mathematical form of the resulting parameterized function for meeting connectivity constraint condition is as follows eventually:
4. the multi-robot formation navigation based on deeply study of certifiable connectivity constraint according to claim 3 Method, which is characterized in that each robot is using shared without constrained parameters functionAccording to the observed value of itselfIt calculates It is respectiveThen all robots are owned by information exchangeConstrained optimization of equal value is respectively solved to ask Topic, obtains a that can satisfy connectivity constraintt;Finally according to the number of oneself from atMiddle taking-up is correspondingUsing it as control Amount processed executes.
5. the multimachine device based on deeply study of certifiable connectivity constraint according to any one of claims 1 to 4 People's formation air navigation aid, which is characterized in that realize the ginseng that can meet constraint condition using virtual policy-extension environment mode Numberization function;By virtual policy-extension environment framework, intensified learning problem originally is converted into an extensive chemical of equal value Habit problem is realized with this using requiring the guidable deeply learning method of parameterized function to be solved;It is wherein original strong Change problem concerning study by the parameterized function of constraint condition guidable may not be met and primal environment is constituted;Extensive chemical of equal value Habit problem is made of guidable virtual policy and extension environment.
6. the multi-robot formation navigation based on deeply study of certifiable connectivity constraint according to claim 5 Method, which is characterized in that assuming that a is intelligent body, e is environment, and f is the parameterized function that can meet constraint condition;F is by b and c two Part forms, and wherein b is no constrained parameters function, and c is constrained optimization module;When code is realized, virtual policy-is used Constrained optimization module c is realized in the data prediction part of environment e, i.e., will be located at intelligent body a originally by the mode for extending environment The constrained optimization module c of end moves on to the front end of environment e.
CN201910795982.0A 2019-08-27 2019-08-27 Multi-robot formation navigation method capable of guaranteeing connectivity constraint and based on deep reinforcement learning Active CN110470306B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910795982.0A CN110470306B (en) 2019-08-27 2019-08-27 Multi-robot formation navigation method capable of guaranteeing connectivity constraint and based on deep reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910795982.0A CN110470306B (en) 2019-08-27 2019-08-27 Multi-robot formation navigation method capable of guaranteeing connectivity constraint and based on deep reinforcement learning

Publications (2)

Publication Number Publication Date
CN110470306A true CN110470306A (en) 2019-11-19
CN110470306B CN110470306B (en) 2023-03-10

Family

ID=68512365

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910795982.0A Active CN110470306B (en) 2019-08-27 2019-08-27 Multi-robot formation navigation method capable of guaranteeing connectivity constraint and based on deep reinforcement learning

Country Status (1)

Country Link
CN (1) CN110470306B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111781922A (en) * 2020-06-15 2020-10-16 中山大学 Multi-robot collaborative navigation method based on deep reinforcement learning and suitable for complex dynamic scene
CN111897224A (en) * 2020-08-13 2020-11-06 福州大学 Multi-agent formation control method based on actor-critic reinforcement learning and fuzzy logic
CN112051780A (en) * 2020-09-16 2020-12-08 北京理工大学 Brain-computer interface-based mobile robot formation control system and method
CN112711261A (en) * 2020-12-30 2021-04-27 浙江大学 Multi-agent formation planning method based on local visual field
CN112817327A (en) * 2020-12-30 2021-05-18 北京航空航天大学 Multi-unmanned aerial vehicle collaborative search method under communication constraint
CN114326438A (en) * 2021-12-30 2022-04-12 北京理工大学 Safety reinforcement learning four-rotor control system and method based on control barrier function
CN115328143A (en) * 2022-08-26 2022-11-11 齐齐哈尔大学 Master-slave water surface robot recovery guiding method based on environment driving

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004199160A (en) * 2002-12-16 2004-07-15 Canon Inc Optimum design method
JP2016009354A (en) * 2014-06-25 2016-01-18 日本電信電話株式会社 Behavior control system, method therefor, and program
US20180074524A1 (en) * 2015-07-17 2018-03-15 Mitsubishi Heavy Industries, Ltd. Aircraft control device, aircraft, and method for computing aircraft trajectory
US20180260700A1 (en) * 2017-03-09 2018-09-13 Alphaics Corporation Method and system for implementing reinforcement learning agent using reinforcement learning processor
CN109270934A (en) * 2018-11-01 2019-01-25 中国科学技术大学 Multi-robot formation continuation of the journey method based on pilotage people's switching
US20190094866A1 (en) * 2017-09-22 2019-03-28 Locus Robotics Corporation Dynamic window approach using optimal reciprocal collision avoidance cost-critic
CN109540151A (en) * 2018-03-25 2019-03-29 哈尔滨工程大学 A kind of AUV three-dimensional path planning method based on intensified learning
CN110147101A (en) * 2019-05-13 2019-08-20 中山大学 A kind of end-to-end distributed robots formation air navigation aid based on deeply study

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004199160A (en) * 2002-12-16 2004-07-15 Canon Inc Optimum design method
JP2016009354A (en) * 2014-06-25 2016-01-18 日本電信電話株式会社 Behavior control system, method therefor, and program
US20180074524A1 (en) * 2015-07-17 2018-03-15 Mitsubishi Heavy Industries, Ltd. Aircraft control device, aircraft, and method for computing aircraft trajectory
US20180260700A1 (en) * 2017-03-09 2018-09-13 Alphaics Corporation Method and system for implementing reinforcement learning agent using reinforcement learning processor
US20190094866A1 (en) * 2017-09-22 2019-03-28 Locus Robotics Corporation Dynamic window approach using optimal reciprocal collision avoidance cost-critic
CN109540151A (en) * 2018-03-25 2019-03-29 哈尔滨工程大学 A kind of AUV three-dimensional path planning method based on intensified learning
CN109270934A (en) * 2018-11-01 2019-01-25 中国科学技术大学 Multi-robot formation continuation of the journey method based on pilotage people's switching
CN110147101A (en) * 2019-05-13 2019-08-20 中山大学 A kind of end-to-end distributed robots formation air navigation aid based on deeply study

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王洋: ""主动配电网优化运行策略研究"", 《中国优秀硕士学位论文全文数据库工程科技II辑》 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111781922A (en) * 2020-06-15 2020-10-16 中山大学 Multi-robot collaborative navigation method based on deep reinforcement learning and suitable for complex dynamic scene
CN111781922B (en) * 2020-06-15 2021-10-26 中山大学 Multi-robot collaborative navigation method based on deep reinforcement learning
CN111897224A (en) * 2020-08-13 2020-11-06 福州大学 Multi-agent formation control method based on actor-critic reinforcement learning and fuzzy logic
CN112051780A (en) * 2020-09-16 2020-12-08 北京理工大学 Brain-computer interface-based mobile robot formation control system and method
CN112051780B (en) * 2020-09-16 2022-05-17 北京理工大学 Brain-computer interface-based mobile robot formation control system and method
CN112711261A (en) * 2020-12-30 2021-04-27 浙江大学 Multi-agent formation planning method based on local visual field
CN112817327A (en) * 2020-12-30 2021-05-18 北京航空航天大学 Multi-unmanned aerial vehicle collaborative search method under communication constraint
CN112817327B (en) * 2020-12-30 2022-07-08 北京航空航天大学 Multi-unmanned aerial vehicle collaborative search method under communication constraint
CN114326438A (en) * 2021-12-30 2022-04-12 北京理工大学 Safety reinforcement learning four-rotor control system and method based on control barrier function
CN114326438B (en) * 2021-12-30 2023-12-19 北京理工大学 Safety reinforcement learning four-rotor control system and method based on control obstacle function
CN115328143A (en) * 2022-08-26 2022-11-11 齐齐哈尔大学 Master-slave water surface robot recovery guiding method based on environment driving
CN115328143B (en) * 2022-08-26 2023-04-18 齐齐哈尔大学 Master-slave water surface robot recovery guiding method based on environment driving

Also Published As

Publication number Publication date
CN110470306B (en) 2023-03-10

Similar Documents

Publication Publication Date Title
CN110470306A (en) A kind of multi-robot formation air navigation aid based on deeply study of certifiable connectivity constraint
Deo et al. Trajectory forecasts in unknown environments conditioned on grid-based plans
CN110147101B (en) End-to-end distributed multi-robot formation navigation method based on deep reinforcement learning
Qin et al. Review of autonomous path planning algorithms for mobile robots
Li An effective methodology for solving matrix games with fuzzy payoffs
Xiang et al. Continuous control with deep reinforcement learning for mobile robot navigation
Yang et al. UAV formation trajectory planning algorithms: A review
Cui et al. Learning world transition model for socially aware robot navigation
WO2021116875A1 (en) Formally safe symbolic reinforcement learning on visual inputs
Huang et al. Recoat: A deep learning-based framework for multi-modal motion prediction in autonomous driving application
Wang et al. Oracle-guided deep reinforcement learning for large-scale multi-UAVs flocking and navigation
Dey Applied Genetic Algorithm and Its Variants: Case Studies and New Developments
Tao et al. A path-planning method for wall surface inspection robot based on improved genetic algorithm
Qiu Multi-agent navigation based on deep reinforcement learning and traditional pathfinding algorithm
Li et al. When digital twin meets deep reinforcement learning in multi-UAV path planning
Zhu et al. Tri-HGNN: Learning triple policies fused hierarchical graph neural networks for pedestrian trajectory prediction
Bialas et al. Coverage path planning for unmanned aerial vehicles in complex 3d environments with deep reinforcement learning
Desai et al. Auxiliary tasks for efficient learning of point-goal navigation
CN116718190A (en) Mobile robot path planning method in long-distance dense crowd scene
Wang et al. Vision-Based Autonomous Driving: A Hierarchical Reinforcement Learning Approach
CN114326826B (en) Multi-unmanned aerial vehicle formation transformation method and system
Wu et al. 3D multi-constraint route planning for UAV low-altitude penetration based on multi-agent genetic algorithm
Braglia et al. Online Motion Planning for Safe Human–Robot Cooperation Using B-Splines and Hidden Markov Models
CN115527272A (en) Construction method of pedestrian trajectory prediction model
CN111723941B (en) Rule generation method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant