CN114371626A - Discrete control containment function improvement optimization method, optimization system, terminal and medium - Google Patents

Discrete control containment function improvement optimization method, optimization system, terminal and medium Download PDF

Info

Publication number
CN114371626A
CN114371626A CN202210026694.0A CN202210026694A CN114371626A CN 114371626 A CN114371626 A CN 114371626A CN 202210026694 A CN202210026694 A CN 202210026694A CN 114371626 A CN114371626 A CN 114371626A
Authority
CN
China
Prior art keywords
optimization
fence function
control
constraint
discrete
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210026694.0A
Other languages
Chinese (zh)
Other versions
CN114371626B (en
Inventor
李石磊
袁志民
李猛
杨智超
叶清
王甲生
何涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Naval University of Engineering PLA
Original Assignee
Naval University of Engineering PLA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Naval University of Engineering PLA filed Critical Naval University of Engineering PLA
Priority to CN202210026694.0A priority Critical patent/CN114371626B/en
Publication of CN114371626A publication Critical patent/CN114371626A/en
Application granted granted Critical
Publication of CN114371626B publication Critical patent/CN114371626B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/04Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
    • G05B13/042Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention belongs to the technical field of control fence functions, and discloses an improved optimization method, an optimization system, a terminal and a storage medium for a discrete control fence function, wherein a feasible state set is defined according to the constraint requirement of the control fence function in a discrete form; dynamically adjusting gamma of the control fence function according to a plurality of task constraint optimization solution conditionsk(ii) a Will control gamma of the penkObtaining an optimizable control fence function as a generation of optimized variables; changing the constraint requirement of the discrete control fence function, and directly optimizing the satisfaction degree of task constraint; introducing S into optimization targetkAnd (4) related preference targets, so that the motion path obtained by simulation optimization accords with the preference. The invention can obviously improve the feasible domain space of the optimization algorithm and can obtain more flexible and changeable optimization solutions. Meanwhile, the constraint requirement of the invention has more obvious physical meaning and the setting is more intuitive and easy.

Description

Discrete control containment function improvement optimization method, optimization system, terminal and medium
Technical Field
The invention belongs to the technical field of fence function control, and particularly relates to an improved optimization method, an improved optimization system, a terminal and a storage medium for a discrete control fence function.
Background
Currently, the definition of the control fence function is closely related to the concept of control invariant set, if there is a control function pi:
Figure BDA0003464972800000011
so that for any starting condition x (0) epsilon IS, the condition can be always satisfied
Figure BDA0003464972800000012
The set IS referred to as a control invariant set. Given a closed set
Figure BDA0003464972800000013
Assuming that set C satisfies:
Figure BDA0003464972800000014
Figure BDA0003464972800000015
Figure BDA0003464972800000016
wherein int (C) and
Figure BDA0003464972800000017
respectively representing the interior and the boundary of the set C, h:
Figure BDA0003464972800000018
is a continuously differentiable function if for all
Figure BDA0003464972800000019
Satisfy the requirement of
Figure BDA00034649728000000110
And is
Figure BDA00034649728000000111
Satisfies the following conditions:
Figure BDA00034649728000000112
let h (x) be the control fence function. Where γ () is a K-like function, i.e. γ () satisfies a strict monotonic increase and γ (0) is 0, in practical applications γ () is often taken as a linear function of constant coefficients, i.e. γ (h) (x) γ h (x). As can be seen by definition, if the initial value h (x) of h (x) is ≧ 0
Figure BDA00034649728000000113
Is always greater than the exponential decay, so it is always guaranteed that h (x) ≧ 0 has forward invariance, i.e., the set { x | h (x) ≧ 0} is the control-invariant set of the system.
In addition, the control fence function is also closely related to the control lyapunov function, in the nonlinear control, if the stability of the system needs to be ensured, namely x (t) → 0, in order to solve the problem of directly solving the system state to judge the stability of the system, the control lyapunov function is realized by constructing a controller if the controller can make a positive definite function v (x):
Figure BDA00034649728000000114
approaching zero, the stability of the system can be indirectly determined. If the controller is satisfied
Figure BDA00034649728000000115
It can be demonstrated that the system will converge steadily, V (x)*) 0, i.e. x*0. Satisfied by function V and by constructing the controller
Figure BDA00034649728000000116
The stability of the system can be indirectly ensured. Similar to the idea of controlling the Lyapunov function, the control fence function indirectly ensures the forward invariance of the system by constructing a controller, and when the control fence function is used, the task constraint can be indirectly ensured to be satisfied by describing the task constraint into a form of h (x) being more than or equal to 0 and optimizing the control quantity by means of the concept of controlling the forward invariance of the fence function. It is worth noting here that controlling the Lyapunov function requires
Figure BDA0003464972800000021
Not only that of
Figure BDA0003464972800000022
To ensure a stable convergence speed of the system, and similarly, to control the fence function
Figure BDA0003464972800000023
This constraint requirement is only a conservative subset that satisfies the forward invariance of the system, which may lead to a situation where the optimization algorithm does not get a feasible solution when multiple task constraints expressed by control fence functions are simultaneously affected. Therefore, a new discrete-time control fence function-based discrete-control fence function improved optimization method is needed.
Through the above analysis, the problems and defects of the prior art are as follows: (1) existing control fence function
Figure BDA0003464972800000024
The constraint requirement of (2) is only a conservative subset satisfying the forward invariance of the system, and when a plurality of task constraints expressed by control fence functions exert influence simultaneously, a situation that an optimization algorithm cannot obtain a feasible solution can be caused.
(2) In the prior art, in the multi-agent motion trajectory data planning of a complex dynamic scene, the traditional method has the problem that the dimensionality is increased rapidly, and the requirements on the operation efficiency and the motion details are difficult to meet. The accuracy of the obtained multi-agent motion track data is low.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides an improved optimization method, an optimization system, a terminal and a storage medium for a discrete control fence function, and particularly designs an improved optimization method for the discrete control fence function based on an optimizable discrete time control fence function.
The invention is realized in such a way that a discrete control fence function improved optimization method based on a discrete time control fence function comprises the following steps:
step one, defining a feasible state set according to the constraint requirement of a control fence function in a discretization form;
step two, dynamically adjusting and controlling gamma of the fence function according to a plurality of task constraint optimization solving conditionsk
Step three, controlling gamma of the fencekObtaining an optimizable control fence function as a generation of optimized variables;
step four, changing constraint requirements of the discrete control fence function and directly carrying out h (x)t+k|t) Optimizing the satisfaction degree of the task constraint;
step five, introducing S into the optimization targetkAnd a related preference target enables the multi-agent motion path data of the complex dynamic scene obtained by simulation optimization to accord with the preference.
Further, in the step one
Figure BDA0003464972800000025
The discretization form of (a) is:
ht+Δt-ht≥-γht
according to the constraint requirement of the control fence function in the discretization form, defining a feasible state set corresponding to a certain simulation time point t + k as follows:
SCBF,k={x∈X:h(xt+k+1)≥(1-γk)h(xt+k)};
where the range of the set of feasible statesFrom h (x)t+k) And gammakJoint determination, apparently gammakThe larger, the next time h (x)t+k+1) The lower the requirements; conversely, γkThe smaller, the next time h (x)t+k+1) The higher the requirements.
Further, in the second step, γ is defined in the existing control fence functionkIs a predetermined hyper-parameter; if gamma iskThe setting is not reasonable enough, which causes no feasible solution, so the Gamma in the control fence functionkAnd carrying out dynamic adjustment according to the actual situation of the constraint optimization solution of a plurality of tasks.
Further, in step three, γ in the pen will be controlledkAs a generation of optimized variables, we call optimizable control fence functions.
Further, in the fourth step, according to the description formula of the feasible state set, in γkH (x) of current time after fixationt+k) Will also be right for h (x)t+k+1) Put a limit if h (x) of the current time pointt+k) Larger, will result in h (x)t+k+1) The CBF constraint requirement can be satisfied by taking a large value, and the essential requirement of forward invariance is h (x) in practicet+k+1) And (5) more than or equal to 0, changing the constraint requirement of the discrete control fence function into:
Figure BDA0003464972800000031
in the formula, SkFor newly introduced optimized variables, for the substitution of (1-. gamma.) h (x)t) Directly on h (x)t+k|t) The satisfaction degree of the task constraints is optimized and called as GOCBF, so that the multiple task constraints are adjusted more directly, and the range of a feasible solution space is further expanded.
Further, the step five introduces S into the optimization targetkRelated preference targets, the step five introduces S into the optimization targetskThe related preference target for enabling the multi-agent motion path data of the complex dynamic scene obtained by simulation optimization to accord with the preference comprises the following steps:
introducing S into optimization targetkA related preference target comprising Φ (S)k)=(ST-S0)WS(S-S0) Therefore, the motion path obtained by simulation optimization accords with the preference.
Another object of the present invention is to provide a discrete-time control fence function-based multi-agent motion trajectory optimization system applying the discrete-time control fence function-based discrete-time control fence function improved optimization method, the discrete-time control fence function-based multi-agent motion trajectory optimization system comprising:
the feasible state set definition module is used for defining a feasible state set according to the constraint requirement of the control fence function in the discretization form;
a variable dynamic adjusting module for dynamically adjusting and controlling the gamma of the fence function according to the solution conditions of the constraint optimization of a plurality of tasksk
An optimizable control fence function acquisition module to control a gamma of a fencekObtaining an optimizable control fence function as a generation of optimized variables;
a constraint requirement changing module for changing the constraint requirement of the discrete control fence function and directly changing h (x)t+k|t) Optimizing the satisfaction degree of the task constraint;
a motion path conformity preference acquisition module for introducing S into the optimization targetkAnd (4) related preference targets, so that the motion path obtained by simulation optimization accords with the preference.
It is a further object of the invention to provide a computer device comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of:
defining a feasible state set according to the constraint requirement of the control fence function in a discretization form; dynamically adjusting gamma of the control fence function according to a plurality of task constraint optimization solution conditionsk(ii) a Will control gamma of the penkAs a generationObtaining an optimized control fence function through the optimized variables; changing the constraint requirement of the discrete control fence function and directly carrying out h (x)t+k|t) Optimizing the satisfaction degree of the task constraint; introducing S into optimization targetkAnd (4) related preference targets, so that the motion path obtained by simulation optimization accords with the preference.
It is another object of the present invention to provide a computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of:
defining a feasible state set according to the constraint requirement of the control fence function in a discretization form; dynamically adjusting gamma of the control fence function according to a plurality of task constraint optimization solution conditionsk(ii) a Will control gamma of the penkObtaining an optimizable control fence function as a generation of optimized variables; changing the constraint requirement of the discrete control fence function and directly carrying out h (x)t+k|t) Optimizing the satisfaction degree of the task constraint; introducing S into optimization targetkAnd (4) related preference targets, so that the motion path obtained by simulation optimization accords with the preference.
Another object of the present invention is to provide an information data processing terminal, which is used for implementing the discrete time control fence function-based multi-agent motion trajectory optimization system.
By combining all the technical schemes, the invention has the advantages and positive effects that: the optimizable discrete time control fence function provided by the invention is more universal and flexible, and when a plurality of task constraints are added, the feasible domain space of an optimization algorithm can be obviously improved, and a more flexible and changeable optimization solution can be obtained; the constraint requirement has more obvious physical meaning and is more intuitive and easier to set.
Aiming at the problems that task constraints are limited in a geometric space, the dynamic characteristics of an intelligent body are not considered, and overall consideration is not available among different task constraints in a traditional motion planning algorithm, the invention provides an optimizable GOCBF function, based on MPC-GOCBF, the dynamic physical simulation and the motion planning are combined, based on an augmented physical simulation thought, the improved CBF function is utilized to realize the unified description of the motion planning multi-task constraints, so that an explicit constraint relation analysis solving (or searching) process is converted into implicit simulation calculation, a multi-intelligent-body motion planning physical simulation realization environment is constructed, the algorithm is tested and verified, and experimental results prove the flexibility and the dynamic adaptability of the algorithm. And the obtained multi-agent motion track data has high accuracy.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments of the present invention will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of a discrete-time-controlled fence function-based discrete-controlled fence function improved optimization method according to an embodiment of the present invention.
FIG. 2 is a block diagram of a multi-agent motion trajectory optimization system based on a discrete time control fence function according to an embodiment of the present invention;
in the figure: 1. a feasible state set definition module; 2. a variable dynamic adjustment module; 3. an optimization control fence function acquisition module; 4. a constraint requirement modification module; 5. the motion path conforms to the preference acquisition module.
Fig. 3 is a schematic diagram of an optimization solving process of a model predictive control algorithm according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of generating an intelligent agent motion path optimization based on augmented physics simulation according to an embodiment of the present invention.
Fig. 5 is a schematic diagram of the conservative behavior motion path optimization generation provided in the embodiment of the present invention.
Fig. 5(a) is a schematic diagram of a motion trajectory of an agent according to an embodiment of the present invention.
FIG. 5(b) is a block diagram of a control fence function coefficient γ according to an embodiment of the present inventionkSchematic representation.
Fig. 6 is a schematic diagram of the generation of the risk behavior motion path optimization provided by the embodiment of the present invention.
Fig. 6(a) is a schematic diagram of a motion trajectory of an agent according to an embodiment of the present invention.
FIG. 6(b) is a block diagram of a control fence function coefficient γ according to an embodiment of the present inventionkSchematic representation.
Fig. 7 is a schematic diagram of generating a motion path optimization for a user-defined behavior according to an embodiment of the present invention.
Fig. 7(a) is a schematic diagram of a motion trajectory of an agent according to an embodiment of the present invention.
FIG. 7(b) is a block diagram of a control fence function coefficient γ according to an embodiment of the present inventionkSchematic representation.
FIG. 8 is a schematic diagram of an MPC-GOCBF motion path optimization generation method according to an embodiment of the present invention.
Fig. 8(a) is a schematic diagram of a motion trajectory of an agent according to an embodiment of the present invention.
Fig. 8(b) is a diagram illustrating the h value of the control fence function according to the embodiment of the present invention.
Fig. 9 is a schematic diagram of an adaptive cruise simulation test result (S-25) according to an embodiment of the present invention.
Fig. 9(a) is a schematic diagram of the movement speed of the cruise agent according to the embodiment of the present invention.
Fig. 9(b) is a schematic diagram of a control fence function h according to an embodiment of the present invention.
Fig. 9(c) is a schematic diagram of the optimized control quantity provided by the embodiment of the present invention.
Fig. 9(d) is a schematic diagram of S values corresponding to the gotbf according to the embodiment of the present invention.
Fig. 10 is a schematic diagram of an adaptive cruise simulation test result (S-40) according to an embodiment of the present invention.
Fig. 10(a) is a schematic diagram of the movement speed of the cruise agent according to the embodiment of the present invention.
Fig. 10(b) is a schematic diagram of a control fence function h according to an embodiment of the present invention.
Fig. 10(c) is a schematic diagram of the optimized control quantity provided by the embodiment of the present invention.
Fig. 10(d) is a schematic diagram of S values corresponding to the gotbf according to the embodiment of the present invention.
FIG. 11 is a schematic diagram of a multi-agent collision avoidance experiment provided by an embodiment of the present invention.
Fig. 11(a) is a screenshot of a simulation process provided by an embodiment of the present invention.
Fig. 11(b) is a schematic diagram of a motion trajectory of an obstacle avoidance intelligent agent according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Aiming at the problems in the prior art, the invention provides an improved optimization method of a discrete control fence function based on a discrete time control fence function, and the invention is described in detail below with reference to the accompanying drawings.
As shown in fig. 1, the discrete-time-controlled-fence-function-based discrete-controlled-fence-function improved optimization method provided by the embodiment of the present invention includes the following steps:
s101, defining a feasible state set according to the constraint requirement of a control fence function in a discretization form;
s102, dynamically adjusting r of the control fence function according to a plurality of task constraint optimization solution conditionsk
S103, controlling r of the fencekObtaining an optimizable control fence function as a generation of optimized variables;
s104, changing constraint requirements of the control fence function in a discrete form, and directly carrying out h (x)t+k|t) Optimizing the satisfaction degree of the task constraint;
s105, introducing S into the optimization targetkAnd (4) related preference targets, so that the motion path obtained by simulation optimization accords with the preference.
In step S105, the multi-agent motion path data of the complex dynamic scene obtained by simulation optimization is made to conform to the preference.
As shown in fig. 2, the multi-agent motion trajectory optimization system based on discrete time control fence function according to the embodiment of the present invention includes:
the feasible state set definition module 1 is used for defining a feasible state set according to the constraint requirement of a control fence function in a discretization form;
a variable dynamic adjusting module 2 for dynamically adjusting and controlling the gamma of the fence function according to the solution condition of the constrained optimization of a plurality of tasksk
An optimizable control fence function acquisition module 3 for controlling the gamma of a fencekObtaining an optimizable control fence function as a generation of optimized variables;
a constraint requirement changing module 4 for changing the constraint requirement of the discrete control fence function and directly changing h (x)t+k|t) Optimizing the satisfaction degree of the task constraint;
a motion path conformity preference obtaining module 5 for introducing S into the optimization objectivekAnd (4) related preference targets, so that the motion path obtained by simulation optimization accords with the preference.
The technical solution of the present invention is further described below with reference to specific examples.
Example (b): optimizable discrete time control fence function design
Figure BDA0003464972800000061
The discretization form of (a) is:
ht+Δt-ht≥-γht (1)
according to the constraint requirement of a control fence function in a discretization form, the feasible state set corresponding to a certain simulation time point t + k is defined as follows:
SCBF,k={x∈X:h(xt+k+1)≥(1-γk)h(xt+k)} (2)
according to equation (2), the range of the feasible state set is defined by h (x)t+k) And gammakJoint determination, apparently gammakThe larger, the next time h (x)t+k+1) The more requiredLow, conversely, γkThe smaller, the next time h (x)t+k+1) The higher the requirements. In the existing definition of the control fence function, γkIs a previously determined hyper-parameter. If gamma iskThe setting is not reasonable enough, which will cause no feasible solution. Therefore, the invention proposes to control γ in the fence functionkAnd carrying out dynamic adjustment according to the actual situation of the constraint optimization solution of a plurality of tasks. According to the above thought, the invention further controls gamma in the fencekAs a generation of optimized variables, we call the Optimizable control fence function (Optimizable CBF).
From the description of the set of possible states equation (1), at γkH (x) of current time after fixationt+k) Will also be right for h (x)t+k+1) Put a limit if h (x) of the current time pointt+k) Larger, will result in h (x)t+k+1) The CBF constraint requirement can be satisfied by taking a large value, and the essential requirement of forward invariance is h (x) in practicet+k+1) ≧ 0, the invention further proposes to change the discrete form of control fence function constraint requirement to:
Figure BDA0003464972800000071
in the formula (3), SkOptimized variables newly introduced for the invention for substituting (1-gamma) h (x)t) Thus, the present invention can directly pair h (x)t+k|t) The degree of satisfaction of the task constraints is optimized, the task constraints are called GOCBF (general and optional CBF), so that the multiple task constraints are adjusted more directly, and the range of a feasible solution space is effectively expanded. Also, the present invention introduces S in the optimization objectivekAssociated preference targets, e.g. Φ (S)k)=(ST-S0)WS(S-S0) Therefore, the motion path obtained by simulation optimization accords with the preference.
The technical solution of the present invention is further described below with reference to simulation experiments.
Simulation experiment: motion planning multitask constraint unified description based on augmented physical simulation
The fundamental goal of motion planning is to find a feasible path from an initial state to a target state in a specific state space, and the essence is to guide a search algorithm to reach a target point by means of partial priori heuristic knowledge (specific constraints, even complete random sampling), which is a typical high-dimensional pathological redundancy problem, i.e. there are a plurality of feasible solutions at a specific time-space point, and a better solution needs to be found from a plurality of feasible solutions based on the existing task constraint requirements.
In the traditional motion planning algorithm, task constraints are limited in a geometric space, a path is directly generated by directly judging the satisfaction of a geometric constraint relation, the physical characteristic limits (such as speed and acceleration constraints) of an intelligent body are usually not considered, the consideration of the dynamic characteristics of the intelligent body is lacked, and the physical feasibility of the motion path cannot be ensured. In addition, geometric constraint relations such as obstacle avoidance and reaching of a specific target point are often directly described by inequalities, and overall consideration is lacked among different task constraints. At present, in the face of the problem of multi-agent motion planning of complex dynamic scenes, the traditional method has the problem that the operation efficiency and the motion detail requirements are difficult to meet due to the fact that the problem dimension is sharply increased.
The method specifically comprises the following steps:
1. in a constraint-based motion planning path optimization generation algorithm, the whole problem is usually described as one optimization problem. Various task constraint requirements are often directly described by a geometrical and physical logic equation relation, such as the distance d between the obstacle and the obstacle is more than 0, and the speed is limited to a certain range
Figure BDA0003464972800000072
And the path result is obtained by directly utilizing optimization solution calculation. At present, in the aspect of optimization solution, a gradient-free intelligent optimization algorithm with certain heuristic characteristics and an optimization calculation method based on gradients are applied. Heuristic intelligent optimization algorithm due to inherent randomness of algorithmExploratory, it cannot be guaranteed that a better result can always be obtained each time; the optimization method based on the gradient is easy to fall into local optimization, the optimization calculation has a large dependence on an Initial Guess value (Initial stress), and if the Initial Guess value is not set properly enough, a better planning result is difficult to obtain. In order to ensure that a better solution can always be obtained by a gradient-based optimization algorithm, the existing paper firstly generates a rough guess path by using an a-star algorithm, then uses the rough guess path as an initial value of a candidate optimization algorithm, sets an optimization target to be a final optimization value with the minimum deviation from an initial estimation path, and finally generates a final path by optimization based on specific task constraint. The method can solve the problem of high dependence of an optimization algorithm based on gradient on an initial estimation value, but the initial estimation value of the global path needs to be generated, so that the method is only suitable for the offline global path optimization generation and cannot be applied to the online real-time generation of the planned path under the complex dynamic environment.
In order to realize the online dynamic generation of the planned path, many researchers propose to realize the dynamic generation of the planned path based on the constraint optimization by using Model Prediction Control or Receding Horizon Control. The optimal calculation process of a model predictive control algorithm with a prediction step size of N can be described as follows:
Figure BDA0003464972800000081
in the formula (1.1), (a) represents a control target to be optimized, (b) represents a dynamic model after discretization of the agent, (c) represents various states and control input constraints which need to be met by the agent at each discrete time point, and (d) represents the initial and terminal states of the agent in the optimization time period. In the model predictive control optimization solution, the optimization decision variable is a control sequence, namely the control input of N time points in the prediction range
Figure BDA0003464972800000082
However, in actual execution, only the control quantity at the first time point is used
Figure BDA0003464972800000083
Inputting the actual system to obtain the system state x at the next momentt+1And taking the state as the initial state again, repeating the optimization solving process, wherein the whole process is shown in fig. 3.
Based on the model predictive control method, various task requirements required to be met in the motion planning process can be conveniently used as constraint conditions to be added into the optimization solving process. And because the model predictive control considers the possible motion trail of the intelligent agent in a period of time in the future when the optimization solution is carried out, the quality of the planned path can be effectively improved, and the short sight phenomenon is avoided. Therefore, if the path planning quality needs to be further improved, the prediction step length N needs to be increased, so that the solving efficiency of the system is difficult to guarantee with the increase of the number of agents and the task constraint requirements. Therefore, under the model predictive control framework, how to properly describe various task constraint requirements becomes the key for rapidly solving the problem.
With the rapid development of the processing speed of a computer, the physical simulation method is widely applied to simulating various natural phenomena, and after various physical constraints are given, a system can automatically generate physical detail effects meeting various constraint requirements through dynamic simulation calculation, so that the method is flexible and convenient. However, the pure physical simulation lacks an effective control mechanism, and a target controller needs to be constructed to ensure that the physical simulation result runs towards an assumed target. Different from physical simulation, the motion planning algorithm has the natural advantage of long-distance target control, but for the generation of multi-agent navigation behaviors in a complex dynamic scene, because the problem dimension is increased rapidly, the planning algorithm is used alone to generate state information of all Configuration spaces (Configuration spaces) of the multi-agent in the motion process, the solving efficiency cannot be guaranteed, and the possibility is provided for the proper combination of the two ideas. Therefore, the invention provides a method for combining dynamic physical simulation with motion planning, wherein the motion planning is responsible for target control, various index constraints are described as real or virtual physical constraints in a working space (Workspace), the traditional real dynamic physical simulation is expanded and expanded, and the unified description of motion planning multitask constraints is realized based on the expanded physical simulation, so that an explicit constraint relation analysis solving (or searching) process is converted into implicit simulation calculation, and the advantages of the two are fused and complemented.
Recently, a concept of controlling a fence function (Control Barrier Functions) is used for reference, task constraint explicit solving based on an agent state is converted into optimized generation of a Control quantity through forward invariance, the Control quantity is generated through optimization to indirectly ensure the satisfaction of the task constraint, and the calculation efficiency can be greatly improved. The existing paper compares a control fence function with an artificial potential field method, and proves that the artificial potential field method is a special case of the control fence function, and the overall performance of the control fence function is superior to that of the artificial potential field method.
From the perspective of dynamic physical simulation, the invention considers that the control fence function is substantially the control quantity which converts the specific task constraint into the dynamic simulation process, and provides a better means tool for the augmented physical simulation provided by the invention, so the invention is based on the control fence function, and analyzes the multi-task constraint unified description and the motion path augmented physical simulation optimized generation method under the model predictive control-control fence function (MPC-CBF) thought framework.
2. Description of the problem
The invention firstly assumes that all agents move on flat terrain, and each agent is simplified into a cylinder, so the global motion path of each agent can be simplified into a two-dimensional path planning problem. Radius r for each agentiA circle of (i 1.., N), at any time t, agent aiIs denoted by xi(t) velocity is denoted vi(t) acceleration is represented by ai(t) and the speed, acceleration are limited by the maximum value, i.e.
Figure BDA0003464972800000091
Indicating the maximum speed value that the agent can achieve,
Figure BDA0003464972800000092
representing the maximum acceleration value that the agent can reach.
The invention aims to enable an intelligent agent to generate a dynamic path in real time according to the environmental constraint requirement in the process of reaching a specified target area from a starting position, and in the process, the intelligent agent does not collide with obstacles in the environment and intelligent agent individuals. The whole process can be described as an optimization framework as follows:
Figure BDA0003464972800000093
in the formula (2.1), the first and second groups,
Figure BDA0003464972800000094
representing a kinetic model of the agent, x (0) ═ x0,xe=xgRespectively representing the initial and end positions of the agent, X belongs to X, U belongs to Uadm(x (t)) represents the state and control constraint requirements of the agent, J (u, x) is the optimization objective, i.e. the performance metric measure function of the planned path, where c (x, u) is the cost function, and is usually expressed as:
c(x,u)=Q(x)+uTRu (2.2)
in the formula (2.2), Q (x) and uTRu represents costs associated with agent states and control quantities, respectively, where R ═ RTAnd > 0 is a symmetric positive definite matrix. The invention expects that the overall cost is as low as possible under the precondition that the task constraint is met in the process of path optimization generation.
3. Control fence function basic background knowledge
The definition of the control fence function is closely related to the concept of the control invariant set, if there is a control function pi:
Figure BDA0003464972800000095
so that for any starting condition x (0) epsilon IS, the condition can be always satisfied
Figure BDA0003464972800000096
The set IS referred to as a control invariant set. Given a value of oneA closed set
Figure BDA0003464972800000097
Assuming that set C satisfies:
Figure BDA0003464972800000098
in the formula (3.1), int, (C) and
Figure BDA0003464972800000099
respectively representing the interior and the boundary of the set C, h:
Figure BDA00034649728000000910
is a continuously differentiable function if for all
Figure BDA00034649728000000911
Satisfy the requirement of
Figure BDA00034649728000000912
And is
Figure BDA00034649728000000913
Satisfies the following conditions:
Figure BDA0003464972800000101
let h (x) be the control fence function. In the formula (3.2), γ () is a K-like function, i.e., γ () satisfies a strict monotonic increase and γ (0) ═ 0, and in practical applications γ () is a linear function of a constant coefficient, i.e., γ (h) (x) ═ γ h (x). As can be seen from definition (3.2), if the initial value h (x) of h (x) is ≧ 0
Figure BDA0003464972800000102
Is always greater than the exponential decay, so it is always guaranteed that h (x) ≧ 0 has forward invariance, i.e., the set { x | h (x) ≧ 0} is the control-invariant set of the system.
In addition, control fence function and control baggageThe japunoff function is also closely related, in nonlinear control, if the invention needs to ensure the stability of the system, namely x (t) → 0, in order to solve the problem of directly solving the system state to judge the stability of the system, the lyapunoff function is controlled by constructing a controller if the controller can make a positive definite function V (x):
Figure BDA0003464972800000103
approaching zero, the stability of the system can be indirectly determined. If the controller is satisfied
Figure BDA0003464972800000104
It can be demonstrated that the system will converge steadily, V (x)*) 0, i.e. x*0. Satisfied by function V and by constructing the controller
Figure BDA0003464972800000105
The stability of the system can be indirectly ensured. Similar to the idea of controlling the Lyapunov function, the forward invariance of the system is indirectly ensured by constructing a controller for controlling the fence function, and when the control fence function is used, the task constraint can be indirectly ensured to be met by describing the task constraint into a form of h (x) being more than or equal to 0 and optimizing the control quantity by means of the concept of controlling the forward invariance of the fence function. It is worth noting here that controlling the Lyapunov function requires
Figure BDA0003464972800000106
Not only that of
Figure BDA0003464972800000107
To ensure a stable convergence speed of the system, and similarly, to control the fence function
Figure BDA0003464972800000108
This constraint requirement is only a conservative subset that satisfies the forward invariance of the system, which may lead to a situation where the optimization algorithm does not get a feasible solution when multiple task constraints expressed by control fence functions are simultaneously affected.
4. Task constraint unified description and dynamic physical simulation optimized generation based on optimization control fence function
Based on the control fence function, explicit task constraint satisfaction inspection can be converted into implicit dynamics simulation calculation, but the description of multi-task constraint needs to meet feasibility requirements, and the situation that a feasible solution cannot be obtained through optimization calculation is avoided. Therefore, the invention provides an optimizable control fence function on the basis of the concept of the control fence function, and provides an improved optimization solution idea under the discrete time condition on the basis of a model prediction control optimization solution framework according to the dynamic simulation calculation characteristics.
4.1 unified description of task constraints based on control fence function
Based on the definition of the control fence function, the method firstly needs to set an invariant security set, and realizes the indirect expression of the task constraint through the invariant security set.
(1) Description of hard constraints
For the hard constraint that the intelligent agent must always meet in the motion process, the control fence function can be directly adopted for description.
If corresponding to the collision avoidance task constraint, the control fence function can be set to:
h1(x)=d(xi,oi)-(ri+ro)≥0 (4.1)
in the formula (4.1), d (x)i,oi) Indicating the distance, r, of the agent from the obstacle oi、roRespectively, the radius of the agent and the obstacle. Therefore, when the motion path is generated optimally, the invention can ensure the satisfaction by optimizing the control quantity input u of the intelligent agent
Figure BDA0003464972800000111
Thus, h can be always guaranteed according to forward invariance1(x) And the intelligent agent is not less than 0, namely the intelligent agent can not collide with various obstacles in the scene.
For pairs like speed, acceleration and the amount of control the agent can apply itselfTo the range constraint, it can be described by adding two forms of control fence functions. Such as speed range constraints
Figure BDA0003464972800000112
Namely, it is
Figure BDA0003464972800000113
The corresponding control fence function is:
Figure BDA0003464972800000114
the description of the two-way range task constraints can be achieved with the help of two control fence functions, using equation (4.2).
(2) Description of Soft constraints
Soft constraints are the targets that need to be reached as much as possible or step by step in the process of the motion of the intelligent agent, and on one hand, when the soft constraints can be fully met, the constraint requirements of the soft constraints are met as much as possible; on the other hand, when the soft constraints conflict with other soft constraints, the satisfaction degree of the soft constraints needs to be properly reduced, and the intelligent agent is ensured to meet the task constraints as many as possible on the whole. If the soft constraints conflict with other soft constraints, the task constraints with high priority can be ensured to meet higher degree according to the priority relations of different constraints, and the soft task constraints with low priority can properly reduce the limit requirements. In the invention, based on an MPC-CBF optimization framework, soft constraints are described by adopting two modes:
one is described in the cost function to be optimized. On the global level, soft constraints which need to be met by the intelligent agent are converted into target quantities to be optimized in the cost function, so that the corresponding soft constraints can be guaranteed to be met to the maximum extent through optimization calculation in the intelligent agent motion path optimization generation process.
For example, for the task constraint requirement that the control quantity applied by the intelligent agent in the motion process is as small as possible (energy is optimal), u can be added into the cost functionTRu item; corresponding arrival specificThe task constraint of the target point can be achieved by adding (x)i-xg)2In the form of directing the agent to gradually reach the target state.
And the second is to directly use the control fence function form for description. For more general soft task constraints, at a local level, the description can be made by directly adding a relaxation variable to the corresponding control fence function constraint requirement, and at this time, the corresponding control fence function constraint requirement (3.2) is transformed into:
Figure BDA0003464972800000115
in the formula (4.3), epsilon > 0 is a relaxation variable, and the pair is further relaxed by adding the relaxation variable epsilon
Figure BDA0003464972800000116
The variation range of the soft constraint is limited, so that when a plurality of soft and hard constraints are optimized, solved and calculated together, the satisfaction degree of the soft constraint can be automatically adjusted through the size variation of epsilon, the smaller epsilon is, the higher the satisfaction degree of the corresponding soft constraint is, and conversely, the larger epsilon obtained by optimized solving is, the lower the satisfaction degree of the corresponding soft constraint is.
When multiple soft task constraints need to be applied simultaneously during the motion planning process, the priority order between different soft constraints needs to be set. In order to realize the modeling expression of different soft constraint priority orders, the invention adds the optimization items of the relaxation variables corresponding to different soft constraints in the corresponding cost function:
Figure BDA0003464972800000117
in the formula (4.4), the first step,
Figure BDA0003464972800000121
withe larger, the corresponding epsilon in the optimization solution processiThe smaller the respective soft constraints are, the higher the degree of satisfaction of the respective soft constraints, and thus the different soft constraintsCorresponds to wiThe relative size of the soft constraint matrix can intuitively express the importance degree of the soft constraint matrix in different soft task constraints, and the priority of different soft constraints can be determined by the coefficient W in the diagonal matrix WiFor visual illustration. The invention can set a higher w corresponding to the soft task constraint of high priorityiAnd the low priority task constraint sets lower wi
4.2 Intelligent body motion path optimization generation under augmented physical simulation framework
In the traditional motion planning algorithm, task constraints are limited in a geometric space, a path is directly generated by directly judging the satisfaction of a geometric constraint relation, and the satisfaction degree of a motion planning result on a specific task is insufficient. The method is based on a control fence function, carries out physical modeling expression on various task constraints, and realizes implicit solution of the task constraints through physical simulation calculation, wherein the specific thought is shown in figure 4.
Firstly, on the basis of the idea that the time-state logic requirements and the space constraints are separately described in the third chapter, various space-time task constraints describe the time-logic relationships by using a DFA, then the space constraints are converted into an optimized solution corresponding to the control quantity required by the dynamic physical simulation through a CBF, the corresponding physical property limitations of the intelligent body are also described through the CBF, specific task targets can be described through a cost function, and finally the whole augmented physical simulation framework can be converted into the optimized solution framework as follows:
Figure BDA0003464972800000122
in equation (4.5), there are n task constraints, where
Figure BDA0003464972800000123
For the hard constraints described based on the CBF,
Figure BDA0003464972800000124
is a soft constraint based on CBF description. Equation (4.4) in the optimization calculation, based on MPC algorithm, it is necessary to convert the continuous equationSolving the MPC in a discretization form based on a numerical simulation algorithm, wherein the discretization form under an MPC optimization solving framework corresponding to the formula (4.5) is as follows according to the formula (1.1):
Figure BDA0003464972800000125
in the formula (4.6), Δ h (x)t+k|t,ut+k|t):=h(xt+k+1)-h(xt+k),Δh(xt+k|t,ut+k|t)≥-γh(xt+k|t) Is composed of
Figure BDA0003464972800000131
In a discretized form. This is due to the fact that in numerical simulation calculations
Figure BDA0003464972800000132
Then there are:
Figure BDA0003464972800000133
ht+Δt-ht≥-γ·Δt·ht (4.8)
since Δ t is the step size of the simulation calculation, and is a definite value, equation (4.8) can be reduced to ht+Δt-ht≥-γhtIt is equivalent to replace the original gamma.delta t with a new value gamma, and the new value gamma should satisfy 0 < gamma < 1.
According to the formula (4.5), the planning generation of the motion path required by the intelligent agent to meet the specific task constraint can be converted into a dynamic simulation calculation process, and the required control input is generated based on the MPC framework optimization. Due to the inherently conservative nature of the control fence function (which only satisfies a conservative subset of the system's forward invariance), when n task constraints are simultaneously applied, it is possible that a situation may arise where no feasible control input is available. In order to improve the feasibility of the system, the invention further provides an optimizable control fence function.
According to the constraint requirement of a control fence function in a discretization form, the feasible state set corresponding to a certain simulation time point t + k is defined as follows:
SCBF,k={x∈X:h(xt+k+1)≥(1-γk)h(xt+k)} (4.9)
according to equation (4.9), the range of the feasible state set is defined by h (x)t+k) And gammakJoint determination, apparently gammakThe larger, the next time h (x)t+k+1) The lower the requirement of (A), and vice versa, γkThe smaller, the next time h (x)t+k+1) The higher the requirements. In the existing definition of the control fence function, γkIs a previously determined hyper-parameter. If gamma iskThe setting is not reasonable enough, which will cause no feasible solution. Therefore, the invention proposes to control γ in the fence functionkAnd carrying out dynamic adjustment according to the actual situation of the constraint optimization solution of a plurality of tasks. According to the above thought, the invention further controls gamma in the fencekAs a generation of optimized variables, referred to herein as the Optimizable control fence function (Optimizable CBF), equation (4.6) is modified as:
Figure BDA0003464972800000134
wherein, phi (gamma)t) For further introduced optimization goals, the method can be arranged into different forms according to specific needs. The following three forms are developed and defined in the invention:
(1) preference for conservative behavior
If the motion trail of the intelligent agent is required to be conservative and meet the constraint requirement as high as possible, the gamma at each time step point should be setkAs small as possible, the invention therefore defines an optimization objective:
Figure BDA0003464972800000141
at this time
Figure BDA0003464972800000142
The optimization process tends to let gammakAs small as possible, thereby achieving conservative behavior generation.
(2) Preference adventure behavior
If the movement track of the intelligent agent is required to be inclined to risk, the task is completed at the minimum limit meeting the constraint requirement, and gamma at each time step point is set at the momentkAs large as possible, the invention therefore defines an optimization objective:
Figure BDA0003464972800000143
at this time, the optimization process tends to let γkAs large as possible, thereby enabling risk-taking behavior generation.
(3) User-defined behavior
If the user is paired with gammakHas a specific requirement of gammat=γ0At this point, the invention defines an optimization objective:
Figure BDA0003464972800000144
at this time, the optimization process tends to let γkAs close to gamma as possible0And therefore, the generation of the user-defined behavior is realized.
From the descriptive equation (4.8) for the set of feasible states, at γkH (x) of current time after fixationt+k) Will also be right for h (x)t+k+1) Put a limit if h (x) of the current time pointt+k) Larger, will result in h (x)t+k+1) The CBF constraint requirement can be satisfied by taking a large value, and the essential requirement of forward invariance is h (x) in practicet+k+1) ≧ 0, the invention further proposes to change the discrete form of control fence function constraint requirement to:
Figure BDA0003464972800000145
in the formula (4.14), the first step,Skoptimized variables newly introduced for the invention for substituting (1-gamma) h (x)t) Thus, the present invention can directly pair h (x)t+k|t) The degree of satisfaction of the task constraints is optimized, the task constraints are called GOCBF (general and optional CBF), so that the multiple task constraints are adjusted more directly, and the range of a feasible solution space is effectively expanded. Also, the present invention introduces S in the optimization objectivekAssociated preference targets, e.g. Φ (S)k)=(ST-S0)WS(S-S0) Therefore, the motion path obtained by simulation optimization accords with the preference.
5. Simulation results and analysis
In order to test and verify the MPC-OCBF and MPC-GOCBF augmentation physical simulation framework provided by the project, the invention utilizes MATLAB, is based on Yalmip optimization calculation language, and utilizes IPOPT optimization calculation package to perform simulation experiment on the algorithm.
5.1 agent movement Path optimization computation with Generation of diversified behaviors
Consider a point agent whose kinetic model is second order:
Xk+1=AXk+BUk (5.1)
wherein the state variable Xk=[x,y,vx,vy]TPosition and velocity, U, of the agent in a two-dimensional planek=[ux,uy]TFor the control inputs required by the system, the state transition matrix A, B is:
Figure BDA0003464972800000151
the intelligent agent is limited by the physical characteristics of the intelligent agent, and the physical constraints required to be met are as follows:
Figure BDA0003464972800000152
x in simulation experimentmax,xmin=±5·I4×1,umax,umin=I2×1In which In×nIs an n × n identity matrix. A radius r needs to be avoided in the moving process of the intelligent bodyobsThe corresponding control fence function is defined as 1.5 obstacle:
Figure BDA0003464972800000153
the initial and target positions of the agent are (-5; -5) and (0; 0), and the position coordinate of the obstacle is (x)obs,yobs) (-2, -2.25). The cost function in the optimization process is defined as:
c(x,u)=x'kQxk+u'kRuk+x'NPxN (5.5)
in the formula (5), Q is 10. I4×4,R=I2×2,P=100·I4×4The prediction window length (MPC horizon) N is 8.
(1) Conservative behavioral motion path optimization generation
In the simulation experiment process, the invention respectively adopts a fixed coefficient gammak(0.01 and 0.1) and making gamma in the optimization targetkThe method as small as possible performs the dynamic physical simulation optimization generation of the motion path, and the simulation result is shown in fig. 5.
As can be seen from FIG. 5(b), the coefficient γ is obtained by using the MPC-OCBF algorithm framework proposed by the present inventionkCan be automatically adjusted according to the actual optimization calculation process, and when the target is reached as soon as possible and the obstacle is avoided, the gamma is automatically improved by the algorithm frameworkkReducing the constraint requirement, and satisfying other constraints (less-bending) as much as possible while ensuring that the collision avoidance constraint can be satisfied, and at other times, the coefficient gammakClose to zero, the collision avoidance constraint can be satisfied to the maximum extent. As can be seen in FIG. 5(a), as the present invention increases W stepwise in the optimization objectiveγThe movement path obtained by the optimization calculation is more and more conservative and deviates from the obstacle.
(2) Adventure behavior motion path optimization generation
In the simulation experiment process, the invention respectively adopts a fixed coefficient gammak(0.9 and 1.0) and let γ in the optimization targetkThe method as large as possible performs the dynamic physical simulation optimization generation of the motion path, and the simulation result is shown in fig. 6.
As can be seen from FIG. 6(a), different WγThe obtained motion trajectories of the intelligent bodies are basically the same, because the risk behavior target is consistent with the task constraint requirement of reaching the target point as soon as possible (linearly), and the risk behavior target and the task constraint requirement are not in conflict, so that the obtained gamma is in optimal calculationkAlways remains close to 1 (as shown in fig. 6 (b)). Therefore, as can be seen from fig. 5 and 6, the MPC-OCBF augmented physical simulation optimization calculation framework provided by the present invention can automatically implement appropriate optimization and adjustment on different task constraints, thereby increasing the solution space and obtaining diversified trajectory paths.
(3) User-defined behavior path optimization generation
In the process of simulation experiment, the invention sets the user-defined coefficient gamma0=[0.3,0.5,0.8],Wγ=102·INThe simulation experiment results are shown in fig. 7.
As can be seen from FIG. 7(a), the coefficient γ is directly fixedkCompared with the MPC-CBF method, the motion trail generated by the MPC-OCBF method provided by the invention is not greatly different because of the W when the target conflict occursγ=102·INSet smaller, optimization algorithm framework auto-pairs gammakThe incremental adjustment is made, as shown in fig. 7(b), to ensure that the resulting path reaches the target position as soon as possible under the condition that the collision avoidance constraint is satisfied.
(4) MPC-GOCBF algorithm behavior path optimization generation
Finally, the invention utilizes the MPC-GOCBF algorithm (equation 4.14) proposed in the project to directly set SkThe method of (3) performs motion path simulation optimization calculation. In the experiment, the invention sets S0=[0,1,2],WS=102·INThe final result is shown in fig. 8.
It can be seen from FIG. 8(a) that by setting different SkThe invention can obtain the motion paths with different behavior characteristics. SkHas obvious physical and practical meaning, SkThe larger the distance of the intelligent body deviating from the barrier is, and in actual operation, different S can be set directlykAnd obtaining a motion path meeting specific requirements. It is noted that, in the simulation optimization calculation process, the control fence function is not strictly satisfied to be greater than S0Instead, the dynamic adjustment is performed according to the actual situation of the task constraint, as shown in fig. 8(b), when the collision avoidance constraint requirement conflicts with the task constraint requirement of reaching the target point as soon as possible (straight line), the optimization algorithm automatically decreases SkAnd the optimization adjustment among the multi-task constraints is automatically realized, and the whole algorithm framework has stronger adaptability.
2 self-adaptive cruise motion path optimization calculation in automatic driving
In order to further test the performance of the MPC-GOCBF algorithm framework proposed by the project, the invention further tests and compares the performance of the MPC-GOCBF algorithm by using the intelligent agent adaptive cruise problem in automatic driving. The dynamic model of the agent is:
Figure BDA0003464972800000161
wherein (x)1,x2) Representing position and speed of a cruise agent
Figure BDA0003464972800000162
m is the mass of the agent, x3For cruising the distance of an agent from a leading agent ahead of it, leading the agent at a fixed speed vlAnd (5) advancing.
Figure BDA0003464972800000163
Is a kinetic parameter of the agent, wherein f0,f1And f2The range of control inputs that an agent can apply is limited to c, an empirical constantdmg≤u≤camg,cdFor constant coefficient, the above parameters are the same as those of the present inventionThe value is completely consistent with the prior art. The task constraints that the cruise agent needs to satisfy are expressed in terms of the control lyapunov function:
V=(x2-vd)2 (5.7)
in the formula (7), vdRepresenting the desired cruising speed of the cruising agent.
Define the control fence function as:
h=x3-1.8x2 (5.8)
in order to test the limit performance of the MPC-GOCBF algorithm framework, the optimization time window length N of the MPC is taken as 1, so that the algorithm is degraded into a single-step QP optimization problem, and the algorithm result and a fixed coefficient gamma are combinedkThe CBF-QP method which is 0.5 and the improved algorithm thereof are compared with the optimal-escape CBF-QP method which additionally adds the optimized parameters in the CBF-QP, and the invention takes 25 as S and W as WS=104The simulation results are shown in fig. 9.
In a simulation experiment, the adaptability of an algorithm framework is tested by setting different initial speeds of the cruise intelligent agent, wherein the initial speeds of the cruise intelligent agent are respectively 26, 28, 30 and 32. From the simulation result, as shown in fig. 8(a), it can be seen that when the initial speeds are 30 and 32, the CBF-QP (nominal) cannot obtain a feasible solution, but the GOCBF algorithm of the present invention can ensure that the system obtains a feasible solution, which can achieve the same performance as the optimal-nominal CBF-QP method, and ensure that the safety constraint requirement h is satisfied in the whole process, as shown in fig. 8 (b). Meanwhile, S can be automatically and dynamically adjusted in the whole process, and the feasibility of system optimization solution is ensured.
The present invention further increases S to 40, and further increases the safety constraint requirement for the distance from the leading vehicle, and the simulation result is shown in fig. 10.
As can be seen from fig. 10, the GOCBF algorithm proposed by the project enhances the constraint requirement on the safe distance from the leading vehicle by increasing S, the algorithm can still ensure the feasibility of the system optimization solution, and compared with fig. 8, the final h is also increased from 0 to 9.7, so that the safe distance from the leading vehicle is increased, and the required cruising behavior is generated.
6. Multi-agent motion planning simulation demonstration software design and development
In order to conveniently test and verify the augmented physical simulation algorithm based on the MPC-GOCBF idea provided by the invention, a multi-agent motion planning simulation demonstration software environment is further designed and developed for the project, and is used for testing and verifying the multi-agent motion planning algorithm.
6.1 demonstration software design development
The demonstration software is realized based on the Python programming, the optimization solution is realized by using CasADi (https:// web. cast. org /), and CasADi provides an efficient open source optimization problem solution, and is very suitable for solving a nonlinear optimization problem (nonlinear optimization) and realizing automatic differentiation (algorithmic differentiation). Compared with other optimization libraries, the method provides a hot Python API at present besides providing standard C/C + + and MATLAB support, so that the nonlinear optimization problem can be conveniently and quickly solved based on Python. The use of Casadi mainly comprises 3 steps: and (3) constructing variables, constructing a target function and setting a solver, wherein the whole process is very intuitive and friendly.
First, a CasADi optimization package was introduced:
import casadi as ca
opti=ca.Opti()
when a variable to be optimized is constructed, the optimized variable can be defined through an opti.variable () function, and the opti.parameter () intuitively defines related parameters, such as:
variable (N) # defines the control quantities to be optimized at N points in time in an MPC
open _ states ═ open (N +1) # defines the state quantities to be optimized at N +1 time points in an MPC
Subject _ to () function is then needed to define the relevant constraints and set the optimization target opt _ cost, e.g., as
opti.subject_to(self.opti.bounded(-self.v_max,v,self.v_max))
opt_cost=self.opt_cost+ca.mtimes([self.opt_controls[i,:],R,self.opt_controls[i,:].T])
When the optimization solution is carried out, the ipopt optimization solution packet is used for calculation, and the specific codes are as follows:
#Optimizer configures
opti.minimize(opt_cost)
opts_setting={'ipopt.max_iter':200,'ipopt.print_level':1,'print_time':0,'ipopt.acceptable_tol':1e-5,'ipopt.acceptable_obj_change_tol':1e-5}
opti.solver('ipopt',opts_setting)
for multi-agent dynamic simulation, the project adopts a Robotarium-open-source multi-robot simulation platform (https://github.com/robotarium/robotarium_python_simulator) The method can conveniently realize the simulation modeling of the multi-agent based on the platform.
In order to improve the universality of demonstration software, the project adopts an object-oriented programming idea, and an MPC-GOCBF algorithm is designed into a class: class CLF _ CBF _ NMPC (), and the class completes the concrete solving calculation related to the MPC-GOCBF algorithm. The main functions of the design are:
# __ add systems constraints defining physical Property constraints
# __ add _ dynamics _ constraints: defining an intelligent body dynamics model
# __ add _ safe _ constraints: achieving task constraint description based on GOCBF
# __ set _ cost _ func: defining a cost function
# solve: performing a specific MPC-GOCBF optimization solution calculation
6.2 Multi-agent motion Path Generation test
In order to test the multi-agent motion planning simulation demonstration software and verify the performance of the algorithm provided by the invention, the project utilizes the software environment and tests the dynamic obstacle avoidance of the multi-agent based on the MPC-GOCBF algorithm. The obstacle avoidance intelligent agent needs to pass through a plurality of dynamic intelligent agents to reach a target state, and a dynamic simulation model of the obstacle avoidance intelligent agent is as follows:
Figure BDA0003464972800000181
physical characteristics, task requirements and dynamic obstacle avoidance constraints are added by using a simulation software environment, a simulation experiment is carried out, and a screenshot of a single simulation process and a motion path of an intelligent body are shown in fig. 11.
In the simulation test process, the initial positions of other intelligent agents are randomly generated, the simulation is run for 100 times, the number of other obstacle intelligent agents is changed, the MPC-GOCBF algorithm is tested and evaluated according to three evaluation indexes of collision avoidance success rate, average path length and average time, the MPC-GOCBF algorithm is compared with an optimal-hierarchy CBF method, and the result is shown in Table 1.
TABLE 1 statistical table of collision avoidance performance of multi-agent
Figure BDA0003464972800000191
As can be seen from Table 1, with the increase of the number of obstacles, the success rate of the MPC-GOCBF algorithm provided by the invention is reduced, but the overall performance is kept stable, and the problem of multi-agent collision avoidance is well solved.
7. Aiming at the problems that task constraints are limited in a geometric space, the dynamic characteristics of an intelligent body are not considered, and overall consideration is not available among different task constraints in a traditional motion planning algorithm, the invention provides an optimizable GOCBF function, based on MPC-GOCBF, the dynamic physical simulation and the motion planning are combined, based on an augmented physical simulation thought, the improved CBF function is utilized to realize the unified description of the motion planning multi-task constraints, so that an explicit constraint relation analysis solving (or searching) process is converted into implicit simulation calculation, a multi-intelligent-body motion planning physical simulation realization environment is constructed, the algorithm is tested and verified, and experimental results prove the flexibility and the dynamic adaptability of the algorithm.
The above description is only for the purpose of illustrating the present invention and the appended claims are not to be construed as limiting the scope of the invention, which is intended to cover all modifications, equivalents and improvements that are within the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A discrete control fence function improved optimization method based on a discrete time control fence function is characterized by comprising the following steps:
step one, obtaining a feasible state set according to a control fence function constraint requirement in a discretization form;
step two, dynamically adjusting and controlling gamma of the fence function according to a plurality of task constraint optimization solving conditionsk
Step three, controlling gamma of the fencekObtaining an optimizable control fence function as a generation of optimized variables;
step four, changing constraint requirements of the discrete control fence function and directly carrying out h (x)t+k|t) Optimizing the satisfaction degree of the task constraint;
step five, introducing S into the optimization targetkAnd a related preference target enables the multi-agent motion path data of the complex dynamic scene obtained by simulation optimization to accord with the preference.
2. The method for improved optimization of discrete-time-controlled fence function based on discrete-time-controlled fence function as claimed in claim 1, wherein in the first step
Figure FDA0003464972790000011
The discretization form of (a) is:
ht+Δt-ht≥-γht
according to the constraint requirement of the control fence function in the discretization form, defining a feasible state set corresponding to a certain simulation time point t + k as follows:
SCBF,k={x∈X:h(xt+k+1)≥(1-γk)h(xt+k)};
where the range of feasible state sets is defined by h (x)t+k) And gammakAnd (4) jointly determining.
3. The method for optimizing discrete-time-controlled fence function as claimed in claim 1, wherein γ in the existing definition of control fence function in step twokControlling gamma in a fence function for predetermined hyper-parameterskAnd carrying out dynamic adjustment according to the actual situation of the constraint optimization solution of a plurality of tasks.
4. The discrete-time-controlled fence function-based discrete-controlled fence function improved optimization method of claim 1, wherein the third step is to control γ in the fencekIs an optimizable control fence function.
5. The discrete-time control fence function-based discrete control fence function improved optimization method of claim 1, wherein the constraint requirement of the discrete-time control fence function is changed to:
Figure FDA0003464972790000021
in the formula, SkFor newly introduced optimized variables, for the substitution of (1-. gamma.) h (x)t) Directly on h (x)t+k|t) And optimizing the satisfaction degree of the task constraint.
6. The discrete-time-controlled fence function-based discrete-controlled fence function improved optimization method as claimed in claim 1, wherein S is introduced into the optimization objective in the fifth stepkThe related preference target for enabling the multi-agent motion path data of the complex dynamic scene obtained by simulation optimization to accord with the preference comprises the following steps:
introducing S into optimization targetkA related preference target comprising Φ (S)k)=(ST-S0)WS(S-S0) And the motion path obtained by simulation optimization is in accordance with the preference.
7. A discrete-time control fence function based multi-agent motion trail optimization system applying the discrete-time control fence function based discrete control fence function improved optimization method as claimed in any one of claims 1 to 6, wherein the discrete-time control fence function based multi-agent motion trail optimization system comprises:
the feasible state set definition module is used for defining a feasible state set according to the constraint requirement of the control fence function in the discretization form;
a variable dynamic adjusting module for dynamically adjusting and controlling the gamma of the fence function according to the solution conditions of the constraint optimization of a plurality of tasksk
An optimizable control fence function acquisition module to control a gamma of a fencekObtaining an optimizable control fence function as a generation of optimized variables;
a constraint requirement changing module for changing the constraint requirement of the discrete control fence function and directly changing h (x)t+k|t) Optimizing the satisfaction degree of the task constraint;
a motion path conformity preference acquisition module for introducing S into the optimization targetkAnd (4) related preference targets, so that the motion path obtained by simulation optimization accords with the preference.
8. A computer device, characterized in that the computer device comprises a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to carry out the steps of:
defining a feasible state set according to the constraint requirement of the control fence function in a discretization form; dynamically adjusting gamma of the control fence function according to a plurality of task constraint optimization solution conditionsk(ii) a Will control gamma of the penkObtaining an optimizable control fence function as a generation of optimized variables; changing the constraint requirement of the discrete control fence function and directly carrying out h (x)t+k|t) Optimizing the satisfaction degree of the task constraint;introducing S into optimization targetkAnd (4) related preference targets, so that the motion path obtained by simulation optimization accords with the preference.
9. A computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of:
defining a feasible state set according to the constraint requirement of the control fence function in a discretization form; dynamically adjusting gamma of the control fence function according to a plurality of task constraint optimization solution conditionsk(ii) a Will control gamma of the penkObtaining an optimizable control fence function as a generation of optimized variables; changing the constraint requirement of the discrete control fence function and directly carrying out h (x)t+k|t) Optimizing the satisfaction degree of the task constraint; introducing S into optimization targetkAnd the motion path obtained by simulation optimization accords with the preference by the related preference target.
10. An information data processing terminal, characterized in that the information data processing terminal is used for realizing the function of the discrete time control fence function based multi-agent motion trail optimization system as claimed in claim 7.
CN202210026694.0A 2022-01-11 2022-01-11 Discrete control fence function improvement optimization method, optimization system, terminal and medium Active CN114371626B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210026694.0A CN114371626B (en) 2022-01-11 2022-01-11 Discrete control fence function improvement optimization method, optimization system, terminal and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210026694.0A CN114371626B (en) 2022-01-11 2022-01-11 Discrete control fence function improvement optimization method, optimization system, terminal and medium

Publications (2)

Publication Number Publication Date
CN114371626A true CN114371626A (en) 2022-04-19
CN114371626B CN114371626B (en) 2023-07-14

Family

ID=81144332

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210026694.0A Active CN114371626B (en) 2022-01-11 2022-01-11 Discrete control fence function improvement optimization method, optimization system, terminal and medium

Country Status (1)

Country Link
CN (1) CN114371626B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180357335A1 (en) * 2017-06-08 2018-12-13 Bigwood Technology, Inc. Systems for solving general and user preference-based constrained multi-objective optimization problems
CN110471408A (en) * 2019-07-03 2019-11-19 天津大学 Automatic driving vehicle paths planning method based on decision process
US20200293009A1 (en) * 2019-03-11 2020-09-17 Mitsubishi Electric Research Laboratories, Inc. Model Predictive Control of Systems with Continuous and Discrete Elements of Operations
CN112116830A (en) * 2020-09-02 2020-12-22 南京航空航天大学 Unmanned aerial vehicle dynamic geo-fence planning method based on airspace meshing
CN113190613A (en) * 2021-07-02 2021-07-30 禾多科技(北京)有限公司 Vehicle route information display method and device, electronic equipment and readable medium
CN113238563A (en) * 2021-06-04 2021-08-10 重庆大学 High-real-time automatic driving motion planning method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180357335A1 (en) * 2017-06-08 2018-12-13 Bigwood Technology, Inc. Systems for solving general and user preference-based constrained multi-objective optimization problems
US20200293009A1 (en) * 2019-03-11 2020-09-17 Mitsubishi Electric Research Laboratories, Inc. Model Predictive Control of Systems with Continuous and Discrete Elements of Operations
CN110471408A (en) * 2019-07-03 2019-11-19 天津大学 Automatic driving vehicle paths planning method based on decision process
CN112116830A (en) * 2020-09-02 2020-12-22 南京航空航天大学 Unmanned aerial vehicle dynamic geo-fence planning method based on airspace meshing
CN113238563A (en) * 2021-06-04 2021-08-10 重庆大学 High-real-time automatic driving motion planning method
CN113190613A (en) * 2021-07-02 2021-07-30 禾多科技(北京)有限公司 Vehicle route information display method and device, electronic equipment and readable medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JUN ZENG: "Enhancing Feasibility and Safety of Nonlinear Model Predictive Control with Discrete-Time Control Barrier Functions", 《2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC)》 *
LI SHI-LEI: "High-quality trajectory planning for heterogeneous individuals", 《J. CENT. SOUTH UNIV》 *
李石磊: "Optimizable control barrier functions to improve feasibility and add behavior diversity while ensuring safety", 《ELECTRONICS》 *

Also Published As

Publication number Publication date
CN114371626B (en) 2023-07-14

Similar Documents

Publication Publication Date Title
Lian et al. Near-optimal tracking control of mobile robots via receding-horizon dual heuristic programming
Bhattacharyya et al. Simulating emergent properties of human driving behavior using multi-agent reward augmented imitation learning
Shah et al. Fuzzy decision tree function approximation in reinforcement learning
Bhattacharya et al. Multiagent rollout and policy iteration for POMDP with application to multi-robot repair problems
König et al. Decentralized evolution of robotic behavior using finite state machines
Liessner et al. Safe deep reinforcement learning hybrid electric vehicle energy management
CN113391633A (en) Urban environment-oriented mobile robot fusion path planning method
Meng et al. Sympocnet: Solving optimal control problems with applications to high-dimensional multiagent path planning problems
Banerjee et al. A survey on physics informed reinforcement learning: Review and open problems
Zhang et al. An efficient planning method based on deep reinforcement learning with hybrid actions for autonomous driving on highway
Uchibe Cooperative and competitive reinforcement and imitation learning for a mixture of heterogeneous learning modules
Haklidir et al. Guided soft actor critic: A guided deep reinforcement learning approach for partially observable Markov decision processes
Kabtoul et al. Proactive and smooth maneuvering for navigation around pedestrians
Williams et al. Locally weighted regression pseudo-rehearsal for adaptive model predictive control
CN114371626B (en) Discrete control fence function improvement optimization method, optimization system, terminal and medium
CN115488881A (en) Man-machine sharing autonomous teleoperation method and system based on multi-motor skill prior
Cubuktepe et al. Shared control with human trust and workload models
Cao et al. Robot motion planning based on improved RRT algorithm and RBF neural network sliding
Ma et al. Learn Zero-Constraint-Violation Safe Policy in Model-Free Constrained Reinforcement Learning
García et al. Incremental reinforcement learning for multi-objective robotic tasks
Yang et al. Reinforcement Learning with Reward Shaping and Hybrid Exploration in Sparse Reward Scenes
Bhatia et al. Reinforcement learning
Zhou et al. Research on the fuzzy algorithm of path planning of mobile robot
Wenwen Application Research of end to end behavior decision based on deep reinforcement learning
Trauth et al. A Reinforcement Learning-Boosted Motion Planning Framework: Comprehensive Generalization Performance in Autonomous Driving

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant