CN114371626B

CN114371626B - Discrete control fence function improvement optimization method, optimization system, terminal and medium

Info

Publication number: CN114371626B
Application number: CN202210026694.0A
Authority: CN
Inventors: 李石磊; 袁志民; 李猛; 杨智超; 叶清; 王甲生; 何涛
Original assignee: Naval University of Engineering PLA
Current assignee: Naval University of Engineering PLA
Priority date: 2022-01-11
Filing date: 2022-01-11
Publication date: 2023-07-14
Anticipated expiration: 2042-01-11
Also published as: CN114371626A

Abstract

The invention belongs to the technical field of control fence functions, and discloses an improved optimization method, an optimization system, a terminal and a storage medium for a discrete control fence function, wherein a feasible state set is defined according to the constraint requirement of the discrete control fence function; dynamically adjusting and controlling gamma of the fence function according to constraint optimization solving conditions of a plurality of tasks _k The method comprises the steps of carrying out a first treatment on the surface of the Gamma of the fence to be controlled _k As a first generation optimized variable, obtaining an optimized control fence function; changing the constraint requirement of the discrete form control fence function, and directly optimizing the satisfaction degree of the task constraint; introducing S in optimization objective _k And the related preference targets are adopted, so that the motion path obtained by simulation optimization accords with the preference. The invention can obviously improve the feasible domain space of the optimization algorithm and can obtain more flexible and changeable optimization solutions. Meanwhile, the constraint requirements of the invention have more obvious physical meanings, and the arrangement is more visual and easy.

Description

Discrete control fence function improvement optimization method, optimization system, terminal and medium

Technical Field

The invention belongs to the technical field of control fence functions, and particularly relates to an improved optimization method, an optimization system, a terminal and a storage medium for a discrete control fence function.

Background

Currently, the definition of control fence functions is closely related to the concept of control invariance sets, if anyAt a control function pi:

so that for any starting condition x (0) ∈IS, always +.>

The set IS referred to as a control invariant set. Giving a closed set->

Assume that set C satisfies:

wherein Int (C) and

respectively representing the interior and the boundary of the set C, h:/, respectively>

As a continuous micro-function if for all +.>

Satisfy->

And->

The method meets the following conditions:

then h (x) is called the control fence function. Wherein gamma (& gt) is a classThe K function, i.e. γ (·) satisfies a strict monotonic increase and γ (0) =0, in practical applications γ (·) is often taken as a linear function of constant coefficients, i.e. γ (h (x))=γh (x). As can be seen from the definition, if the initial value h (x) of h (x) is equal to or greater than 0, then due to

Always greater than exponential decay, so h (x) 0 is always guaranteed with forward invariance, that is, the set { x|h (x) 0} is a control invariant set of the system.

In addition, the control fence function is closely related to the control lyapunov function, in nonlinear control, if the invention needs to ensure the stability of the system, namely x (t) →0, in order to solve the problem of judging the stability of the system by directly solving the system state, the control lyapunov function is constructed by constructing a controller, if it can make a positive definite function V (x):

Approaching zero, the stability of the system can be indirectly determined. If the controller meets the requirement%>

It can be demonstrated that the system will converge steadily, V (x ^* ) =0, i.e. x ^* =0. By means of a function V and by constructing a controller

The stability of the system can be indirectly ensured. Similar to the thought of controlling the Lyapunov function, the control fence function indirectly ensures the forward invariance of the system by constructing a controller, and when the control fence function is used, task constraint can be indirectly ensured to be satisfied by describing the task constraint into a form of h (x) more than or equal to 0 and optimizing the control quantity by means of the concept of the forward invariance of the control fence function. It is worth noting here that controlling the Lyapunov function requirements

Not just +.>

To ensure a stable convergence speed of the system, and similarly, to control the fence function +.>

This constraint requirement is only a conservative subset of the forward invariance of the system, which may lead to situations where the optimization algorithm cannot get a viable solution when multiple task constraints expressed in terms of the control fence function are applied simultaneously. Thus, there is a need for a new discrete control fence function improved optimization method based on discrete time control fence functions.

Through the above analysis, the problems and defects existing in the prior art are as follows: (1) Existing control fence functions

The constraint requirements of (a) are only a conservative subset of the forward invariance of the system, which may lead to situations where the optimization algorithm cannot get a viable solution when multiple task constraints expressed in terms of the control fence function are applied simultaneously.

(2) In the prior art, in multi-agent motion trail data planning facing complex dynamic scenes, the traditional method has the problem that the dimension of the problem is increased sharply, and the requirements on the operation efficiency and the motion details are difficult to meet. The accuracy of the obtained multi-agent motion trail data is low.

Disclosure of Invention

Aiming at the problems existing in the prior art, the invention provides an improved optimizing method, an optimizing system, a terminal and a storage medium for a discrete control fence function, and particularly designs an improved optimizing method for the discrete control fence function based on an optimizable discrete time control fence function.

The invention is realized in such a way that a discrete control fence function improvement optimization method based on a discrete time control fence function comprises the following steps:

Step one, defining a feasible state set according to control fence function constraint requirements in a discretization form;

step two, dynamically adjusting and controlling gamma of the fence function according to the constraint optimization solving conditions of a plurality of tasks _k ；

Step three, controlling the gamma of the fence _k As a first generation optimized variable, obtaining an optimized control fence function;

step four, changing the constraint requirement of the discrete form control fence function, and directly changing h (x _t+k|t ) Optimizing the satisfaction degree of task constraint;

step five, S is introduced into the optimization target _k And the related preference targets enable the multi-agent motion path data of the complex dynamic scene obtained by simulation optimization to accord with the preference.

Further, in the step one

In the form of discretization of:

h _t+Δt -h _t ≥-γh _t ；

according to the control fence function constraint requirement of a discretization form, defining a feasible state set corresponding to a certain simulation time point t+k as follows:

S _CBF,k ＝{x∈X:h(x _t+k+1 )≥(1-γ _k )h(x _t+k )}；

where the range of the feasible state set is defined by h (x _t+k ) And gamma _k Co-determination, obviously gamma _k The larger the time for the next time h (x _t+k+1 ) The lower the requirements for (2); conversely, gamma _k The smaller the time for the next time h (x _t+k+1 ) The higher the demand for (2).

Further, in the step two, in the existing control fence function definition, γ _k A pre-determined hyper-parameter; if gamma is _k The arrangement is not reasonable enough, which can lead to the situation of no feasible solution, so the gamma in the fence function is controlled _k And carrying out dynamic adjustment according to the actual conditions of the constraint optimization solution of the tasks.

Further, in the third step, gamma in the fence is controlled _k As a variable of the generation of optimization, it is called an optimizable control fence function.

Further, in the fourth step, according to the description formula of the feasible state set, the method is as follows _k After fixing, h (x _t+k ) Will also be applied to h (x _t+k+1 ) Applying a constraint if h (x _t+k ) Larger, then h (x _t+k+1 ) It must also take a larger value to meet the CBF constraint requirement, whereas the essential requirement for forward invariance is in fact h (x _t+k+1 ) And (3) not less than 0, and changing the constraint requirement of the control fence function in a discrete form into:

wherein S is _k For the newly introduced optimization variables, for substitution of (1-gamma) h (x _t ) Directly to h (x _t+k|t ) The satisfaction degree of task constraints is optimized and called GOCBF, so that multiple task constraints are adjusted more directly, and the range of feasible solution space is further improved.

Further, S is introduced into the optimization target in the fifth step _k Related preference targets, introducing S in the optimization targets in the fifth step _k The related preference targets, enabling the multi-agent motion path data of the complex dynamic scene obtained by simulation optimization to conform to the preference comprises the following steps:

introducing S in optimization objective _k A related preference target comprising Φ (S _k )＝(S ^T -S ₀ )W _S (S-S ₀ ) Thereby enabling the motion path obtained by simulation optimization to accord with the preference.

Another object of the present invention is to provide a multi-agent motion trajectory optimization system based on a discrete time control fence function applying the discrete time control fence function based discrete control fence function improvement optimization method, the multi-agent motion trajectory optimization system based on a discrete time control fence function comprising:

the feasible state set definition module is used for defining a feasible state set according to the control fence function constraint requirement in a discretization form;

the variable dynamic adjustment module is used for dynamically adjusting and controlling gamma of the fence function according to the constraint optimization solving conditions of a plurality of tasks _k ；

An optimized control fence function acquisition module for acquiring gamma of the control fence _k As a first generation optimized variable, obtaining an optimized control fence function;

a constraint requirement changing module for changing the constraint requirement of the control fence function in a discrete form and directly changing the constraint requirement of h (x _t+k|t ) Optimizing the satisfaction degree of task constraint;

a motion path conforming preference acquisition module for introducing S in the optimization objective _k And the related preference targets are adopted, so that the motion path obtained by simulation optimization accords with the preference.

It is a further object of the present invention to provide a computer device comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of:

defining a feasible state set according to the control fence function constraint requirement of the discretization form; dynamically adjusting and controlling gamma of the fence function according to constraint optimization solving conditions of a plurality of tasks _k The method comprises the steps of carrying out a first treatment on the surface of the Gamma of the fence to be controlled _k As a first generation optimized variable, obtaining an optimized control fence function; changes the constraint requirement of the control fence function in a discrete form, and directly changes h (x _t+k|t ) Optimizing the satisfaction degree of task constraint; introducing S in optimization objective _k And the related preference targets are adopted, so that the motion path obtained by simulation optimization accords with the preference.

Another object of the present invention is to provide a computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of:

Defining a feasible state set according to the control fence function constraint requirement of the discretization form; dynamically adjusting and controlling gamma of the fence function according to constraint optimization solving conditions of a plurality of tasks _k The method comprises the steps of carrying out a first treatment on the surface of the Will control the fenceγ _k As a first generation optimized variable, obtaining an optimized control fence function; changes the constraint requirement of the control fence function in a discrete form, and directly changes h (x _t+k|t ) Optimizing the satisfaction degree of task constraint; introducing S in optimization objective _k And the related preference targets are adopted, so that the motion path obtained by simulation optimization accords with the preference.

The invention further aims to provide an information data processing terminal which is used for realizing the multi-agent motion trail optimization system based on the discrete time control fence function.

By combining all the technical schemes, the invention has the advantages and positive effects that: the optimized discrete time control fence function provided by the invention is more universal and flexible, and when a plurality of task constraints are added, the feasible region space of an optimization algorithm can be obviously improved, and more flexible and changeable optimization solutions can be obtained; the constraint requirement has more obvious physical meaning, and the setting is more visual and easy.

Aiming at the problems that task constraints are limited in a geometric space, consideration of dynamic characteristics of an intelligent agent is lacked, and overall consideration among different task constraints is lacked, the invention provides an optimizable GOCBF function, dynamic physical simulation and motion planning are combined based on MPC-GOCBF, and the unified description of the multi-task constraints of the motion planning is realized by utilizing an improved CBF function based on an augmented physical simulation thought, so that the explicit constraint relation analysis solving (or searching) process is converted into implicit simulation calculation, a multi-intelligent-agent motion planning physical simulation realization environment is constructed, the algorithm is tested and verified, and experimental results prove the flexibility and the dynamic adaptability of the algorithm. And the accuracy of the obtained multi-agent motion trail data is high.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments of the present invention will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flowchart of a discrete control fence function improvement optimization method based on a discrete time control fence function according to an embodiment of the present invention.

FIG. 2 is a block diagram of a multi-agent motion trajectory optimization system based on discrete time control fence functions provided by an embodiment of the present invention;

in the figure: 1. a feasible state set definition module; 2. a variable dynamic adjustment module; 3. the fence function acquisition module can be optimally controlled; 4. a constraint requirement changing module; 5. the motion path conforms to the preference acquisition module.

Fig. 3 is a schematic diagram of an optimization solving process of a model predictive control algorithm according to an embodiment of the invention.

Fig. 4 is a schematic diagram of generating an optimized movement path of an agent based on augmented physics simulation according to an embodiment of the present invention.

Fig. 5 is a schematic diagram of conservative behavior movement path optimization generation provided by an embodiment of the present invention.

Fig. 5 (a) is a schematic diagram of an intelligent body motion track according to an embodiment of the present invention.

FIG. 5 (b) is a graph showing the control fence function coefficient gamma provided by an embodiment of the present invention _k Schematic diagram.

FIG. 6 is a schematic diagram of the generation of an adventure behavior movement path optimization provided by an embodiment of the present invention.

Fig. 6 (a) is a schematic diagram of an intelligent body motion track according to an embodiment of the present invention.

FIG. 6 (b) is a graph showing the control fence function coefficient gamma provided by an embodiment of the present invention _k Schematic diagram.

FIG. 7 is a schematic diagram of user-defined behavioral movement path optimization generation provided by an embodiment of the present invention.

Fig. 7 (a) is a schematic diagram of an intelligent body motion track according to an embodiment of the present invention.

FIG. 7 (b) is a graph showing the control fence function coefficient gamma provided by an embodiment of the present invention _k Schematic diagram.

FIG. 8 is a schematic diagram of the MPC-GOCBF motion path optimization generation provided by the embodiment of the invention.

Fig. 8 (a) is a schematic diagram of an intelligent agent movement track according to an embodiment of the present invention.

Fig. 8 (b) is a schematic diagram of the h value of the control fence function according to the embodiment of the present invention.

Fig. 9 is a schematic diagram of an adaptive cruise simulation test result (s=25) provided by an embodiment of the present invention.

Fig. 9 (a) is a schematic diagram of a movement speed of a cruising agent according to an embodiment of the present invention.

Fig. 9 (b) is a schematic diagram of a control fence function h according to an embodiment of the present invention.

Fig. 9 (c) is a schematic diagram of an optimized control amount provided by an embodiment of the present invention.

Fig. 9 (d) is a schematic diagram of S values corresponding to the GOCBF provided in the embodiment of the present invention.

Fig. 10 is a schematic diagram of an adaptive cruise simulation test result (s=40) provided by an embodiment of the present invention.

Fig. 10 (a) is a schematic diagram of a movement speed of a cruising agent according to an embodiment of the present invention.

Fig. 10 (b) is a schematic diagram of a control fence function h according to an embodiment of the present invention.

Fig. 10 (c) is a schematic diagram of an optimized control amount provided by an embodiment of the present invention.

Fig. 10 (d) is a schematic diagram of S values corresponding to the GOCBF provided in the embodiment of the present invention.

Fig. 11 is a schematic diagram of a multi-agent collision avoidance experiment according to an embodiment of the present invention.

FIG. 11 (a) is a screenshot of a simulation process provided by an embodiment of the invention.

Fig. 11 (b) is a schematic diagram of a motion track of an obstacle avoidance agent according to an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the following examples in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

Aiming at the problems existing in the prior art, the invention provides an improved optimization method of a discrete control fence function based on a discrete time control fence function, and the invention is described in detail below with reference to the accompanying drawings.

As shown in fig. 1, the discrete control fence function improvement optimization method based on the discrete time control fence function provided by the embodiment of the invention comprises the following steps:

S101, defining a feasible state set according to control fence function constraint requirements in a discretization form;

s102, dynamically adjusting and controlling r of the fence function according to the constraint optimization solving conditions of a plurality of tasks _k ；

S103, controlling r of fence _k As a first generation optimized variable, obtaining an optimized control fence function;

s104, changing the constraint requirement of the discrete form control fence function, and directly changing h (x _t+k|t ) Optimizing the satisfaction degree of task constraint;

s105, introducing S into the optimization target _k And the related preference targets are adopted, so that the motion path obtained by simulation optimization accords with the preference.

In step S105, the multi-agent motion path data of the complex dynamic scene obtained by the simulation optimization is conformed to the preference.

As shown in fig. 2, the multi-agent motion trajectory optimization system based on discrete time control fence function provided by the embodiment of the invention includes:

the feasible state set definition module 1 is used for defining a feasible state set according to the control fence function constraint requirement in a discretization form;

the variable dynamic adjustment module 2 is used for dynamically adjusting and controlling the gamma of the fence function according to the constraint optimization solving conditions of a plurality of tasks _k ；

An optimizable control fence function acquisition module 3 for acquiring gamma of the control fence _k As a first generation optimized variable, obtaining an optimized control fence function;

constraint requirement modification module 4 for making discrete form control fence function constraint requirements more demandingChange, and directly apply to h (x _t+k|t ) Optimizing the satisfaction degree of task constraint;

a motion path conforming preference acquisition module 5 for introducing S in the optimization objective _k And the related preference targets are adopted, so that the motion path obtained by simulation optimization accords with the preference.

The technical scheme of the invention is further described below with reference to specific embodiments.

Examples: optimized discrete time control fence function design

In the form of discretization of:

h _t+Δt -h _t ≥-γh _t (1)

according to the control fence function constraint requirement of a discretization form, the invention defines a feasible state set corresponding to a certain simulation time point t+k as follows:

S _CBF,k ＝{x∈X:h(x _t+k+1 )≥(1-γ _k )h(x _t+k )} (2)

according to equation (2), the range of the feasible state set is defined by h (x _t+k ) And gamma _k Co-determination, obviously gamma _k The larger the time for the next time h (x _t+k+1 ) The lower the demand on (2), and conversely, gamma _k The smaller the time for the next time h (x _t+k+1 ) The higher the demand for (2). In the existing control fence function definition, gamma _k For a pre-determined hyper-parameter. If gamma is _k The arrangement is not reasonable enough, which results in the situation of no feasible solution. The invention therefore proposes to control gamma in the fence function _k Dynamic adjustment is carried out according to the actual condition of the constraint optimization solution of a plurality of tasks. According to the above thought, the invention further controls the gamma in the fence _k As a variable for the generation of optimization, it is called an Optimizable control fence function (Optimizable CBF).

According to the description formula (1) of the feasible state set, at gamma _k After fixing, h (x _t+k ) Will also be applied to h (x _t+k+1 ) Applying a constraint if h (x _t+k ) Larger, then h (x _t+k+1 ) It must also take a larger value to meet the CBF constraint requirement, whereas the essential requirement for forward invariance is in fact h (x _t+k+1 ) 0, therefore the invention further proposes to change the control fence function constraint requirements in discrete form to:

in the formula (3), S _k The newly introduced optimization variables are used for replacing (1-gamma) h (x _t ) Thus, the invention can directly process h (x _t+k|t ) The method and the device optimize the satisfaction degree of the task constraints, namely GOCBF (General and Optimizable CBF), so that the plurality of task constraints are more directly adjusted, and the range of a feasible solution space is effectively improved. Also, the present invention introduces S in the optimization objective _k Related preference targets, e.g. phi (S _k )＝(S ^T -S ₀ )W _S (S-S ₀ ) Thereby enabling the motion path obtained by simulation optimization to accord with the preference.

The technical scheme of the invention is further described below in connection with simulation experiments.

Simulation experiment: motion planning multitask constraint unified description based on augmented physics simulation

The basic goal of motion planning is to find a feasible path from a starting state to a target state in a specific state space, and the essence is to guide a search algorithm to reach a target point by means of partial priori heuristic knowledge (specific constraint or even completely random sampling), which is a typical high-dimensional pathological redundancy problem, namely a plurality of feasible solutions exist at a specific space-time point, a better solution needs to be found out from the plurality of feasible solutions based on the existing task constraint requirements, and at the moment, proper description of various task constraints becomes the key of problem solving and directly determines the solving efficiency of the path planning algorithm and the advantages and disadvantages of a final planned path.

The conventional motion planning algorithm limits task constraint to geometric space, and realizes direct generation of paths by directly judging the satisfaction of geometric constraint relationship, so that physical characteristic limitations (such as speed and acceleration constraint) of an intelligent body are often not considered, consideration of dynamic characteristics of the intelligent body is lacking, and physical feasibility of the motion paths cannot be ensured. In addition, geometrical constraint relationships such as obstacle avoidance and specific target point arrival are often directly described by inequality, and overall consideration among different task constraints is lacking. At present, the problem of multi-agent motion planning of complex dynamic scenes is faced, and the requirements of the traditional method on the operation efficiency and the motion details are difficult to meet due to the fact that the dimension of the problem is increased sharply.

The method specifically comprises the following steps:

1. in constraint-based motion planning path optimization generation algorithms, the overall problem is generally described as an optimization problem. Various task constraint requirements are often directly described by geometric and physical logic equality relationships, such as distance d > 0 from the obstacle, speed limit in a certain range

And the like, and then directly obtaining a path result by utilizing optimization solution calculation. In the aspect of optimizing and solving, a gradient-free intelligent optimization algorithm with a certain heuristic characteristic and a gradient-based optimization calculation method are applied. The heuristic intelligent optimization algorithm cannot ensure that a better result can be obtained every time due to inherent random exploratory property of the algorithm; the optimization method based on the gradient is easy to fall into local optimum, the dependence of the optimization calculation on an Initial Guess value (Initial Guess) is large, and if the Initial Guess value is set inadequately, a better planning result is difficult to obtain. In order to ensure that the gradient-based optimization algorithm always obtains a better solution, firstly, a rough guess path is generated by using an A-algorithm, then the rough path is used as an initial value of a candidate optimization algorithm, an optimization target is set to be the minimum deviation between a final optimization value and an initial estimated path, and finally, a final path is generated in an optimization mode based on specific task constraint. The method can solve the problem of high dependence of the optimization algorithm based on the gradient on the initial estimated value, but is only suitable for off-line global path optimization generation and cannot be applied to complex dynamic loops because the initial estimated value of the global path needs to be generated And generating the planned path in the environment on line in real time.

To enable online dynamic generation of planned paths, numerous researchers have proposed using model predictive control (Model Prediction Control or Receding Horizon Control) to enable constraint optimization-based dynamic generation of planned paths. The optimization calculation process of a model predictive control algorithm with a prediction step length of N can be described as:

in the formula (1.1), (a) represents a control target to be optimized, (b) represents a dynamic model of the agent after discretization, (c) represents various states and control input constraints which the agent needs to meet at each discrete time point, and (d) represents initial and end states of the agent in the optimization time period. In model predictive control optimization solution, the optimization decision variable is a control sequence, i.e., control inputs at N time points within the prediction horizon

However, in the actual execution, only the control quantity at the first time point is +.>

Inputting the actual system to obtain the system state x at the next moment _t+1 And the state is taken as an initial state again, the optimization solving process is repeated, and the whole process is shown in fig. 3.

Based on the model predictive control method, various task requirements which need to be met in the motion planning process can be conveniently used as constraint conditions to be added into the optimization solving process. And the model predictive control can effectively improve the quality of a planned path and avoid the short-sighted phenomenon by considering the possible motion trail of the intelligent body in a period of time in the future when optimizing and solving. Therefore, if the path planning quality is further improved, the prediction step length N needs to be increased, so that the solving efficiency of the system is difficult to ensure with the increase of the number of agents and the task constraint requirements. Therefore, under the framework of model predictive control, how to properly describe various task constraint requirements becomes a key for quick problem solving.

With the rapid development of the processing speed of a computer, the physical simulation method is widely applied when simulating various natural phenomena, and after various physical constraints are given, the system can automatically generate physical detail effects meeting various constraint requirements through dynamic simulation calculation, so that the system is flexible and convenient. However, the pure physical simulation lacks an effective control mechanism, a target controller needs to be constructed, and the physical simulation result can be ensured to run towards an assumed target. Different from physical simulation, the motion planning algorithm has the natural advantage of long-distance target control, but for multi-agent navigation behavior generation in a complex dynamic scene, the problem dimension is increased sharply, and the solution efficiency can not be ensured by generating all configuration space (Configuration Space) state information of the multi-agent in the motion process by the planning algorithm, so that the possibility is provided for proper combination of two ideas. Therefore, the invention combines dynamic physical simulation and motion planning, the motion planning is responsible for target control, various index constraints are described as real or virtual physical constraints in a working space (workbench), the traditional real dynamic physical simulation is expanded and augmented, the unified description of the motion planning multi-task constraints is realized based on the augmented physical simulation, and the explicit constraint relation analysis solving (or searching) process is converted into implicit simulation calculation, so that the fusion complementation of the advantages of the two is realized.

Recently proposed control fence function (Control Barrier Functions) is used for converting task constraint explicit solution based on intelligent body state into optimal generation of control quantity by referring to thought of control Lyapunov function (Control Lyapunov Functions), and the control quantity is generated by optimizing to indirectly ensure the satisfaction of task constraint, so that the calculation efficiency can be greatly improved. The prior paper compares the control fence function with an artificial potential field method, and proves that the artificial potential field method is a special case of the control fence function, and the overall performance of the control fence function is superior to that of the artificial potential field method.

From the viewpoint of dynamic physical simulation, the control fence function is considered to be the control quantity for converting specific task constraint into dynamic simulation process, and a better means tool is provided for the augmented physical simulation provided by the invention, so that the method for generating the multi-task constraint unified description and motion path augmented physical simulation under the concept frame of the analysis model predictive control-control fence function (MPC-CBF) is based on the control fence function.

2. Description of the problem

The invention firstly assumes that all the agents move on a flat terrain, each agent is simplified into a cylinder, and therefore the global motion path of each agent can be simplified into a two-dimensional path planning problem. Each agent has a radius r _i (i=1.,), N) is represented by a circle of the formula, at any time t, agent A _i The position of (a) is expressed as x _i (t) speed is denoted v _i (t) acceleration is denoted as a _i (t) and the speed, acceleration are limited by maximum values, i.e

Indicating the maximum speed value that the agent can reach, < >>

Indicating the maximum acceleration value that the agent can reach.

The invention aims to enable an intelligent agent to generate a dynamic path in real time according to environmental constraint requirements in the process of reaching a designated target area from a starting position, and the intelligent agent does not collide with obstacles in the environment and between the intelligent agent individuals in the process. The whole process can be described as an optimization framework as follows:

in the formula (2.1) of the present invention,

kinetic model representing an agent, x (0) =x ₀ ,x _e ＝x _g Respectively represent the starting position and the end position of the intelligent body, X is E X, U is E U _adm (x (t)) tableIllustrating the state and control quantity constraint requirements of an agent, J (u, x) is an optimization objective, i.e., a performance index measurement function of a planned path, where c (x, u) is a cost function, and is generally expressed as:

c(x,u)＝Q(x)+u ^T Ru (2.2)

in the formula (2.2), Q (x) and u ^T Ru represents costs related to the state and control quantity of the agent, respectively, where r=r ^T And > 0 is a symmetric positive definite matrix. The invention expects that the total cost is as small as possible under the precondition of meeting task constraint in the path optimization generation process.

3. Control fence function background knowledge

The definition of the control fence function is closely related to the concept of a control invariant set, if there is one control function pi:

so that for any starting condition x (0) ∈IS, always +.>

The set IS referred to as a control invariant set. Giving a closed set->

Assume that set C satisfies:

in equation (3.1), int (C) and

As a continuous micro-function if for all +.>

Satisfy->

And->

The method meets the following conditions:

then h (x) is called the control fence function. In the formula (3.2), γ (·) is a class-K function, that is, γ (·) satisfies a strict monotonic increase and γ (0) =0, in practical application, γ (·) is often taken as a linear function with a constant coefficient, that is, γ (h (x))=γh (x). As can be seen from definition (3.2), if the initial value h (x) of h (x) is 0 or more, then due to

The stability of the system can be indirectly ensured. Similar to the thought of controlling Lyapunov function, the control fence function indirectly ensures the forward invariance of the system by constructing a controller, and the invention can be used when the control fence function is usedThe task constraint is described as h (x) not less than 0, and the task constraint is indirectly ensured to be satisfied by optimizing the control quantity by means of the concept of controlling the forward invariance of the fence function. It is worth noting here that control of the Lyapunov function requires +.>

Not just +.>

This constraint requirement is only a conservative subset of the forward invariance of the system, which may lead to situations where the optimization algorithm cannot get a viable solution when multiple task constraints expressed in terms of the control fence function are applied simultaneously.

4. Task constraint unified description and dynamics physical simulation optimization generation based on optimizable control fence function

Based on the control fence function, explicit task constraint satisfaction test can be converted into implicit dynamic simulation calculation, but the description of the multi-task constraint is required to meet the feasibility requirement, so that the situation that the optimization calculation cannot obtain a feasible solution is avoided. Therefore, the invention provides an optimized control fence function based on the concept of the control fence function, and provides an improved optimization solving thought under discrete time condition based on a model prediction control optimization solving framework according to the dynamics simulation calculation characteristics of the optimized control fence function.

4.1 unified description of task constraints based on control fence functions

Based on the definition of the control fence function, the invention firstly needs to set the invariable safety set, and realizes the indirect expression of task constraint through the invariable safety set.

(1) Description of hard constraints

For the hard constraint which the intelligent body always meets in the motion process, the control fence function can be directly used for description.

If the corresponding collision avoidance task is constrained, the control fence function may be set to:

h ₁ (x)＝d(x _i ,o _i )-(r _i +r _o )≥0 (4.1)

in the formula (4.1), d (x) _i ,o _i ) Represents the distance between the intelligent body and the obstacle o, r _i 、r _o Respectively represent the radius of the agent and the obstacle. Thus, when the motion path is optimally generated, the invention can ensure that the control quantity input u of the intelligent agent is satisfied by optimizing

Thus, according to the forward invariance, h can be always ensured ₁ (x) And the collision between the intelligent agent and various obstacles in the scene is not more than 0.

For bi-directional range constraints like speed, acceleration and control quantity that the agent itself can impose, it can be described by adding two forms of control fence functions. Such as speed range constraints

I.e. < ->

The corresponding control fence function is:

using equation (4.2), the description of the bi-directional range task constraint can be achieved with two control fence functions.

(2) Description of Soft constraints

The soft constraint is a target which needs to be achieved as much as possible or gradually in the movement process of the intelligent body, and on one hand, when the soft constraint can be fully satisfied, the constraint requirement of the soft constraint is satisfied as much as possible; on the other hand, when the soft constraint collides with other soft constraints, it is necessary to appropriately reduce the satisfaction degree of the soft constraint, and ensure that the agent as a whole satisfies as many task constraints as possible. If the soft constraint and other soft constraints conflict with each other, at this time, according to the priority relation of different constraints, it is required to ensure that task constraints with high priority meet the higher degree, and soft task constraints with low priority can properly reduce the limitation requirements. In the invention, soft constraint is described by adopting two modes based on an MPC-CBF optimization framework:

The first is described in the cost function to be optimized. On the global level, soft constraints which are required to be met by the intelligent agent are converted into target quantities to be optimized in the cost function, so that the maximum possible meeting of the corresponding soft constraints can be ensured through optimization calculation in the process of optimizing and generating the motion path of the intelligent agent.

For example, as for task constraint requirements that the control quantity applied by the intelligent agent is as small as possible (energy optimal) in the movement process of the intelligent agent, u can be added in the cost function ^T Ru item description; task constraints corresponding to reaching a particular target point can be determined by adding (x _i -x _g ) ² In the form of (a) guiding the agent to gradually reach the target state.

And secondly, directly describing by using a control fence function form. For the more general soft task constraint, on a local level, it can be described by directly adding a form of relaxation variable to the corresponding control fence function constraint requirement, at this time, the corresponding control fence function constraint requirement (3.2) is transformed into:

in the formula (4.3), epsilon > 0 is a relaxation variable, and the pair is further relaxed by adding the relaxation variable epsilon

When a plurality of soft constraints and hard constraints are optimized together for solving and calculating, the satisfaction degree of the soft constraints can be automatically adjusted through the size change of epsilon, and the smaller epsilon is, the higher the satisfaction degree of the corresponding soft constraints is, otherwise, the larger epsilon obtained by optimizing and solving is, and the lower the satisfaction degree of the corresponding soft constraints is.

When multiple soft task constraints need to be applied simultaneously during motion planning, the priority order between different soft constraints needs to be set. In order to realize modeling expression of different soft constraint priority orders, the invention adds optimization items of corresponding relaxation variables of different soft constraints in the corresponding cost functions:

in the formula (4.4) of the present invention,

w _i the larger, the corresponding ε in the optimization solution process _i The smaller the corresponding soft constraint is, the higher the satisfaction degree of the corresponding soft constraint is, so that different soft constraints correspond to w _i Can intuitively express the importance degree of the relative size of the soft task in different soft task constraints, and the priority of different soft constraints can be realized through the coefficient W in the diagonal matrix W _i To make an intuitive description. Corresponding to the soft task constraint with high priority, the invention can set higher w _i While the low priority task constraint sets a lower w _i 。

4.2 agent motion Path optimization Generation under augmented physical simulation framework

The conventional motion planning algorithm limits task constraint to geometric space, and realizes direct generation of paths by directly judging the satisfaction of geometric constraint relations, so that the motion planning result has insufficient satisfaction of specific tasks. The invention carries out physical modeling expression on various task constraints based on the control fence function, and realizes implicit solution of the task constraints through physical simulation calculation, and the specific thinking is shown in fig. 4.

Firstly, on the basis of the thought that the temporal logic requirements and the spatial constraints of the third chapter are separately described, the time logic relationship of various space-time task constraints is described by using a DFA, then the spatial constraints are converted into the optimal solution of the control quantity required by the corresponding dynamic physical simulation through a CBF, the physical characteristic limitations of the corresponding intelligent agent are also described through the CBF, the specific task targets can be described through a cost function, and finally the whole augmented physical simulation framework can be converted into the following optimal solution framework:

in equation (4.5), there are n task constraints in total, where

For hard constraint based on CBF description, +.>

Is a soft constraint based on CBF description. In the optimization solving and calculating process, the equation (4.4) is solved based on a numerical simulation algorithm based on the MPC algorithm by converting the continuous equation into a discretization form, and according to the equation (1.1), the discrete form under the MPC optimization solving framework corresponding to the equation (4.5) is as follows:

Is a discretized version of (c). This is because +.>

Then there are:

h _t+Δt -h _t ≥-γ·Δt·h _t (4.8)

since Δt is the simulation calculation step length, which is a certain value, equation (4.8) can be simplified to h _t+Δt -h _t ≥-γh _t Equivalent to replacing the original with a new value gamma The new value gamma of gamma.delta t is 0 < gamma.ltoreq.1.

According to the formula (4.5), the intelligent agent meets the requirement of the specific task constraint to generate the planning of the motion path, and can be converted into a dynamic simulation calculation process, and the required control input is generated based on the optimization of the MPC framework. Due to the inherently conservative nature of the control fence function (which is only a conservative subset of the forward invariance of the system is satisfied), situations may arise in which viable control inputs are not available when n task constraints are applied simultaneously. In order to improve the feasibility of the system, the invention further provides an optimized control fence function.

S _CBF,k ＝{x∈X:h(x _t+k+1 )≥(1-γ _k )h(x _t+k )} (4.9)

according to equation (4.9), the range of the feasible state set is defined by h (x _t+k ) And gamma _k Co-determination, obviously gamma _k The larger the time for the next time h (x _t+k+1 ) The lower the demand on (2), and conversely, gamma _k The smaller the time for the next time h (x _t+k+1 ) The higher the demand for (2). In the existing control fence function definition, gamma _k For a pre-determined hyper-parameter. If gamma is _k The arrangement is not reasonable enough, which results in the situation of no feasible solution. The invention therefore proposes to control gamma in the fence function _k Dynamic adjustment is carried out according to the actual condition of the constraint optimization solution of a plurality of tasks. According to the above thought, the invention further controls the gamma in the fence _k As a variable for the generation of optimization, the invention is called an Optimizable control fence function (Optimizable CBF), and then the formula (4.6) is changed to:

wherein, phi (gamma) _t ) For further introduced optimization objectives, different forms can be set according to specific needs. The invention has the following three development and definitionForm:

(1) Preference conservative behavior

If the movement track of the intelligent agent is required to be conservative, the constraint requirement is met as high as possible, and the gamma at each time step point is set _k As small as possible, the present invention therefore defines an optimization objective:

at this time

In the optimization process, the gamma tends to be allowed _k As small as possible, thereby achieving conservative behavior generation.

(2) Preference adventure behavior

If the movement track of the intelligent agent is required to deviate from risk, the task is completed under the minimum condition of meeting the constraint requirement, and the gamma at each time step point is set _k As large as possible, the present invention therefore defines an optimization objective:

at this time, the optimization process tends to let γ _k As large as possible, thereby enabling risk action generation.

(3) User-defined behavior

If the user is to gamma _k Has a specific requirement gamma _t ＝γ ₀ At this time, the present invention defines an optimization objective:

at this time, the optimization process tends to let γ _k As close to gamma as possible ₀ Thereby realizing the user-defined behavior generation.

According to the description formula (4.8) of the feasible state set, at gamma _k After fixing, h (x _t+k ) Will also be applied to h (x _t+k+1 ) Applying a constraint if h (x _t+k ) Larger, then h (x _t+k+1 ) It must also take a larger value to meet the CBF constraint requirement, whereas the essential requirement for forward invariance is in fact h (x _t+k+1 ) 0, therefore the invention further proposes to change the control fence function constraint requirements in discrete form to:

in the formula (4.14), S _k The newly introduced optimization variables are used for replacing (1-gamma) h (x _t ) Thus, the invention can directly process h (x _t+k|t ) The method and the device optimize the satisfaction degree of the task constraints, namely GOCBF (General and Optimizable CBF), so that the plurality of task constraints are more directly adjusted, and the range of a feasible solution space is effectively improved. Also, the present invention introduces S in the optimization objective _k Related preference targets, e.g. phi (S _k )＝(S ^T -S ₀ )W _S (S-S ₀ ) Thereby enabling the motion path obtained by simulation optimization to accord with the preference.

5. Simulation results and analysis

In order to test and verify the MPC-OCBF and MPC-GOCBF augmented physics simulation frameworks proposed by the project, the invention utilizes MATLAB, based on Yalmip optimization computing language, and utilizes IPOPT optimization computing package to carry out simulation experiment on the algorithm.

5.1 agent motion Path optimization computation to generate diverse behaviors

Consider a punctiform agent whose kinetic model is second order:

X _k+1 ＝AX _k +BU _k (5.1)

wherein the state variable X _k ＝[x,y,v _x ,v _y ] ^T U is the position and speed of the intelligent body in the two-dimensional plane _k ＝[u _x ,u _y ] ^T For the control amount input required by the system, the state transition matrices A, B are respectively:

due to the limitation of physical characteristics of the intelligent body, the physical constraint to be satisfied is as follows:

taking x in simulation experiment _max ,x _min ＝±5·I _4×1 ,u _max ,u _min ＝I _2×1 Wherein I _n×n Is an n x n identity matrix. The radius r needs to be avoided in the movement process of the intelligent body _obs Obstacle=1.5, corresponding control fence function is defined as:

the initial and target positions of the agent are (-5; -5) and (0; 0), and the position coordinates of the obstacle are (x) _obs ,y _obs ) = (-2, -2.25). The cost function in the optimization process is defined as:

c(x,u)＝x' _k Qx _k +u' _k Ru _k +x' _N Px _N (5.5)

in the formula (5), q=10·i is taken _4×4 ,R＝I _2×2 ,P＝100·I _4×4 Prediction window length (MPC horizon) n=8.

(1) Conservative behavior movement path optimization generation

In the simulation experiment process, the invention adopts fixed coefficients gamma respectively _k (0.01 and 0.1) and making gamma in the optimization objective _k And the motion path dynamics physical simulation optimization generation is carried out by a method as small as possible, and the simulation result is shown in fig. 5.

As can be seen from FIG. 5 (b), the coefficient gamma is calculated by using the MPC-OCBF algorithm framework proposed by the present invention _k Can be automatically adjusted according to the actual optimization calculation process, and when the target is reached as soon as possible and the obstacle avoids the collision of the target, the algorithm framework is automatically liftedHigh gamma _k The constraint requirements are reduced, other constraints (less walking) are simultaneously met as much as possible on the premise of ensuring that the collision avoidance constraint can be met, and the coefficient gamma is the coefficient gamma at other times _k Near zero, the collision avoidance constraints can be met to the greatest extent. As can be seen from FIG. 5 (a), as the present invention increases W gradually in the optimization objective _γ The motion path obtained by optimization calculation is more and more conservative and more deviated from the obstacle.

(2) Risk behavior movement path optimization generation

In the simulation experiment process, the invention adopts fixed coefficients gamma respectively _k (0.9 and 1.0) and letting gamma in an optimization objective _k And the motion path dynamics physical simulation optimization generation is carried out by a method as large as possible, and the simulation result is shown in fig. 6.

As can be seen from FIG. 6 (a), different W _γ The obtained movement track of the intelligent agent is basically the same, because the risk behavior target is consistent with the task constraint requirement of reaching the target point as soon as possible (straight line), and the risk behavior target and the task constraint requirement are not in conflict, the obtained gamma is obtained during the optimization calculation _k Always remains approaching 1 (as shown in fig. 6 (b)). Therefore, as can be seen from fig. 5 and fig. 6, the MPC-OCBF augmented physical simulation optimization computing framework provided by the present invention can automatically implement appropriate optimization adjustment for different task constraints, thereby increasing the solution space and obtaining diversified trajectory paths.

(3) User-defined behavioral path optimization generation

In the simulation experiment process, the invention sets the user-defined coefficient gamma ₀ ＝[0.3,0.5,0.8]，W _γ ＝10 ² ·I _N The simulation experiment results are shown in fig. 7.

As can be seen from FIG. 7 (a), the coefficient γ is fixed directly _k Compared with the MPC-CBF method, the MPC-OCBF method provided by the invention has little difference in motion trail generated by the MPC-OCBF method because when target conflict occurs, the method is characterized in that _γ ＝10 ² ·I _N The setting is smaller, and the optimization algorithm framework automatically pairs gamma _k Performing an increase adjustment, as shown in FIG. 7 (b), to ensure that the resulting path satisfies the collision avoidance constraintThe target position is reached as soon as possible under beam conditions.

(4) MPC-GOCBF algorithm behavior path optimization generation

Finally, the invention utilizes the MPC-GOCBF algorithm (formula 4.14) proposed by the project, and directly sets S _k And (3) performing motion path simulation optimization calculation. In experiments, the invention sets S ₀ ＝[0,1,2]，W _S ＝10 ² ·I _N The final results are shown in FIG. 8.

It can be seen from FIG. 8 (a) that by providing different S _k The invention can obtain motion paths with different behavioral characteristics. S is S _k Has obvious physical and practical meaning, S _k The larger the distance of the intelligent body from the obstacle is, and in actual operation, the intelligent body can be directly provided with different S _k A movement path is obtained which meets specific requirements. Notably, in the process of the simulation optimization calculation, the control fence function does not necessarily strictly satisfy the requirement of being larger than S ₀ Instead of dynamically adjusting the constraint requirements according to the actual situation of the task constraint, as shown in fig. 8 (b), when the collision avoidance constraint requirements collide with the task constraint requirements that reach the target point as soon as possible (straight line), the optimization algorithm automatically reduces S _k The optimization adjustment between the multitask constraints is automatically realized, and the whole algorithm framework has stronger adaptability.

2 adaptive cruise motion path optimization calculation in autopilot

In order to further test the performance of the MPC-GOCBF algorithm framework proposed by the project, the invention further carries out test comparison on the performance of the MPC-GOCBF algorithm by using the self-adaptive cruising problem of an intelligent agent in automatic driving. The kinetic model of the agent is:

wherein (x) ₁ ,x ₂ ) Representing the position and speed of cruising agents

m is the mass of the intelligent agent, x ₃ For distance of cruising agent from front guiding agent, front guiding agent is at a fixed speed v _l Advancing. />

Is an intelligent dynamic parameter, wherein f ₀ ，f ₁ And f ₂ The range of control inputs that the agent can exert is defined as c, which is an empirical constant _d mg≤u≤c _a mg，c _d For the constant coefficient, the value of the parameter is completely consistent with the prior art. The task constraints that the cruise agent needs to meet are represented by the control lyapunov function:

V＝(x ₂ -v _d ) ² (5.7)

in the formula (7), v _d Representing the cruise speed desired by the cruise agent.

Defining a control fence function as:

h＝x ₃ -1.8x ₂ (5.8)

in order to test the limit performance of MPC-GOCBF algorithm framework, the invention takes the optimization time window length N=1 of MPC, so that the algorithm is degenerated into a single-step QP optimization problem, and the algorithm result and the fixed coefficient gamma are combined _k Compared with the optimal-decay CBF-QP method with the optimized parameters additionally added in the CBF-QP by using the CBF-QP method with the optimized parameters of the CBF-QP of which the number is equal to 0.5, the invention takes the value of S=25 and W _S ＝10 ⁴ The simulation results are shown in fig. 9.

In a simulation experiment, the adaptability of the algorithm framework is tested by setting different initial speeds of the cruising intelligent body, and the initial speeds of the cruising intelligent body are respectively 26, 28, 30 and 32. From the simulation results, as can be seen from fig. 8 (a), when the initial speeds are 30 and 32, the CBF-QP (Nominal) cannot obtain a feasible solution, but the GOCBF algorithm of the present invention can ensure that the system obtains a feasible solution, and can achieve the same performance as the optimal-decay CBF-QP method, and ensure that the safety constraint requirement h is greater than or equal to 0 is satisfied in the whole process, as shown in fig. 8 (b). Meanwhile, the S can be automatically and dynamically adjusted in the whole process, and the feasibility of optimizing and solving the system is ensured.

The invention further increases S, takes s=40, further improves the safety constraint requirement of the distance between the vehicle and the front vehicle, and the simulation result is shown in fig. 10.

As can be seen from fig. 10, the GOCBF algorithm proposed by the project enhances the constraint requirement of the safety distance between the front vehicle and the algorithm, and can still ensure the feasibility of optimizing and solving the system, compared with fig. 8, the final h is also increased from 0 to 9.7, thereby increasing the safety distance between the front vehicle and the algorithm and generating the cruise behavior meeting the requirement.

6. Multi-agent motion planning simulation demonstration software design development

In order to conveniently test and verify the augmented physics simulation algorithm based on the MPC-GOCBF thought, the project is further designed and developed to simulate and demonstrate a software environment for multi-agent motion planning, and the software environment is used for testing and verifying the multi-agent motion planning algorithm.

6.1 demonstration software design development

Demonstration software is realized based on Python programming, optimization solution is realized by using CasaDi (https:// web. CasADi. Org /), and the CasaDi provides an efficient open source optimization problem solution, and is very suitable for solving nonlinear optimization problems (nonlinear optimization) and realizing automatic differentiation (algorithmic differentiation). Compared with other optimization libraries, the method provides standard C/C++ and MATLAB support, and provides a Python API which is popular at present, so that nonlinear optimization problems can be conveniently and rapidly solved based on Python. The use of Casadi essentially comprises 3 steps: constructing variables, constructing objective functions and setting solvers, the whole process is very visual and friendly.

First, a CasADi optimization package is introduced:

import casadi as ca

opti＝ca.Opti()

when constructing variables to be optimized, the optimization variables can be defined through an opti.variable () function, and relevant parameters can be intuitively defined through the opti.parameter (), such as:

opt_control=opti.variable (N) # defines the control amount to be optimized for N time points in MPC

opt_states=opti.variable (n+1) # defines the state quantity to be optimized for n+1 time points in MPC

Subsequently, the constraint related to the opti_to () function needs to be defined and the optimization target opt_cost needs to be set, such as

opti.subject_to(self.opti.bounded(-self.v_max,v,self.v_max))

opt_cost＝self.opt_cost+ca.mtimes([self.opt_controls[i,:],R,self.opt_controls[i,:].T])

When the optimization solution is carried out, the method utilizes the ipopt optimization solution package to calculate, and the specific codes are as follows:

#Optimizer configures

opti.minimize(opt_cost)

opts_setting＝{'ipopt.max_iter':200,'ipopt.print_level':1,'print_time':0,'ipopt.acceptable_tol':1e-5,'ipopt.acceptable_obj_change_tol':1e-5}

opti.solver('ipopt',opts_setting)

for the dynamics simulation of multiple intelligent agents, the project adopts a Robotarium-open-source multi-robot simulation platformhttps://github.com/robotarium/robotarium_python_simulator) The simulation modeling of multiple agents can be conveniently realized based on the platform.

In order to improve the universality of demonstration software, the project adopts an object-oriented programming idea, and an MPC-GOCBF algorithm is designed into a class: class clf_cbf_nmpc (), which completes the specific solution calculations associated with the MPC-GOCBF algorithm. The main functions of the design are:

# __ add_system_constraints defining physical Property constraints

# __ add_dynamics_constraints: defining an agent dynamics model

# __ add_safe_constraints: implementation task constraint description based on GOCBF

# __ set_cost_func: defining a cost function

# solid: performing specific MPC-GOCBF optimization solution calculations

6.2 Multi-agent motion Path Generation test

In order to test the multi-agent motion planning simulation demonstration software and verify the performance of the algorithm provided by the invention, the project utilizes the software environment to test the dynamic obstacle avoidance of the multi-agent based on the MPC-GOCBF algorithm. The obstacle avoidance agent needs to pass through a plurality of dynamic agents to reach a target state, and a dynamic simulation model of the obstacle avoidance agent is as follows:

and the simulation software environment is utilized to add physical characteristics, task requirements and dynamic obstacle avoidance constraints, a simulation experiment is carried out, and a screenshot of a single simulation process and an intelligent body motion path are shown in fig. 11.

In the simulation test process, the initial positions of other intelligent agents are randomly generated, the simulation is operated for 100 times, the number of other obstacle intelligent agents is changed, the MPC-GOCBF algorithm is tested and evaluated from three evaluation indexes of collision avoidance success rate, average path length and average time, and compared with an optimal-decay CBF method, and the results are shown in Table 1.

TABLE 1 Multi-agent collision avoidance performance statistics

As can be seen from Table 1, the success rate of the MPC-GOCBF algorithm provided by the invention is reduced along with the increase of the number of obstacles, but the overall performance is kept stable, and the problem of multi-agent collision avoidance is better solved.

7. Aiming at the problems that task constraints are limited in a geometric space, consideration of dynamic characteristics of an intelligent agent is lacked, and overall consideration among different task constraints is lacked, the invention provides an optimizable GOCBF function, dynamic physical simulation and motion planning are combined based on MPC-GOCBF, and the unified description of the multi-task constraints of the motion planning is realized by utilizing an improved CBF function based on an augmented physical simulation thought, so that the explicit constraint relation analysis solving (or searching) process is converted into implicit simulation calculation, a multi-intelligent-agent motion planning physical simulation realization environment is constructed, the algorithm is tested and verified, and experimental results prove the flexibility and the dynamic adaptability of the algorithm.

The foregoing is merely illustrative of specific embodiments of the present invention, and the scope of the invention is not limited thereto, but any modifications, equivalents, improvements and alternatives falling within the spirit and principles of the present invention will be apparent to those skilled in the art within the scope of the present invention.

Claims

1. The discrete control fence function improvement and optimization method based on the discrete time control fence function is characterized by comprising the following steps of:

step one, obtaining a feasible state set according to the control fence function constraint requirement of a discretization form;

step four, changing the constraint requirement of the discrete form control fence function, and directly changing h (x _t+kt ) Optimizing the satisfaction degree of task constraint;

step five, S is introduced into the optimization target _k The related preference targets enable the multi-agent motion path data of the complex dynamic scene obtained by simulation optimization to accord with the preference;

in the fourth step, the constraint requirement of the control fence function in a discrete form is changed into:

wherein S is _k For the newly introduced optimization variables, for substitution of (1-gamma) h (x _t ) Directly to h (x _t+kt ) Optimizing the satisfaction degree of task constraint;

introducing S into the optimization target in the fifth step _k The related preference targets, enabling the multi-agent motion path data of the complex dynamic scene obtained by simulation optimization to conform to the preference comprises the following steps:

introducing S in optimization objective _k A related preference target comprising Φ (S _k )＝(S ^T -S ₀ )W _S (S-S ₀ ) And enabling the motion path obtained by simulation optimization to accord with the preference.

2. The improved discrete-time-controlled fence function optimization method based on the discrete-time-controlled fence function as set forth in claim 1, wherein said step one is

In the form of discretization of:

h _t+Δt -h _t ≥-γh _t ；

S _CBF,k ＝{x∈X:h(x _t+k+1 )≥(1-γ _k )h(x _t+k )}；

where the range of the feasible state set is defined by h (x _t+k ) And gamma _k And (5) jointly determining.

3. The improved discrete control fence function optimization method based on a discrete time control fence function as set forth in claim 1, wherein in said step two, γ is defined in the existing control fence function definition _k For a predetermined hyper-parameter, control gamma in the fence function _k And carrying out dynamic adjustment according to the actual conditions of the constraint optimization solution of the tasks.

4. The improved discrete control fence function optimization method based on a discrete time control fence function as set forth in claim 1, wherein in said step three, γ in the control fence is to be performed _k Is optimizableControlling the fence function.

5. A multi-agent motion trajectory optimization system based on a discrete time control fence function applying the discrete time control fence function based discrete control fence function improvement optimization method of any one of claims 1 to 4, characterized in that the multi-agent motion trajectory optimization system based on a discrete time control fence function comprises:

a constraint requirement changing module for changing the constraint requirement of the control fence function in a discrete form and directly changing the constraint requirement of h (x _t+kt ) Optimizing the satisfaction degree of task constraint;

6. A computer device comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the discrete-time-control-fence-function-based discrete control fence function improvement optimization method of any one of claims 1 to 4.

7. A computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the discrete-time control fence function based discrete-control fence function improvement optimization method of any one of claims 1 to 4.

8. An information data processing terminal, wherein the information data processing terminal is used for realizing the function of the multi-agent motion trail optimization system based on the discrete time control fence function according to claim 5.