CN111882184A - Multi-agent system null space behavior control dynamic task priority planning method - Google Patents

Multi-agent system null space behavior control dynamic task priority planning method Download PDF

Info

Publication number
CN111882184A
CN111882184A CN202010677791.7A CN202010677791A CN111882184A CN 111882184 A CN111882184 A CN 111882184A CN 202010677791 A CN202010677791 A CN 202010677791A CN 111882184 A CN111882184 A CN 111882184A
Authority
CN
China
Prior art keywords
task
priority
tasks
dynamic
planning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010677791.7A
Other languages
Chinese (zh)
Other versions
CN111882184B (en
Inventor
黄捷
张祯毅
陈宇韬
蔡逢煌
齐义文
王武
柴琴琴
林琼斌
李卓敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN202010677791.7A priority Critical patent/CN111882184B/en
Publication of CN111882184A publication Critical patent/CN111882184A/en
Application granted granted Critical
Publication of CN111882184B publication Critical patent/CN111882184B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • G06Q10/06316Sequencing of tasks or work
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Educational Administration (AREA)
  • Feedback Control In General (AREA)

Abstract

The invention relates to a zero-space behavior control dynamic task priority planning method for a multi-agent system, which comprises the following steps of firstly, establishing a composite task by a zero-space projection method; secondly, converting the dynamic task priority planning problem into a switching mode optimal control problem; and finally, solving the optimal control problem by using a mixed integer optimization method to obtain an optimal composite task track. The method solves the problem of dynamic task priority planning of multi-agent system zero-space behavior control, does not need artificial switching conditions for setting task priorities, reduces the workload of researchers, has good expandability, and can be applied to large-quantity task priority dynamic planning which is difficult to process by a traditional logic method. In addition, the method considers the prediction information of the future state of the agent in the process of switching the task priority, so that the method has more ideal switching effect compared with the traditional logic method.

Description

Multi-agent system null space behavior control dynamic task priority planning method
Technical Field
The invention relates to the technical field of intelligent robots, in particular to a zero-space behavior control dynamic task priority planning method for a multi-agent system.
Background
The multi-agent system has the characteristics of autonomy, fault tolerance, flexibility, expandability, cooperative capability and the like, and is widely applied to the fields of military affairs and civil affairs. Multiple autonomous mobile robots, a typical multi-agent system, are generally required to work in unstructured environments and to perform multiple tasks including moving along preset trajectories, avoiding static and dynamic obstacles, maintaining formation and aggregation, etc. However, these tasks may collide during execution, for example, an autonomous mobile robot cannot move along a preset trajectory while avoiding obstacles appearing on the preset trajectory.
Behavior control has been extensively studied in the last decade as one of the ways to resolve task conflicts. The task priority is set and a collaborative scheme with the priority task is designed to solve the conflict between the tasks. The traditional behavior control method mainly includes a hierarchical control method (the layered control approach) and a motion mode control method (the motor schema control approach), wherein the hierarchical control method is a competitive method, that is, in the process of task execution, only a task with high priority is executed, and a task with low priority can be executed only after the task with high priority is completed; the motion pattern control method is a collaborative method in which the velocity outputs of all tasks are weighted and executed as the final task output, so that none of the tasks is completely completed. Then, the literature (g.antonelliand s.chiaveri, "Kinematic control of planes of autonomus events," ieee transactions on Robotics, vol.22, No.6, pp.1285-1292,2006.) proposes a null-space based behavior control that projects low priority tasks to the null-space of high priority tasks while ensuring that high priority tasks are fully executed, so as to utilize the redundancy of the system to execute sub-priority tasks, which would not be executed if they used the full freedom of the system. Therefore, the null-space behavior control has a good potential in handling task conflicts. However, the priority of a task is determined by the supervision of the task and is usually set in advance and cannot be changed during the execution of the task. This may degrade the control performance of the null-space behavior control method, limiting its application in dynamic and unknown environments. For example, when an obstacle does not exist, setting the obstacle avoidance task to the highest priority level may affect the execution of other tasks.
In this regard, researchers have conducted certain research. A finite state machine is a state transition mechanism in which the state of a system automatically transitions from one state to another state based on a state transition trigger condition. Under the framework of zero-space behavior control, different task priority sequences are hidden in the state of each finite-state machine, and the task priority is planned by designing a trigger condition of state transition. In addition, the fuzzy logic method is to design some fuzzy rules to realize the task priority planning. However, both of these methods require a manual switching rule of task priorities designed in advance, which is very difficult when the number of tasks is very large or in a dynamic and unknown environment, and both of these methods use only current state information to the autonomous robot without predicting a future state, thereby degrading performance of priority switching.
Disclosure of Invention
In view of the above, the invention provides a multi-agent system zero-space behavior control dynamic task priority planning method, which converts a task priority dynamic planning problem into an optimization problem to be processed, and obtains an optimal composite task track by solving the optimization problem, aiming at the defects that the existing zero-space behavior control priority dynamic planning needs a person to set priority switching rules in advance and is difficult to process a large number of task priority dynamic plans.
The invention is realized by adopting the following scheme: a multi-agent system null space behavior control dynamic task priority planning method comprises the following steps:
step S1: designing a basic task rho of an intelligent agent by using a zero-space-based behavior control method, enumerating the possibility of the priorities of all tasks, and combining the designed basic behaviors into a composite task in different priority sequences by using a zero-space projection method;
step S2: converting the dynamic task priority planning problem into a switching mode optimal control problem at each sampling moment by using a model predictive control technology; step S3: and solving the optimal control problem by using a mixed integer optimization method to obtain an optimal composite task track.
Further, the step S1 specifically includes the following steps:
step S11: design of basic tasks:
a basic task is formed by a task variable rho epsilon RmCoding is carried out, which is a code matching with the system configuration x epsilon RnA function of correlation, and can be expressed as:
ρ=g(x) (1)
differentiation thereof gives:
Figure BDA0002583779710000041
wherein J (x) e Rm×nIs a task Jacobian matrix related to system configuration, v belongs toRnIs the speed of the system; by locally linear inverse mapping, the velocity v of the referencedExpressed as:
Figure BDA0002583779710000042
where ρ isdA function of the desired task is represented,
Figure BDA0002583779710000043
representing a pseudo-inverse of a jacobian matrix; in practical applications, the integral of the reference velocity will result in a numerical drift of the position of the agent, and therefore the following form of closed-loop inverse kinematics (closed-loop inverse kinematics) is used to compensate for this numerical drift:
Figure BDA0002583779710000044
where a is a suitably constant positive definite gain matrix,
Figure BDA0002583779710000045
is a task error;
step S12: and (3) constructing a composite task:
a compound task is a combination of a plurality of basic behaviors in a certain priority order; setting up
Figure BDA0002583779710000046
Is a task function where i ∈ Nr,Nr={1,...,r},miA dimension representing a task space; defining a time-dependent priority function yi(i,t):Nr×[0,∞]→Nr(ii) a At the same time, a task hierarchy is defined with the following rules:
a task i with a priority of y (i) cannot interfere with a task j with a priority of y (j), if
Figure BDA0002583779710000047
The mapping relationship from the speed to the task speed is formed by the taskOf the jacobian matrix
Figure BDA0002583779710000048
Figure BDA0002583779710000049
Represents;
the dimension of task i with the lowest priority may be larger than
Figure BDA00025837797100000410
Thus ensuring that dimension n is greater than the total dimension of all tasks;
the value of y (i) is distributed by the task supervisor according to the requirements of the task and the sensor information;
by assigning a given priority to the basic tasks, the speed of the compound task at time t is expressed as:
Figure BDA0002583779710000051
wherein v isi,tIs the speed of task j with priority y (j, t) ═ i; equation (5) is the composite task formed after all the basic tasks are subjected to zero-space projection.
Further, the specific content of step S2 is:
consider a set of n (n ≧ 2) autonomous robots and a series of m (m ≧ 2) tasks for each autonomous robot; defining a state vector
Figure BDA0002583779710000052
Is composed of
Figure BDA0002583779710000053
Wherein each element represents a state of a robot; introducing a binary variable
Figure BDA0002583779710000054
Wherein each element wi(t)∈{0,1}M(ii) a The dynamics of the robot are written as convex combinations of dynamics in different modes:
Figure BDA0002583779710000055
wherein
Figure BDA0002583779710000056
Is wiThe (j) th element of (a),
Figure BDA0002583779710000057
a velocity in the form of formula (5) for the jth task of the ith robot; the total number of the task speeds is M ═ M! (ii) a
Figure BDA0002583779710000059
One constraint of (2) is expressed as:
Figure BDA0002583779710000058
this constraint is called a class of special-ordered-set-of-type-one, which guarantees that at least and only one mode is activated at any time;
defining the tracking error of the ith robot as:
Figure BDA0002583779710000061
wherein
Figure BDA0002583779710000062
Is the desired task function of the ith robot. The cost function is defined as follows:
Figure BDA0002583779710000063
wherein u isiDenotes the relaxation variable, KiAnd PiIs a normal number; thus, one optimal control problem is described as:
Figure BDA0002583779710000064
Figure BDA0002583779710000065
Figure BDA0002583779710000066
Figure BDA0002583779710000067
wherein
Figure BDA0002583779710000068
In the initial state, the state of the device is as follows,
Figure BDA0002583779710000069
the position of an obstacle detected by the ith robot at time T is shown, and T is a prediction range; the relaxation vector is
Figure BDA00025837797100000610
Constraint (10d) ensures that all relaxation variables are non-negative, making (10c) a soft-beam of the local minimum problem.
Further, the specific content of step S3 is:
different initial states at each sampling time in the task execution process
Figure BDA00025837797100000613
Such that the problem (10) (10a) (10b) (10c) (10d) must be solved repeatedly; however, problem (10) contains integer variables that are difficult to solve; the problem (10) can be effectively solved on line by adopting a real-time model predictive control method, and dynamic task priority planning is provided; the algorithm adopts a direct method of first dispersion and then optimization and external convection of an integer relaxation technology;
will time domain [0, T]Divided into N intervals characterised by equidistant grid points, i.e. 0-t0<t1<···<tN-1<tN=T,
Figure BDA00025837797100000611
At each interval, a binary variable is assumed
Figure BDA00025837797100000612
Is constant and its value can only be changed at grid points; the binary variable is relaxed to the real variable and satisfied
Figure BDA0002583779710000071
Using the multi-objective targeting method, the problem (10) is transformed into a nonlinear programming problem and is represented as:
Figure BDA0002583779710000072
Figure BDA0002583779710000073
Figure BDA0002583779710000074
Figure BDA0002583779710000075
Figure BDA0002583779710000076
wherein
Figure BDA0002583779710000077
s is the current sample and the current sample,
Figure BDA0002583779710000078
predicting at sample s, x for k step statei,s|kIs composed of
Figure BDA0002583779710000079
The ith element of (1); in the formula (11), the reaction mixture,
Figure BDA00025837797100000710
is the number of passesThe function resulting from the value-integral solution of the nonlinear dynamical equation (6) and the constraint (11c) ensures that the state trajectory is continuous over the entire prediction range after the solution of the problem (11), i.e. the equation 11(a)11(b)11(c)11(d)11(e), converges; the nonlinear programming problem shown in equation (11) can be solved efficiently by a standard nonlinear programming problem solver using a sequential quadratic programming or interior point method without using an integer optimization algorithm;
after solving equations 11(a), 11(b), 11(c), 11(d), 11(e), a summary rounding method is applied from
Figure BDA00025837797100000711
In-taking binary variables
Figure BDA00025837797100000712
The steps are as follows:
Figure BDA00025837797100000713
Figure BDA0002583779710000081
wherein j is 1, 0, M, k is 0, N-1, and the optimal composite task trajectory is formula (12a) and formula (12b), and the formula (12a) and formula (12b) have the following propositions;
if it is not
Figure BDA0002583779710000082
Is measurable and satisfies
Figure BDA0002583779710000083
Then function
Figure BDA0002583779710000084
Is converted from the formulas (12a) (12b) by using the zero order hold, and satisfies
Figure BDA0002583779710000085
Wherein
Figure BDA0002583779710000086
Satisfy the requirement of
Figure BDA0002583779710000087
Compared with the prior art, the invention has the following beneficial effects:
(1) the invention takes task supervision controlled by zero-space behavior as a research object, and researches aiming at the problem of dynamic task priority planning; the invention can effectively solve the problem of dynamic task priority planning of multi-agent system null space behavior control, avoids the switching rule of artificial predesigned tasks and reduces the workload of researchers. Meanwhile, the proposed method has good expandability and can be used for processing the problem of dynamic task priority planning in a large number of tasks and a dynamic unknown environment, and the traditional method needs a design rule in advance, so that great difficulty can be brought to designers under the conditions of a large number of tasks and an unknown environment, and the designers are difficult to consider all possible complex conditions, thereby influencing the performance of task priority switching. The method does not need to artificially set the switching condition of the task priority, thereby reducing the workload of researchers, having good expandability and being applied to the dynamic planning of a large number of task priorities which are difficult to process by the traditional logic method. In addition, the method not only considers the current state information of the autonomous robot but also considers the future state information in the process of switching the task priorities, and further improves the task priority switching effect.
(2) The method introduces a model predictive control technology into the dynamic task priority planning of the zero-space behavior control, converts the planning problem into an optimization problem at each moment for processing, and obtains the optimal task priority track by solving the designed optimization problem. The method does not need to artificially set task priority switching conditions, and can be applied to large-quantity task priority dynamic planning which is difficult to process by the existing method.
Drawings
FIG. 1 is a schematic block diagram of a method of an embodiment of the invention.
FIG. 2 shows an embodiment of the present invention with N
Figure BDA0002583779710000091
N-1 initial guesses of
Figure BDA0002583779710000092
Example graph of the direct discretization method of initial guessing.
FIG. 3 is a continuous trace plot after problem (11) solution convergence according to an embodiment of the present invention.
Fig. 4 is a robot trajectory diagram of the model predictive control method and the conventional method according to the embodiment of the present invention.
Fig. 5 is a diagram illustrating a distance between the robot and an obstacle according to the embodiment of the present invention.
Fig. 6 is a tracking error map of the robot according to the embodiment of the present invention.
FIG. 7 is a track diagram of a compound task according to an embodiment of the present invention.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
As shown in fig. 1, the present embodiment provides a method for zero-space behavior control dynamic task priority planning for a multi-agent system, which includes the following steps:
step S1: designing a basic task rho of an intelligent agent by using a zero-space-based behavior control method, enumerating the possibility of the priorities of all tasks, and combining the designed basic behaviors into a composite task in different priority sequences by using a zero-space projection method; if there are m basic tasks, there are theoretically m! A composite task;
step S2: converting the dynamic task priority planning problem into a switching mode optimal control problem at each sampling moment by using a model predictive control technology; the method mainly comprises the following steps: introducing a binary variable, performing evagination processing, and constructing a target function and a constraint condition, so as to convert the dynamic task priority planning problem into a switching mode optimal control problem;
step S3: and solving the optimal control problem by using a mixed integer optimization method to obtain an optimal composite task track. The main contents comprise: discretization, relaxation and projection to integer sets.
In this embodiment, the step S1 specifically includes the following steps:
step S11: design of basic tasks:
a basic task is formed by a task variable rho epsilon RmCoding is carried out, which is a code matching with the system configuration x epsilon RnA function of correlation, and can be expressed as:
ρ=g(x) (1)
differentiation thereof gives:
Figure BDA0002583779710000111
wherein J (x) e Rm×nIs a task Jacobian matrix related to system configuration, v belongs to RnIs the speed of the system; by locally linear inverse mapping, the velocity v of the referencedExpressed as:
Figure BDA0002583779710000112
where ρ isdExpressing the expected task function (which is a position function related to time, namely the track of the intelligent agent and is the expected task function of behavior control)
Figure BDA0002583779710000113
Representing a pseudo-inverse of a jacobian matrix; in practical applications, the integral of the reference velocity will result in a numerical drift of the position of the agent, and therefore the following form of closed-loop inverse kinematics (closed-loop inverse kinematics) is used to compensate for this numerical drift:
Figure BDA0002583779710000114
where a is a suitably constant positive definite gain matrix,
Figure BDA0002583779710000115
is a task error;
step S12: and (3) constructing a composite task:
a compound task is a combination of a plurality of basic behaviors in a certain priority order; setting up
Figure BDA0002583779710000116
Is a task function where i ∈ Nr,Nr={1,...,r},miA dimension representing a task space; defining a time-dependent priority function yi(i,t):Nr×[0,∞]→Nr(ii) a At the same time, a task hierarchy is defined with the following rules:
a task i with a priority of y (i) cannot interfere with a task j with a priority of y (j), if
Figure BDA0002583779710000117
The mapping relation from the speed to the task speed is determined by the Jacobian matrix of the task
Figure BDA0002583779710000118
Figure BDA0002583779710000119
Represents;
the dimension of task i with the lowest priority may be larger than
Figure BDA00025837797100001110
Thus ensuring that dimension n is greater than the total dimension of all tasks;
the value of y (i) is distributed by the task supervisor according to the requirements of the task and the sensor information;
by assigning a given priority to the basic tasks, the speed of the compound task at time t is expressed as:
Figure BDA0002583779710000121
wherein v isi,tIs the speed of task j with priority y (j, t) ═ i. Formula (5) is a composite task formed after all basic tasks are subjected to zero-space projection, and is different from the conventional method in that the priority of the tasks is time-varying, for example, the basic task with the highest priority is not fixed and is determined according to the requirements of the tasks, namely, the subsequent optimization problem and the solution thereof. The null space projection method is the part I-J + J in the formula, wherein J + is a Jacobian pseudo inverse matrix of a high-priority task, J is a Jacobian matrix of a low priority, and the physical meaning of the part is to project a task with a suboptimal priority onto the null space of the high-priority task, wherein I-J + J is the so-called null space projection.
In this embodiment, the specific content of step S2 is:
consider a set of n (n ≧ 2) autonomous robots and a series of m (m ≧ 2) tasks for each autonomous robot; defining a state vector
Figure BDA0002583779710000122
Is composed of
Figure BDA0002583779710000123
Wherein each element represents a state of a robot; introducing a binary variable
Figure BDA0002583779710000124
Wherein each element wi(t)∈{0,1}M(ii) a The dynamics of the robot are written as convex combinations of dynamics in different modes:
Figure BDA0002583779710000125
wherein
Figure BDA0002583779710000126
Is wiThe (j) th element of (a),
Figure BDA0002583779710000127
a velocity in the form of formula (5) for the jth task of the ith robot; the total number of the task speeds is M ═ M! (ii) a
Figure BDA0002583779710000131
One constraint of (2) is expressed as:
Figure BDA0002583779710000132
this constraint is called a class of special-ordered-set-of-type-one, which guarantees that at least and only one mode is activated at any time;
defining the tracking error of the ith robot as:
Figure BDA0002583779710000133
wherein
Figure BDA0002583779710000134
Is the expected task function of the ith robot; the cost function is defined as follows:
Figure BDA0002583779710000135
wherein u isiDenotes the relaxation variable, KiAnd PiIs a normal number; thus, one optimal control problem is described as:
Figure BDA0002583779710000136
Figure BDA0002583779710000137
Figure BDA0002583779710000138
Figure BDA0002583779710000139
wherein
Figure BDA00025837797100001310
In the initial state, the state of the device is as follows,
Figure BDA00025837797100001311
the position of an obstacle detected by the ith robot at time T is shown, and T is a prediction range; the relaxation vector is
Figure BDA00025837797100001312
Constraint (10d) ensures that all relaxation variables are non-negative, making (10c) a soft-beam of the local minimum problem.
In this embodiment, the specific content of step S3 is:
different initial states at each sampling time in the task execution process
Figure BDA0002583779710000149
Such that the problem (10) (10a) (10b) (10c) (10d) must be solved repeatedly; however, problem (10) contains integer variables that are difficult to solve; the real-time model prediction control method can be effectively used for on-lineSolving the problem (10), providing a dynamic task priority plan; the algorithm adopts a direct method of first dispersion and then optimization and external convection of an integer relaxation technology; problem 10 is the complete formula of (10a) (10b) (10c) (10d), which constitutes an optimization problem, where 10(a) is a cost function and 10b) (10c) (10d) is its constraint, and the so-called solution optimization problem is to solve 10(a) on the basis of the constraints (10b) (10c) (10d), so that problem 10 refers to the complete formula.
Will time domain [0, T]Divided into N intervals characterised by equidistant grid points, i.e. 0-t0<t1<···<tN-1<tN=T,
Figure BDA0002583779710000141
At each interval, a binary variable is assumed
Figure BDA0002583779710000142
Is constant and its value can only be changed at grid points; the binary variable is relaxed to the real variable and satisfied
Figure BDA0002583779710000143
Using the multi-objective targeting method, the problem (10) is transformed into a nonlinear programming problem and is represented as:
Figure BDA0002583779710000144
Figure BDA0002583779710000145
Figure BDA0002583779710000146
Figure BDA0002583779710000147
Figure BDA0002583779710000148
wherein
Figure BDA0002583779710000151
s is the current sample and the current sample,
Figure BDA0002583779710000152
predicting at sample s, x for k step statei,s|kIs composed of
Figure BDA0002583779710000153
The ith element of (1); an example of such discretization in the initial guess is shown in fig. 2. In the formula (11), the reaction mixture,
Figure BDA0002583779710000154
is a function obtained by solving the nonlinear dynamical equation (6) by numerical integration, and the constraint (11c) ensures that the state trajectory is continuous in the whole prediction range after the solution of the problem (11), i.e. the equation 11(a)11(b)11(c)11(d)11(e), reaches convergence; a diagram of the state of the solution and the pattern trajectory is shown in fig. 3. The nonlinear programming problem shown in equation (11) can be solved efficiently by a standard nonlinear programming problem solver using a sequential quadratic programming or interior point method without using an integer optimization algorithm;
after solving equations 11(a), 11(b), 11(c), 11(d), 11(e), a summary rounding method is applied from
Figure BDA0002583779710000155
In-taking binary variables
Figure BDA0002583779710000156
The steps are as follows:
Figure BDA0002583779710000157
Figure BDA0002583779710000158
where j is 1, 0, M, k is 0, N-1, and the optimal compound task trajectory is equation (12a) and equation (12b),
the optimal composite task track is embodied in a formula (12a) and a formula (12b), because the weight of one composite task is 1 and is activated, and the other composite tasks are 0, which are obtained by solving the optimization problem at the moment, the solution is the optimal composite task at the current moment, and the basic tasks of the composite task are composed in a certain priority order, the priority order at the moment is the optimal priority order, and the optimal composite tasks at different moments can be obtained in the whole time field, so the optimal composite task track is obtained.
The propositions for the formulas (12a) and (12b) are as follows;
if it is not
Figure BDA0002583779710000161
Is measurable and satisfies
Figure BDA0002583779710000162
Then function
Figure BDA0002583779710000163
Is converted from the formulas (12a) (12b) by using the zero order hold, and satisfies
Figure BDA0002583779710000164
Wherein
Figure BDA0002583779710000165
Satisfy the requirement of
Figure BDA0002583779710000166
Preferably, in this embodiment, the basic task ρ of the agent is designed by using a zero-space-based behavior control method, and the basic behaviors are combined into the composite task in different priority orders by using a zero-space projection method.
Preferably, in this embodiment, first, a binary variable is introduced
Figure BDA0002583779710000167
Secondly, carrying out convex treatment; finally, an objective function is constructed
Figure BDA0002583779710000168
And constraint conditions, and further converting the dynamic task priority planning problem into a switching mode optimal control problem at each sampling moment.
Preferably, in this embodiment, in order to process an optimization problem including integer variables, a real-time model prediction control method is provided, and first, a multi-target shooting method (Multiple shooting) is used to discretize the optimization problem of continuous time, then, a relaxation process is performed, and finally, a sum-up-rounding method is used to project a solution to an integer set, so as to obtain an optimal composite task trajectory.
Preferably, in the present embodiment, an embodiment is given to show the effectiveness and superiority of the proposed method for dynamic task priority planning for null-space behavior control of multi-agent system based on model predictive control.
1. Design of basic tasks
a. Moving tasks
In the movement task, the robot moves to a target point along a predetermined trajectory. The corresponding task function may be defined as:
ρB=p∈R2(14)
wherein p ═ px,py]TIs the position of the robot. The speed of the corresponding desired movement task may be expressed as:
Figure BDA0002583779710000171
wherein JB=I2Jacobian matrix for moving tasks, I2The unit matrix is represented by a matrix of units,
Figure BDA0002583779710000172
to move the pseudo-inverse of the task jacobian,ρB,dfor the desired task function, ΛBThe gain matrix is determined positively for a constant of the movement task,
Figure BDA0002583779710000173
is the error of the movement task.
b. Obstacle avoidance task
In the obstacle avoidance task, the robot must avoid the obstacle detected on the preset trajectory by the sensor. The corresponding task function may be defined as:
ρA=||p-pO||∈R (16)
wherein p isO=[pxO,pyO]TIs the location of the obstacle. The speed of the corresponding desired obstacle avoidance task may be expressed as:
Figure BDA0002583779710000174
wherein
Figure BDA0002583779710000175
For the jacobian matrix of the obstacle avoidance task,
Figure BDA0002583779710000176
Figure BDA0002583779710000177
pseudo-inverse of jacobian matrix for obstacle avoidance tasks, ΛAA constant positive definite gain matrix for the obstacle avoidance task,
Figure BDA0002583779710000178
d represents a safe distance for avoiding the task error of the obstacle avoidance task.
2. Construction of composite tasks
The first composite task takes the obstacle avoidance task as the highest priority task, and the final composite task speed obtained according to the zero space projection method is as follows:
and (4) compound task A:
Figure BDA0002583779710000181
wherein
Figure BDA0002583779710000182
Is a zero space of the obstacle avoidance task.
The second compound task takes the mobile task as the highest priority task, and the final compound task speed obtained according to the zero space projection method is as follows:
and (4) compound task B:
Figure BDA0002583779710000183
wherein
Figure BDA0002583779710000184
Is a zero space of the obstacle avoidance task.
3. Simulation comparison and analysis
In the given simulation case, the model predictive control method is compared with a traditional logic method, wherein the traditional logic method switches task priorities by judging the distance between the obstacle and the robot and the size of the safe distance, namely, the obstacle avoidance task is the highest priority when the distance between the obstacle and the robot is smaller than the safe distance, and the mobile task is the highest priority when the distance between the obstacle and the intelligent body is larger than the safe distance. The preset tracks of the three robots are respectively
Figure BDA0002583779710000185
The obstacle avoidance task gain is 15, the moving task gain is 13, the safety distance is 1m, and the positions of the three obstacles are PO1=[10;10.5],PO2=[8;12.5],PO3=[16;11.5]The sampling frequency of the model predictive control method and the sampling frequency of the traditional logic method are both 20HZ, and the prediction time range of the model predictive control method is 3 s. The robot trajectory of the proposed model predictive control method and the conventional method is shown in fig. 4, the distance between the robot and the obstacle is shown in fig. 5, and the tracking error of the robot is shown in fig. 6. The result shows that the proposed model predictive control method is obviously superior to the traditional method under the same sampling frequency. The trajectory of the compound task is shown in fig. 7. The conventional method is as followsWhen an obstacle is encountered, switching is frequently performed in order to avoid the obstacle, resulting in oscillation of the trajectory and violation of the safety distance. On the other hand, the proposed model predictive control method can smoothly complete the task by only using two priority switches under the condition of not violating the safety distance constraint. This can be explained by the fact that the model predictive control method is able to predict obstacles in advance, thereby preparing priority switching in advance. This demonstrates the effectiveness of the proposed model predictive control method and its advantages over traditional logic-based methods.
The above description is only a preferred embodiment of the present invention, and all equivalent changes and modifications made in accordance with the claims of the present invention should be covered by the present invention.

Claims (4)

1. A method for planning the priority of dynamic tasks of a multi-agent system for controlling zero-space behavior is characterized in that: the method comprises the following steps:
step S1: designing a basic task rho of an intelligent agent by using a zero-space-based behavior control method, enumerating the possibility of the priorities of all tasks, and combining the designed basic behaviors into a composite task in different priority sequences by using a zero-space projection method;
step S2: converting the dynamic task priority planning problem into a switching mode optimal control problem at each sampling moment by using a model predictive control technology;
step S3: and solving the optimal control problem by using a mixed integer optimization method to obtain an optimal composite task track.
2. The multi-agent system null-space behavior control dynamic task priority planning method according to claim 1, characterized in that: the step S1 specifically includes the following steps:
step S11: design of basic tasks:
a basic task is formed by a task variable rho epsilon RmCoding is carried out, which is a code matching with the system configuration x epsilon RnA function of correlation, and can be expressed as:
ρ=g(x) (1)
differentiation thereof gives:
Figure FDA0002583779700000011
wherein J (x) e Rm×nIs a task Jacobian matrix related to system configuration, v belongs to RnIs the speed of the system; by locally linear inverse mapping, the velocity v of the referencedExpressed as:
Figure FDA0002583779700000012
where ρ isdA function of the desired task is represented,
Figure FDA0002583779700000013
representing a pseudo-inverse of a jacobian matrix; in practical applications, the integration of the reference velocity will result in a numerical drift of the position of the agent, and therefore the following form of closed loop inverse kinematics is used to compensate for this numerical drift:
Figure FDA0002583779700000021
where a is a suitably constant positive definite gain matrix,
Figure FDA0002583779700000022
is a task error;
step S12: and (3) constructing a composite task:
a compound task is a combination of a plurality of basic behaviors in a certain priority order; setting up
Figure FDA0002583779700000023
Is a task function where i ∈ Nr,Nr={1,...,r},miA dimension representing a task space; defining a time-dependent priority function yi(i,t):Nr×[0,∞]→Nr(ii) a At the same time, a task hierarchy is defined with the following rules:
a task i with a priority of y (i) cannot interfere with a task j with a priority of y (j), if y (i) ≧ y (j),
Figure FDA0002583779700000024
i≠j;
the mapping relation from the speed to the task speed is determined by the Jacobian matrix of the task
Figure FDA0002583779700000025
i∈NrRepresents;
the dimension of task i with the lowest priority may be larger than
Figure FDA0002583779700000026
Thus ensuring that dimension n is greater than the total dimension of all tasks;
the value of y (i) is distributed by the task supervisor according to the requirements of the task and the sensor information;
by assigning a given priority to the basic tasks, the speed of the compound task at time t is expressed as:
Figure FDA0002583779700000027
wherein v isi,tIs the speed of task j with priority y (j, t) ═ i; equation (5) is the composite task formed after all the basic tasks are subjected to zero-space projection.
3. The multi-agent system null-space behavior control dynamic task priority planning method according to claim 2, characterized in that:
consider a set of n (n ≧ 2) autonomous robots and a series of m (m ≧ 2) tasks for each autonomous robot; defining a state vector
Figure FDA0002583779700000031
Is composed of
Figure FDA0002583779700000032
Wherein each element represents a state of a robot; introducing a binary variable
Figure FDA0002583779700000033
Wherein each element wi(t)∈{0,1}M(ii) a The dynamics of the robot are written as convex combinations of dynamics in different modes:
Figure FDA0002583779700000034
wherein
Figure FDA0002583779700000035
Is wiThe (j) th element of (a),
Figure FDA0002583779700000036
a velocity in the form of formula (5) for the jth task of the ith robot; the total number of the task speeds is M ═ M! (ii) a
Figure FDA0002583779700000037
One constraint of (2) is expressed as:
Figure FDA0002583779700000038
this constraint is called a class of special ordered sets, which guarantees that at least and only one mode is activated at any time;
defining the tracking error of the ith robot as:
Figure FDA0002583779700000039
wherein
Figure FDA00025837797000000310
Is a desired task of the ith robotA function; the cost function is defined as follows:
Figure FDA00025837797000000311
wherein u isiDenotes the relaxation variable, KiAnd PiIs a normal number; thus, one optimal control problem is described as:
Figure FDA0002583779700000041
Figure FDA0002583779700000042
Figure FDA0002583779700000043
Figure FDA0002583779700000044
wherein
Figure FDA0002583779700000045
In the initial state, the state of the device is as follows,
Figure FDA0002583779700000046
the position of an obstacle detected by the ith robot at time T is shown, and T is a prediction range; the relaxation vector is
Figure FDA0002583779700000047
Constraint (10d) ensures that all slack variables are non-negative, making (10c) a soft constraint for the local minimum problem.
4. The multi-agent system null-space behavior control dynamic task priority planning method according to claim 1, characterized in that:
during task executionAt each sampling instant, different initial states
Figure FDA00025837797000000413
Such that the problem (10) (10a) (10b) (10c) (10d) must be solved repeatedly; however, problem 10 contains integer variables that are difficult to solve; the problem (10) can be effectively solved on line by adopting a real-time model predictive control method, and dynamic task priority planning is provided; the algorithm adopts a direct method of first dispersion and then optimization and external convection of an integer relaxation technology;
will time domain [0, T]Divided into N intervals characterised by equidistant grid points, i.e. 0-t0<t1<···<tN-1<tN=T,
Figure FDA0002583779700000048
At each interval, a binary variable is assumed
Figure FDA0002583779700000049
Is constant and its value can only be changed at grid points; the binary variable is relaxed to the real variable and satisfied
Figure FDA00025837797000000410
Using the multi-objective targeting method, the problem (10) is transformed into a nonlinear programming problem and is represented as:
Figure FDA00025837797000000411
Figure FDA00025837797000000412
Figure FDA0002583779700000051
Figure FDA0002583779700000052
Figure FDA0002583779700000053
wherein
Figure FDA0002583779700000054
s is the current sample and the current sample,
Figure FDA0002583779700000055
predicting at sample s, x for k step statei,s|kIs composed of
Figure FDA0002583779700000056
The ith element of (1); in the formula (11), the reaction mixture,
Figure FDA0002583779700000057
is a function obtained by solving the nonlinear dynamical equation (6) by numerical integration, and the constraint (11c) ensures that the state trajectory is continuous in the whole prediction range after the solution of the problem (11), i.e. the equation 11(a)11(b)11(c)11(d)11(e), reaches convergence; the nonlinear programming problem shown in equations 11(a)11(b)11(c)11(d)11(e) can be solved efficiently by a standard nonlinear programming problem solver using a sequential quadratic programming or interior point method without using an integer optimization algorithm;
after solving equations 11(a), 11(b), 11(c), 11(d), 11(e), a summary rounding method is applied from
Figure FDA0002583779700000058
In-taking binary variables
Figure FDA0002583779700000059
The steps are as follows:
Figure FDA00025837797000000510
Figure FDA00025837797000000511
wherein j is 1, 0, M, k is 0, N-1, and the optimal composite task trajectory is formula (12a) and formula (12b), and the formula (12a) and formula (12b) have the following propositions;
if it is not
Figure FDA0002583779700000061
Is measurable and satisfies
Figure FDA0002583779700000062
Then function
Figure FDA0002583779700000063
Is converted from the formulas (12a) (12b) by using the zero order hold, and satisfies
Figure FDA0002583779700000064
Wherein
Figure FDA0002583779700000065
Satisfy the requirement of
Figure FDA0002583779700000066
CN202010677791.7A 2020-07-14 2020-07-14 Multi-agent system null space behavior control dynamic task priority planning method Active CN111882184B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010677791.7A CN111882184B (en) 2020-07-14 2020-07-14 Multi-agent system null space behavior control dynamic task priority planning method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010677791.7A CN111882184B (en) 2020-07-14 2020-07-14 Multi-agent system null space behavior control dynamic task priority planning method

Publications (2)

Publication Number Publication Date
CN111882184A true CN111882184A (en) 2020-11-03
CN111882184B CN111882184B (en) 2022-10-14

Family

ID=73150276

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010677791.7A Active CN111882184B (en) 2020-07-14 2020-07-14 Multi-agent system null space behavior control dynamic task priority planning method

Country Status (1)

Country Link
CN (1) CN111882184B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112965722A (en) * 2021-03-03 2021-06-15 深圳华大九天科技有限公司 Verilog-A model optimization method, electronic device and computer readable storage medium
CN113467465A (en) * 2021-07-22 2021-10-01 福州大学 Human-in-loop decision modeling and control method for robot system
CN114469642A (en) * 2022-01-20 2022-05-13 深圳华鹊景医疗科技有限公司 Rehabilitation robot control method and device and rehabilitation robot

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120078585A1 (en) * 2010-06-29 2012-03-29 University Of Connecticut Method and system for constructing geometric skeletons and medial zones of rigid and non-rigid shapes
CN107065547A (en) * 2017-04-07 2017-08-18 西北工业大学 A kind of autonomous rendezvous strategy of noncooperative target based on kernel method
CN108237532A (en) * 2016-12-23 2018-07-03 深圳光启合众科技有限公司 The gait control method, apparatus and robot of multi-foot robot
CN108829113A (en) * 2018-09-01 2018-11-16 哈尔滨工程大学 A kind of adaptive kernel action amalgamation method of multi-robot formation
CN109079780A (en) * 2018-08-08 2018-12-25 北京理工大学 Distributed mobile mechanical arm task hierarchy optimization control method based on generalized coordinates

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120078585A1 (en) * 2010-06-29 2012-03-29 University Of Connecticut Method and system for constructing geometric skeletons and medial zones of rigid and non-rigid shapes
CN108237532A (en) * 2016-12-23 2018-07-03 深圳光启合众科技有限公司 The gait control method, apparatus and robot of multi-foot robot
CN107065547A (en) * 2017-04-07 2017-08-18 西北工业大学 A kind of autonomous rendezvous strategy of noncooperative target based on kernel method
CN109079780A (en) * 2018-08-08 2018-12-25 北京理工大学 Distributed mobile mechanical arm task hierarchy optimization control method based on generalized coordinates
CN108829113A (en) * 2018-09-01 2018-11-16 哈尔滨工程大学 A kind of adaptive kernel action amalgamation method of multi-robot formation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
宗立军等: "自由漂浮空间机器人多约束混合整数预测控制", 《宇航学报》, no. 08, 30 August 2016 (2016-08-30) *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112965722A (en) * 2021-03-03 2021-06-15 深圳华大九天科技有限公司 Verilog-A model optimization method, electronic device and computer readable storage medium
CN113467465A (en) * 2021-07-22 2021-10-01 福州大学 Human-in-loop decision modeling and control method for robot system
CN114469642A (en) * 2022-01-20 2022-05-13 深圳华鹊景医疗科技有限公司 Rehabilitation robot control method and device and rehabilitation robot

Also Published As

Publication number Publication date
CN111882184B (en) 2022-10-14

Similar Documents

Publication Publication Date Title
Siddique Intelligent control: a hybrid approach based on fuzzy logic, neural networks and genetic algorithms
Antsaklis Intelligent control
Inoue et al. “Weak” control for human-in-the-loop systems
Pan et al. Adaptive robust sliding mode trajectory tracking control for 6 degree-of-freedom industrial assembly robot with disturbances
CN111882184B (en) Multi-agent system null space behavior control dynamic task priority planning method
Ebrahimi et al. Intelligent Robust Fuzzy-Parallel Optimization Control of a Continuum Robot Manipulator
Chen et al. Dynamic task priority planning for null-space behavioral control of multi-agent systems
Tan et al. Robust model-free control for redundant robotic manipulators based on zeroing neural networks activated by nonlinear functions
Xiao et al. Time-varying nonholonomic robot consensus formation using model predictive based protocol with switching topology
Zong et al. Obstacle avoidance handling and mixed integer predictive control for space robots
Mellah et al. Adaptive control of bilateral teleoperation system with compensatory neural-fuzzy controllers
Campos et al. PSO tuning for fuzzy PD+ I controller applied to a mobile robot trajectory control
Santibañez et al. Global asymptotic stability of a tracking sectorial fuzzy controller for robot manipulators
Li et al. Event-triggered-based cooperative game optimal tracking control for modular robot manipulator with constrained input
Ngo et al. Adaptive Single-Input Recurrent WCMAC-Based Supervisory Control for De-icing Robot Manipulator
Zhang et al. Trajectory-tracking control of robotic systems via deep reinforcement learning
Crenganis et al. Inverse kinematics of a 7 DOF manipulator using adaptive neuro-fuzzy inference systems
Zhang et al. A behavior-based adaptive dynamic programming method for multiple mobile manipulators coordination control
Xiang et al. Interactive natural motion planning for robot systems based on representation space
Bai et al. Adaptive Hybrid Optimization Learning-Based Accurate Motion Planning of Multi-Joint Arm
Wang et al. Event-driven collision-free path planning for cooperative robots in dynamic environment
Gorkavyy et al. Hybrid model process design of joint operator-robot interaction within a synergistic system
Chen et al. A “look-backward-and-forward” adaptation strategy for assessing parameter estimation error of human motion prediction model
Vasilopoulos et al. Technical Report: A Hierarchical Deliberative-Reactive System Architecture for Task and Motion Planning in Partially Known Environments
Brablc et al. Control of Magnetic Manipulator Using Reinforcement Learning Based on Incrementally Adapted Local Linear Models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant