CN111596691A

CN111596691A - Decision modeling and cooperative control method and system of multi-robot system based on human-in-loop

Info

Publication number: CN111596691A
Application number: CN202010648652.1A
Authority: CN
Inventors: 黄捷; 吴文华; 王武; 齐义文; 柴琴琴; 林琼斌; 李卓敏
Original assignee: Fuzhou University
Current assignee: Fuzhou University
Priority date: 2020-07-07
Filing date: 2020-07-07
Publication date: 2020-08-28
Anticipated expiration: 2040-07-07
Also published as: CN111596691B

Abstract

The invention relates to a decision modeling and cooperative control method and system of a multi-robot system based on a human-in-loop, comprising the following steps: acquiring an output information value after the robot executes a task, and selecting the position deviation information of the robot as the decision information of a person; using a drift diffusion model of a person as a modeling method, and modeling the decision-making behavior of the person according to the decision-making information of the person; and designing a human decision task, and executing the human decision task when the robot cannot rely on the autonomous control system to complete the task to help the robot to complete the task smoothly. The invention combines the drift diffusion model with the behavior control method based on the null space, provides the human drift diffusion model, obtains the corresponding decision threshold formula through the speed-accuracy criterion, and can improve the decision accuracy of the human.

Description

Decision modeling and cooperative control method and system of multi-robot system based on human-in-loop

Technical Field

The invention relates to the technical field of robot application, in particular to a decision modeling and cooperative control method and system of a multi-robot system based on a human-in-loop.

Background

In the past decade, multi-robot systems have received much attention due to their loosely coupled network architecture, and robots can interact to solve problems that cannot be solved by a single robot. In a multi-robot system, robot formation is one of control methods for robots to cooperatively execute tasks. Behavior control is taken as one of formation control technologies, distributed control of a multi-robot system can be achieved, and the method has the advantages of flexible obstacle avoidance and the like, but the stability of formation control cannot be guaranteed by a traditional behavior control method.

Therefore, in order to better realize formation control and improve formation stability performance, human intervention needs to be introduced. Current human-to-multi-robot interactions have been successful in application fields based on formation control methods such as pilot followers and human-to-machine interaction control frameworks, but they lack accurate human models.

Disclosure of Invention

In view of the above, the present invention provides a decision modeling and cooperative control method and system for a multi-robot system based on a human-in-loop, which combines a drift diffusion model with a behavior control method based on a null space to provide a human drift diffusion model, and obtains a corresponding decision threshold formula according to a speed-accuracy criterion, so that the method can improve the decision accuracy of a human.

The invention is realized by adopting the following scheme: a decision modeling and cooperative control method of a multi-robot system based on a human-in-loop specifically comprises the following steps:

acquiring an output information value after the robot executes a task, and selecting the position deviation information of the robot as the decision information of a person;

using a drift diffusion model of a person as a modeling method, and modeling the decision-making behavior of the person according to the decision-making information of the person;

and designing a human decision task, and executing the human decision task when the robot cannot rely on the autonomous control system to complete the task to help the robot to complete the task smoothly.

Further, the tasks executed by the robot comprise a task of moving to a target point and an obstacle avoidance task; the movement to the target point task is defined by the movement of the robot team to the target point, and once each robot reaches the target point, the multi-robot system stops; when an obstacle exists in the process of moving to a target point, the obstacle avoidance task aims to keep the safe distance between the robot and the obstacle, and if the distance between the robot and the obstacle is smaller than the preset safe distance, obstacle avoidance is carried out.

Further, the robot position deviation information is a deviation between an actual position of the robot and a preset position.

Further, the robot position deviation information is obtained by adopting a zero-space-based behavior control method.

Further, the modeling of the human decision behavior according to the human decision information using the human drift diffusion model as a modeling method specifically includes:

the position deviation information of the robot is used as the decision information of the person, in order to reflect the variation of the decision information in unit time, the speed deviation information of the robot is used as the drift rate, and the decision behavior of the person in the man-machine interaction system based on behavior control is modeled:

in the formula (I), the compound is shown in the specification,

is the position deviation amount of the jth robot,

is the speed deviation of the jth robot, W (t) is the standard wiener process, σ_jIs the standard deviation in the wiener process; the speed deviation information of the robot is the deviation between a preset speed and the actual speed of the robot;

in order to obtain accurate human decision opportunity, a human decision threshold is set, human decision information continuously evolves along with time, when the human decision information evolves to a preset threshold, a certain behavior of a human needs to be selected from a behavior set of the human, and the human decision threshold is set as follows:

in the formula (I), the compound is shown in the specification,

C_jis a constant gain of the gain,

is a human decision threshold for the value of the threshold,

indicating the initial position deviation.

Further, the designer decision task is specifically:

when the artificial decision information reaches a threshold value, an operator makes an artificial decision, a behavior set of a designer comprises two behaviors of monitoring and intervention, subtask input is not generated for the robot in the monitoring process, and the intervention task is designed as follows:

in the formula, v_hIs the human intervention taskThe amount of speed output of the service is,

is a Jacobian pseudo-inverse matrix in human intervention tasks,

is the differentiation of the desired human intervention task, Λ_hIs a gain in the human intervention task,

is a human intervention task bias.

Further, when the robot cannot rely on the autonomous control system to complete the task, executing a human decision task to help the robot to smoothly complete the task specifically comprises: setting an intervention task in a human decision task as a highest priority task, projecting an original robot autonomously executing task into a null space of the human intervention task, and finally obtaining a speed output instruction of the robot under human intervention:

in the formula, v_jIs the speed output command of the jth robot under human intervention,

zero-space matrix, v, representing human intervention tasks_hIs the speed output of human intervention tasks.

The invention also provides a decision modeling and cooperative control system of the multi-robot system based on the human-in-loop, which comprises a robot output information acquisition module, an artificial decision behavior modeling module and a cooperative task control module,

the output information acquisition module acquires an output information value after the robot executes a task, selects the robot position deviation information as human decision information, and transmits the human decision information to the decision behavior modeling module;

the artificial decision behavior modeling module uses a drift diffusion model as a modeling method and models the decision behavior of the person according to the decision information of the person;

the cooperative task control module is designed into a human decision task, and when the robot cannot rely on the autonomous control system to complete the task, the human decision task is executed to help the robot to complete the task smoothly.

The invention also provides another decision modeling and cooperative control system based on a human-in-loop multi-robot system, which comprises a processor, a memory and a computer program stored on the memory and capable of being executed by the processor, wherein when the processor executes the computer program, the method steps as described above can be realized.

The present invention also provides a computer readable storage medium having stored thereon computer program instructions capable, when executed by a processor, of carrying out the method steps as set out above.

Compared with the prior art, the invention has the following beneficial effects: the invention provides a human drift diffusion model aiming at the human decision problem in a human and multi-robot system by combining a traditional drift diffusion model with a zero space-based behavior control method, is suitable for modeling the human decision behavior in the human and multi-robot system, provides a human decision threshold setting formula in a human and multi-robot interaction system based on the human drift diffusion model in order to obtain accurate decision opportunity, and makes a decision when the decision information of a human reaches the decision threshold. After the human makes a decision, human intervention is selected, an intervention instruction of the human is designed into an intervention task through a behavior control method based on a null space, and the human intervention task is designed to have the highest priority, so that the robot can quickly identify and completely execute the human intervention task.

Drawings

FIG. 1 is a schematic diagram of the method of the embodiment of the present invention.

Fig. 2 is a track diagram of a robot in human-computer interaction according to an embodiment of the present invention.

Fig. 3 is a human decision information evolution diagram in human-computer interaction according to an embodiment of the present invention.

Detailed Description

The invention is further explained below with reference to the drawings and the embodiments.

It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.

The embodiment provides a decision modeling and cooperative control method based on a human-in-loop multi-robot system, and the embodiment uses three quadrotor unmanned planes and four warning rods as obstacles for illustration. The four-rotor unmanned aerial vehicle is used as a mobile robot (hereinafter referred to as a robot), and the plurality of warning rods simulate obstacles in the environment (hereinafter referred to as obstacles). Equipping a GPS positioner and a sensor on a quad-rotor unmanned aerial vehicle specifically comprises the following steps:

In this embodiment, the tasks executed by the robot include a task of moving to a target point and an obstacle avoidance task; the movement to the target point task is defined by the movement of the robot team to the target point, and once each robot reaches the target point, the multi-robot system stops; therefore, the design move to target point task function is related to the robot position. When an obstacle exists in the process of moving to a target point, the obstacle avoidance task aims to keep the safe distance between the robot and the obstacle, the safe distance is assumed to be D, the obstacle avoidance task function is designed by comparing the distance between the robot and the obstacle with the safe distance in real time, and if the distance between the robot and the obstacle is smaller than the preset safe distance, obstacle avoidance is carried out.

In this embodiment, the robot position deviation information is a deviation between an actual position of the robot and a preset position. In the multi-robot system based on behavior control, when a robot executes a task moving to a target point and an obstacle avoidance task, feedback information output by the robot task comprises various information such as robot position information, robot speed information, robot position deviation information, robot speed deviation information, distance information between the robot and an obstacle, and the like, and the feedback information needs to be classified and used for decision information selection of the robot. Some type information exists in feedback information output by the robot task, which can directly reflect the execution progress of the robot task, such as robot position deviation information (deviation between an actual position and a preset position), and the rest information cannot reflect the execution progress of the task (distance information between the robot and an obstacle), so that the robot position deviation information is selected as the decision information selection of a person.

In this embodiment, the robot position deviation information is obtained by a null-space-based behavior control method. And designing the action of moving the robot to a target point and the obstacle avoidance action as tasks by utilizing a behavior control method based on zero control. The action of moving to the target point is related to the position of the robot, the target function of the action is designed to be the target point of the robot, and the movement is stopped when the robot moves to the target point, so that the task of moving to the target point is designed as follows:

in the formula, v_mjIs the velocity output of the jth robot task moving to the target point,

is the pseudo-inverse of the Jacobian matrix of the move to target task, Λ_mjIs the task gain of moving to the target point,

representing partial derivatives, p, of the desired task function moving to the target point_mdjRepresenting a desired task function, p, moving to a target point_mjRepresenting a move to target point task function.

The obstacle avoidance behavior is related to the distance between the robot and the obstacle, and the obstacle avoidance task is executed when the distance between the robot and the obstacle is smaller than the safe obstacle avoidance distance by setting the safe obstacle avoidance distance of the robot. The obstacle avoidance task is as follows:

in the formula, v_ajIs the speed output of the jth robot obstacle avoidance task,

is the pseudo-inverse of the jacobian matrix of the obstacle avoidance task, Λ_ajIs the gain of the obstacle avoidance task,

representing the barrier avoidance task expectation function, p_ajAnd representing an obstacle avoidance task function.

According to the figure 1, the output of the task moving to the target point and the obstacle avoidance task is fused, and because the safety of the robot is important, the robot obstacle avoidance task is set as the high priority of the robot task, the task moving to the target point is set as the sub-optimal priority task, and the task moving to the target point is output and projected to the zero space output by the obstacle avoidance task

Obtaining the total output of the robot task:

in the formula, v_rjIs the speed output of a human intervention task,

is the null space of the obstacle avoidance task of people.

In this embodiment, the modeling the human decision behavior according to the human decision information using the drift diffusion model of the human as a modeling method specifically includes:

combining a traditional drift diffusion model with a zero-space-based behavior control method, establishing a human drift diffusion model, taking robot position deviation information obtained by the zero-space-based behavior control method as human decision information, adopting speed deviation information of a robot as a drift rate to reflect the variation of the decision information in unit time, and modeling the human decision behavior in a human-computer interaction system based on behavior control:

in the formula (I), the compound is shown in the specification,

is the position deviation amount of the jth robot,

in order to obtain accurate man-made decision time, according to the speed-accuracy criterion of Bayes Risk, a decision threshold setting formula of a person is obtained by minimizing a cost function generated by the decision of the person, so that the speed and the accuracy are optimal, wherein the cost function is as follows:

B＝c_1jT_j+c_2jE_j；

in the formula, c_1jIs the cost incurred in the decision time of the person, c_2jIs the cost of human decision errors, T_jIs the decision time, E_jThe deviation of the decision is represented by a deviation,

in the formula (I), the compound is shown in the specification,

is the initial position deviation.

Setting a human decision threshold, wherein the human decision information evolves continuously along with time, when the human decision information evolves to a preset threshold, a certain behavior of a human needs to be selected from the behavior set of the human, and the human decision threshold is set as follows:

in the formula (I), the compound is shown in the specification,

C_jis a constant gain of the gain,

is a human decision threshold.

In this embodiment, the designer decision task specifically includes:

when the human decision information reaches the threshold value, the operator makes a human decision to design a human behavior set, the human behavior set in the embodiment includes human supervision behaviors and human intervention behaviors, and since the supervision behaviors do not generate task input to the robot, a task is designed aiming at the human intervention behaviors. As can be seen from fig. 1, the human intervention task is set as the highest priority task, and the speed output of the robot task needs to be projected onto the null space of the human intervention task, so as to ensure the complete execution of the human intervention task. According to the example of fig. 2, when the robot 2 finds a new obstacle in the obstacle avoidance process, the distance between the robot 2 and the obstacle 2 is equal to the distance between the robot 2 and the newly found obstacle, and the robot 2 falls into the problem of a local extreme point, so that the robot 2 cannot solve the problem by means of an autonomous control system, as can be seen from fig. 3, since the position deviation information continuously evolves to reach a decision threshold value in the process of falling into the extreme point, the decision of the person needs to be made to select the human intervention behavior. When a person supervises that the robot executes a task, the distance between the found obstacle 2 and the newly found obstacle is larger than the width of the robot 2 body, so that the human intervention task is designed to move to a new target point, and the following steps are performed:

in the formula, v_hIs the speed output of a human intervention task,

is a Jacobian pseudo-inverse matrix in human intervention tasks,

is a human intervention task bias.

In this embodiment, when the robot cannot rely on the autonomous control system to complete a task, executing a human decision task to help the robot successfully complete the task specifically includes: the designed human intervention task is the same as the robot autonomous execution task, and can be quickly identified and executed by the robot, the intervention task in the human decision task is set as the highest priority task, the original robot autonomous execution task is projected into the null space of the human intervention task, and finally, a speed output instruction of the robot under human intervention is obtained:

The embodiment also provides a decision modeling and cooperative control system of a multi-robot system based on a human-in-loop, which comprises a robot output information acquisition module, an artificial decision behavior modeling module and a cooperative task control module,

The present embodiment also provides another decision modeling and cooperative control system based on a human-in-loop multi-robot system, which includes a processor, a memory, and a computer program stored in the memory and capable of being executed by the processor, and when the processor executes the computer program, the method steps as described above can be implemented.

The present embodiments also provide a computer readable storage medium having stored thereon computer program instructions capable, when executed by a processor, of carrying out the method steps as set out above.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The foregoing is directed to preferred embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow. However, any simple modification, equivalent change and modification of the above embodiments according to the technical essence of the present invention are within the protection scope of the technical solution of the present invention.

Claims

1. A decision modeling and cooperative control method based on a multi-robot system with a person in a loop is characterized by comprising the following steps:

2. The method of claim 1, wherein the tasks performed by the robots include a move to target task and an obstacle avoidance task; the movement to the target point task is defined by the movement of the robot team to the target point, and once each robot reaches the respective target point, the multi-robot system stops; when an obstacle exists in the process of moving to a target point, the obstacle avoidance task aims to keep the safe distance between the robot and the obstacle, and if the distance between the robot and the obstacle is smaller than the preset safe distance, obstacle avoidance is carried out.

3. The method as claimed in claim 1, wherein the robot position deviation information is a deviation between an actual position and a preset position of the robot.

4. The method as claimed in claim 3, wherein the robot position deviation information is obtained by a null-space-based behavior control method.

5. The method as claimed in claim 1, wherein the modeling of the human decision behavior based on the human decision information using a drift diffusion model as a modeling method is specifically:

in the formula (I), the compound is shown in the specification,

is the position deviation amount of the jth robot,

in the formula (I), the compound is shown in the specification,

C_jis a constant gain of the gain,

is a human decision threshold for the value of the threshold,

indicating the initial position deviation.

6. The method for decision modeling and cooperative control of a multi-robot system based on a human-in-loop as claimed in claim 1, wherein the designer is specifically a decision task:

in the formula, v_hIs the speed output of a human intervention task,

is a Jacobian pseudo-inverse matrix in human intervention tasks,

is the differentiation of the desired human intervention task, Λ_hIs a human beingThe gain in the intervening tasks of (a) is,

is a human intervention task bias.

7. The method as claimed in claim 1, wherein the human decision task is executed to help the robot to complete the task smoothly when the robot cannot rely on the autonomous control system to complete the task. The method specifically comprises the following steps: setting an intervention task in a human decision task as a highest priority task, projecting an original robot autonomously executing task into a null space of the human intervention task, and finally obtaining a speed output instruction of the robot under human intervention:

8. A decision modeling and cooperative control system of a multi-robot system based on a human-in-loop is characterized by comprising a robot output information acquisition module, a human decision behavior modeling module and a cooperative task control module,

9. Decision modeling and cooperative control system for a human-in-loop based multi-robot system, comprising a processor, a memory and a computer program stored on the memory and executable by the processor, the computer program, when executed by the processor, being capable of performing the method steps according to any of claims 1-7.

10. A computer-readable storage medium, having stored thereon computer program instructions capable, when executed by a processor, of carrying out the method steps of any one of claims 1 to 7.