CN117270393A

CN117270393A - Intelligent robot cluster cooperative control system

Info

Publication number: CN117270393A
Application number: CN202311296605.5A
Authority: CN
Inventors: 邹应全; 龚宇瑶; 谭文新; 汪洋生
Original assignee: Chongqing University
Current assignee: Chongqing University
Priority date: 2023-10-07
Filing date: 2023-10-07
Publication date: 2023-12-22
Anticipated expiration: 2043-10-07
Also published as: CN117270393B

Abstract

The invention discloses an intelligent robot cluster cooperative control system, which relates to the technical field of robots, and comprises: a robot data acquisition section and a control section; the data acquisition part is configured to acquire the state and control input of each robot in the robot cluster at each moment; the states include: robot position data, robot motion gesture data, and robot sensor readings; the control inputs include: robot target position data and target motion attitude data of the robot; the control section includes: a cooperative control section and a collision control section; the cooperative control part generates a cooperative control strategy based on a cooperative effect so that a control rewarding objective function of the whole robot cluster is maximum; the collision control part predicts the collision of the robots in the robot cluster so as to avoid the occurrence of the collision of the robots. The invention realizes the efficient multi-robot cooperative work and control and improves the task completion efficiency, safety and flexibility.

Description

Intelligent robot cluster cooperative control system

Technical Field

The invention relates to the technical field of robots, in particular to an intelligent robot cluster cooperative control system.

Background

The robot technology has been greatly developed, and is widely applied not only in the fields of manufacturing industry, medical field, military application and the like, but also in the daily life. However, as the number and variety of robots are increasing, the problems of cooperative work and control between robots are becoming more prominent. The traditional robot control method and the cooperative rules are difficult to cope with the complicated robot cluster cooperative control requirement, so that a new robot cooperative control system is necessary to be researched.

Over the past decades, significant advances in robotics have been made, yielding a wide variety of single robot control systems. These systems typically employ conventional PID controllers, feedback control, and the like, for controlling the motion and behavior of the individual robots. These systems perform well in certain tasks such as assembly robots on a production line, robotic assisted surgery in surgery, and the like. However, when multiple robots are required to work cooperatively, conventional single robot control systems are not attractive.

As robotics advances, people began to study the collaborative work of robot clusters. The robot cluster can show surprise potential in various application scenes, such as search rescue, environment monitoring, agricultural automation and the like. However, to achieve efficient collaborative work of robot clusters, a series of technical challenges are faced. Coordinated control of the robot clusters is a complex problem. Conventional centralized control methods typically require a central controller to coordinate the movements of the robots, which can result in single point failures and communication bottlenecks. The decentralized approach can alleviate these problems, but the synergy between robots is still limited. Collision avoidance is a critical issue when multiple robots are moving in a limited space. Existing collision avoidance algorithms are typically based on static maps or sensor data, but these methods often fail to cope with complex interactions inside dynamic environments or robot clusters. The effective coordination rule is designed to be the core of the coordination work of the robot clusters. The existing rules are often based on heuristic methods, and lack theoretical support and expandability. Thus, there is a need for more intelligent and adaptive collaborative rules.

Disclosure of Invention

The invention aims to provide an intelligent robot cluster cooperative control system, which realizes efficient multi-robot cooperative work and control by introducing advanced technologies such as distributed cooperative control, collision prediction and avoidance, intelligent cooperative rules and the like, and improves task completion efficiency, safety and flexibility.

In order to solve the technical problems, the invention provides an intelligent robot cluster cooperative control system, which comprises: a robot data acquisition section and a control section; the data acquisition part is configured to acquire the state and control input of each robot in the robot cluster at each moment; the states include: robot position data, robot motion gesture data, and robot sensor readings; the control inputs include: robot target position data and target motion attitude data of the robot; the control section includes: a cooperative control section and a collision control section; the cooperative control part is configured to generate a cooperative control strategy based on the state and control input of all robots in the robot cluster and based on a cooperative effect, so that a control rewarding objective function of the whole robot cluster is maximum; the collision control part is configured to predict the collision of the robots in the robot cluster on the basis of the cooperative control strategy so as to avoid the occurrence of the collision of the robots.

Further, the executing process of the cooperative control section includes: setting the number of robots as N, and controlling the time step as t; the state of the robot i at the time t is S _i (t); control input U of robot i at time t _i (t); in order to make the control reward objective function of the whole robot cluster be the maximum value, the control reward objective function of each robot is the maximum value; in state S _i (t) and control input U _i (t) as an influence factor, constructing a control reward objective function of the robot i as R _i (t); at R _i (t) control input U for each time step under constraint of maximum value _i (t) optimizing to obtain an optimized control input U _i (t) _Op The method comprises the steps of carrying out a first treatment on the surface of the Setting a cooperative rule F (t), and inputting optimal control U of all robots _i (t) _op Overall input U combined into a cluster _cluster (t); using global input U _cluster (t) controlling the motion of each robot in the robot cluster.

Further, the objective function R _i (t) is expressed using the following formula

Wherein d _ij (t) is the distance of the robot i from the target position calculated from the robot position data and the robot target position data;the motion gesture data of the robot i; />Target action gesture data of the robot i; ρ (t) represents a cooperative parameter of the cluster, which is a set value, and the value of the cooperative parameter may be different in each time step; θ _i (t) represents the response time of robot i at each time step; sigma (sigma) _i (t) represents the sensor readings of robot i; t is t ₀ Is the current moment;

further, at R _i (t) control input U for each time step under constraint of maximum value _i (t) optimizing to obtain an optimized control input U _i (t) _op The method of (1) comprises:

wherein α is a first learning rate;is a laplace operator.

Further, the control of all robots is input U using the following formula in conjunction with the rule F (t) _i (t) overall input U combined into clusters _cluster (t)：

U _cluster (t)＝F(t,U ₁ (t) _op ,U ₂ (t) _op ,…,U _N (t) _op )。

Further, the collaboration rule F (t) is expressed using the following formula:

where lambda is a hyper-parameter that balances the control reward objective function and regularization term,the complexity regularization term is represented, and the complexity of the collaborative rule is described as a set value.

Further, the collision control part performs collision prediction on the robots in the robot cluster based on the cooperative control strategy, and the method comprises the following steps: generating a position vector for each robot from the robot position dataWhere i denotes the number of the robot, and 0 denotes the initial time; the collision objective function is determined as:

wherein J (P) ^(t) ) Is the collision objective function, P ^(t) Is the position vector of the robot cluster at the time t, w _i Is the weight of robot i, f _i Is a local objective function of the robot i,is a global target position vector; minimizing the objective function J (P ^(t) ) In order to find an optimal robot position, wherein the optimal robot position of robot i isDetermining a perception range of each robot; according to the sensing range, in the motion process of the robot, determining whether other robots exist in the sensing range, if so, comparing whether the two robots overlap with the optimal robot positions of the other robots in the sensing range, and if so, adjusting the second learning rate B to redetermine the optimal robot positions of the robots.

Further, the objective function J (P ^(t) )：

Wherein B is the second learning rate, μJ _i (P ^(t) ) Is the gradient of the collision objective function for robot i.

Further, the sensing range of the robot is determined by the following formula:

wherein,a communication direction vector indicating the communication direction vector between the robot i and the robot m at time t; d (D) _mi The distance between the robot i and the robot m at the time t is shown; />Is the perception range vector of the robot i; by calculating->And obtaining the perception range.

Further, the communication direction vector is calculated by the following formula:

wherein D is _im Represents the distance between robot i and robot m, D _max Is the maximum communication range of the robot i.

The intelligent robot cluster cooperative control system has the following beneficial effects: the invention introduces an advanced distributed cooperative control method, and allows each robot to make intelligent decisions according to the state and the surrounding environment. This reduces the reliance on the central controller and improves the collaborative efficiency of the robot cluster. Each robot can respond to task demands independently, thereby achieving a higher level of collaborative work. The invention comprises a collision control part, which can predict the collision and take measures to avoid the collision inside the robot cluster based on an advanced collision prediction algorithm. This greatly improves the safety and stability of the robot work, enabling the robots to work cooperatively in crowded and complex environments without collisions. The invention introduces an intelligent collaborative rule design method, and the rules are not only based on a heuristic method, but also combine machine learning and optimization technology. This means that the robot clusters can intelligently adjust the coordination rules according to task requirements and environmental conditions, and achieve more flexible and adaptive coordination work. The system design of the invention has high expandability and can adapt to robot clusters of different scales and types. The invention can effectively realize cooperative work and control whether a small-scale robot team or a large-scale robot group.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic system structure diagram of an intelligent robot cluster cooperative control system according to an embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Example 1: referring to fig. 1, an intelligent robot cluster cooperative control system, the system comprising: a robot data acquisition section and a control section; the data acquisition part is configured to acquire the state and control input of each robot in the robot cluster at each moment; the states include: robot position data, robot motion gesture data, and robot sensor readings; the control inputs include: robot target position data and target motion attitude data of the robot; the control section includes: a cooperative control section and a collision control section; the cooperative control part is configured to generate a cooperative control strategy based on the state and control input of all robots in the robot cluster and based on a cooperative effect, so that a control rewarding objective function of the whole robot cluster is maximum; the collision control part is configured to predict the collision of the robots in the robot cluster on the basis of the cooperative control strategy so as to avoid the occurrence of the collision of the robots.

In particular, the robot-based data acquisition portion needs to acquire and monitor information in the environment in real time in order to make decisions and perform tasks. This part typically relies on sensor technology and data transmission protocols. Robots are typically equipped with various sensors, such as lidar, cameras, inertial Measurement Units (IMUs), ultrasonic sensors, and the like. These sensors may measure environmental information around the robot, including obstacle position, terrain, self-position, speed, attitude, and properties of surrounding objects, etc. The data acquired from the sensors needs to be transmitted to the control system for processing. Standard data transmission protocols, such as Wi-Fi, bluetooth, ROS (robot operating system), etc., are typically used to ensure real-time and reliability of the data.

The data acquisition part is used for providing key information required by a robot control system so that the robot can sense and understand the surrounding environment, execute tasks and work together with other robots. The data acquired by the sensor, including the position, motion gesture, sensor reading, etc. of the robot, allows the robot to perceive its own state in real time. For example, the robot may scan the surrounding environment using lidar to obtain the position and shape of the obstacle. The data acquisition portion helps the robot understand the topology and characteristics of the surrounding environment. By analyzing the sensor data, the robot can detect and identify obstacles, target locations, topographical features, and the like. Control inputs to the robot, such as target position and target motion profile data, are typically set by an external system or user. These target data may be transmitted to the robot control system through the data acquisition section to instruct the robot to perform a task. The information of the data acquisition part is not only applicable to a single robot, but also is important for the cooperative work of the robot clusters. By sharing state information, robots can coordinate actions, avoid collisions, and achieve a common goal. The data acquisition section also contributes to improvement of the safety of the robot. For example, with sensor data, the robot may detect a potential collision risk, taking measures to avoid the collision.

Robot-based clusters of cooperative control sections require coordinated actions to achieve a common goal. This requires comprehensive analysis of the robot state information in the cluster and generation of a cooperative control strategy to maximize the control reward objective function. The cooperative control section gathers status information from all robots, including position, attitude, sensor readings, etc. The information is fused and integrated to form a global view of the entire cluster state. Based on the fused state information, the cooperative control section needs to establish a cooperative effect model, that is, interactions and influences between robots. This may include avoiding collisions, collaborative searches, task division, and so forth. The cooperative control section generates a control strategy to maximize the control reward objective function of the entire robot cluster using an optimal control theory or other related method.

The collision control section is responsible for detecting collision risk between robots or with the surrounding environment. It may use various collision detection algorithms such as geometric collision detection or motion collision detection. If the collision control section detects a potential collision, it takes action to avoid the collision, typically by adjusting the trajectory or speed of the robot.

Example 2: on the basis of the above embodiment, the execution process of the cooperative control section includes: setting the number of robots as N, and controlling the time step as t; the state of the robot i at the time t is S _i (t); control input U of robot i at time t _i (t); in order to make the control reward objective function of the whole robot cluster be the maximum value, the control reward objective function of each robot is the maximum value; in state S _i (t) and control input U _i (t) as an influence factor, constructing a control reward objective function of the robot i as R _i (t); at R _i (t) control input U for each time step under constraint of maximum value _i (t) optimizing to obtain an optimized control input U _i (t) _op The method comprises the steps of carrying out a first treatment on the surface of the Setting a cooperative rule F (t), and inputting optimal control U of all robots _i (t) _op Overall input U combined into a cluster _cluster (t); using global input U _cluster (t) controlling the motion of each robot in the robot cluster.

Specifically, the cooperative control is to generate a cooperative control strategy through a cooperative effect based on the state and the control input of the robot cluster so as to maximize the control reward objective function of the whole robot cluster. Cooperative control ensures that the robot clusters can cooperate in an efficient manner. By considering the status and goals of each robot in the cluster, the system can intelligently allocate tasks and resources to maximize the overall performance of the cluster. The control reward objective function is a mathematical function that is used to quantify the performance of the robot cluster. The goal of each robot is to maximize its control prizeThe objective function is excited. The function is used for providing a definite performance index, so that the robot can make optimal decisions at each time step to achieve the aim of cluster cooperative work. The robot optimizes its control inputs at each time step to maximize its control reward objective function. This may involve the use of mathematical optimization methods such as gradient descent or model predictive control. By optimizing the control inputs, the robot can make optimal decisions in a constantly changing environment to achieve its individual control goals and work in concert with other robots. A collaborative rule is a set of rules or algorithms for combining the optimal control inputs of all robots to form an overall input for the cluster. This may include task allocation, path planning, obstacle avoidance strategies, etc. The role of the coordination rules is to ensure that the individual actions in the robot cluster are coordinated to achieve a control reward objective function maximization for the whole cluster. This helps to avoid collisions and collisions, as well as to improve efficiency. Using global input U _cluster (t) controlling the movement of each robot in the robot cluster to ensure that they operate as required by the collaborative rules. By executing the overall control, the robot clusters can work cooperatively in the actual environment to reach the predetermined task objective.

Example 3: on the basis of the above embodiment, the objective function R _i (t) is expressed using the following formula

Wherein d _ij (t) is the distance of the robot i from the target position calculated from the robot position data and the robot target position data;the motion gesture data of the robot i; />Target action gesture data of the robot i; ρ (t) represents the synergy of the clustersThe sexual parameter is a set value, and the value of the cooperative parameter can be different in each time step; θ _i (t) represents the response time of robot i at each time step; sigma (sigma) _i (t) represents the sensor readings of robot i; t is t ₀ Is the current time.

In particular, the method comprises the steps of,the items: consider the square of the inverse of the distance between robot i and the target location. The square of the reciprocal means that the closer the distance, the larger the value of this term, and the farther the distance, the smaller the value. The function of this term is to encourage the robot to approach its target position. It shows that the closer the robot is to the target location, the greater its contribution to the target function, thereby encouraging the robot to quickly approach the target. />The items: this term considers the ratio of the motion gesture of robot i to the target motion gesture. The more consistent the robot's motion is with the target motion, the higher the value of this term. The function of this term is to encourage the robot's motion to coincide with the target motion. It reflects whether the robot operates according to the desired action to achieve the synergy of the task.The items: this term includes the response time of robot i at each time step divided by the total number of robots N. It takes into account the effect of response time on the robot performance. The effect of this term is to take into account the response speed of the robot. If the robot responds to the task faster, the value of this term is greater, encouraging the robot to respond to the task faster. />The items: this term takes into account the square of the inverse of the sensor reading of robot i, characterizing the accuracy of the sensor. The square of the reciprocal means that the more accurate the sensor reading, the greater the value of this term. The effect of this term is that the bonus robot is equipped with high accuracyTo improve the quality of the environmental perception. />The items: this term represents the synergy parameter p (t) of the cluster divided by the total number of robots N. The synergy parameter is used to account for synergy between robots in the cluster. The function of this term is to measure the degree of synergy between robots in a cluster. The stronger the synergy between robots, the greater the value of this term, thereby encouraging better synergy.

This objective function is used to quantify the performance of robot i at time t. It consists of a number of items, each taking into account different aspects of the robot behaviour, including distance from the target, gesture of action, sensor readings etc. The goal is to maximize R _i (t). The purpose of this objective function is to provide a comprehensive performance metric for each robot so that the system can optimize the robot's actions during cooperative control to maximize the overall cluster performance.

Example 4: on the basis of the above embodiment, R is _i (t) control input U for each time step under constraint of maximum value _i (t) optimizing to obtain an optimized control input U _i (t) _op The method of (1) comprises:

wherein α is a first learning rate;is a laplace operator.

Specifically, the learning rate α is a super-parameter for controlling the optimization step. It determines the adjustment amplitude of the control input in each optimization update. The learning rate serves to balance the update rate and stability of the control input. A smaller learning rate may result in smaller control input adjustments that help maintain stability, but may require more iterations to converge. A larger learning rate may lead toFaster convergence, but possibly instability, requires careful selection. Gradient ofIs an objective function R _i (t) relative to the control input U _i Rate of change of (t). It tells us how the objective function will change if the control input is adjusted slightly. The role of the gradient is to direct the optimization algorithm to update the control input in a direction that maximizes the objective function. If the gradient is positive, increasing the control input will increase the value of the objective function and vice versa. Laplacian>Is a control input U _i (t) a second spatial derivative of (c). It describes the curvature and rate of change of the control input. The effect of the laplace operator is to take into account the smoothness of the control input. It helps to reduce drastic changes, make the control input smoother, and helps control stability. Optimizing control input U _i (t) _op By at gradient->And Laplacian>Under the influence of (a) adjusting the original control input U at a learning rate alpha _i A new control input of (t). The effect of this optimization procedure is to cause the control input U _i (t) more adaptive objective function R _i (t) to satisfy as much as possible the constraint condition that maximizes the objective function. It may improve the control strategy of the robot by iterating constantly to optimize performance.

In general, this formula describes a gradient and laplace operator based optimization method for adjusting the control input of the robot at each time step to meet the objective of maximizing the objective function under constraint. The learning rate and the choice of operators depend on the nature of the optimization and the characteristics of the objective function, which need to be carefully adjusted and experimentally determined to ensure the effectiveness and stability of the optimization.

Example 5: on the basis of the above embodiment, the control of all robots is input U in accordance with the rule F (t) using the following formula _i (t) overall input U combined into clusters _cluster (t)：

U _cluster (t)＝F(t,U ₁ (t) _op ,U ₂ (t) _op ,…,U _N (t) _op )。

Specifically, the coordination rule F (t) is a function that inputs the optimal control of each robot U _i (t) op as an input parameter, and then generating the overall input Ucluster (t) of the cluster. The function of this formula is to integrate the individual control inputs of each robot into one overall control input to ensure that the robot cluster operates as required by the collaborative rules. The coordination rules may include task allocation, path planning, coordination obstacle avoidance policies, etc. to ensure that the robots work cooperatively when performing tasks. Overall input U _cluster (t) is generated by a collaborative rule F (t) which represents the overall control input of the entire robot cluster. The overall input serves to enable the robot clusters to operate in a coordinated manner. The method reflects the result of the cooperative rule and guides the action to be taken by the robot when the robot executes the task.

The formula of the collaborative rule F (t) in this embodiment is used to combine the optimal control inputs for each robot into the overall input for the robot cluster. This overall input is a cluster-level control instruction for ensuring that the robots work cooperatively in accordance with a cooperative strategy. The specific form and algorithm of the cooperative rules may be different according to application requirements, and may be designed and adjusted according to task requirements, so as to realize efficient cooperative work of the robot cluster.

Example 6: on the basis of the above embodiment, the cooperative rule F (t) is expressed using the following formula:

where lambda is the trade-off control rewards objectiveThe hyper-parameters of the scalar function and regularization term,the complexity regularization term is represented, and the complexity of the collaborative rule is described as a set value.

Specifically, the core of the overall formula is an overall objective function, expressed as It is composed of two parts. This overall objective function aims to trade off two key factors: the sum of the reward objective functions and the regularization term are controlled. The overall objective is to maximize the value of this overall objective function by choosing an appropriate collaborative rule F (t). />Part represents the sum of the control reward objective functions, where R _i (t) is a control reward objective function for robot i. This section takes into account the individual performance of each robot, which is desirably maximized by selecting a collaboration rule. The performance of each robot is summarized and taken into account to determine the overall performance. Lambda is a hyper-parameter used to balance the weights between the control reward objective function and the regularization term. />The representation complexity regularization term is a set value describing the complexity of the collaborative rule. This part introduces regularization to ensure that the selected collaborative rule is not overly complex. A larger lambda value will emphasize the simplicity of the rule more, while a smaller lambda value will emphasize the performance more. The presence of regularization terms helps to improve the interpretability and maintainability of the collaborative rules. The goal of this optimization problem is to find a collaborative rule F (t) that maximizes the overall objective function. Through the optimization process, the system selects the cooperative rule most suitable for the task requirement to realize high performanceIs cooperated with the cluster of the (c). The choice of λ allows a trade-off between performance and rule complexity. The principle of this formula is to select the best co-rule F (t) by an overall objective function, taking into account both the performance objective and the rule complexity. The method can be used for automatically selecting the cooperative rules adapting to different tasks and environments, and realizing better performance in a cooperative control system.

Example 7: on the basis of the above embodiment, the collision control section performs collision prediction on robots in a robot cluster on the basis of a cooperative control strategy, and the method includes: generating a position vector for each robot from the robot position dataWhere i denotes the number of the robot, and 0 denotes the initial time; the collision objective function is determined as:

wherein J (P) ^(t) ) Is the collision objective function, P ^(t) Is the position vector of the robot cluster at the time t, w _i Is the weight of robot i, f _i Is a local objective function of the robot i,is a global target position vector; minimizing the objective function J (P ^(t) ) In order to find an optimal robot position, wherein the optimal robot position of robot i isDetermining a perception range of each robot; according to the perception range, in the motion process of the robot, determining whether other robots exist in the perception range, if so, comparing whether the two optimal robot positions of the robot and the other robots in the perception range overlap, if so, adjusting a second learning rate B, and repeatingThe optimal robot position of the robot is newly determined.

Specifically, each robot generates a position vector at an initial time t=0This vector records the initial position of the robot. Initial position vector +.>The initial state of the robot is provided as a starting point for collision prediction. Which is used as a reference in the subsequent collision prediction and avoidance process. Collision objective function J (P ^(t) ) Position vector P for measuring robot clusters at time t ^(t) And global target position vector->Collision conditions between the two. It consists of a local objective function f of each robot _i And the weight is weighted and summed. The purpose of the collision objective function is to translate collision prediction into a mathematical optimization problem. By minimizing this function, the position vector of the robot cluster can be made to tend to avoid collisions, thereby improving safety. For the collision objective function J (P ^(t) ) Performing minimization to determine the optimal position +/for each robot>The gradient descent method is an optimization technique for adjusting the position of each robot to minimize the value of the collision objective function. This helps the robot to avoid collisions during movement, ensuring safety. Each robot has a sensing range for detecting whether other robots exist within the sensing range. The sensing range allows the robot to detect the position of other robots around. This is critical for collision detection. If there are other robots in the sensing range, the robot will further check if there is an overlap conflict. If the robot detects an overlap conflict within the sensing range, it will adjust the second learning rate beta and then re-determine its own optimal position. This machineThe system allows the robot to dynamically adjust when a potential collision is detected to avoid the collision. By recalculating the optimal position, the robot can adjust its path of motion, ensuring safety. Collision prediction and collision avoidance are translated into an optimization problem, which minimizes the collision objective function by adjusting the position of the robot. The sensing range is used for detecting the positions of other robots and avoiding collision through dynamic adjustment so as to ensure the safe operation of the robot cluster.

Example 8: on the basis of the above embodiment, the following formula is used to minimize the objective function J (P ^(t) )：

In particular, the gradient descent method is an optimization algorithm for minimizing the objective function. In this formula, it is used to adjust the position vector of the robot so that the collision objective function J (P ^(t) ) Is reduced. By iteratively updating position vectorsThe robot tries to decrease the value of the collision objective function along the gradient direction of the collision objective function, thereby avoiding the collision. The second learning rate B and learning rate μ control the step size of each update. The second learning rate B is a super-parameter that determines the step size of the gradient descent. It is generally tailored to the specific problem and needs of the algorithm. The choice of B affects the convergence speed and stability of the gradient descent. A larger B value may result in a shock or unstable convergence, while a smaller B value may result in a slower convergence rate. The learning rate μ is a constant for adjusting the step size of the gradient descent. μ determines the magnitude of each gradient descent update. A larger learning rate μmay lead to unstable convergence, while a smaller learning rate μmay lead to slow convergence rate. Thus, the choice of μ also needs to be adjusted to the particular problem. This formula describes updating the position vector of the robot by gradient descent to minimize the collision objective function J (P ^(t) ) Thereby realizing collision prediction and collision avoidance targets. The choice of the learning rate μ and the second learning rate B is critical to the effectiveness of the optimization process and needs to be adjusted according to the specific situation. This approach helps to ensure that the robot avoids collisions and remains safe during cooperative work.

Example 9: on the basis of the above embodiment, the perception range of the robot is determined by the following formula:

Specifically, this formula is applied to the communication direction vector Mmi of each pair of robots i and other robots m ^(t) Weighted summation is carried out on the distance Dmi, and a perception range vector of the robot i is calculatedSpecifically, it first calculates a communication direction vector Mmi ^(t) And the reciprocal of the distance Dmi, then add them, and multiply by the communication direction vector +.>This weighted sum represents the degree of perception of the robot i to the other robots in each direction. The main function of the formula is to calculate the perception range vector of robot i +.>This vector describes the direction and distance that the robot i can perceive. By calculating the perception range vector +.>The perceived range of robot i, i.e. the maximum distance that the robot can detect other robots, can be determined. The weighted sum in the formula takes into account the influence of the communication direction and distance. Communication direction vector Mmi ^(t) The relative positions of the other robots in different directions are shown, while the distance Dmi represents the actual distance. Such weighted sum allows robot i to pay more attention to robots that are closer and reduces the impact of robots that are farther away. The communication direction, distance, and other robot contributions are combined to calculate the perception range of the robot. The perception range is key information in a robot cooperative control system and is used for determining the direction and distance that the robot can perceive other robots, so that the robot is helped to avoid collision, realize cooperative work and make safe motion decisions.

Example 10: on the basis of the above embodiment, the communication direction vector is calculated by the following formula:

Specifically, the core principle of the formula is by the position vector difference between robot i and robot mTo determine the direction of communication. Specifically, the method comprisesIt takes into account the following factors: position difference vector +.>The direction and distance from robot m to robot i are shown. />The modulus of the position difference vector, i.e. the actual distance between robot i and robot m, is shown. +.>Part distance weighting the communication direction, where D _im Represents the distance between robot i and robot m, D _max Is the maximum communication range of the robot i. The main function of this formula is to calculate the communication direction vector +.>The vector describes the direction of communication of robot i to robot m. +.>The communication direction vector is distance weighted in part. The influence of the closer robots in the communication direction is larger, while the influence of the farther robots is smaller. />The section makes the communication direction vector a unit vector. This is because the mode of the communication direction does not affect the communication direction, only the direction is important. D (D) _max The maximum communication range of the robot i, i.e. the maximum distance that the robot i can communicate with other robots, is indicated. D (D) _max The function of (a) is to ensure that the modulus of the communication direction vector does not exceed the maximum communication range of the robot. This means that the robot i will only communicate to the extent that it can communicate information. The communication direction is considered as a direction from the robot m to the robot i, and the accuracy of the communication direction vector is ensured by the distance weighting and the unit vectorization.

The present invention has been described in detail above. The principles and embodiments of the present invention have been described herein with reference to specific examples, the description of which is intended only to facilitate an understanding of the method of the present invention and its core ideas. It should be noted that it will be apparent to those skilled in the art that various modifications and adaptations of the invention can be made without departing from the principles of the invention and these modifications and adaptations are intended to be within the scope of the invention as defined in the following claims.

Claims

1. Intelligent robot cluster cooperative control system, its characterized in that, the system includes: a robot data acquisition section and a control section; the data acquisition part is configured to acquire the state and control input of each robot in the robot cluster at each moment; the states include: robot position data, robot motion gesture data, and robot sensor readings; the control inputs include: robot target position data and target motion attitude data of the robot; the control section includes: a cooperative control section and a collision control section; the cooperative control part is configured to generate a cooperative control strategy based on the state and control input of all robots in the robot cluster and based on a cooperative effect, so that a control rewarding objective function of the whole robot cluster is maximum; the collision control part is configured to predict the collision of the robots in the robot cluster on the basis of the cooperative control strategy so as to avoid the occurrence of the collision of the robots.

2. The intelligent robot cluster cooperative control system according to claim 1, wherein the execution process of the cooperative control section includes: setting the number of robots as N, and controlling the time step as t; the state of the robot i at the time t is S _i (t); control input U of robot i at time t _i (t); in order to make the control reward objective function of the whole robot cluster be the maximum value, the control reward objective function of each robot is the maximum value; in state S _i (t) and control input U _i (t) is an influencing factor, constructThe control reward objective function of robot i is R _i (t); at R _i (t) control input U for each time step under constraint of maximum value _i (t) optimizing to obtain an optimized control input U _i (t) _op The method comprises the steps of carrying out a first treatment on the surface of the Setting a cooperative rule F (t), and inputting optimal control U of all robots _i (t) _op Overall input U combined into a cluster _cluster (t); using global input U _cluster (t) controlling the motion of each robot in the robot cluster.

3. The intelligent robot cluster cooperative control system of claim 2, wherein the objective function R _i (t) is expressed using the following formula

Wherein d _ij (t) is the distance of the robot i from the target position calculated from the robot position data and the robot target position data;the motion gesture data of the robot i; />Target action gesture data of the robot i; ρ (t) represents a cooperative parameter of the cluster, which is a set value, and the value of the cooperative parameter may be different in each time step; θ _i (t) represents the response time of robot i at each time step; sigma (sigma) _i (t) represents the sensor readings of robot i; t is t ₀ Is the current time.

4. The intelligent robot cluster cooperative control system of claim 3, wherein R is _i (t) control input U for each time step under constraint of maximum value _i (t) optimizing to obtain an optimized control input U _i (t) _op The method of (1) comprises:

wherein α is a first learning rate;is a laplace operator.

5. The intelligent robot cluster cooperative control system as set forth in claim 4, wherein the cooperative rule F (t) inputs the control of all robots into U using the following formula _i (t) overall input U combined into clusters _cluster (t)：

U _cluster (t)＝F(t,U ₁ (t) _op ,U ₂ (t) _op ,…,U _N (t) _op )。

6. The intelligent robot cluster cooperative control system of claim 5, wherein the cooperative rule F (t) is expressed using the following formula:

7. The intelligent robot cluster cooperative control system according to claim 6, wherein the collision control section performs the collision prediction of the robots in the robot cluster based on the cooperative control strategy, the method comprising: generating a position vector for each robot from the robot position dataWhere i denotes the number of the robot, and 0 denotes the initial time; the collision objective function is determined as:

8. The intelligent robot cluster cooperative control system of claim 7, wherein the objective function J (P ^(t) )：

9. The intelligent robot cluster cooperative control system of claim 7, wherein the perceived range of the robot is determined by the following formula:

10. The intelligent robot cluster cooperative control system of claim 9, wherein the communication direction vector is calculated by the following formula: